This week's workshop is canceled.
Today's speaker: Reagan Mozer (Bentley University), "New approaches for scaling-up human coding efforts in randomized trials with text-based outcomes"
Abstract: Text data have a long history in social science and education research. However, these data are notoriously high-dimensional and characterized by many nuances of language that lack plausible statistical models. As a result, analysis of text data typically involves intensive human coding tasks where particular constructs or features of the text are first defined, and then a collection of documents are inspected and coded for the presence or absence of these constructs. While this process may be feasible in studies with smaller sample sizes, the time and resources required to train and employ multiple human coders frequently poses a challenge for large-scale efforts. In this talk, I will consider how to reliably and efficiently extract meaningful constructs from text documents for the purposes of drawing causal inferences, with an emphasis on the context of experimental studies where some outcomes of interest are features of text generated by the trial’s participants. In particular, I will describe an approach that combines machine learning and survey sampling methods to streamline the process of hand-coding in a way that is automatically verified and validated. To illustrate the proposed methods, I will present results from a pilot analysis of a randomized trial that used student-generated essays to evaluate the impact of an educational intervention on students’ writing abilities.
The Applied Statistics Workshop (Gov 3009) meets all academic year, Wednesdays, 12pm-1:30pm, in CGIS K354. This workshop is a forum for advanced graduate students, faculty, and visiting scholars to present and discuss methodological or empirical work in progress in an interdisciplinary setting. The workshop features a tour of Harvard's statistical innovations and applications with weekly stops in different fields and disciplines and includes occasional presentations by invited speakers. Free lunch is provided.
More information is available at the Gov 3009 website: https://projects.iq.harvard.edu/applied.stats.workshop-gov3009