Edward Kennedy (Workshop in Applied Statistics)

Date: 

Wednesday, November 2, 2022, 12:00pm to 1:30pm

Location: 

CGIS Knafel Building, room K354 or Online via Zoom

Today's Speaker

Edward Kennedy (Carnegie Mellon University), "Doubly robust capture-recapture methods for estimating population size"

Abstract

Estimation of population size using incomplete lists (also called the capture-recapture problem) has a long history across many biological and social sciences. For example, human rights and other groups often construct partial and overlapping lists of victims of armed conflicts, with the hope of using this information to estimate the total number of victims. Earlier statistical methods for this setup either use potentially restrictive parametric assumptions, or else rely on typically suboptimal plug-in-type nonparametric estimators; however, both approaches can lead to substantial bias, the former via model misspecification and the latter via smoothing. Under an identifying assumption that two lists are conditionally independent given measured covariate information, we make several contributions. First, we derive the nonparametric efficiency bound for estimating the capture probability, which indicates the best possible performance of any estimator, and sheds light on the statistical limits of capture-recapture methods. Then we present a new estimator, and study its finite-sample properties, showing that it has a double robustness property new to capture-recapture, and that it is near-optimal in a non-asymptotic sense, under relatively mild nonparametric conditions. Next, we give a method for constructing confidence intervals for total population size from generic capture probability estimators, and prove non-asymptotic near-validity. Finally, we study our methods in simulations, and apply them to estimate the number of killings and disappearances attributable to different groups in Peru during its internal armed conflict between 1980 and 2000.

The Applied Statistics Workshop (Gov 3009) meets all academic year, Wednesdays, 12pm-1:30pm, in CGIS K354. This workshop is a forum for advanced graduate students, faculty, and visiting scholars to present and discuss methodological or empirical work in progress in an interdisciplinary setting. The workshop features a tour of Harvard's statistical innovations and applications with weekly stops in different fields and disciplines and includes occasional presentations by invited speakers.

More information is available at the Gov 3009 website: https://projects.iq.harvard.edu/applied.stats.workshop-gov3009

All interested Harvard affiliates are invited to attend.