| Sun | Mon | Tue | Wed | Thu | Fri | Sat |
|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| 8 | 9 | 10 | 11 | 12 | 13 | 14 |
| 15 | 16 | 17 | 18 | 19 | 20 | 21 |
| 22 | 23 | 24 | 25 | 26 | 27 | 28 |
| 29 | 30 |
April 28, 2009
Please join us for our final meeting tomorrow when Thomas Yee, Department of Statistics, University of Auckland will present ``Vector generalized linear and additive models". Thomas provided the following abstract for his talk:
The class of vector generalized linear and additive models (VGLMs/VGAMs) is very large and contains many statistical models relevant to quantitative social science, e.g., univariate and multivariate distributions, categorical data analysis, time series, survival analysis, extreme value analysis, mixture models, correlated binary data, and nonlinear regression. I'll first give an overview of the framework and tie it in with practice using my VGAM package for R. Then we will focus on two sub-topics: reduced-rank VGLMs and quantile/expectile regression. The former handles the reduced-rank multinomial logit model (aka stereotype model) and Goodman's row-column association model; applications of the latter are becoming popular in many fields. Time allowing, I'll describe several sub-projects I'm currently working on since arriving at IQSS.The Applied Statistics workshop meets each Wednesday in room K-354, CGIS-Knafel (1737 Cambridge St). We start at 12 noon with a light lunch, with presentations beginning around 1215 and we usually wrap up around 130 pm.
Posted by Justin Grimmer at 3:12 PM
April 12, 2009
Please join us this Wednesday for the applied statistics workshop when Alberto Abadie, Professor of Public Policy, will present ``A General Theory of Matching Estimation", joint work with Guido Imbens. Alberto provided the following abstract for his talk:
Matching methods provide simple and intuitive tools for adjusting the distribution of covariates among samples from different populations. Probably because of their transparency and intuitive appeal, matching methods are widely used in evaluation research to estimate treatment effects when all treatment confounders are observed (Rubin, 1973, 1977; Rosenbaum, 2002). In spite of their popularity, the problem of establishing the large sample distribution of matching estimators remains largely unsolved, with the exception of some special cases (see Abadie and Imbens, 2006). The reason is that matching estimators are non-smooth functionals of the data, which makes their large sample theory particularly challenging. This talk will describe a new general method to establish the large sample distribution of matching estimators. As an example of the applicability of the method, we will describe how to derive the distribution of matching estimators when matching is carried out without replacement, a result previously unavailable in the literature. We will also discuss how to adjust the standard errors for propensity score matching estimators to take into account first step estimation of the propensity score, a result also previously unavailable.
The Applied Statistics Workshop meets each Wednesday at 12 noon in K-354 CGIS-Knafel (1737 Cambridge St). The workshop begins with a light lunch and presentations usually start around 1215 and last until about 130 pm.
Posted by Justin Grimmer at 7:41 PM
April 7, 2009
We are pleased to announce a special presentation that should be of interest. David Firth, Professor of Statistics at the University of Warwick, will present on Quasi Variances this *Thursday* from 12-2 pm in room K-354 in CGIS-Knafel (1737 Cambridge St, the usual meeting place for the applied statistics workshop). Professor Firth provided the following abstract for his presentation:
The notion of quasi variances, as a device for both simplifying and enhancing the presentation of additive categorical-predictor effects in statistical models, was developed in Firth and de Menezes (Biometrika, 2004, 65-80). The approach generalizes the earlier idea of "floating absolute risk" (Easton et al., Statistics in Medicine, 1991), which has become rather controversial in epidemiology. In this talk I will outline and exemplify the method, and discuss its extension to some other contexts such as parameters that may be arbitrarily scaled and/or rotated.Everyone (especially graduate students) is welcome and encouraged to attend.
A bit of background on Professor Firth. He is Professor of Statistics at the University of Warwick. He specializes in statistical theory and methods, and has a particular interest in generalized linear models---especially as applied to the social sciences. He has published extensively in the discipline's major journals of record, such as JRSS and Biometrika, and has written several packages for the R language and environment. He has made several significant contributions to the field, and is well known as the inventor of bias-reduced logistic regression (also known as 'Firthit').
He is at IQSS as a Distinguished Visiting Fellow (April 7--17), and will be spending part of his time here working with Arthur Spirling on models of momentum for contest data.
We hope everyone will be able to attend
Posted by Justin Grimmer at 5:52 PM
The workshop will meet tomorrow, when Sandra Sequeira, a PhD candidate in public policy, will present her work on the efficiency cost of corruption, work that is joint with Simeon Djankov. Sandra provided the following abstract for her talk:
This paper estimates the efficiency cost of corruption. We generate an original dataset on bribe payments at ports in Southern Africa that allows us to take an unusually close look into the black box of corruption, observing how bureaucrats set bribes and measuring their economic costs on firms and on the broader economy. We find that bribes are product-specific, frequent and substantial. Bribes can represent up to a 14\% increase in total shipping costs for a standard 20ft container and a 600\% increase in the monthly salary of a port official. Bribes are paid primarily to evade tariffs, protect cargo on the docks and avoid costly storage. We further identify three systemic effects associated with this type of corruption: a ``diversion effect" where firms go the long way around to avoid the most corrupt port; a ``revenue effect" as bribes reduce overall tariff revenue; and a ``congestion effect" as the re-routing of firms increases congestion and transport costs by causing imbalanced cargo flows in the transport network. The evidence supports the theory that bribe payments at ports represent a significant distortionary tax on trade, as opposed to just a transfer between shippers and port officials that greases slow-moving clearing queues.
The Applied Statistics Workshop meets each Wednesday at 12 noon in K-354 CGIS-Knafel (1737 Cambridge St). The workshop begins with a light lunch and presentations usually start around 1215 and last until about 130 pm.
Posted by Justin Grimmer at 5:46 PM
March 30, 2009
Please join us this Wednesday when Arthur Spirling, Department of Government, will present ``Bargaining Power in Practice: US Treaty-making with American Indians, 1784--1911". Arthur provided the following overview for his talk:
I will discuss a new data set of treaties signed 1784--1911 between the United States government and American Indian tribes, and comment on some early findings using kernel methods to analyze these texts. I particularly welcome feedback and suggestions from the ASW on the appropriateness of the techniques given the problem at hand.
Arthur also provided the following abstract for a paper that is the basis for his talk:
Native Americans are unique among domestic actors in that their relations with the United States government involve treaty-making, with almost 600 such documents signed between the Revolutionary War and the turn of the twentieth century. We obtain and digitize all of these treaties for textual analysis. In particular, we employ new 'kernel methods' to study the evolution of their nature over time and show that the Indian Removal Act of 1830 represents a systematic shift in language. We relate our findings to a bargaining model with the parties---government and tribes---varying in power according to contemporary political and economic events. With a mind to earlier historical and legal literatures, we also show that the 'broken' treaties do not form their own cluster in the data, and that the post-1871 'agreements' represent a straightforward continuation of earlier treaty policy in both style and substance.
The Applied Statistics Workshop meets each Wednesday at 12 noon in K-354 CGIS-Knafel (1737 Cambridge St). The workshop begins with a light lunch and presentations usually start around 1215 and last until about 130 pm.
Posted by Justin Grimmer at 11:14 AM
March 16, 2009
Please join us this Wednesday when Gabriel Lenz, MIT Department of Political Science, will present ``Getting Rich(er) in Office? Corruption and Wealth Accumulation in Congress", work that is joint with Kevin Lim. Gabe provided the following abstract:
How corrupt is Congress? We provide an indirect test by comparing wealth accumulation from 1995 to 2005 among members of the U.S. House of Representatives and members of the public. Data on representatives are from Personal Financial Disclosure forms and data on the public are from the Panel Study of Income Dynamics (PSID). To test whether representatives accumulate wealth at a faster rate than expected, we construct counterfactuals based on the PSID with two approaches. We first use statistical models, conditioning on asset distribution over stocks, bonds, businesses, and land, as well as demographic variables. These models find representatives accumulating wealth about 20 percent faster than expected. Second, we employ matching. Unlike the modeling approach, matching finds an almost identical rate of wealth accumulation among both groups. Further analysis reveals that matching reduces bias from several incorrect functional form assumptions in the statistical models. We thus conclude that representatives report accumulating wealth at a rate consistent with similar non-representatives, suggesting no aggregate corruption. Besides examining overall wealth accumulation, we also test for effects of committee assignments, safe seats, career trajectories, and campaign contributions
The Applied Statistics Workshop meets each Wednesday at 12 noon in K-354 CGIS-Knafel (1737 Cambridge St). The workshop begins with a light lunch and presentations usually start around 1215 and last until about 130 pm.
Posted by Justin Grimmer at 3:34 PM
March 9, 2009
Please join us this Wednesday when Dan Hopkins, Post-Doctoral Fellow at Harvard University (and soon to be Assistant Professor at Georgetown), will present "Making Credible Inferences about the Effects of Local Contexts". Dan provided the following abstract for his presentation:
In the last decade, there has been an explosion of social science research exploring the influence of local contexts on attitudes and behavior. Yet such studies face methodological hurdles, including the endogeneity of individuals' moving decisions, significant measurement error, and ambiguity about their causal interpretation. This presentation reconceptualizes the effects of local contexts as an interaction between the local context and salient national issues. It then uses panel or time-series cross-sectional data to explore the impact of exogenous changes in the salience of national issues on local contextual effects. Across three empirical examples on attitudes toward immigration drawn from two countries, we observe that local contexts only correlate with attitudes when immigration is a nationally salient issue. The effects of local contexts vary in predictable ways with the topics of national politics. All politics might not be local after all.
Dan provided the this paper as background for his talk.
The Applied Statistics Workshop meets each Wednesday in room K354, 1737 Cambridge St (CGIS-Knafel). A light lunch is served at 12 noon, with presentations usually beginning at 1215 pm and the workshop usually concludes by 130 pm. All are welcome!
Posted by Justin Grimmer at 8:23 PM
March 2, 2009
Please join us this Wednesday, March 4th when Jamie Robins will present ``A Bold Vision of Artificial Intelligence and Philosophy: Finding Causal Effects Without Background Knowledge or Statistical Independences", a project that is joint with Thomas Richardson, Ilya Shpitser, and Steffen Lauritzen. Jamie provided the following abstract:
I describe a statistical methodology based on philosophy, causal directed acyclic graphs, and a pinch of magic and miracle that holds the promise of making a silk purse of causal knowledge out of the sow's ear of an observational data set with no obvious structure. In 10 years or so, for better or worse, this methodology may become part of mainstream genomics.
The workshop will meet at 12 noon in room K-354, CGIS-Knafel (1737 Cambridge St) with a light lunch served. The presentation will begin at 1215 and usually ends around 130 pm. All are welcome
Posted by Justin Grimmer at 6:56 PM
February 23, 2009
Please join us this Wednesday, when Thomas Richardson--Department of Statistics, University of Washington--will present "Analysis of the Binary Instrumental Variable Model", work that is joint with Jamie Robins, Harvard School of Public Health. Thomas provided the following abstract:
In this talk I consider an instrumental variable potential outcomes model in which the instrument (Z), treatment (X) and response (Y) are all binary. It is well known that this model is not identified by the observed joint distribution p(x,y,z). Consequently many statistical analyses impose additional untestable assumptions or change the causal estimand of interest. Here we take a different approach, directly characterizing and graphically displaying the set of distributions over potential outcomes that correspond to a given population distribution p(x,y,z). This provides insights into the variation dependence between the partially identified average causal effects for various compliance groups. The analysis also leads directly to re-parametrization that may be used for Bayesian inference and the development of models that incorporate baseline covariates.
The Applied Statistics Workshop meets each Wednesday at 12 noon in K-354 CGIS-Knafel (1737 Cambridge St). The workshop begins with a light lunch and presentations usually start around 1215 and last until about 130 pm.
Posted by Justin Grimmer at 5:24 PM
February 16, 2009
Please join us this Wednesday when Filiz Garip, Harvard Department of Sociology, will present here joint-work with Paul Dimaggio, "How Do Network Externalities Lead to Intergroup Inequality?". Filiz provided the following abstract for her talk:
In this paper, we identify a mechanism, which we contend chronically reproduces and, under some conditions, may generate or even efface intergroup inequality. That mechanism is (a) the diffusion of goods, services, and practices that (b) are characterized by strong network externalities under conditions of (c) social homophily. When the value of a good or practice to an agent is a function of the number of persons in that agent's network who also possess the good or engage in the practice, and when networks are homophilic with respect to certain social characteristics, this mechanism will exacerbate initial individual-level differences in access to the good or practice and, under some conditions, induce persistent intergroup inequality. We illustrate this claim in two empirical contexts. For the first, the diffusion of access to and use of the Internet, we start with observed data on the relationship between cost and adoption and between adoption levels and price, and produce a computational model that permits us to predict variation in intergroup inequality over time as a function of variation in the strength of network externalities and the extent of social homophily. For the second, the practice of rural-to-urban migration by young people in rural Thailand, we use village-level data on family resources and migration patterns to explore the relationship between information sharing, homophily, and intergroup differences in migration.
The Applied Statistics Workshop meets each Wednesday at 12 noon in K-354 CGIS-Knafel (1737 Cambridge St). The workshop begins with a light lunch and presentations usually start around 1215 and last until about 130 pm.
Posted by Justin Grimmer at 8:55 PM
February 9, 2009
Please join us this Wednesday, February 11th when Bruce Western, Professor of Sociology, will present "Analyzing Inequality with Variance Function Regressions". Bruce provided the following abstract:
Regression-based studies of inequality model only between-group differences, yet often these differences are far exceeded by residual inequality. Residual inequality is usually attributed to measurement error or the influence of unobserved characteristics. We present a regression that includes covariates for both the mean and variance of a dependent variable. In this model, the residual variance is treated as a target for analysis. We apply this model to study the effects of union membership decline on the growth in men's earnings inequality from 1970 to 2006. The union membership data offer additional challenge for data analysis, because survey respondents may misreport their union membership status.
The Applied Statistics Workshop meets each Wednesday at 12 noon in K-354 CGIS-Knafel (1737 Cambridge St). The workshop begins with a light lunch and presentations usually start around 1215 and last until about 130 pm.
Posted by Justin Grimmer at 4:47 PM
February 2, 2009
The first meeting of the applied statistics workshop will be this Wednesday, February 4th, when Kari Lock, Graduate Student in the Department of Statistics, will present "Bayesian Combination of State Polls and Election Forecasts". Kari provided the following abstract:
A wide range of potentially useful data are available for election forecasting: the results of previous elections, a multitude of pre-election polls, and predictors such as measures of national and statewide economic performance. How accurate are different forecasts? We estimate predictive uncertainty via analysis of data collected from past elections (actual outcomes, pre-election polls, and model estimates). With these estimated uncertainties, we use Bayesian inference to integrate the various sources of data to form posterior distributions for the state and national two-party Democratic vote shares for the 2008 election. Our key idea is to separately forecast the national popular vote shares and the relative positions of the states.
The Applied Statistics Workshop meets each Wednesday at 12 noon in K-354 CGIS-Knafel (1737 Cambridge St). The workshop begins with a light lunch and presentations usually start around 1215 and last until about 130 pm.
Hope to see you there--
Posted by Justin Grimmer at 2:14 PM
December 8, 2008
Please join us this Wednesday, December 10th, when Amanda Cox (who is with the New York Times) when she will present "Open Problems in NYT Graphics". Amanda provided the following abstract:
The New York Times graphics department is a group of about 30 journalists who make the charts, maps and diagrams for the print and online versions of the paper. This talk is a (completely unofficial) guide to some of the problems the department faces on an ongoing basis, including how to represent uncertainty in an accessible way, and how to move beyond something I call "Here is some data:" toward something closer to inference.
The applied statistics workshop meets at 12 noon in room K-354, CGIS-Knafel (1737 Cambridge St) with a light lunch. Presentations start at 1215 pm and usually end around 130 pm. As always, all are welcome and please email me with any questions.r
Posted by Justin Grimmer at 11:41 AM
December 1, 2008
Please join us this Wednesday, December 3rd when Michael Peress, Department of Political Science, University of Rochester, will be presenting, "Estimating Proposal and Status Quo Locations Using Voting and Cosponsorship Data". Michael provided the following abstract,
Theories of lawmaking generate predictions for the policy outcome as a function of the
status quo. These theories are difficult to test because existing ideal point estimation techniques do not recover the locations of proposals or status quos. Instead, such techniques only recover cutpoints. This limitation has meant that existing tests of theories of lawmaking have been indirect in nature. I propose a method of directly measuring ideal points, proposal locations, and status quo locations on the same multidimensional scale, by employing a combination of voting data, bill and amendment cosponsorship data, and the congressional record. My approach works as follows. First, we can identify the locations of legislative proposals (bills and amendments) on the same scale as voter ideal points by jointly scaling voting and cosponsorship data. Next, we can identify the location of the final form of the bill using the location of last successful amendment (which we already know). If the bill was not amended, then the final form is simply the original bill location. Finally, we can identify the status quo point by employing the cutpoint we get from scaling the final passage vote. To implement this procedure, I automatically coded
data on the congressional record available from www.thomas.gov. I apply this approach to recent sessions of the U.S. Senate, and use it to test the implications of competing theories of lawmaking.
A copy of the paper is available here.
The applied statistics workshop meets at 12 noon in room K-354, CGIS-Knafel (1737 Cambridge St) with a light lunch. Presentations start at 1215 pm and usually end around 130 pm. As always, all are welcome and please email me with any questions.
Posted by Justin Grimmer at 10:18 AM
November 17, 2008
Please join us Wednesday, November 19th, when Adam Glynn--Government Department--will present his research, "Assessing the Empirical Evidence for Mechanism Specific Causal Effects". Adam provided the following abstract:
Social scientists often cite the importance of mechanism specific causal
knowledge, both for its intrinsic scientific value and as a necessity for
informed policy. In this talk, I use counterfactual causal models to re-assess
the empirical evidence for two oft cited examples from American and comparative
politics: the voting habit effect that is not due to campaign attention and the
effect of oil production on the likelihood of civil war onset that is due to
the weakening of state capacity. Utilizing decompositions of direct and
indirect effects, I discuss a number of identification strategies, and
demonstrate through sensitivity and bounding analysis that the evidence for the
aforementioned examples is weaker than is typically understood.
The applied statistics workshop meets at 12 noon in room K-354, CGIS-Knafel (1737 Cambridge St) with a light lunch. Presentations start at 1215 pm and usually end around 130 pm. As always, all are welcome and please email me with any questions
Update: Adam provided this paper as background for his presentation
Posted by Justin Grimmer at 7:13 PM
November 7, 2008
Please join us this Wednesday, November 12th when Kosuke Imai will present "Identification and Inference in Causal Mediation Analysis". Kosuke is currently a professor in the Department of Politics at Princeton University and an alum of the Harvard Government Department. He has provided the following abstract for his talk:
Causal mediation analysis is routinely conducted by applied researchers in a variety of disciplines including communications, epidemiology, political science, psychology, and sociology. The goal of such an analysis is to investigate alternative causal mechanisms by examining the roles of intermediate variables that lie in the causal path between the treatment and outcome variables. In this paper, we first prove that under the assumption of sequential ignorability, the average causal mediation effects are nonparametrically identified. This identification result contrasts with previous studies which have concluded that the nonparametric identification of average causal mediation effects requires an additional assumption. Second, we show that under the same sequential ignorability assumption the average causal mediation effects can be identified in the linear structural equation model commonly used by applied researchers. Some practical implications of our identification result are also discussed. Third, we consider a simple
nonparametric estimator of the average causal mediation effects and derive its asymptotic variance. Fourth, we offer sensitivity analyses in both parametric and nonparametric settings so that researchers can examine the robustness of their empirical findings to the violation of the sequential ignorability assumption. Finally, we analyze a randomized experiment from political psychology to illustrate the proposed methods.
A paper for the talk is available here .
The applied statistics workshop meets at 12 noon in room K-354, CGIS-Knafel (1737 Cambridge St) with a light lunch. Presentations start at 1215 pm and usually end around 130 pm. As always, all are welcome and please email me (jgrimmer_at_fas.harvard.edu) with any questions
Posted by Justin Grimmer at 11:09 AM
October 26, 2008
Please join us this Wednesday, October 29th, when Michael Kellerman, PhD Candidate in the Department of Government, will present his work on "Electoral Punishment as Signaling in Subnational Elections". Mike provided the following abstract,
It is a well-established empirical regularity that parties in federal office suffer setbacks in state-level elections. Many authors attribute this to a desire on the part of voters to balance the policy preferences of the federal incumbent. In this paper, I consider an alternative explanation with a long tradition in the literature: voters punish the party of the federal incumbent in state elections in order to send a signal to the federal government. I construct a simple signaling model to formalize this intuition, which predicts that under most circumstances signaling can occur at only one level of government. I estimate a statistical model allowing for electoral punishment using data from German elections and find support for punishment at the state level, rather than the punishment at both levels implied by balancing theories.
Mike also provided a copy of his paper, available here .
The applied statistics workshop meets each Wednesday in room K-354 CGIS-Knafel, 1737 Cambridge St, Cambridge MA. The workshop convenes at 12 noon with a light-lunch, presentations usually begin around 1215 and conclude by 130 pm. As always, everyone is welcome!
Posted by Justin Grimmer at 10:03 PM
October 20, 2008
Please note, there has been a scheduling change. Kosuke Imai, Department of Politics, Princeton University, will be presenting on November 12th.
In Kosuke's place, this wednesday, October 22nd, Don Rubin, Professor of Statistics, Harvard University, will present his paper, "For Objective Causal Inference, Design Trumps Analysis". Don provided the following abstract:
For obtaining causal inferences that are objective, and therefore have the best chance of revealing scientific truths, carefully designed and executed randomized experiments are generally considered to be the gold standard. Observational studies, in contrast, are generally fraught with problems that compromise any claim for objectivity of the resulting causal inferences. The thesis here is that observational studies have to be carefully designed to approximate randomized experiments, in particular, without examining any final outcome data. Often a candidate data set will have to be rejected as inadequate because of lack of data on key covariates, or because of lack of overlap in the distributions of key covariates between treatment and control groups, often revealed by careful propensity score analyses. Sometimes the template for the approximating randomized experiment will have to be altered, and the use of principal stratification can be helpful in doing this. These issues are discussed and illustrated using the framework of potential outcomes to define causal effects, which greatly clarifies critical issues.
Don has provided the full paper available here .
The applied statistics workshop meets at 12 noon in Room K-354, CGIS Knafel (1737 Cambridge St), with a light lunch. Our presentations begin at 1215 and usually conclude around 130 pm. As always, everyone is welcome!
Posted by Justin Grimmer at 3:07 PM
October 14, 2008
The good folks at CNN are hot on the trail of a swing to McCain in Ohio, a crucial battleground state. CNN's headline claims, "Ohio Poll of Polls: McCain Gains Some Ground in Tight Race". From the story , we learn that,
"CNN's new Ohio poll of polls shows Barack Obama leading McCain by three points, 49 to 46 percent. Five percent of the state's voters were unsure about their presidential pick.
The network's last Ohio poll of polls, released October 9, showed Obama leading McCain by four points, 50 to 46 percent. In the September 21 poll of polls, Obama led McCain by a single point, 47 to 46 percent."
This is the smallest possible shift that a network would be willing to report: a one-percentage point decrease in support for Obama and no-change in support for McCain. The survey design and the poll of polls would have to be incredibly powerful to detect this subtle shift in the electorate's preferences.
With my interest piqued, I read further. As it turns out, CNN's analysis of the poll of polls is based on some claims that are suspect :
"The Ohio general election "poll of polls" consists of four surveys: Ohio Newspaper Poll/University of Cincinnati (October 4-8), ARG (October 4-7), CNN/Time/ORC (October 3-6) and ABC/Washington Post (October 3-5). The poll of polls does not have a sampling error."
What? No sampling error?
If CNN thinks that averaging four polls removes all variability, then I have a bridge in Alaska up for sale (and I'll throw in some oceanfront property in Arizona , which also seems appropriate).
It is more likely that the author meant that the margin of error would be hard to calculate. This is not equivalent to the margin of error not existing at all. For example, it is hard to calculate when the Cubs are going to win another World Series . But I pray that this does not mean that the date is undefined (which seems infinitely worse than never).
Of course, news networks want to justify covering politics as a horse race and want to ignore the warnings that small changes in polls are not real, even when you average over four surveys. But this seems like a particularly egregious abuse of polling numbers to make a race seem more fluid than reality (or reasonable statistics) seems to permit.
Posted by Justin Grimmer at 10:45 AM
October 13, 2008
Dear Applied Statistics Community,
Please join us this Wednesday (October 15th) when Stephen Ansolabehere, Professor in Harvard's Department of Government, will present his work on "Vote Validation in the 2006 CCES". Stephen provided the following abstract,
New technology and recent political reform have made vote validation an easier and
more reliable process than it has been in the past. We present a basic summary of the
vote validation procedure used in the 2006 CCES, a Web-based survey of nearly 35,000
Americans that has been validated electronically with new state-wide voter files. As
the validation method in the CCES is quite different from the method used by the
National Election Studies (NES) in the 1960s through 1980s, we compare the CCES
procedure and results with the most recent midterm elections validated by the NES.
We show that while the rate of vote misreporting is substantially higher in the 2006
Web-based survey, the pattern of misreporting is consistent with the NES samples. We
also show how the large sample size in the CCES can be exploited to study phenomena
beyond vote misreporting using the validated records.
A paper is available for download here
The applied statistics workshop meets at 12 noon in Room K-354, CGIS Knafel (1737 Cambridge St), with a light lunch. Our presentations begin at 1215 and usually conclude around 130 pm. As always, everyone is welcome!
Cheers
Justin Grimmer
Posted by Justin Grimmer at 2:05 PM
October 5, 2008
Please join us this Wednesday, October 8th when Stefano Iacus, Department of Economics, Business and Statistics, University of Milan (yes, in Italy) will be presenting his work on Stochastic differential equations and applied statistics. Stefano provided the following abstract:
Stochastic differential equations (SDEs) arise naturally in many fields of science. Solutions of SDEs are continuous time processes and are usually proposed as alternative models to standard time series. While continuous time modeling seems better in describing the natural evolving nature of the underlying data generating process, observations always come in discrete form. This discrepancy raised new statistical challenges (e.g., the discrete time likelihood is not always available).
In the first part of the talk, we present few examples (from biostatistics, econometrics, political analysis, etc.) in which SDEs naturally emerge. Then, we present the general statistical issues peculiar to these models and finally we present some new applications (with solutions) like change point analysis, hypotheses testing and cluster analysis for discretely observed stochastic differential equations.
Stefano suggested that the following papers might offer helpful background information for his presentation.
De Gregorio, A., Iacus, S.M. (2008) Clustering of discretely observed diffusion processes
De Gregorio, A., Iacus, S.M. (2008) Divergences Test Statistics for Discretely Observed Diffusion Processes
The workshop will begin at 12 noon in room K-354 in 1737 Cambrdge St (CGIS-Knafel) with a light lunch and the presentation will commence around 1215. The workshop usually adjourns around 130 pm. All are welcome!
Posted by Justin Grimmer at 3:36 PM
September 29, 2008
Please join us on Wednesday October 1st when Gary King, the David Florence Professor of Government, will present "Matching for Causal Inference Without Balance Checking". A draft of the paper is available here , and here is the abstract:
We address a major discrepancy in matching methods for causal inference in observational data. Since these data are typically plentiful, the goal of matching is to reduce bias and only secondarily to keep variance low. However, most matching methods seem designed for the opposite problem, guaranteeing sample size ex ante but limiting bias by controlling for covariates through reductions in the imbalance between treated and control groups only ex post and only sometimes. (The resulting practical difficulty may explain why many published applications do not check whether
imbalance was reduced and so may not even be decreasing bias.) We introduce a new class of "Monotonic Imbalance Bounding" (MIB) matching methods that enables one to choose a fixed level of maximum imbalance, or to reduce maximum imbalance for one variable without changing it for the others. We then discuss a specific MIB method called "Coarsened Exact Matching" (CEM) which, unlike most existing approaches, also explicitly bounds through ex ante user choice both the degree of model dependence and the causal effect estimation error, eliminates the need for a separate procedure to restrict data to common support, meets the congruence principle, is approximately invariant to measurement error, works well with modern methods of imputation for missing data, is computationally efficient even with massive data sets, and is easy to understand and use. This method can improve causal inferences in a wide range of applications, and may be preferred for simplicity of use even when it is possible to design superior methods for particular problems. We also make available open source software which implements all our suggestions.
The applied statistics workshop meets in room K-354, CGIS-Knafel (1737 Cambridge St) at 12 noon, with a light-lunch served. The presentation will begin at 1215 and the workshop usually ends around 130. All are welcome to attend
Posted by Justin Grimmer at 8:52 PM
September 23, 2008
Please join us tomorrow (Wednesday, 9/24) when we welcome Ben Fry to the applied statistics workshop. Ben's research explores data visualization--more details can be found here -- including details of his recently completed book "Data Visualization" and samples from his previous work .
The workshop will meet at 12 noon in room K-354, CGIS-Knafel (1737 Cambridge St) with a light lunch served. The presentation will begin at 1215 and usually ends around 130 pm. All are welcome--
Posted by Justin Grimmer at 10:39 AM
September 10, 2008
Welcome back for the 2008-2009 academic year. The applied statistics workshop has an exciting lineup of speakers this coming semester. The workshop kicks off this coming Wednesday, September 17th, with Andrew Gelman, Department of Statistics and Political Science, Columbia University. Andrew will be presenting results from his recently released book "Red State, Blue State, Rich State, Poor State". Here is an introduction to the book from the publisher:
With wit and prodigious number crunching, Andrew Gelman and his coauthors get to the bottom of why Democrats win elections in wealthy states while Republicans get the votes of richer voters, how the two parties have become ideologically polarized, and other issues. Gelman uses eye-opening, easy-to-read graphics to unravel the mystifying patterns of recent voting, and in doing so paints a vivid portrait of the regional differences that drive American politics. He demonstrates in the plainest possible terms how the real culture war is being waged among affluent Democrats and Republicans, not between the haves and have-nots; how religion matters for higher-income voters; how the rich-poor divide is greater in red not blue states--and much more.
With the excitement surrounding the current presidential races, this presentation promises to be informative to anyone interested in separating the facts from the myths about vote choice in America. For those interested, a blog is available about the book , which is also available for purchase.
As a reminder, the applied statistics workshop meets every Wednesday in CGIS-Knafel, 1737 Cambridge St, room K-354 (Previously N-354, before the Chad Johnson/Prince-esque name change that recently swept through the north building). We start at 12 noon with a light lunch and the presentations usually begin around 1215.
To give Andrew the maximum amount of time, we will skip the normal "business" meeting that usually starts the year. If anyone has any suggestions about how the workshop could improve, or would like to present at the workshop this year, please let me know (email would probably be the quickest and most effective method, jgrimmer at fas dot harvard dot edu)
Posted by Justin Grimmer at 11:14 AM
April 28, 2008
Please join us for the final applied statistics workshop when Jamie Robins , Department of Epidemiology and Biostatistics, Harvard School of Public Health, will Present "Estimation of Direct Effects in different contexts: Pure and natural direct effects, Pathway-specific estimation, principal stratification, mendelian randomization, testing the exclusion restriction , and surrogate markers". Jamie will be sampling from the following papers during his talk:
paper 1
paper 2
paper 3
paper 4
The Applied Statistics workshop meets in room N-354, CGIS-Knafel 1737 Cambridge. The workshop begins at 12noon with a light lunch, with our presentations beginning at 1215 and usually ending around 130 pm.
Posted by Justin Grimmer at 10:48 PM
April 21, 2008
Please join us this Wednesday when Jeff Gill--Department of Political Science and Director Center for Applied Statistics, Washington University St Louis-- will present "Circular Data in Political Science and How to Handle It", work that is joint with Dominik Hangartner. Jeff and Dominik provided the following abstract
There has been no attention to circular (purely cyclical) data in political science research. We show that such data exists and is generally mishandled by models that do not take into account the inherently recycling nature of some phenomenon. Clock and calendar effects are the obvious cases, but directional data exists as well. We develop a modeling framework based on the von Mises distribution and apply it to two datasets: casualties in the second Iraq war and suicides in Switzerland. Results clearly demonstrate the importance of circular regression models to handle periodic data.
A preliminary draft of their paper is available here
The authors also provided an example of circular data analyzed in their paper: the figure below shows the time at which different kinds of violent attacks occur in Iraq.
The applied statistics workshop meets at 12 noon in room N-354 of CGIS-Knafel (1737 Cambridge St), with a light lunch served. The presentations begin around 1215 and conclude at about 130 pm.
Please contact me with any questions
Posted by Justin Grimmer at 1:15 PM
April 14, 2008
Please join us at the applied statistics workshop this Wednesday when Lee Fleming, Harvard Business School, will present “Mobility, Skills, and the Michigan Noncompete Experiment”. Lee provided the following abstract:
While prior research has considered the desirability and implications of employee mobility, less research has considered factors affecting the ease of mobility. This paper explores a legal constraint on mobility —employee noncompete agreements—by exploiting Michigan’s apparently-inadvertent 1985 reversal of its enforcement policy as a natural experiment. Using a differences-in-differences approach, and controlling for changes in the auto industry central to Michigan’s economy, we find that the enforcement of noncompetes indeed attenuates mobility. Moreover, noncompete enforcement decreases mobility most sharply for inventors with firm-specific skills, and for those who specialize in narrow technical fields. The results speak to the literature on mobility constraints while offering a credibly exogenous source of variation that can extend previous research.
The paper for the talk is available here
The applied statistics workshop meets at 12 noon in room N-354, CGIS-Knafel (1737 Cambridge St) with a light lunch. Presentations usually begin around 1215 and usually run until about 130 pm.
Posted by Justin Grimmer at 10:18 AM
April 8, 2008
Please join us this Wednesday (tomorrow) when Judith J. Lok, Harvard School of Public Health, Department of Biostatistics, will present " Optimal start of treatment based on time-dependent covariates". Judith provided the following abstract for her talk:
Using observational data, we estimate the effects of treatment regimes that start treatment once a covariate, X, drops below a certain level, x. This type of analysis is difficult to carry out using experimental data, because the number of possible values of x may be large. In addition, we estimate the optimal value of x, which maximizes the expected value of the outcome of interest within the class of treatment regimes studied in this paper. Our identifying assumption is that there are no unmeasured confounders.
We illustrate our methods using the French Hospital Database on HIV. The best moment to start Highly Active AntiRetroviral Therapy (HAART) in HIV positive patients is unknown. It may be the case that withholding HAART in the beginning is beneficial, because it postpones the time patients develop drug resistance, and hence might improve the patients' long term prognosis. However, it is unknown how long initiation of HAART can safely be postponed.
The paper for the talk can be found here
The applied statistics workshop meets at 12 noon in room N 354, CGIS Knafel (1737 Cambridge Street) with a light lunch. The presentation will begin at 1215pm and usually runs until 130 pm.
Posted by Justin Grimmer at 10:06 AM
March 31, 2008
Please join us this Wednesday when Nicholas Christakis--Professor, Department of Sociology (Harvard University) and Medical Sociology (Harvard Medical School)--who will be present "Eat Drink and Be Merry: The Spread of Health Phenomena In Social Networks". Nicholas provided the following abstract:
Our work has involved the quantitative investigation of whether and how various health-related phenomena might spread from person to person. For example, we explored the nature and extent of person-to-person spread of obesity. We developed a densely interconnected network of 12,067 people assessed repeatedly from 1971 to 2003. We used longitudinal statistical models and network-scientific methods to examine whether weight gain in one person was associated with weight gain in friends, siblings, spouses, and neighbors. Discernible clusters of obese persons were present in the network at all time points, and the clusters extended three people deep. These clusters were not solely due to selective formation of social ties. A friend becoming obese in a given time interval increased a person's chances of becoming obese by 57% (95% CI: 6%-123%). Among pairs of adult siblings, one becoming obese increased the chance that the other became obese by 40% (21%-60%). Among spouses, one becoming obese increased the likelihood that the other became obese by 37% (7%-73%). Among those working in small firms, a co-worker becoming obese increased a person's chances of becoming obese by 41% (17-59%). Immediate neighbors did not exhibit these effects. We have also conducted similar investigations of other health behaviors, such as smoking, drinking, exercising, and the receipt of health screening, and of other health phenomena, such as happiness and depression. Various aspects of our findings suggest that the spread of social norms may partly underlie inter-personal health effects. Our findings have implications for clinical and public health interventions, and for cost-effectiveness assessments of preventive and therapeutic interventions. They also lay a new foundation for public health by providing a rationale for the claim that health is not just an individual, but also a collective, phenomenon.
Nicholas also provided a link to his paper here
The applied statistics workshop meets in room N354 in CGIS-Knafel, (1737 Cambridge st.) A light lunch will be served at 12 noon with the presentation beginning around 1215. Please contact me with any questions
Posted by Justin Grimmer at 10:02 AM
March 17, 2008
Please join us this Wednesday as we welcome, Kenneth Hill--Harvard School of Public Health, Department of Population and International Health-- who will present his research "Global Health and Global Goals: Do Targets Make a Difference?" Kenneth provided the following paper as background for his presentation:
http://people.fas.harvard.edu/~jgrimmer/Hill319.pdf
The applied statistics workshop meets in room N-354 in CGIS-Knafel, 1737 Cambridge st. The workshop begins at 12 noon with a light lunch, with presentations usually beginning around 1215.
Please contact me with any questions
Justin Grimmer
Posted by Justin Grimmer at 5:21 PM
March 10, 2008
This Wednesday we are excited to welcome Andy Eggers and Jens Hainmueller, Government Department, Harvard University, who will present, "MPs for Sale? Estimating Returns to
Office in Post-War British Politics'. Andy and Jens provided the following abstract:
While the role of money in policymaking is a central question in political
economy research, surprisingly little attention has been given to the rents
politicians actually make from politics. Using an original dataset on the
size of British politicians' estates, we find that gaining a seat in the
House of Commons had a large effect on personal wealth: Conservative Party
MPs died with almost twice as much money, on average, as very similar
Parliamentary candidates who were defeated. We find no financial benefits
for candidates from the Labour party. We argue that Conservative MPs
profited from office in a lax regulatory environment by using their
political positions to obtain outside work as directors, consultants, and
lobbyists, both while in office and after retirement. Our results are
consistent with anecdotal evidence on MPs' outside financial dealings but
suggest that the magnitude of influence peddling was larger than has been
appreciated.
The paper is available here:
http://people.fas.harvard.edu/~jgrimmer/MPsforsale.pdf
The applied statistics workshop meets in room N-354 in CGIS-Knafel, 1737 Cambridge st. The workshop begins at 12 noon with a light lunch, with presentations usually beginning around 1215.
Please contact me with any questions
Justin Grimmer
Posted by Justin Grimmer at 10:42 AM
March 3, 2008
Please join us this Wednesday as we welcome Joseph Blitzstein, Department of Statistics, Harvard University, who will present 'In and Out of Network Sampling'.
Joe provided the following abstract for his talk ,
In recent years it has become extremely common to need to work with
network data, in applications such as the study of social networks,
protein interaction networks, and the Internet. This has required the
development of new generative models such as exponential random graph
models and power law models. Yet it is usually prohibitively expensive
to observe or work with the full network, so sampling within the
network is generally required.
Various approaches to network sampling, such as respondent-driven
sampling, have been proposed. But when will the generative mode mesh
well with the sampling scheme? This question is crucial for reliable
inference about networks, yet the question is seldom addressed and
much remains unknown. We will discuss generating random networks and
sampling within a network, and their interactions. Based on joint work
with Ben Olding.
The workshop will begin at 12 noon with a light lunch and the presentation will begin at 1215. The workshop is help in room N354, CGIS-Knafel, 1737 Cambridge St.
Posted by Justin Grimmer at 11:00 PM
February 25, 2008
This Wednesday the Applied Statistics Workshop will welcome Matthew Harding, Dept. of Economics, Stanford University. Matthew will be presenting his research, "A Bayesian Mixed Logit-Probit Model for Multinomial Choice", a project that is joint with Jerry Hausman and Martin Burda. Here is an abstract for the presentation:
In this paper we introduce a new flexible mixed model for multinomial discrete choice where the key individual- and alternative-specific parameters of interest are allowed to follow an assumption-free nonparametric density specification while other alternative-specific coefficients are assumed to be drawn from a multivariate normal distribution. A hierarchical specification of our model allows us to break down a complex data structure into a set of submodels with the desired features that are naturally assembled in the original system. We estimate the model using a Bayesian Markov Chain Monte Carlo technique with a multivariate Dirichlet Process (DP) prior on the coefficients with nonparametrically estimated density. We bypass a problem of prior non-conjugacy by employing a "latent class" sampling algorithm for the DP prior. The model is applied to supermarket choices of a panel of Houston households whose shopping behavior was observed over a 24-month period in years 2004-2005. We estimate the nonparametric density of two key variables of interest: the price of a basket of goods based on scanner data, and driving distance to the supermarket based on their respective locations, calculated using GPS software. Supermarket dummies form the parametric part of our model.
The workshop meets at 12 noon with a light lunch and presentations usually begin at 1215. Our workshop is located at 1737 Cambridge St, CGIS-Knafel, room N-354.
Posted by Justin Grimmer at 5:43 PM
February 18, 2008
This Wednesday, 2/20, the applied statistics workshop welcomes Jim Snyder, Arthur and Ruth Sloan Professor of Economics and Political Science at MIT. He will be presenting "The Wealth of Political Office in the US, 1840-1870" work that is joint with Pablo Querubin, Department of Economics, MIT. Jim provided the following abstract and the attached article:
The second half of the 19th century was known as a corrupt era in U.S. politics. Using the censuses of 1850, 1860 and 1870, we find the wealth of all candidates running for the U.S. House of Representatives during the period 1840-1870. We use this data to estimate several quantities of interest, including: How wealthy were these candidates compared to others in the population at the time? How did the wealth accumulation of these candidates compare to others in the population? How did the wealth levels and accumulation vary by party? How did those candidates who won a congressional race by a close margin compare with those who lost by a close margin? This last quantity, which exploits a regression-discontinuity approach, provides a good estimate of the monetary ``rents'' to a congressional seat at that time.
As always, the workshop will convene at 12 noon with a light lunch and the presentation will begin at 1215. We are located in CGIS-Knafel, 1737 Cambridge St, room N-354.
Posted by Justin Grimmer at 3:05 PM
February 11, 2008
This Wednesday the applied statistics workshop presents Donald Rubin -- Department of Statistics, Harvard University – who will present, “Direct and Indirect Causal Effects: An unhelpful distinction?" Don has suggested the following papers provide a helpful background to his talk:
2003 - “Assumptions Allowing the Estimation of Direct Causal Effects: Discussion of `Healthy,
Wealthy, and Wise? Tests for Direct Causal Paths Between Health and Socioeconomic
Status’ by Adams et al.’”. Journal of Econometrics, 112, pp. 79-87. (With F. Mealli.)
2004 - “Direct and Indirect Causal Effects Via Potential Outcomes.” The Scandinavian Journal of
Statistics, 31, pp. 161-170; 196-198, with discussion and reply
2005 “Causal Inference Using Potential Outcomes: Design, Modeling, Decisions.”
The Journal of the American Statistical Association, 100, 469, pp. 322-331.
The workshop meets at 12 noon in room N-354 CGIS-Knafel (1737 Cambridge St) with a light lunch, with presentations usually beginning at 1215.
Posted by Justin Grimmer at 10:33 AM
February 4, 2008
Apologies for the late post—we’ve experienced some last minute scheduling changes. This week Kevin Quinn, Department of Government, will present ‘Assessing Political Positions of Media’ a project that is joint with Daniel Ho, Stanford Law School. Kevin provided the following abstract:
Although central to understanding the role of the media, few quantitative measures of the
political positions of media exist. We amass a new, large-scale dataset to shed light on this question. Collecting and classifying over 1500 editorials adopted by 25 major U.S. newspapers on 495 Supreme Court cases from 1994-2004, we apply an item response theoretic approach to place newspapers on a substantively meaningful and long validated scale of political preferences. Our results provide significant insights into the study of the media. We show that 18 of the 25 papers are more likely to the left of the median Justice for this period, but also considerable evidence that this may be an artifact of the liberalness of urban, elite, high circulation papers.
Kevin also provided a link to the paper, which is available here
Our workshop will convene this Wednesday at 12 noon with a light lunch, with the presentation to start at 1215. We are located in CGIS-Knafel (1737 Cambridge St) Room N-354.
Posted by Justin Grimmer at 7:23 PM
January 27, 2008
The applied statistics workshop returns this Wednesday, January 30. We’ll have David Nickerson, Department of Political Science, University of Notre Dame, presenting “How (and how not) to Study Voter Registration Experimentally”.
The workshop will convene at 12 noon with a light lunch served. The presentation will begin at 1215. We are located in CGIS Knafel (1737 Cambridge St), room N-354. We hope to see you there.
Any questions, comments, or concerns? Please send me an email—(jgrimmer@fas.harvard.edu)
Justin Grimmer
Posted by Justin Grimmer at 12:02 PM
December 10, 2007
There will be no applied statistics workshop this Wednesday December 12th.
The workshop will resume on January 30th with a presentation from David Nickerson, University of Notre Dame-Department of Political Science.
Hope to see you all then and have a great holiday season.
Posted by Justin Grimmer at 12:09 PM
December 4, 2007
Please join us for the final applied statistics workshop of the semester when Sendhil Mullainathan, Professor of Economics Harvard University,
will present 'How We Choose: Medicare Drug Plan Selection', work that is joint with Jeff Kling, Eldar Shafir, Lee Vermeulen, and Marian Wrobel.
Sendhil provided the following abstract:
Choices increasingly abound for various government supported services, ranging from charter schools to health plans. 24 million elderly Americans have enrolled in Medicare Part D prescription drug coverage during the past two years, and may choose among at least 40 plans. In this paper we examine the informational context in which choices are made and conduct an experiment of information provision, focusing on the decision about whether to switch plans during the open enrollment period in 2006, one year after the program began. We find that most participants obtain their information from mailings from plans and from Medicare. This information is not personalized, although the costs and benefits for a given plan vary greatly depending on specific prescriptions are used. Knowledge of how plans work is low. Personalized information is available by calling Medicare, but most participants do not seek information.
Our randomized experiment provided an intervention of personalized information (highlighting the predicted out-of-pocket cost of the current plan and the least expensive plan, and also listing costs of all plans -- based on information about prescription use) in comparison to a group that was provided information about accessing the Medicare website. The intervention group plan-switching rate was 28 percent, while the comparison group rate was 17 percent. The potential cost savings for those affected by the intervention was at least $230 on average. The impacts on switching and potential savings were larger for those with greater absolute and relative potential savings, and for those in small market share plans. The impacts on switching were larger for those initially in low premium plans. We conclude that additional efforts to distribute simple, personalized drug plan information would lead to significant reductions in Medicare beneficiaries' out-of-pocket costs and that the costs of such a program would likely be offset by reduced Medicare expenditures on subsidies to drug plans.
Here is a link to the paper.
As a reminder, the workshop meets at 12 noon and we provide a light lunch. We are located in room N-354, CGIS Knafel, 1737 Cambridge St.
Please contact me with any questions, or suggestions for next semester's schedule
Posted by Justin Grimmer at 10:29 AM
November 26, 2007
The Applied statistics workshop reconvenes this Wednesday, 11/28, for Esther Duflo, the Abdul Latif Jameel Professor of Poverty Alleviation and Development Economics, who will present her work on "Improving School Quality". Esther provided the following link describing the project:
http://www.povertyactionlab.com/projects/project.php?pid=39
As a reminder, the workshop begins with a light lunch at 12 noon. We are located in room N354, Cgis Knafel (1737 Cambridge st).
Posted by Justin Grimmer at 7:40 PM
November 13, 2007
The Applied Statistics Workshop returns tomorrow (11/14) with Chris Paciorek, Department of Biostatistics in the School of Public Health, presenting this work on , 'Spatial scale and bias in regression models with spatial confounding'. Chris provided the following abstract for his talk:
When unmeasured confounders vary spatially, a common technique in regression modeling, including spatial epidemiology applications, is to try to account for the unmeasured confounding by modeling residual spatial correlation. The intuition is that modeling the spatial structure will remove large scale variation and allow one to estimate the effect of the covariate of interest based on variation in the outcome isolated at smaller scales. Previous work in the temporal setting indicates that when the variable of interest has an uncorrelated component then such an approach can minimize bias. Here I consider the situation that the variable of interest varies at multiple spatial scales but may not have a non-spatial component. I develop a framework for understanding bias using a simple generalized least squares model with data collected at point locations and fixed and known spatial scales. I show that bias is substantial even when the scales are known, unless the variable of inte rest has an unconfounded component that varies at a finer spatial scale than the confounder. Using simulation I consider the effect of estimating the scale of the residual spatial correlation on bias, showing that bias is similar when variance and scale parameters are estimated to when they are known. I discuss extensions to data aggregated into areal units and to the setting of measurement error in the covariate of interest.
As always, the workshop will convene at 12 noon, in room N-354, CGIS-Knafel. And a light lunch will be served.
Hope you all can make it!
Posted by Justin Grimmer at 11:18 AM
November 6, 2007
The applied statistics workshop will take a one week hiatus this week (11/7). But be sure to join us next week (11/14) for Chris Paciorek, Department of Biostatistics, who will present 'Spatial scale and bias in regression models with spatial confounding'.
Hope that you all can make it next week--
Posted by Justin Grimmer at 10:17 AM
October 30, 2007
This week, the Applied Statistics Workshop is happy to have Andrew C. Thomas, G-4 Department of Statistics, presenting his work on, "Symmetry and Competition in State Legislative Election Systems". Andrew has provided the following abstract for his presentation:
Drawing of legislative districts has historically been conducted by the legislators themselves; recently, some states have appointed redistricting commissions, the members of which cannot run for seats in the legislature for a period afterwards. I demonstrate that current methods, in particular the Gelman-King model and the JudgeIt R package, can easily diagnose the state of an electoral map given previous electoral conditions. In particular, competition increases in states with commissions, but the impact on symmetry is as yet unclear. I conclude with a discussion on techniques to improve the resolution and measurement of electoral symmetry within states.
Please join us this Wednesday at 12 noon for the presentation and a light lunch. We hold the workshop in Room N-354 of CGIS-Knafel (1737 Cambridge St).
Posted by Justin Grimmer at 1:46 AM
October 22, 2007
The Applied Statistics Workshop is proud to present James Stock, Chair of the Economics Department, as he presents, “Forecasting in Dynamic Factor Models Subject to Structural Instability”. James has provided the following abstract:
Dynamic factor models (DFMs) express the comovements of time series at leads and lags in terms of a small number of latent factors. In macroeconomic applications, the latent factors can be thought of as theoretical constructs (income) that are linked to specific measurements (GDP). The large body of work on DFMs in macroeconomics assumes a stable structure. This paper develops time-varying DFMs and uses implications of time-varying DFMS to shed light on some ongoing macro puzzles such as the Great Moderation and the breakdown of the backward-looking Phillips curve.
The workshop will meet at 12-noon in room N-354, CGIS-Knafel. And a light lunch will be served.
Posted by Justin Grimmer at 8:45 PM
October 15, 2007
The applied statistics workshop is back for another exciting installment. This week we have Damon Centola, RWJ Scholar, Harvard University presenting 'Diffusion in Social Networks: New Theory and Experiments' . Damon provided the following abstract for his talk:
The strength of weak ties is that they tend to be long – they connect
socially distant locations. Research on “small worlds” shows that these
long ties can dramatically reduce the “degrees of separation” of a
social network, thereby allowing ideas and behaviors to rapidly diffuse.
However, I show that the opposite can also be true. Increasing the
frequency of long ties in a clustered social network can also inhibit
the diffusion of collective behavior across a population. For health
related behaviors that require strong social reinforcement, such as
dieting, exercising, smoking, or even condom use, successful diffusion
may depend primarily on the width of bridges between otherwise distant
locations, not just their length. I present formal and computational
results that demonstrate these findings, and then propose an
experimental design for empirically testing the effects of social
network topology on the diffusion of health behavior.
The workshop is held on Wednesday at 12 noon in room N 354, CGIS Knafel (1737 Cambridge St). And a light lunch will be served.
Posted by Justin Grimmer at 5:59 PM
October 8, 2007
Dear Applied Statistics Community,
Please join us for this week's installment of the Applied Statistics workshop, where Fernanda Viegas and Martin Wattenberg will be presenting their talk entitled, "From Wikipedia to Visualization and Back'. The authors provided the following abstract for their talk:
This talk will be a tour of our recent visualization work, starting with a case study of how a new data visualization technique uncovered dramatic dynamics in Wikipedia. The technique sheds light on the mix of dedication, vandalism, and obsession that underlies the online encyclopedia. We discuss the reaction of the Wikipedia community to this visualization, and how it led to a recent ambitious project to make data visualization technology available to everyone. This project, Many Eyes, is a web site where people may upload their own data, create interactive visualizations, and carry on conversations. The goal is to foster a social style of data analysis in which visualizations serve not only as a discovery tool for individuals but also as a means to spur discussion and collaboration.
Martin and Fernanda have also provided the following set of links as background for the presentation:
http://alumni.media.mit.edu/~fviegas/papers/history_flow.pdf
http://www.research.ibm.com/visual/papers/viegasinfovis07.pdf
And to a website based upon recent work in data visualization
Link to Many Eyes site:
www.many-eyes.com
As always, the workshop meets at 12 noon on Wednesday, in room N-354 CGIS-Knafel. A light lunch will be provided
Posted by Justin Grimmer at 12:02 PM
October 2, 2007
The Applied Statistics Workshop presents another installment this week with Thomas Cook, Department of Sociology, Northwestern University presenting a talk entitled, "When the causal estimates from randomized experiments and non-experiments coincide: Empirical findings from the within-study comparison literature." Here is an excerpt from the paper:
The present paper has several purposes. It seeks to up-date the literature since Glazerman et al. (2003) and Bloom et al. (2005) and to move it beyond its near exclusive focus on job training. We have examined the job training studies, and find nothing to challenge the past conclusions described above. However, the more recent studies allow us to broach three questions that are more finely differentiated than whether experiments and non-experiments produce comparable findings:1. Do experiments and RDD studies produce comparable effect sizes? We have found three examples attempting this comparison.
2. Do comparable effect sizes result when the non-experiment depends on selecting one or more intact comparison groups that are deliberately matched on pretest measures of the posttest outcome, as recommended in Cook & Campbell (1979)? Thus, in a non-experiment with schools as the unit of assignment, intervention schools are carefully matched with intact non-intervention schools in the hope that the average treatment and comparison schools will not differ on pretest achievement, let us say, though they may differ on unobservables. We have found three studies with this focus.
3. Do experiments and non-experiments produce comparable effect sizes when the intervention and comparison units do differ at pretest and so statistical adjustments or individual matches are constructed to control for this demonstrated non-equivalence? This question has dominated the literature to date, and we found six studies outside of job training that asked this question
We will meet at 12 noon in CGIS-Knafel N354 and the talk will begin at 1215pm. And of course a delicious, free lunch will be provided.
Posted by Justin Grimmer at 1:19 AM
September 24, 2007
Please join us this Wednesday (9/26) when David Lazer, Associate Professor of Public Policy and Director of the Program on Networked Governance at the Kennedy School of Government, will present "Life in the Network: The Coming Era of Computational Social Science". Professor Lazer provided the following summary of his talk:
An increasing fraction of human behavior (especially relational behavior) leaves substantial digital traces-- whether in the form of phone logs, e-mail, instant messaging, etc. Further, increased computational power allows the analysis of these digital traces-- e.g., through natural language processing, statistical analysis of massive (millions of individuals) longitudinal data, etc. These two points suggest that we are on the precipice of dramatic new insights into collective human behavior. I will discuss the potential future of a "computational social science", with reference to four ongoing research projects.
As always, our workshop begins at 12 noon in CGIS-Knafel room N-354. And a free lunch will be provided.
Posted by Justin Grimmer at 7:07 PM
November 8, 2006
Justin Grimmer
In a recent Brookings Institution report on the mathematics scores of junior high and high school students from different nations uncovers some paradoxical correlations. Using standardized test scores, the report shows that nations with the highest scores also have the students with the lowest confidence in their math ability and the lowest levels of enjoyment from learning math. This is evident in American students, with high confidence and enjoyment, but only with middle-of-the-pack scores on standardized tests.
Casting correlation/causation concerns aside, the Brookings report goes on to argue that the American mathematical education experience is perhaps too enjoyable for students. Rather than informing students about the important mathematical concepts that the foreign textbooks provide, American textbooks are characterized as trying too hard to create an enjoyable classroom experience.
The policy implication provided is to make mathematics less enjoyable in American classrooms by discarding colorful pictures and interesting story problems. At the very least, the report suggests that educator’s attention should be redirected from making math fun to making math education solely about mathematics.
Because of the study’s limited nature, any drastic policy recommendations should be avoided. After all, the report’s argument merely identifies two paradoxical relationships and then speculates a causal mechanism that provides one potential explanation for the trend. No effort is made to eliminate other alternative causal mechanisms. For example, cultural explanations could explain the discrepancy of the scores and confidence ratings, aside from differences in teaching methodologies. The study also attempts to make an ecological inference, inferring individual level behavior from aggregated data. While not damming in itself, it does weaken the strength of the conclusions.
That being said, perhaps the problem with American mathematics education does not lie in the attempt to make students happy, but in the material that is presented. Rather than providing students with an in depth understanding of concepts and introducing proof techniques, high school math assignments are often about memorization and a superficial knowledge of the techniques involved. Perhaps, if the focus were changed to make high school mathematics less like balancing a check book and more like Real Analysis, American math students would see an increase in their happiness in the classroom and also their test scores.
Posted by Justin Grimmer at 11:51 AM
October 26, 2006
Justin Grimmer
Newcomb’s paradox is a classic problem in philosophy and also an entertaining puzzle to consider. Here is one version of the paradox. Suppose you are presented with two boxes, A and B. You are allowed to take just box A, just box B, or both A and B. There will always be $1000 in box A, and there will either be $0 or $1,000,000 in box B.
A ‘predictor’ determines the contents of box B before you have arrived, using the following plan. If the predictor believes you will pick both box A and B, then she places nothing in box B, but if she believes that you will only take box B, then she places the $1,000,000 in box B.
What makes this predictor special is her amazing accuracy. In the previous billion plays of the game she has never been wrong.
So, you have the two boxes in front of you, what should you do? Keep in mind, the predictor has already made her decision when you arrive at the boxes, so by our normal rules of causality (events in the future cannot cause past events), our actions cannot change what the predictor has decided.
Posted by Justin Grimmer at 12:00 PM