« New NBER working paper by James Heckman ``Econometric Causality'' |
Main
| Tuesday: Tips & Tricks »
4 May 2008
IN, NC Predictions
Since I have qualifying exams tomorrow, I'll keep this entry unimaginative. I've re-run my predictions for the Indiana and North Carolina primaries on Tuesday, adding a few new bells and whistles:
- A turnout model
- More covariates in the voting share model


With the help of a turnout model, I can actually predict the election result by multiplying turnout by population and adding up votes for Clinton and Obama. When I do that, I get:
Indiana: Clinton 53.5%, Obama 46.5%; turnout 950,000
North Carolina: Obama 58%, Clinton 42%; turnout 1,200,000
Yowzers! We'll see how the real numbers pan out. Here are a few details on the two models:
- The share model is trained on the primary results from Ohio, Pennsylvania and Virginia. This model has R^2 = 0.99, meaning that it's explained nearly as much as it can. The residuals still show a SE of 5%, however, so the results could be shaky at the county level.
- The turnout model is trained on the primary results from Ohio. Note that Indiana and North Carolina are open primaries. I didn't use Pennsylvania in this model because it was a closed primary, and I didn't use Virginia because it had a contested Republican election at the time. Ohio's Republican primary was technically contested by Huckabee, but he wasn't a serious factor, whereas he had dedicated substantial resources to competing in Virginia. For this model R^2 = .84 and the residual SE is 2%. My turnout projections are mapped below.


This time I included even more covariates for both models. Next to the ones found to be important, I've placed their effect in parentheses.
- Kerry's 2004 vote share and its square (pro-Clinton and +turnout)
- Proportions White, Black, Asian, Native American and Hispanic (white pro-Clinton and +turnout, others pro-Obama)
- Proportion male (pro-Clinton, +turnout)
- Proportions 18-21 and 65+ (both pro-Obama, young -turnout, old +turnout)
- Percentage urban
- Log(median household income) (pro-Obama)
- Proportion with a bachelor's degree, proportion with a master's degree (pro-Obama)
- Unemployment rate (high is pro-Clinton)
- Proportions employed in mining, in education, in construction (mining pro-Clinton, education pro-Obama)
How do my results stack up against the current polls? In Indiana, the RealClearPolitics average has Clinton +6%, only a point from my prediction. In North Carolina, the RCP average has Obama +8%, significantly below my predicted 16% victory. Two factors shed light on this discrepancy:
- In neighboring South Carolina, the polling average had Obama +11.6% and he won by 28.9%.
- In neighboring Virginia, the polling average had Obama +17.7% and he won by 28.2%.
- So perhaps my analysis isn't so crazy putting Obama above what the polls say in NC.
We'll see how it pans out on Tuesday. I'm more than willing to eat crow :)
Posted by Kevin Bartz at May 4, 2008 6:38 PM