May 2008
Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

Authors' Committee

Chair:

Andy Eggers (Gov)

Members:

Weihua An (Soc)
Kevin Bartz (Stats)
Sebastian Bauhoff (HealthPol)
John Graves (HealthPol)
Justin Grimmer (Gov)
Jens Hainmueller (Gov)
Mike Kellermann (Gov)
Ellie Powell (Gov)
Gary King (Gov)

Weekly Research Workshop Sponsors

Alberto Abadie, Lee Fleming, Adam Glynn, Guido Imbens, Gary King, Kevin Quinn, Jamie Robins, Don Rubin, Chris Winship

Recent Comments

Recent Entries

Categories

Blogroll

Brad DeLong
Cognitive Daily
Complexity & Social Networks
Developing Intelligence
EconLog
The Education Wonks
Empirical Legal Studies
Free Exchange
Freakonomics
Health Care Economist
Junk Charts
Language Log
Law & Econ Prof Blog
Machine Learning (Theory)
Marginal Revolution
Mixing Memory
Mystery Pollster
New Economist
Political Arithmetik
Political Science Methods
Pure Pedantry
Science & Law Blog
Simon Jackman
Social Science++
Statistical modeling, causal inference, and social science

Archives

Notification

Powered by
Movable Type 3.34


« Correlation of Ratios or Difference Scores Having Common Terms | Main | Does Medicare Save Lives? »

20 March 2008

Primary Crosstabs

We're lucky to have two contested Presidential primaries. One of my favorite habits is to look at cross-tabs of candidate preferences by party and county. Here's an example of an Iowa cross-tab, showing the number of Iowa counties by Republican winner and Democratic winner:






IowaObamaClintonEdwards
McCain
0
0
0
Romney
15
7
2
Huckabee
27
21
27

This paints a very clear picture: Huckabee won the Edwards counties and, to a lesser extent, the Clinton counties, and, to an even lesser extent, the Obama counties.

We can visualize cross-tabs using mosaic plots as in "Visualizing Categorical Data." I did it for nine primary states in the image below. The green represents Obama counties, the orange Hillary counties and the purple Edwards counties. Across the columns are the Republican candidates: McCain, Romney, Huckabee. Across the rows, Obama, Hillary and Edwards. Check it out here. If you instead prefer an inverted version, with Republicans across the rows and Democrats across the columns (this makes it easier to compare the Democrats), check it out here.

The conclusions are the same over most states: Huckabee and Edwards are clearly the most complementary candidates. They shared counties whenever Edwards was in play (Iowa, Florida); after that, Huckabee shared Clinton counties. In Missouri every single county he won was a Clinton county! Huckabee and Clinton are somewhat complementary. Neither McCain nor Romney is particularly complementary with any Democrat (see California, where McCain and Romney split the Hillary-Obama counties), though both did better in Obama counties when Huckabee was in play.

One distracting feature of the plots above is that counties aren't uniformly populous. Obama won Missouri by winning only six counties. An alternative interpretation is to view this as an ecological inference problem, in which we are trying to determine the population totals in each of the cross-tab cells. This isn't perfectly accurate, since Edwards voters don't actually also vote for Huckabee. But it does provide a nice framework for scaling the mosaic plot by population size, and making it look generally less degenerate. I did that using Ryan Moore's eiPack and got this.

Posted by Kevin Bartz at March 20, 2008 5:53 PM

Comments

Interesting post. Not being familiar with mosaic plots, it took me a while to wrap my head around what the plots were saying, eg to imagine what it would look like with axes reversed. Now I think I see it -- the area of each block is proportional to the number of counties won by a given combination (eg Clinton & Romney); putting the Republicans on the x-axis means that the blocks are lined up to make it easy to compare Republicans in terms of the proportion of their counties won by a particular Democrat.

Two things stand out to me: 1) Clinton's advantage in counties won in Texas, California, and Missouri is a lot larger than I would have thought, given the relatively even popular vote and the allocation of delegates. I guess this is mainly because she won in a lot of relatively unpopulous counties. 2) The complementarity of Huckabee and Clinton, or perhaps the uncomplementarity of Huckabee and Obama, is quite strong, particularly in the South -- Obama's counties in Georgia and Alabama seem roughly equally split between Huckabee and McCain, while Clinton's are maybe 85% Huckabee. It looks to me like this kind of imbalance is evident in basically every state except California. Again this fits in with the idea that urban, highly educated, and young people go for Obama and didn't go for Huckabee.

Posted by: Andy Eggers at March 21, 2008 12:09 PM

Notification

Enter e-mail address to receive notification of new comments to this entry

Post a comment




Remember Me?

(you may use HTML tags for style)