October 2008
Sun Mon Tue Wed Thu Fri Sat
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31  

Authors' Committee

Chair:

Matt Blackwell (Gov)

Members:

Martin Andersen (HealthPol)
Kevin Bartz (Stats)
Deirdre Bloome (Social Policy)
Andy Eggers (Gov)
John Graves (HealthPol)
Rich Nielsen (Gov)
Maya Sen (Gov)
Gary King (Gov)

Weekly Research Workshop Sponsors

Alberto Abadie, Lee Fleming, Adam Glynn, Guido Imbens, Gary King, Arthur Spirling, Jamie Robins, Don Rubin, Chris Winship

Recent Comments

Recent Entries

Categories

Blogroll

Brad DeLong
Cognitive Daily
Complexity & Social Networks
Developing Intelligence
EconLog
The Education Wonks
Empirical Legal Studies
Free Exchange
Freakonomics
Health Care Economist
Junk Charts
Language Log
Law & Econ Prof Blog
Machine Learning (Theory)
Marginal Revolution
Mixing Memory
Mystery Pollster
New Economist
Political Arithmetik
Political Science Methods
Pure Pedantry
Science & Law Blog
Simon Jackman
Social Science++
Statistical modeling, causal inference, and social science

Archives

Notification

Powered by
Movable Type 4.24-en


« Dan Hopkins on the (non-)Bradley/Wilder Effect | Main | Firefox plugin for webscraping »

22 October 2008

Useful metric for comparing two distributions?

In reading Bill Easterly's working paper "Can the West Save Africa?," I came across an interesting metric Easterly uses to compare African nations with the rest of the world on a set of development indicators. The metric is, "Given that there are K African nations, what percent of the K lowest scoring countries were African?" I don't think I've ever seen anyone use that particular metric, but maybe someone has. Does it have a name? Does it deserve one?

Generally, looking at the percent of units below (or above) a certain percentile that have some feature is a way of describing the composition of that tail of the distribution. What's interesting about using a cutoff corresponding to the total number of units with that feature is that it produces an intuitive measure of overlap of two distributions: it gives us a rough sense of how many countries would have to switch places before all the worst countries were African or, put differently, before all of the African countries are in the worst group. It reminds me a bit of measures of misclassification in machine learning, where here the default classification is, "All the worst countries are African."

Needless to say, the numbers were bleak -- 88% for life expectancy, 84% for percent of population with HIV, 75% for infant mortality.

Posted by Andy Eggers at October 22, 2008 11:02 PM