May 2008
Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

Authors' Committee

Chair:

Andy Eggers (Gov)

Members:

Weihua An (Soc)
Kevin Bartz (Stats)
Sebastian Bauhoff (HealthPol)
John Graves (HealthPol)
Justin Grimmer (Gov)
Jens Hainmueller (Gov)
Mike Kellermann (Gov)
Ellie Powell (Gov)
Gary King (Gov)

Weekly Research Workshop Sponsors

Alberto Abadie, Lee Fleming, Adam Glynn, Guido Imbens, Gary King, Kevin Quinn, Jamie Robins, Don Rubin, Chris Winship

Recent Comments

Recent Entries

Categories

Blogroll

Brad DeLong
Cognitive Daily
Complexity & Social Networks
Developing Intelligence
EconLog
The Education Wonks
Empirical Legal Studies
Free Exchange
Freakonomics
Health Care Economist
Junk Charts
Language Log
Law & Econ Prof Blog
Machine Learning (Theory)
Marginal Revolution
Mixing Memory
Mystery Pollster
New Economist
Political Arithmetik
Political Science Methods
Pure Pedantry
Science & Law Blog
Simon Jackman
Social Science++
Statistical modeling, causal inference, and social science

Archives

Notification

Powered by
Movable Type 3.34


« Extreme Values | Main | Use of Averaged Data; Mature Cohort Size as an Instrument for Inequality »

29 September 2005

Near, Far, Wherever You Are

Sebastian Bauhoff

Tobler's First Law of Geography states that "everything is related to everything else, but near things are more related than distant things." Obviously there are many examples -- an infection is more likely to spread to a nearby person than to a far away one, a new highway might depress house prices for people living right next to it, and so on. The point is that there can be important dependencies and heterogeneities that vary with space, among other associations. And in those cases the usual assumptions that observations or errors are independently distributed don't hold. Urgh. Welcome to the world of spatial statistics.

As an estimation problem this is often addressed through clustering methods. Households in a village with some infected persons are at higher risks than households in neighboring villages. Or are they really? Clustering works when the locations are relatively homogenous and separated. What if there is no good way to classify observations into clusters, for example, if an area is evenly populated? Or if the infected household lives right at the end of the village road, and some neighbors are in the other village? The administrative boundaries commonly used for clustering (village name) might not properly account for the actual proximity or whatever defines the space between the observations. If a transmitting mosquito wouldn't care much about the village name when deciding who to bite next, why should an analyst rely on it?

Using clustering may often be a good approximation but in some cases it's not good enough and there can be substantial spatial lags (observations are spatially dependent), spatial errors (error terms are related) and spatial heterogeneity (model parameters vary across space). Those can lead to biased estimates, inefficient ones, or both. The bad news is that those effects can matter a lot. The good news is that there are methods to test for spatial dependence and correlation, and estimation techniques to deal with them.

Of course the underlying interactions we are trying to better capture can be anything from linear to more complicated relations. It is unlikely that they are perfecrly well described by any abstract spatial model, so we will still need to make assumptions. But at least there are some methods that can handle cases where the usual assumptions fail, and they can make an important difference to the analysis. I will write more about them in later blog entries. Meanwhile you might be interested in the following texts:

-- James LeSage's Econometrics Toolbox (www.spatial-econometrics.com) has an excellent workbook discussing spatial econometrics and examples for the MATLAB functions provided on the same site
-- Anselin (2002) "Under the Hood: Issues in the Specification and Interpretation of Spatial Regression Models" Agricultural Economics 27: 247-267 provides a quick overview of the issues
-- Anselin (1988) Spatial Econometrics: Methods and Models is the classic and widely quoted reference for spatial statistics

Posted by James Greiner at September 29, 2005 6:00 AM

Comments

Notification

Enter e-mail address to receive notification of new comments to this entry

Post a comment




Remember Me?

(you may use HTML tags for style)