| Sun | Mon | Tue | Wed | Thu | Fri | Sat |
|---|---|---|---|---|---|---|
| 1 | 2 | 3 | ||||
| 4 | 5 | 6 | 7 | 8 | 9 | 10 |
| 11 | 12 | 13 | 14 | 15 | 16 | 17 |
| 18 | 19 | 20 | 21 | 22 | 23 | 24 |
| 25 | 26 | 27 | 28 | 29 | 30 | 31 |
« Iowa Redistricting: The Maytag Repairman of States? | Main | Better Way To Make Cumulative Comparisons With Small Samples? »
12 December 2006
This discussion came up yesterday in the Bayes course. There is a plethora of names for multilevel models. Sociologists seem to prefer "hierarchical," many statisticians say "mixed effects," and there is heterogeneity about usage in economics. It seems reasonable to standardize, but this is unlikely to happen. Maybe the most common comes from the following. Given two data matrices, x_{ij} for individual i in cluster j, and z_j for cluster j, there are perhaps four canonical models:
"Pooled:" y_{ij} = \alpha + x_{ij}'\beta + z_j'\gamma + e_{ij}
"Fixed Effect:" y_{ij} = \alpha_j + x_{ij}'\beta + e_{ij}
"Random Effect:" y_{ij} = \alpha_j + x_{ij}'\beta + z_j'\gamma + e_{ij}
"Random Intercept and Random Slope:" y_{ij} = \alpha_j + x_{ij}'\beta_j + z_j'\gamma + e_{ij}
Some prefer "random intercepts" for "fixed effects" and perhaps we can consider these all to be members of a larger family where indices are turned-on turned-off systematically. On the other hand maybe it's just terminology and not worth worrying about too much. Thoughts?
Posted by Jeff Gill at December 12, 2006 10:23 AM
Jeff,
I prefer the term "multilevel" for reasons discussed in the new book. Also, I avoid the terms "fixed" and "random" effects because they are used differently by different people; see here:
http://www.stat.columbia.edu/~cook/movabletype/archives/2005/01/why_i_dont_use.html
and accompanying references and discussion.
Finally, yes, varying-intercept, varying-slope models are the way to go.
Posted by: Andrew Gelman at December 12, 2006 8:53 PM
A similar thing comes up with some time series models. People often refer to models with an autoregressive (AR) ERROR process as a AR model. In fact, this is not the case, since if the dynamic is in the errors, it is a moving average process. A lagged dependent variable is what generates an AR process. So layer that complication in with what you have above (adding a "t" index) and those of us who do time series are even further confounded by the terminology!
Posted by: Patrick Brandt at December 13, 2006 10:18 AM