| Sun | Mon | Tue | Wed | Thu | Fri | Sat |
|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| 8 | 9 | 10 | 11 | 12 | 13 | 14 |
| 15 | 16 | 17 | 18 | 19 | 20 | 21 |
| 22 | 23 | 24 | 25 | 26 | 27 | 28 |
| 29 | 30 | 31 |
« xkcd on Correlation and Causation | Main | Hopkins on "Making Credible Inferences about the Effects of Local Contexts" »
7 March 2009
I encounter a problem when using a Log normal distribution to model income distribution. Namely, there are a bunch of people in my dataset who report zero income, maybe due to unemployment, and I am wondering how to logarize the zero incomes. I notice some researchers just drop the observations with zero income while others assign a small amount of income to them so that logarithm can be taken legitimately. Obviously, we can try both ways to see how the results stand. But I am wondering if there are some experts on this topic who can clarify the pros and cons of these and other approaches treating zero incomes.
A related question is what model you think fits the income distribution best, a Lognormal, a power distribution, or a mixture model of a Normal and a point mass at zero, and so on.
Look forward to your thoughts on these questions.
Lastly, here is an interesting animation of the income distribution in the USA.
Posted by Weihua An at March 7, 2009 6:07 PM