| Sun | Mon | Tue | Wed | Thu | Fri | Sat |
|---|---|---|---|---|---|---|
| 1 | 2 | 3 | ||||
| 4 | 5 | 6 | 7 | 8 | 9 | 10 |
| 11 | 12 | 13 | 14 | 15 | 16 | 17 |
| 18 | 19 | 20 | 21 | 22 | 23 | 24 |
| 25 | 26 | 27 | 28 | 29 | 30 | 31 |
« The Value of Control Groups in Causal Inference (and Breakfast Cereal) | Main | Human Statistical Learning »
1 November 2005
Jim Greiner
Social science statistics is everywhere. So is law. And both are tangled up with each other. I was forcefully reminded of these facts when my wife pointed out an article on Salon.com about an opinion Samuel Alito (as of yesterday, a nominee to the Supreme Court) wrote while a judge on the United States Court of Appeals for the Third Circuit in a case called Riley v. Taylor. The facts of the specific case, which concerned the potential use of race in preemptory challenges in a death penalty trial, are less important than Judge Alito's approach to statistics and the burden of proof.
Schematically, the facts of the case follow this pattern: Party A has the burden of proof on an issue concerning race. Party A produces some numbers that look funny, meaning instinctively unlikely in a race-neutral world, but conducts no significance test or other formal statistical analysis. The opposing side, Party B, doesn't respond at all, or if it does respond, it simply points out that a million different factors could explain the funny-looking numbers. Party B does not attempt to show that such innocent factors actually do explain the observed numbers, just that they could, and that Party A has failed to eliminate all such alternative explanations.
Such cases occur over and over again in cases involving employment discrimination, housing discrimination, preemptory challenges, and racial profiling, just to name a few. When discussing them, judges inevitably lament the fact that one side or the other did not conduct a multiple regression analysis, as if that technique would provide all the answers (Judge Alito's Riley opinion is no exception here).
The point is, of course, that how a judge views such cases has almost nothing to do with the facts at bar and everything to do with a judge's priors on the role of race in modern society. For judges who believe that race has little relevance in the thought processes of modern decision makers (employers, landlords, prosecutors, cops), Party A in the above situation must eliminate all potential explanatory factors via (alas) multiple regression in order to meet its burden of production. For judges who believe that race still matters, Party B must respond in the above situation or lose the case. Judge Alito's Riley opinion demonstrates where he stands here.
Is there a middle way? Perhaps. In the above situation, what about requiring some sort of significance test from Party A, but not one that eliminates alternative explanations? In the specific facts of Riley, the number-crunching necessary for "some sort of significance test" is the statistical equivalent of riding a tricycle: a two-by-two hypergeometric with row totals of 71 whites and 8 blacks, column totals of 31 strikes and 48 non-strikes, and an observed value of 8 black strikes yields a p-value of 0.
Posted by James Greiner at November 1, 2005 3:58 AM
Jim,
That's interesting. But if you really want to make it sound like "riding a tricycle," call it a chi-squared test and not a "hypergeometric"!
See more here (including why I don't think the hypergeometric is quite correct).
Posted by: Andrew
at November 1, 2005 8:08 PM
Prof Gelman, many thanks for reading the entry and for your comment. (Folks out there, I recommend going to Prof Gelman's link for the full text of his discussion of whether the hypergeometric is the proper distribution). One thought in response: in the specific context of the Riley case, it may be appropriate to condition on both the row totals and the column totals (and thus to use a hypergeometric instead of a chi-squared). We both agree that the race (row) totals are fixed. Re the columns, the number 48 represents 4 death penalty cases x 12 jurors per case, and is thus set in stone (if one conditions on the number of cases, which one has to do if one is fixing the row totals). Once 48 is fixed, the other column total is determined by the fact that the column totals and row totals must sum to the same figure.
In any case, because the chi-squared and the hypergeometric reject the null at any reasonable significance level, the number-crunching involved in Riley really is like riding a tricycle.
Posted by: Jim Greiner at November 1, 2005 10:49 PM
I guess my experience differs from yours, but I find this case highly atypical. It seems to me routine that a statistician has put a p-value on the observed data, and the defendant hires a statistician to show that taking a few obvious causal factors into account the anomaly goes away. A case where a statistical anomaly is simply listed with no measure of statistical significance sounds like evidence that the court just ought to ignore, unless it's willing to do the runs itself. But maybe that's just because its what I do for a living.
Posted by: Jonathan at November 2, 2005 10:07 AM
the defendant hires a statistician to show that taking a few obvious causal factors into account the anomaly goes away.
Indeed. If, as the FBI Uniform Crime Reports show, African Americans commit violent crimes at a higher rate, it is not out of the question that they are also overrepresented in the far right tail of the distribution with the most severe crimes.
Posted by: blah at November 7, 2005 1:37 PM
Jim,
Also take a look at the latest Chance News for more on Alito's statistical argument.
Posted by: Andrew
at November 8, 2005 9:32 PM
Interesting post, and interesting blog.
For what it's worth, I too am a lawyer and a statistician. Not only that, I recently clerked for a federal judge, and had a death penalty case involving a Batson claim in the Third Circuit!
Although the defense never raised this argument, I too wondered about the statistical significance of the prosecutor's peremptory strikes. I too came to the conclusion (independently I should add) that the hypergeometric/Fisher's exact test was the proper test, and got a p-value of about 1 in half a million.
I dropped a footnote about it in the opinion, but eventually took it out because it wasn't quite proper (made the judge look like she was doing the job of the defense attorneys). We were granting relief on other grounds anyway.
Posted by: Mahan Atma at November 10, 2005 12:43 AM