October 2008
Sun Mon Tue Wed Thu Fri Sat
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31  

Authors' Committee

Chair:

Matt Blackwell (Gov)

Members:

Martin Andersen (HealthPol)
Kevin Bartz (Stats)
Deirdre Bloome (Social Policy)
Andy Eggers (Gov)
John Graves (HealthPol)
Rich Nielsen (Gov)
Maya Sen (Gov)
Gary King (Gov)

Weekly Research Workshop Sponsors

Alberto Abadie, Lee Fleming, Adam Glynn, Guido Imbens, Gary King, Arthur Spirling, Jamie Robins, Don Rubin, Chris Winship

Recent Comments

Recent Entries

Categories

Blogroll

Brad DeLong
Cognitive Daily
Complexity & Social Networks
Developing Intelligence
EconLog
The Education Wonks
Empirical Legal Studies
Free Exchange
Freakonomics
Health Care Economist
Junk Charts
Language Log
Law & Econ Prof Blog
Machine Learning (Theory)
Marginal Revolution
Mixing Memory
Mystery Pollster
New Economist
Political Arithmetik
Political Science Methods
Pure Pedantry
Science & Law Blog
Simon Jackman
Social Science++
Statistical modeling, causal inference, and social science

Archives

Notification

Powered by
Movable Type 4.24-en


« October 1, 2008 | Main | October 4, 2008 »

3 October 2008

Biden-Palin Linguistics

This post looks at the linguistics of last night's Biden-Palin debate. Palin used the word "reform" 12 times compared to Biden's none. Biden used "middle class" 12 times to Palin's one.


Here's a sequel to my earlier Obama-Clinton post. Overall,

Overall, Biden uttered 7,065 words and Palin 7,646, with a total of 2,117 unique words. Which words did Biden use significantly more or less than Palin? For each word, we apply a chi-squared test that the candidates spoke the word with equal probability. Finally, we sort the list by p-value, highlighting the differences. I've eliminated words that appear over 50 times (mostly stop words like "the," which Palin evidently used a couple hundred times more than Biden).

WordBidenPalinpval
also 3470.0000
their32 40.0000
number15 00.0002
want 0160.0003
united16 10.0004
policy22 40.0004
just 6280.0007
those10340.0013
too 0130.0014
they41180.0015
well24 70.0019
these 1150.0020
said40180.0022
reform 0120.0023
who11340.0025
even 3190.0025
down16 30.0034
gwen16 30.0034

Observations:


  • Look at all the connectors in the Palin column: "also," "just," "too." She uses these words to string together her thoughts.

  • Biden's favored conjunctions are "number" -- from his several "number one, number two" formulations -- and "well."

  • Two interesting ones are "policy" (Biden 22, Palin 4) and "reform" (Palin 12, Biden 0). These were certainly buzzwords from debate prep.

We can also look at bigrams, pairs of words, in a similar way.


  • Biden used "United States" 16 times as opposed to Palin's 1.

  • Like McCain, Palin only once used the term "middle class," compared to Biden's 12. To be fair, she made several allusions to the middle class without mentioning the word.
  • Here's an interesting observation: Palin shares many of Obama's constructions; her favorite conjunctions are "and that's" and "and i."

WordBidenPalinpval
the united16 10.0004
united states16 10.0004
we have 9340.0007
want to 0140.0009
he said11 00.0016
have got 0120.0023
and i 6250.0025
that is 4210.0026
and that's 1140.0032
middle class12 10.0035

Posted by Kevin Bartz at 11:38 AM

Regression Discontinuity Reversed

I recently came across a new paper by David Card, Alexandre Mas, and Jesse Rothstein entitled "Tipping and the Dynamics of Segregation." What's interesting from a methodological standpoint is that the authors use what may be called "inverted" regression discontinuity methods to test for race-based tipping in neighborhoods in American cities.

In a classic regression discontinuity design researchers commonly exploit the fact that treatment assignment changes discontinuously as a function of one or more underlying variables. For example scholarships may be assigned based on whether students exceed a test score threshold (like in the classic paper by Thistlethwaite and Campbell (1960)). Unlucky students who just miss the threshold are assumed to be virtually identical to lucky ones who score just above the cutoff value so that the threshold offers a clean identification of the counterfactual of interest (assuming no sorting).

In the Card et al. paper, the situation is slightly different because the authors have no hard-and-fast decision rule, but a theory that posits that whites' willingness to pay for homes depends on the neighborhood minority share and exhibits a tipping behavior. If the minority share exceeds a critical threshold, all the white households will leave. Since the location of the (city-specific) tipping point is unknown, the author's estimate it from the data and find that there are indeed significant discontinuities in the white population growth rate at the identified tipping points. Once the tipping point is located, they go on to examine whether rents or housing prices exhibit non-linearity around the tipping point but find no effects. They also try to explain the location of the tipping points by looking at survey data on racial attitudes of whites. Cities with more tolerant whites appear to have higher tipping points.

I think this is a very creative paper. The general approach could be useful in other contexts so take a look!


Posted by Jens Hainmueller at 8:10 AM