Main

November 5, 2009

Final thoughts for now on the Coburn attacks on the online townhall report

This is hopefully my last post for a while on the online townhalls. I do think there is a value in dialog and discourse, and I wanted to excerpt the critiques from some of the more credible online sources and provide my responses, more for posterity than anything else.

I would first note that none of the posts responded to the findings that the online townhalls reached people who do not show up to regular townhalls (based on demographics), where, notably, people who were frustrated with the political system were more likely to show up. Further, none of the criticisms deal with the findings that the people who participated were subsequently more engaged in politics, more likely to vote, increased their knowledge of the policy, more likely to follow the election. These are the parts of the report that actually have normative bite; clearly, approval of Members of Congress by itself is neither here nor there normatively. These may be things that the bloggers below do not value (in which case they should explain why) or actively ignored.

In any case, here is what they did say.

First, the Heritage post, in its words:

[referring to the fact that the research methodology called for offensive questions to be culled] CMF does not say what qualifies as offensive, but if this summer is any indication that definition would include anything that the Congressman did not want to talk about. In other words, this report urges Congressmen not to actually interact with their constituents, but to avoid them altogether by holding safe townhalls they can completely control. And what did CMF find where the results of these Potemkin townhalls?

The online town halls increased constituents' approval of the Member. Every Member involved experienced an increase in approval by the constituents who participated. The average net approval rating (approve minus disapprove) jumped from +29 before the session to +47 after. There were also similar increases in trust and perceptions of personal qualities - such as whether they were compassionate, hardworking, accessible, etc. - of the Member.

The lesson: avoid your constituents' inconvenient questions and your approval ratings will rise. And this is a taxpayer funded study. Here is the grant from the National Science Foundation.

Congress is actually using your tax dollars to pay social scientists to find ways they can avoid actually talking to their constituents while improving their chances of reelection.

Response: as noted in the report, the possibility of screening anything as "offensive" was theoretical. We did not actually exclude any questions for this reason (we did state this in a footnote rather in the text). We had pretty high (low?) standards for offensive--e.g., we would not post questions that included expletives. It was part of our research protocols, and thus our instinct was to mention it.

That said, it is worth noting that the medium is potentially manipulable, and there is nothing to stop someone who is doing an online townhall from excluding difficult questions. (Of course, all communication media are manipulable in some way, so it is not obvious that this is an advantage or disadvantage of online townhalls.) We had a neutral moderator, and included all questions that time would allow, in the order that were posted. This included some that were pretty hostile to the Member. Our assessment (and recommendation) was that these very confrontations made the events more effective, because they reflected the authenticity of the event. In short, the Members approval ratings increased because they had done the right thing.


The Wall Street Journal:

The National Science Foundation prides itself on making research grants that lead to path-breaking discoveries. So it seemed odd to Coburn, a physician known as the Senate's 'Dr. No,' that science foundation money was being used to show legislators how to exile angry town-hall mobs to cyberspace.

Response: It is unclear whether the blogger is speaking in her voice, or Senator Coburn's. In any case, the criticism that we were trying to eliminate traditional townhalls came up repeatedly through the blogs. We do not advocate this in the report (or elsewhere). But the online and tele-townhalls do allow Members to reach many more people than they can via traditional townhalls, and people they could not otherwise reach. Indeed, Senator Coburn himself has participated in many tele-townhalls, presumably without crowding out traditional townhalls.

From Fox (John Stossel's blog):

This summer's town hall meetings made many congressmen and senators uncomfortable. No worries. The sycophants they fund have used your tax money to fund a study that advises politicians how they can avoid seeing you altogether.

[The rest is just derivative from the Heritage blog]

Response: All of the NSF funded data collection was conducted in the summer of 2006, three years before the town hall meetings of this past summer. Further, the participating members included some conservative Republicans, i.e., there was no partisan or ideological tilt in this research.

Otherwise, I'd note that name calling is a poor substitute for a persuasive argument. 'Nuff said.

----------------------------
The more serious argument is the resource constraint argument that Coburn himself raises in the flyer below. Taxpayer dollars are wrested, involuntarily, from individuals who have worked hard for their money. Should these dollars be used to fund political science research, especially in these resource constrained times? Could these funds be used, for example, to help cure cancer (something that the Coburn amendment did not mandate, I should note). (NB: a useful fact, we currently spend more than 2000 times as much money on NIH than on the NSF political science program.) Does this research provide public value equal or greater than its cost?

This requires a more thorough answer than I can give here and now. The broad intellectual argument for giving even one penny for research from the government is that certain types of knowledge, for which there is no mechanism for intellectual property right protection, will be underproduced in our market system. And, to the extent it is produced, it might remain proprietary in a fashion that reduces its possible positive impact. For example, in the political arena, I am sure Senator Coburn (and every other Senator) has spent tons of money on politically related research by consultants, etc. (across all politicians, vastly more than is spent on the NSF Political Science Program). But little of that money is aimed at producing knowledge that might improve democracy, nor would any insights along those lines be made publicly available.

To take the boundary case for supporting political science research, say the authors of the Federalist papers, the inspiration for our research, had been drawing government salaries on the time that they were writing those foundational papers (if any readers of the blog can speak to this historical fact, I would be interested in knowing). Would Senator Coburn say that this had been a waste of taxpayer money?

So, in the case of our more modest research, we produced something of a model for how democracy could be done, with the Internet facilitating direct interactions between Members of Congress and citizens that they had, before, been unable to have a conversation with. We applied cutting edge scientific methods to test the effects--field experiments, with control groups, the whole nine yards. Any criteria you might throw out there regarding "scientific methods" we match-- rigor, research design, replicability, inferential power, etc etc. And we got some really compelling results from a normative point of view. I am not sure what the dollar value was, but I am happy for our report to be held out as a poster child for the public value that federally funded research for political science can produce. And on the more general issue, presumably enhanced understanding of the inner workings of democracy, the causes of war and peace, of the difficulty/challenges of institution building in damaged states, of the causes of the growth and decline of terrorist networks (all things that have been federally funded political science research) are all things that plausibly produce public value, and are worth more rigorous treatment than pundits and talking heads on Fox, CNN, and MSNBC might provide.

November 3, 2009

Following the digital breadcrumbs...

A reader of the blog found an antecedent for the photograph on the Coburn flyer:

Little girl with doll.jpg

can be found at Progressive States:
progressive states.JPG

Putting aside the small irony here, there is an interesting lesson here about being able to track the linkages among objects in digitized information, and, in turn, what those linkages reveal. More on that another day.

October 7, 2009

Delete - The Virtue of Forgetting in the Digital Age

I knew I was in for a treat when I sat down to listen to Viktor Mayer-Schoenberger at NYU's Law School yesterday afternoon. Viktor discussed his new book, Delete - The Virtue of Forgetting in the Digital Age, and kicked off a book tour that will take him to several US locations (I've listed upcoming talks below). Although he had arrived from Singapore only hours prior to giving his talk, he engaged the audience with his clever presentation, leaving us wanting more even after 45 minutes of Q&A.

Mayer-Schoenberger beautifully illustrates our society's transition from "biological forgetting to digital remembering". While for generations our efforts have concentrated on trying to remember events, actions, etc. and preserve them for posterity, in today's world we are facing the opposite problem: The digital memory is here to stay. However, the book argues, forgetting has its virtues, and needs to be reintroduced. The solution is simple: Put an expiration date on information.

The book is a great read (as soon as I got it, my non-academic spouse snagged it and took it on a business trip, which usually doesn't happen with the books I order), and I am not even close to doing it justice with this description, so if you find yourselves near any of the locations of the book tour, make sure to stop by and join the discussion.

Future stops (from here):
• Harvard's Berkman Center on October 7 at 6 pm
• Princeton University's Center for Information Technology Policy on October 8 at 4.30 pm
• Town Hall Seattle on October 19 at 7.30 pm
• University of California Berkeley Law School on October 22 at 4 pm

June 6, 2009

The President's Grassroots Network- alive and well?

The unwritten political story of 2009 is the effort of the Democratic Party to mobilize Obama supporters from 2008 to push forward a health care proposal. There has been a steady beat of e-mails every 2-3 weeks from "Organizing for America", a project of the Democratic National Committee, highlighting today as the start of a grassroots effort for health care reform (see e-mail below). In particular, today they have called for "Health Care Organizing Kickoff" parties across the country. The question: how are they doing? How many parties are there today? How many people in attendance? What is the geographic dispersion of these events (are they only taking place in true blue regions, or purple and red as well)?

Clearly, the Obama campaign did a remarkable job working back and forth between the "real" and virtual worlds, getting e-mails at campaign events, mobilizing financial support with those e-mails, as well as volunteer action. As I have written before, the question is whether it is possible to mobilize these political resources in support of a policy agenda. There should be, for example, thousands of Obama contributors, on average, in each Congressional district. Do you get some nontrivial fraction of those people knocking on doors, writing letters, mobilizing others, contributing to a costly campaign for health care reform? This is not how Presidential leadership has worked in the past, and it is an open question how effective this will be in the present, but some of the answers are emerging on the ground today.


Sample e-mail below. Key features include:

1) Obama will be on a conference call to the parties;
2) The twin emphases on action and fun. The cause is front and center, but the e-amil closes with social aspects of the gatherings--that they offer an opportunity for meeting like minded people, to have fun.


John Smith --

Remember this date: Saturday, June 6th, 2009. We will look back on that day as the moment when the fight for real health care reform began in your neighborhood -- perhaps even in your own living room.

On June 6th, in thousands of homes across the country, we'll gather to launch our grassroots campaign for health care. We'll watch a special message from the President. We'll build the teams and draw up the plans for winning health care reform the same way we won the election: Building support one block, one neighbor, one conversation at a time. And we'll put those plans into action.

These kickoffs are so crucial that President Obama will join confirmed hosts and attendees on a live conference call.

Sign up today to host or attend a Health Care Organizing Kickoff.

Host a Health Care Organizing Kickoff

There's no prior experience required. We'll send you the details for dialing into the President's call and provide you everything you need to make your meeting a success.

After the election, people gathered at over 9,000 meetings across every state to set priorities for health care reform. Our voices were heard. Now the race is on to make sure Congress produces a plan that reflects the President's call for reduced costs, guaranteed choice, and quality care for all.

To make that happen, we need to build a groundswell of support in every district and every state, and we have no time to lose. All summer we'll be reaching out to our neighbors, knocking on doors, serving in our communities, and building a grassroots network strong enough to win.

These gatherings on June 6th are just the beginning of a battle between those who fought and believe in change and those who would protect a broken status quo. The stakes for our country could not be greater.

Some call this strategy pie-in-the-sky. They say we'll never have enough volunteers to make a real impact; that you need insiders and Washington lobbyists to make a difference. But you and I know firsthand how wrong they are. Starting June 6th, it's once again time to show this country how bottom-up change is done.

Please sign up today to host or attend a kickoff near you.

http://my.barackobama.com/HCkickoff

These kickoffs will be both effective and fun. You'll meet likeminded supporters in your neighborhood, share stories, enjoy good company and a shared mission, and know that no matter what this effort requires of us, if we work together we'll be ready to face it and persevere.

I look forward to joining you and the President to chart our course.

David Plouffe
Organizing for America

P.S. -- This week, President Obama asked us to send in our personal health care stories. Hundreds of thousands of people have already responded, and the stories are simply incredible. Here are just few that help remind us what we're fighting for:

I am a single parent and have lost my teaching job effective in June. I'm scared to death because my son has a serious pre-existing condition (Neurofibromatosis) and can't go without medical insurance. However, my employer has just informed me that continuing my family coverage under COBRA will cost $1,400.00 a month! That's a house payment for me. Or three times my car payment! How can I keep my family covered without going under financially?

--Cathy
Apple Valley, Minnesota

Since I lost my job in 2006, I have had no health insurance. After paying for insurance through my employer for 30 years, I have no major medical. But now that I am approaching 60, I may need insurance more than ever. I have not had a mammogram for three years because it would be too stressful to find anything suspicious. Risky but true.

--Kathy
Macon, Georgia

My husband isn't getting enough hours at his job to qualify for health insurance so we have been looking around for a provider. He has a pre-existing health condition (non-epileptic seizures) and he is being denied left and right. We don't make a lot of money, about $23,000/year and we can't afford to not have insurance, in case he needs to go to the doctor. And it looks like we can't afford to have it either. We are stuck.

--Amanda
Pasadena, California

Please donate

May 2, 2009

Networked governance and the swine flu

Riffing off of Ines' post, there was an interesting piece by David Brooks on governance and the swine flu last Monday. Not quite right, but definitely interesting. And it has many of my favorite words (complex, networks, emergent). Some extended excerpts:

In these post-cold war days, we don't face a single concentrated threat. We face a series of decentralized, transnational threats: jihadi terrorism, a global financial crisis, global warming, energy scarcity, nuclear proliferation and, as we're reminded today, possible health pandemics like swine flu.

These decentralized threats grow out of the widening spread and quickening pace of globalization and are magnified by it. Instant global communication and rapid international travel can sometimes lead to universal, systemic shocks. A bank meltdown or a virus will not stay isolated. They have the potential to hit nearly everywhere at once. They can wreck the key nodes of complex international systems.

So how do we deal with these situations? Do we build centralized global institutions that are strong enough to respond to transnational threats? Or do we rely on diverse and decentralized communities and nation-states?
...
If you apply [the logic of a centralized response] to the swine flu, you could say that the world should beef up the World Health Organization to give it the power to analyze the spread of the disease, decide when and where quarantines are necessary and organize a single global response.
...
Those dangers are all real. Yet, so far, that's not the lesson of this crisis. The response to swine flu suggests that a decentralized approach is best. This crisis is only days old, yet we've already seen a bottom-up, highly aggressive response.

In the first place, the decentralized approach is much faster....

Second, the decentralized approach is more credible. It is a fact of human nature that in times of crisis, people like to feel protected by one of their own....

Finally, the decentralized approach has coped reasonably well with uncertainty....
A single global response would produce a uniform approach. A decentralized response fosters experimentation.

The bottom line is that the swine flu crisis is two emergent problems piled on top of one another. At bottom, there is the dynamic network of the outbreak. It is fueled by complex feedback loops consisting of the virus itself, human mobility to spread it and environmental factors to make it potent. On top, there is the psychology of fear caused by the disease. It emerges from rumors, news reports, Tweets and expert warnings.

The correct response to these dynamic, decentralized, emergent problems is to create dynamic, decentralized, emergent authorities: chains of local officials, state agencies, national governments and international bodies that are as flexible as the problem itself.
Swine flu isn't only a health emergency. It's a test for how we're going to organize the 21st century. Subsidiarity works best.

---

I have seen remarkable evidence of the bottom up response, just in my little corner of the world. I have been deluged by e-mails from Harvard, my kids' schools, my synagogue. Each of these e-mails have updated me on the status of the potential pandemic vis a vis that particular institution, and instructed me on appropriate behavior on my part (stay calm and wash my hands, basically).

The reason why such a response can work well is because there is a reasonably good alignment of individual incentives and global effects. If I wash my hands, it reduces the likelihood that I will get sick, which, in turn, is good for people I know who might get the flu from me were I to get sick. Such an approach works less well if there were no such alignment (e.g., vis a vis CO2 emissions).

However, I would note that these are not distinctively 21st century governance problems-- indeed, human history is littered with pandemics/epidemics that have killed millions (see Spanish Flu, 1918). Of course, epidemics can travel much faster now in the jet age (although jets are not particularly new technologies anymore either). Further, our global authority structures have not changed dramatically in a long time. The idea of a global centralized authority that could respond to the pandemic is nothing that we are going to see any time soon. This is not a governance choice that has ever been on the table. It is inconceivable that there could be an uber World Health Organization that could order my youngest's school shut.

What is different is that we have much more effective tools with which to communicate about these issues. And further, I would guess that we actually have more powerful institutions at the center of the storm (the WHO, the CDC) than we have ever had before. They likely have more resources, and vastly superior mechanisms with which to disseminate information, recommended practices, and to work with local authorities to evaluate whether the flu is present in particular jurisdictions,etc. I would guess that there is actually less variation and experimentation in local practice and more (voluntary) following of systemic authorities than ever before.

Thus, arguably in this case "subsidiarity works." But it is because (1) the incentives of the individual decision makers in the system are reasonably well aligned with global outcomes, (2) there are substantial centralized capacities, and (3) because of current communication technologies, local decision makers (generally) are acting voluntarily in a fashion consistent with the preferences of those central agencies.

April 13, 2009

Twitter? You can not be serious!!!!

Do any of you actually use Twitter for work?

twitter.png


Sure, it's fun, and does something that Facebook does a bit better ("What's on your mind?").
I see it as the next generation of social software ......
friendster --> myspace --> facebook --> twitter
(with of course hundreds of other, equivalent, competing platforms)

But to use it for work stuff? To use it for anything more than goofing around with your buddies? Really? You are not serious, are you?


February 24, 2009

A tale of high school tribes: the round tables and the square tables

High schools present a primal soup of sorts of human relationships. You have a reasonably closed social system, with rather intense and enforced daily interaction (at least relative to adults) of a population of individuals that are still searching for their identities, simultaneously seeking to conform and stand out. So it was with interest that I was listening to my oldest daughter and her friend talk the other day, as I drove them from point A to point B, about "round table" people versus "long table" people. What, I asked, are round and long table people. The answer was simple, but most surprising. There are two types of tables in the cafeteria: round (circular) and long (rectangular). Round table people sit at round tables at lunchtime, and long table people at long tables. You make a decision on the first day of high school which type of person you are, and that's the shape of the table where you sit for the rest of high school. You _never_ change the type of table you sit at, and you _never_ sit at the wrong shaped table.

My daughter and her friend are round tablers. What defines the difference between these two tribes? Not race, gender, or ethnicity, but more nebulous characteristics, such as "creativity", academic orientation, and so on.

A few striking observations: while, prior to high school, they had each had a number of "long table" individuals in their social circles, those relationships faded in high school, so that they both only are close friends with round tablers now. This sounds like the typical pattern. This is remarkable, really, that all of the many ways that students at this high school differ have reduced themselves to a single dimension, which (it appears), powerfully drives the social network at the school.

There seems to be no memory of when this started. The furniture goes back a fair number of years, so this may have emerged many generations of students ago. After listening to my daughter and her friend talk about the distinctions between the two groups, I would also note that there is also a predictable degree of ingroup favoritism. That said, I would note that Facebook friendships do cross these tribal boundaries (presumably because they encompass weaker ties), and dating does occur across table types.

The case of the round and long tablers serves as a useful metaphor, more broadly, about the interplay of arbitrary design choices--the creation of neighborhoods, the design of buildings, the drawing of boundaries--from which identities are constructed, that then result in emergent, but subsequently, inescapable social structures. Inescapable, in this case at least, until the school gets around to buying new cafeteria tables.

February 5, 2009

Paper in Science tomorrow on "Computational Social Science"

One of the key themes of this blog has been that social science will/should undergo a transformation over the next generation, driven by the availability of new data sources, as well as the computational power to analyze those data. I, along with many collaborators, address these issues in a paper coming out tomorrow in Science on "Computational social science" (the original title-- Life in the network: the coming age of computational social science-- was more evocative but too wordy). In any case, while I cannot post the final version of the paper, I can post the version we submitted:


Computational social science


David Lazer (Harvard University), Alex (Sandy) Pentland (MIT), Lada Adamic (University of Michigan), Sinan Aral (NYU), Albert Laszlo Barabási (Northeastern University), Devon Brewer (Interdisciplinary Scientific Research), Nicholas Christakis (Harvard University), Noshir Contractor (Northwestern University), James Fowler (UCSD), Myron Gutmann (University of Michigan), Tony Jebara (Columbia University), Gary King (Harvard University), Michael Macy (Cornell University), Deb Roy (MIT), Marshall Van Alstyne
(Boston University)


We live life in the network. When we wake up in the morning, we check our e-mail, make a quick phone call, walk outside (our movements captured by a high definition video camera), get on the bus (swiping our RFID mass transit cards) or drive (using a transponder to zip through the tolls). We arrive at the airport, making sure to purchase a sandwich with a credit card before boarding the plane, and check our BlackBerries shortly before takeoff. Or we visit the doctor or the car mechanic, generating digital records of what our medical or automotive problems are. We post blog entries confiding to the world our thoughts and feelings, or maintain personal social network profiles revealing our friendships and our tastes. Each of these transactions leaves digital breadcrumbs which, when pulled together, offer increasingly comprehensive pictures of both individuals and groups, with the potential of transforming our understanding of our lives, organizations, and societies in a fashion that was barely conceivable just a few years ago.

The capacity to collect and analyze massive amounts of data has unambiguously transformed such fields as biology and physics. The emergence of such a data-driven "computational social science" has been much slower, largely spearheaded by a few intrepid computer scientists, physicists, and social scientists. If one were to look at the leading disciplinary journals in economics, sociology, and political science, there would be minimal evidence of an emerging computational social science engaged in quantitative modeling of these new kinds of digital traces. However, computational social science is occurring, and on a large scale, in places like Google, Yahoo, and the National Security Agency. Computational social science could easily become the almost exclusive domain of private companies and government agencies. Alternatively, there might emerge a "Dead Sea Scrolls" model, with a privileged set of academic researchers sitting on private data from which they produce papers that cannot be critiqued or replicated. Neither scenario will serve the long-term public interest in the accumulation, verification, and dissemination of knowledge.

What potential value might a computational social science, based in an open academic environment, offer society, through an enhanced understanding of individuals and collectives? What are the obstacles that stand in the way of a computational social science?

From individuals to societies

To date the vast majority of existing research on human interactions has relied on one-shot self-reported data on relationships. New technologies, such as video surveillance, e-mail, and 'smart' name badges offer a remarkable, second-by-second picture of interactions over extended periods of time, providing information about both the structure and content of relationships. Consider examples of data collection in this area and of the questions they might address:

Video recording and analysis of the first two years of a child's life (1): Precisely what kind of interactions with others underlies the development of language? What might be early indicators of autism?

Examination of group interactions through e-mail data: What are the temporal dynamics of human communications--that is, do work groups reach a stasis with little change, or do they dramatically change over time (2 , 3)? What interaction patterns predict highly productive groups and individuals? Can the diversity of news and content we receive predict our power or performance (4)?

Examination of face-to-face group interactions over time using sociometers: Small electronics packages ('sociometers') worn like a standard ID badge can capture physical proximity, location, movement, and other facets of individual behavior and collective interactions. What are patterns of proximity and communication within an organization, and what flow patterns are associated with high performance at the individual and group levels (5)?

Macro communication patterns: Phone companies have records of call patterns among their customers extending over multiple years, and e-Commerce portals such as Google and Yahoo collect instant messaging data on global communication. Do these data paint a comprehensive picture of societal-level communication patterns? What does the "macro" social network of society look like (6), and how does it evolve over time? In what ways do these interactions affect economic productivity or public health?

Tracking movement: With GPS and related technologies, it is increasingly easy to track the movements of people (7, 8). Mobile phones, in particular, allow the large scale tracing of people's movements and physical proximities over time (9), where it may be possible to infer even cognitive relationships, such as friendship, from observed behavior (10). How might a pathogen, such as influenza, driven by physical proximity, spread through a population (11)?

Internet: The Internet offers an entirely different channel for understanding what people are saying, and how they are connecting (12). Consider, for example, in this political season, tracing the spread of arguments/rumors/positions in the blogosphere (13), as well as the behavior of individuals surfing the Internet (14), where the concerns of an electorate become visible in the searches they conduct. Virtual worlds, by their nature capturing a complete record of individual behavior, offer ample opportunities for research, for example, experimentation that would be impossible or unacceptable (15). Similarly, social network websites offer an unprecedented opportunity to understand the impact of a person's structural position on everything from their tastes to their moods to their health (16), while Natural Language Processing offers increased capacity to organize and analyze the vast amounts of text from the Internet and other sources (17).

In short, a computational social science is emerging that leverages the capacity to collect and analyze data with an unprecedented breadth and depth and scale. Substantial barriers, however, might limit progress. Existing ways of conceiving human behavior were developed without access to terabytes of data describing their minute-by-minute interactions and locations of entire populations of individuals. For example, what does existing sociological network theory, built mostly on a foundation of one-time 'snapshot' data, typically with only dozens of people, tell us about massively longitudinal datasets of millions of people, including location, financial transactions, and communications? The answer is clearly "something," but, as with the blind men feeling parts of the elephant, limited perspectives provide only limited insights. These emerging data sets surely must offer some qualitatively new perspectives on collective human behavior.

There are significant barriers to the advancement of a computational social science both in approach and in infrastructure. In terms of approach, the subjects of inquiry in physics and biology present different challenges to observation and intervention. Quarks and cells neither mind when we discover their secrets nor protest if we alter their environments during the discovery process (although, as discussed below, biological research involving humans offers some similar concerns regarding privacy). In terms of infrastructure, the leap from social science to a computational social science is larger than from, say, biology to a computational biology, in large part due to the requirements of distributed monitoring, permission seeking, and encryption. The resources available in the social sciences are significantly smaller, and even the physical (and administrative) distance between social science departments and engineering or computer science departments tends to be greater than for the other sciences. The availability of easy-to-use programs and techniques would greatly magnify the presence of a computational social science. Just as mass-market CAD software revolutionized the engineering world decades ago, common computational social science analysis tools and the sharing of data will lead to significant advances. The development of these tools can, in part, piggyback on those developed in biology, physics and other fields, but also requires substantial investments in applications customized to social science needs.

Perhaps the thorniest challenges exist on the data side, with respect to access and privacy. Many, though not all, of these data are proprietary (e.g., mobile phone and financial transactional data). The debacle following AOL's public release of "anonymized" search records of many of its customers highlights the potential risk to individuals and corporations in the sharing of personal data by private companies (18). Robust models of collaboration and data sharing between industry and the academy need to be developed that safeguard the privacy of consumers and provide liability protection for corporations.

More generally, properly managing privacy issues is essential. As the recent NRC report on GIS data highlights, it is often possible to pull individual profiles out of even carefully anonymized data (19). To take a non-social science example: this past Summer NIH and the Wellcome Trust abruptly removed a number of genetic databases from online access (20). These databases were seemingly anonymized, simply reporting the aggregate frequency of particular genetic markers. However, research revealed the potential for de-anonymization, based on the statistical power of the sheer quantity of data collected from each individual in the database (21).

A single dramatic incident involving a breach of privacy could produce a set of statutes, rules, and prohibitions that could strangle the nascent field of computational social science in its crib. What is necessary, now, is to produce a self-regulatory regime of procedures, technologies, and rules that reduce this risk but preserve most of the research potential. As a cornerstone of such a self-regulatory regime, Institutional Review Boards (IRBs) must increase their technical knowledge enormously to understand the potential for intrusion and individual harm because new possibilities do not fit their current paradigms for harm. For example, many IRBs today would be poorly equipped to evaluate the possibility that complex data could be de-anonymized. Further, it may be necessary for IRBs to oversee the creation of a secure, centralized data infrastructure. Certainly, the status quo is a recipe for disaster, where existing data sets are scattered among many different groups, with uneven skills and understanding of data security, with widely varying protocols.

Researchers themselves must tackle the privacy issue head on by developing technologies that protect privacy while preserving data essential for research (22). These systems, in turn, may prove useful for industry in managing privacy of customers and security of their proprietary data.

Finally, the emergence of a computational social science shares with other nascent interdisciplinary fields (e.g., sustainability science) the need to develop a paradigm for training new scholars. A key requirement for the emergence of an interdisciplinary area of study is the development of complementary and synergistic explanations spanning different fields and scales. Tenure committees and editorial boards need to understand and reward the effort to publish across disciplines (23). Certainly, in the short run, computational social science needs to be the work of teams of social and computer scientists. In the longer run, the question will be: should academia be building computational social scientists, or teams of computationally literate social scientists and socially literate computer scientists?

The emergence of cognitive science in the 1960s and 1970s offers a powerful model for the development of a computational social science. Cognitive science emerged out of the power of the computational metaphor of the human mind. It has involved fields ranging from neurobiology to philosophy to computer science. It attracted the investment of substantial resources to establish a common field, and it has created enormous progress for public good in the last generation. We would argue that a computational social science has a similar potential, and is worthy of similar investments.

References:

1. D. Roy, R. Patel, P. DeCamp, R. Kubat, M. Fleischman, B. Roy, N. Mavridis, S. Tellex, A. Salata, J. Guiness, M. Levit, P. Gorniak. 2006. "The Human Speechome Project," Twenty-eighth Annual Meeting of the Cognitive Science Society.
2. JP Eckmann, E. Moses, D. SergI. 2004. "Entropy of dialogues creates coherent structures in e-mail traffic," Proceedings of the National Academy of Sciences of the United States of America 101: 14333-14337.
3. Kossinets, G. & D. Watts. 2006. "Empirical Analysis of an Evolving Social Network." Science (311:5757): 88-90.
4. S. Aral, M. Van Alstyne. 2007. "Network Structure & Information Advantage" Proceedings of the Academy of Management Conference, Philadelphia, PA.
5. Pentland. A. 2008. Honest Signals: how they shape our world, MIT Press, Cambridge, MA
6. J.-P. Onnela, J. Saramäki, J. Hyvönen, G. Szabó, D. Lazer, K. Kaskil, J. Kertész, A.-L. Barabási. 2007. "Structure and tie strengths in mobile communication networks," Proceedings of the National Academy of Sciences of the United States of America.
7. B. Shaw, T. Jebara. 2007. "Minimum Volume Embedding," Proceedings of the Conference on Artificial Intelligence and Statistics.
8. T. Jebara, Y. Song, K. Thadani. 2007. "Spectral Clustering and Embedding with Hidden Markov Models", Proceedings of the European Conference on Machine Learning".
9. M. C. González, C. A. Hidalgo, A.-L. Barabási. 2008. Understanding individual human mobility patterns Nature 453: 779-782.
10. N. Eagle, A. Pentland, D. Lazer. 2008. "Inferring friendships from behavioral data," HKS working paper.
11. V. Colizza, A.Barrat, M. Barthelemy, and A. Vespignani. 2006. "Prediction and predictability of global epidemics: the role of the airline transportation network," Proceedings of the National Academy of Sciences of the United States of America, 103: 2015-2020.
12. D. Watts. Connections A twenty-first century science, Nature 445: 489.
13. L. Adamic, N. Glance. 2005. The Political Blogosphere and the 2004 U.S. Election Divided They Blog, LinkKDD-2005, Chicago, IL.
14. J. Teevan. 2008. "How People Recall, Recognize and Re-Use Search Results," To appear in ACM Transactions on Information Systems (TOIS) special issue on Keeping, Re-finding, and Sharing Personal Information.
15. W. Bainbridge. 2007. "The scientific research potential of virtual worlds," Science 317. no. 5837: 472 - 476.
16. K. Lewis, J. Kaufman, M. Gonzalez, A. Wimmer, and N. Christakis. 2009. "Tastes, Ties, and Time: A New (Cultural, Multiplex, and Longitudinal) Social Network Dataset Using Facebook.com." Social Networks, in press.
17. C. Gardie, J. Wilkerson. 2008. Text annotation for political science research, Journal of Information Technology and Politics 5: 1-6.
18. M. Barbarao, T. Zeller Jr. 2006. "A Face Is Exposed for AOL Searcher No. 4417749, New York Times, (August 9).
19. National Research Council. 2007. Putting People on the Map: Protecting Confidentiality with Linked Social-Spatial Data. Ed. Myron P. Gutmann and Paul Stern. Washington: National Academy Press.
20. J. Felch. August 29, 2008. DNA databases blocked from the public. LA Times.
21. N Homer, S Szelinger, M Redman, D Duggan, W Tembe. 2008. "Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping Microarrays," PLoS Genetics 4(8): e1000167. doi:10.1371/journal.pgen.1000167
22. L. Backstrom, C. Dwork, J. Kleinberg. 2007. Wherefore Art Thou R3579X? Anonymized Social Networks, Hidden Patterns, and Structural Steganography. Proc. 16th Intl. World Wide Web Conference.
23. M. Van Alstyne, E. Brynjolfsson. 1996. "Could the Internet Balkanize Science?" Science. 274: 1479-1480.

------------------------------------------------
Full reference for this paper: David Lazer, Alex Pentland, Lada Adamic, Sinan Aral, Albert-Laszlo Barabasi, Devon Brewer, Nicholas Christakis, Noshir Contractor, James Fowler, Myron Gutmann, Tony Jebara, Gary King, Michael Macy, Deb Roy, and Marshall Van Alstyne, "Computational Social Science," Science 6 February 2009: 721-723.

February 2, 2009

Long Time Gone .....

I haven't posted an entry here for almost six months ..... damn this administrative job .....

I am back, and have stored up a number of things to write about. So, stay tuned. Same channel, of course. When? Soon .....

September 30, 2008

Regulating the madness of crowds...

Before there was "crowd sourcing" there was the "madness of crowds." The current economic crisis raises an interesting question: how does one "regulate away" the madness of crowds? While there was likely a degree of moral hazard in the behavior of the financial sector over the last few years (i.e., rationally taking overly risky behavior in the belief that the government will bail them out should things get really bad), my intuition is that the main thing that has created fragility in the system is a convergence in mindset. That is, arguably, there was a convergence in belief about the value of particular assets-- these subprime based securities. Given modern day communication technology, and the generally small world of the financial sector, would it be surprising for there to be a convergence in beliefs about what these securities were worth? and this convergence must create some fragility-- even if it is unbiased-- because if the mean person is wrong, it means everyone is wrong, and that everyone falls off the cliff at the same time.

Of course, the madness of crowds (as the book illustrates) goes back a long ways. And we have built rather robust financial regulatory institutions that have, generally, since the Great Depression, reduced the vulnerability of the financial sector. But it does seem that there has been a paradigm change over the last generation, with the development of more complex (and ambiguously valued) financial instruments, and increased global communication, that perhaps make us more vulnerable now than we have been in 80 years to the madness of crowds. Which raises the question: given an increased systemic tendency to "global groupthink", and given systemic benefits of diversity, how does one regulate the collective mind? (I don't have an answer at this time, but thought I would pose the question.)

May 19, 2008

Family ties as bridging ties

I have been a low volume blogger over the last several months due to a full load of teaching obligations, but will crank it up a bit over the next month.

I always learn something when I teach my networks class. One of the early exercises that I have students do is a personal network self evaluation (adapted from Baker’s book on social capital). The students’ networks tend to be highly educated, very international, ages 25-30. Essentially, the networks of students tend to look a lot like themselves. What is notable (I base this on the last 3 years of doing this exercise) is where the networks look most different from the students, which is, more often than not, in ties to family members. Family ties are somewhat less likely to be highly educated, and when they are, it is often in a different domain than that of the other people in the students’ network, and, unsurprisingly, they are often (literally) of a different generation.

I think this reflects a broader truth about our ties to family and friends. We are born into our families, but we choose our friends. That flexibility of friendship tends to bring with it homogeneity. If we differ too much from our friends, we get new friends. If we differ too much from our family, we just don’t bring up sex/religion/politics during the holidays.

The notable corollary to this is that while social network scholars often think of family ties as the ultimate in bonding/strong ties, they, in fact, may offer bridges to groups/perspectives/skills that may not exist in our friendship networks.

February 28, 2008

Interview: Thorsten Jacobi on the current state and trends in social software

I have come up with a new format for our blog. In the next couple of months I will post interviews with leading Internet entrepreneurs and venture capitalists who can share their insider knowledge on the current state and future of social software/Web 2.0. Hopefully this is inspiring to those with entrepreneurial ambitions in the area as well as interesting to researchers who want to work on the "next big thing".

Dear Thorsten, you have been involved in various internet ventures either as part of the management team (21Publish, kinkaa, Newtron, Creative Weblogging) or as investor. We are very happy that you are taking the time to answer our questions.

Please tell us about your latest activities.

Hehe - that is a broad question - I did run my first marathon, saw my first two kids born and I do continue to bootstrap two startups - Kinkaa (a meta travel search engine for Europe) and Creative Weblogging (a blog media network).

Will the social software industry be affected by the economic downturn? Have you recognized or experienced a change in entrepreneurial or investor activity within the last couple of months?

People are certainly more cautious as everyone is trying to figure out what the impacts could be (less marketing spend, less advertising). However its just psychology so far - I haven't seen any early stage deals fall apart (as it happened with many private equity deals). Overall it seems that early stage deals show a healthy consolidation but its hard to forecast this further for me.

Let's say someone would like to start a social networking venture today. What would be your recommendations? Do you believe that the ideas are still of interest to Angel investors or VC's?

They are - just look at a Hamburg (Germany) based social network for classic car ('oldtimer') lovers. It just raised funds in the end of last year. Social networks must have a convincing organic growth and should target a certain specific demographic. If there is a good business model or good idea to make money besides running ads that can indeed be an enticing mix for investors.

Google executives recently said that it is harder than expected to generate revenue from online social networks. What is your opinion on the potential revenue models for social networks?

CPMs (price per 1000 impressions) will continue to be below average compared to other internet services. Nevertheless social networks market themselves mostly and can claim enormous amounts of users with very little marketing needed. So most will break even eventually.

Many social networking platforms have made it easier for companies to mine their user data for marketing purposes. Do you think this is the right move or will the internet community strike back?

I feel it's not a good idea to move into that direction. It was felt like going 'under your skin' as a user. Most initiatives have backtracked already from their former stance.

A follow-up question. Aren't you tired of keeping your profile up to date in all those social networks. Wouldn't it be the best way to create a single XML type online identity?

Absolutely - but remember each social network has a (slightly) different purpose - my identity in LinkedIn and MySpace may never be the same.

Merging data from social software with the real world has been discussed in the past under the name of "location based services". Though we have yet to see applications and devices that are available and used by a majority of users. When and how could this change?

It must become ubiquitous - all (or most) phones need GPS. Data plans must be included into a normal mobile phone subscription. Mobile phone displays and UI must improve so that even your grandma can login to Facebook from her mobile phone. Seems a coupe of years off - judging from my grandma who has yet to buy a mobile phone...

As a pan-european investor you are seeing and hearing about trends before they emerge. Is there an area of social software that we should be aware of in the near future? Will we be seeing more crowd based business concepts such as the trend to allow users to share almost anything?

I like the idea of Amazon Mechnical Turk a lot - it's basically an API for the human mind. It's still a lot of theory and only so much practice, but social networks with all the user data could eventually build their business on a similar platform. Social networks as an API to knowledge and human services?
...
I kept it short - hope it still helps. Please let me know if you have any more questions.
Best, TJ

Torsten Jacobi or 'TJ' is a serial entrepreneur and investor with experience in the software and media industry in the U.S. and Europe. He lives close to Silicon Valley with his family.

February 9, 2008

Searching for the next President...

It is an interesting and important question how voters search for information about candidates. Presumably, in a democracy, one hopes that institutions push people to deliberate about the choices before them, because (1) it would normatively be a good thing that the collective choice reflects the balance of well-thought out opinions about the direction of the country, (2) that the (individual) deliberative process has a cumulative effect on people’s knowledge and preferences, and (3) thinking about the collective choice helps forge a civic identity.

Political science often offers a rather pessimistic view of the capacity of voters to make informed decisions. However, this view is often based on an overly static of the voter. (I will blog another day regarding a paper that I am working on with Kevin Esterling and Mike Neblo on this subject.) Voters (arguably) search for information on an as-needed basis. Much of that search is around nonhuman sources (e.g., news media), and much of it human sources (e.g., friends and family). One of the nice things about the Internet, as I have written before, is the digital record people leave of their behavior (as compared, for example, to casual conversations). So here’s a little fun with google trends: Below are two figures from google trends, one for google searches for the names of each of the remaining candidates for President over the last 30 days in the whole country, and one for Washington State, which is holding its caucuses today.


google%20trends.png

google%20trends%202.png

I would note that for the first figure, of the top 10 states that conducted these searches, 8 were Super Tuesday states, one was DC, and one was South Carolina. Put together, these figures offer a glimpse of how the campaign drives people’s search for information about the candidates; and they also offer a hint of who people are looking for information about. An emphasis needs to be put on the word “glimpse” because we are drawing on a particular sample of searching behavior. What types of people look where for what types of information? The contrast shows up quite clearly between Paul and Huckabee is illuminating, where Paul is relying on an Internet-based network, and Huckabee more of a church-based network. One suspects that this figure thus over-represents the relative number of people looking for information about Paul.

HERE IS THE KEY:

Blue: Obama
Red: Clinton
Yellow: McCain
Green: Paul
Purple: Huckabee

February 5, 2008

More on the Massachusetts primary...

A few more observations about the primary in Massachusetts: I voted this morning, apparently in the midst of heavy turnout. There is this charming tradition of sign holding for candidates outside of the voting areas. I don't know how common this is-- certainly, it wasn't the case in other states I have lived and voted in (Michigan and New Jersey). It is all quite civil-- I walked by the Obama (two) and Clinton (one) sign holders and they were talking about local affairs. There no Romney signs, notably, since this is his home town (although I am sure he votes in the higher rent precinct). And, of course, the sign holders are members of the community, so people who pass by always are stopping to chat with the sign holders. These little exchanges, I think, are the microscopic civic foundations of a democracy.

February 4, 2008

The election comes to Massachusetts

A small but telling note about the "invisible" networks mobilizing around the primary-- a canvasser for Obama knocked on my door yesterday. A few observations: the canvasser was not someone i knew, but was from the neighborhood (4 or so blocks from my house); she had a checklist of doors to knock on; my town is not obviously promising Obama territory, where the median voter is probably about 50 and white; and, more generally, surveys have suggested that Clinton has the biggest lead in Massachusetts of any of the Super Tuesday states (see wonderful pollster.com graphic). There were no other canvassers that came to our door, and no others that I saw, generally, going door to door.

What does this tell us? It suggests that there is a formidable and sophisticated grassroots mobilization effort by the Obama campaign-- with primaries tomorrow covering a population of about 150 million people the Obama campaign had someone from my neighborhood knock on my door. The fact that it was someone local reflects the mobilization of local networks, the fact that she had a computerized print out of people to talk to suggests the sophisticated and centralized dimension of those efforts. Field studies suggest that these mobilization efforts can be very effective, and the effect must be amplified if it is someone from the neighborhood (as compared to outside canvassers). It also highlights the intensity of the Obama support, which is only partially reflected in voting (since the votes of high intensity people count no more than those of low intensity people, although presumably they are more likely to turn out), because intensity is related also to efforts to persuade. (My intuition therefore is that, at least in Massachusetts, polls are understating Obama's support-- although that intuition may wilt in about 28 hours.)

The other thing that this highlights is the role that elections play in spurring civic discussions. It is one of the curses of the electoral college system that the election does not come to states that are uncompetitive. It has been a long time since Republicans and Democrats fought over Massachusetts. There are votes (on the margin) to be fought over here, probably as many as in any state, but no electoral college votes to be won. As a result, the election settles with a vengeance in places like Ohio and Florida, but states like Massachusetts and Utah are utterly ignored. The Democratic and Republican primaries offer a nice contrast, where the Democratic delegates are awarded roughly proportional to the number of votes Clinton and Obama get in each Congressional district, whereas the Republican primary is winner take all. In part as a result there is vastly more competition for votes on the Democratic side than the Republican side, because McCain cannot hope to win delegates here, and Obama can.

Once tomorrow is past, the election is pretty much over for Massachusetts, except for commercials aimed at New Hampshire this Fall (since New Hampshire will be a competitive state). And I think that is unfortunate, because these kind of election-spurred discussions are healthy for our democracy.

December 4, 2007

Netcentric travel

As I waited for the bus a good chunk of yesterday morning, in the drizzly slush that is Boston's lot in winter, I was contemplating how travel should work in a networked world. In particular, let me propose the idea of "netcentric travel", where service providers push out information so as to allow consumers to assist in the co-production of good outcomes. Example: it should not have been necessary for me to stand outside for the better part of an hour waiting for a wildly out of sync MBTA bus. Instead, one could imagine placing GPS devices on each of the buses, which could be tracked real-time on the Internet. Given such a system I could have seen when the bus was going to arrive and gone out shortly before (or seen that it was not going to come in time and driven in).

Such a system would be fairly cheap to implement, and, I would note, could be plausibly be built through a public-private partnership. You could imagine a private company funding the installation of GPS devices, but gaining revenue through web-based advertising revenue.

The data produced through such a system could also prove a valuable operations management tool for mass transit.

Another example: I have had a series of bad experiences with air travel over the last year (I know: join the club). One particular example was a USAirways flight I was on the day after they merged their database with America West. Needless to say, this merging of databases went rather poorly, putting out of commission all of their check-in kiosks. When I arrived at the airport, there was a 2+ hour line, resulting in many missed flights. One could have imagined, as soon as this problem arose, their communicating to customers (via e-mail and phone) that they should (1) arrive at the airport especially early, and (2) print out their boarding passes at home if at all possible. This would have potentially greatly reduced the impact of this disruption on customers.

In any case, the key insight here is to recognize that in travel (as in many domains) consumers co-produce systemic outcomes, and will adapt in ways that are good for the system given the right informational tools. Such a system exists to some extent-- e.g., one can see if flights are delayed online; some highways have a radio station that transmits information about traffic; you can go to traffic.com to find out real time where there is traffic congestion. But I think that these bottom-up possibilities are underexploited by some of the more top-down parts of our travel infrastructure.