| Sun | Mon | Tue | Wed | Thu | Fri | Sat |
|---|---|---|---|---|---|---|
| 1 | 2 | |||||
| 3 | 4 |
5 |
6 |
7 | 8 |
9 |
| 10 | 11 | 12 | 13 | 14 | 15 | 16 |
| 17 | 18 | 19 |
20 | 21 | 22 | 23 |
24 |
25 |
26 |
27 | 28 |
29 |
« December 2005 | Main | February 2006 »
31 January 2006
A current article in the German online newspaper SPIEGEL reports that wikipedia entries regarding US congress members have been manipulated (in German). As the related wiki entry notes "these edits had, among others, added libelous statements, removed content with malice, added childish insults, violated Wikipedia Policy."
This raises an interesting issue. We need to further our understanding how well such "open source" knowledge bases can be used to find reliable data. Building on recent discussions, such as Censorship and Google, we need to ask whether or not it is required to start thinking about the "rules and regulations of internetworking". The internet on the one hand allows for accessing a vast variety of information, but on the other hand raises serious concerns regarding the reliabilty of such "freely available information". However, one might also argue that there is no need to regulate, because the very nature of "open source" regulates the flow of information automatically (as the current example shows). What do you think?
See also:
Report in Sun Lowell on staffer edits on the bio of US Rep Marty Meehan. A discussion of Wikipedia Immunity by Anita Ramasastry. Further details on Wikipedia and US Congress.
Posted by Thomas Langenberg at 1:28 PM | Comments (1)
As Alexander mentioned earlier today Google has famously agreed to tailor the results of their new Chinese product. (Summary and links here) The image search has also been affected. Floating around the anti-censorship community is a very evocative comparison of the American and Chinese Google Image searches for "tiananmen":
http://images.google.cn/images?q=tiananmen
http://images.google.com/images?q=tiananmen
Is this google censorship?
An easy observation is that the google.cn results are all from the .cn domain. Indeed, a google.com search of images from the .cn domain yields a similarly sanitized image collection. But google.cn doesn't draws from the wide web, not just local searches: running google.cn searches with a misspelled "tianamen" or, even more damning, a capitialized "Tiananmen" reveals many pictures of the infamous 1989 incident. This is not search optimized for local or cultural interests; this is top-down intervention in a very successful bottom-up search algorithm.
Posted by Allan Friedman at 9:44 AM | Comments (0)
In my entry on Google bombs on 11/19/2005 I raised the following question:
"How will governments react to such movements of altering the search results in an unfavorable way in the future as knowledge becomes more important? How will search engine providers react? The easiest way to approach this would be to influence or enforce rules on search engine vendors. Hence, we could ask whether search engine providers need to be kept as autonomous as central banks with respect to knowledge?"
Well, as of 1/25/2006 we got an answer to this when reports on Google's self-censored search engine for China came out. However, as a other reports show, censorship also exists in other countries like Germany or France for certain terms. So in fact there is a need to watch developments in this regard carefully...What do you think or propose?
Related articles:
Harvard Law School, Berkman Center for Internet & Society
NY Times on Google and China search engine version
Wired on Google and their geolocations on searches
NY Times on Google and Privavcy
Newscientist on China and Google search
Washington Post on Geolocator
Posted by Alexander Schellong at 5:00 AM | Comments (0)
30 January 2006
Before I leave the NSA thread, one more thought about the utility of the data mining phone log data, recording phone calls, etc. It is important to note that pattern recognition is only as powerful as the ancillary intelligence that the government would have to complement the phone data. Obviously, in the absence of any indications of who is high risk, the signal in the data is infinitesmal.
A corollary to this is that pattern recognition of today's data will be much more effective in the future-- because the amount of complementary information will only go up. We may have no reason to pay attention to Joe, except that he was 3 degrees removed from someone in some terrorist's phonebook. However, in the future we may receive solid intelligence that Joe is a terrorist. At that point, it would be a pretty good idea to go through all of the data you collected about Joe-- if you have it sitting in a hard drive somewhere. If you haven't been recording, and collecting information from the switches, etc, it may be impossible to reconstruct afterwards. That is, "retrospective" pattern analysis is likely to be much more effective than prospective pattern analysis. Of course, this in turn points to a strategy of collecting data now and asking questions later, which again brings us back to the issues around privacy and collateral usage in aces.
Posted by David Lazer at 2:29 PM | Comments (2)
I'm happy to announce the release of a report that I wrote with Barry Wellman and the Pew Internet & American Life Project titled, "The Strength of Internet Ties.�
Disputing concerns that heavy use of the internet might diminish people’s social relations, the report shows that the internet fits seamlessly with Americans’ in-person and phone encounters. With the help of the internet, people are able to maintain active contact with sizable social networks, even though many of the people in those networks do not live close to them.
The report highlights how email supplements, rather than replaces, the communication people have with others in their network.
The report "The Strength of Internet Ties" is available as PDF.
A BBC news story about the report is available online at:
http://news.bbc.co.uk/2/hi/technology/4644666.stm
Posted by Jeff Boase at 12:46 PM | Comments (0)
28 January 2006
This event is the first this year of the Transatlantic Initiative on Complex Organizations and Networks (TAICON), which is co-chaired by Lars-Erik Cederman, of ETH-Zurich, and David Lazer, of Harvard. It will take place on January 30 at noon at the Swiss House for Advanced Research and Education (SHARE ), 420 Broadway in Cambridge.
"The architecture of real networks: from the Web to social networks"
Speaker: Albert-László Barabási
Abstract: Networks with complex topology describe systems as diverse as the society, cell, or the World Wide Web.
The emergence of most networks is driven by self-organizing processes that are governed by simple but generic laws. The analysis of social, biological and technological systems shows that nature and human designs share the same large-scale topology, and are governed by similar evolutionary laws. I will show that the structure of these complex webs have important consequences on their robustness against failures and attacks, with implications on drug design, the Internet's ability to survive attacks and failures, and the ability of ideas and innovations to spread on the network.
Posted by David Lazer at 5:25 PM | Comments (1)
27 January 2006
Many social science articles focusing on networks also discuss social capital. Sometimes the two terms are used nearly synonymously. Some scholars bristle at the conflation, others are indifferent.
Does a network map necessarily provide insight into social capital?
Does understanding social capital require a network map?
As food for thought, consider these two potentially problematizing examples (continued below in the extended entry)...
1. Social capital may have meaning beyond networks:
One finding under the social capital umbrella is that the crime rate in a neighborhood is negatively associated with the percent of residents who know their neighbors' first names. In this case, social capital can be a kind of public good, where it is possible for an individual resident who does not know his or her neighbors can still derive free-rider benefits from the social capital of others.
Does a network-based definition of social capital account for such phenomena that are more associated with a general social structure than a specific network?
2. Social networks may be independent from social capital.
Social capital is sometimes characterized explicitly by network structure. For example, the person located at the hub of a star network is considered to have the most social capital in that network. But networks are defined in part by the nature of their ties, not just their existence. Does the star network give us information about the relative social capital of its members if the tie is something like "makes eye contact with during the course of the day," "has sexual relations with," or "prepares food for?" These ties are very important to epidemiologists in studying the spread of infectious diseases or other pathogens. In these settings, individuals like bus drivers, sex workers, and food service workers are quite influential. But is this influence an example or form of social capital?
My own take is that social networks are a representational methodology – a way of representing social reality that focuses on the quantification of relationships. Other representational methods with other foci include not only approaches like ethnography, censi, sample surveys, records of economic transactions, and the like, but also documentary film making, photography, portraiture, and other artistic and interpretive/expressive methods.
Social capital, on the other hand, is a construct. Because this construct deals explicitly with social structure and individual interdependencies with others, a representational methodology that quantifies relationships among individuals is particularly well-suited for operationalizing this construct for formal analysis.
As an SAT-style analogy, social networks are to social capital as a piano is to a musical composition. The latters of the pairs can exist independently from their formers, but we sometimes use the formers to understand and interpret the latters. The analogy is far from perfect, but hopefully illustrative of my understanding of the difference between social networks and social capital.
I welcome all blog readers to weigh in with your views.
Posted by Brian Rubineau at 7:28 PM | Comments (2)
23 January 2006
Just closing the loop on the NSA data collection for now. Having discussed the potential utility of the data for investigation of terrorists, I now turn to the potential collateral use of these data. The potential is considerable—probably a lot greater than for combating terrorism— especially if the data are retained.
First of all, the data could be used for other criminal investigations. In fact, it is difficult to draw a logical line separating terrorist activity from other criminal acts—e.g., many more people are harmed by other types of crime each year in the US than by terrorism. Obviously, there is a potential for terrorist acts that are especially devastating, and terrorism has particular political significance. But from an empirical/utilitarian calculus, it is difficult to justify why different tools are appropriate for preventing deaths from terrorist activities and not for other criminal activities.
In fact, one would anticipate that for people who “live on the grid� in the US, the data produced could be very powerful for investigations. Consider simply the potential use of locational information from cellular phones, where one might be able to identify who was near the crime scene at the time of the crime. Even if the perpetrator of the crime was sensible enough to turn off their cell phone (and many would not), this would be an effective way to identify material witnesses.
Second, there is the possibility that these data could be used for political purposes. Consider the value of tracking the communication and location of the news media and political opponents. One could see which opponents were talking with each other, which of your erstwhile allies were talking with opponents. If one suspect an opponent was having an affair, you could correlate their locations and communications with those of other people, etc.
In a political culture which is increasingly polarized, where the “other side� is increasingly demonized, it is plausible that such tactics in the future could be rationalized by those in power, if they felt that there was a sufficiently low probability of being caught.
Just the possibility of such abuses, arguably, is damaging to our discursive space—potentially undermining our news media, discouraging communications with certain people because it might provoke enhanced scrutiny from the government, etc.
The questions that this significant “dark side� to the data sweeps highlights are (1) restrictions should be placed on the government in their access to the data like these, and (2) what are the necessary governance structures to assure accountability. Currently, there are no apparent limits, and no governance structures to deal with these data sweeps. The claim of the administration is that they have dominion over any bit of data that they believe has some positive probability of being useful in fighting terrorism. This principle, however, allows no accountability, and points to no boundary between what data government should and should not “sweep up.�
While the NSA data collection has not provoked a massive popular uproar, it should be the political class (of both parties) that should be most concerned about these data sweeps, because their dirty laundry is also being swept up. Should there be abuses, it would be their lives that would come under scrutiny first. I should hope, therefore, that the political class can produce a governance structure that simultaneously assures our security, but provides sufficient limits on data mining to assure individual freedom.
….
A small point that did not fit in with the above, but which I wanted to make before I left this thread. A number of places in news reports, unnamed defenders of the data sweeps have claimed that having computers monitor communications was less intrusive than having people do so. This is not necessarily so. Given a choice between two FBI agents listening to my conversations, and taping on an analog tape that would go in a file somewhere, and a digital recording, I would go with the FBI agents in a moment. The digital recording, with voice recognition, provides an easily searchable and recallable data source, which would be a lifelong intrusion, whereas the analog tape would not be easily searchable, and the memories of the agents would fade over time. This is not to say, however, that this is what the NSA is doing (but it is consistent with what information we have received about their activities).
Posted by David Lazer at 11:03 AM | Comments (5)
22 January 2006
My next entries will discuss the application of Customer Relationship Management in the public sector. Other terms used are citizen or constituent relationship management. As this is a relatively new topic and less applied concept in the pulic sector I hope our visitors are interested in sharing some of their ideas or questions with me.
What is CiRM?
In how war is CiRM different from CRM?
How is it understood in government?
How is CiRM implemented?
Will it have an impact on customer service in the ps? What other impacts do you expect.
What other questions should we ask?
I am looking forward for your input. I will provide further information on Citizen Relationship Management at my website.
Posted by Alexander Schellong at 6:40 PM | Comments (1)
21 January 2006
In the upcoming week, Las Vegas will host the 2006 SIA Snowsports Trade Show. As the SIA announces on its website, the trade show is the premiere show in the world market. Sports manufacturers, distributors, buyers, sales representatives, press/media, industry professionals and athletes come together to both discuss the latest trends for the new season as well as to extent their "social networks".
LINE Skis is an entrepreneurial firm in the snowsports industry, which is based in Burlington, Vermont. Within the last couple of years, the firm has significantly contributed to the development of what today is known as "newschool" or "freeride/freestyle" skiing. In a recent interview on http://newschoolers.com, the founder and owner of the company, J. Levinthal, announced not to participate in the SIA trade show, although it is (a) the world's greatest and important event, and (b) the network of distributors and sales reps obviously weren't really happy about decision.
One of the key arguments Mr. Levinthal brought up to explain his decision goes like this: The main theme of the company is to promote alpine skiing. It is not about making profits, instead it is about spreading the word and building an (international) community of alpine skiers.
In consequence, the company is taking serious efforts to establish a constant information exchange with its "customers" through the company website and diverse online communities, such as newschoolers.com. Skiers from all over the place "meet online" to share ideas, movies, pictures, and plans for building an ever growing community of skiers!
So what? The aforementioned development raises a series of interesting questions for innovation and network scholars alike:
- Will online communities become a novel type of business model for "action-sports entrepreneurs"?
- Will online communties help firms collect novel knowledge to foster (technological) innovation?
- Is creating "social capital" a new way of funding entrepreneurial activity?
- How do strong ties between action-sport communities and young entrepreneurial firms as well as common beliefs affect traditional incumbent firms?
Posted by Thomas Langenberg at 11:29 PM | Comments (3)
19 January 2006
The crawlers of Google, Yahoo, MSN and other search engine providers are automatically indexing the web. All of the web? No not all of it, just the surface which consists of billions of documents like HTML pages or directly linked file of any kind (i.e. mp3, PDF, doc, zip...). If we use complex search strings we are also able to plunge a little into the grey matter below the surface web (SW). Many documents are not directly linked but are still indexed.
The deep Web (DW), is web data that resides in databases and is only dynamically available in response to queries (i.e. you do a search on a specific website, login or load a website). It is supposedly much bigger and provides more valuable data than the surface web.
Bergman (2001) estimates that the DW contains 7,500 tb of data compared to 19 terabytes of data in the SW. Current Link analysis and crawler activity does not help to tap into those sources. It is much more complex and labor intensive and would probably exceed storage capabilities currently available to Google. A market exists as organisations such as the CIA, FBI or private companies have interests in using those additional (high quality)resources.
Related links:
Search engine trends, marketing
The Deep Web: Surfacing Hidden Value by Bergman (2001, U Michigan)
Search for the invisible web by Sherman (2001, The Guardian)
Index Structure for querying the Deep Web by Qiu/Shao/Zatsman/Shanmugasundaram (2003, Cornell U)
Accessing the Deep Web: A Survey by He/Patel/Zhang/Chen-Chuan Chang (2004, U Urbana-Champaign)
Deep Web Search engine
Introduction to the deep web by Laura Cohen (2005, SUNY Alabany)
Finding unpublished research by Mathews (2004, ACRL)
Posted by Alexander Schellong at 11:38 AM | Comments (0)
17 January 2006
Excerpts from today's front page NYT article below. Three things to take away from this article: (1) that there was a snowball-type of analysis from the relational data collected; (2) there was some degree of content analysis of communication (although not clear that voice recognition was involved); and (3) the signal to noise ratio seems to be unacceptable (to the point of being humorous).
The next entry on this (and then I will return to more traditional points of discussion for the blog) will be on potential
collateral uses of these data.
January 17, 2006
Spy Agency Data After Sept. 11 Led F.B.I. to Dead Ends
By LOWELL BERGMAN, ERIC LICHTBLAU, SCOTT SHANE and DON VAN NATTA Jr.
In the anxious months after the Sept. 11 attacks, the National Security Agency began sending a steady stream of telephone numbers, e-mail addresses and names to the F.B.I. in search of terrorists. The stream soon became a flood, requiring hundreds of agents to check out thousands of tips a month....
"We'd chase a number, find it's a schoolteacher with no indication they've ever been involved in international terrorism - case closed," said one former F.B.I. official, who was aware of the program and the data it generated for the bureau. "After you get a thousand numbers and not one is turning up anything, you get some frustration."...
Officials who were briefed on the N.S.A. program said the agency collected much of the data passed on to the F.B.I. as tips by tracing phone numbers in the United States called by suspects overseas, and then by following the domestic numbers to other numbers called. In other cases, lists of phone numbers appeared to result from the agency's computerized scanning of communications coming into and going out of the country for names and keywords that might be of interest. The deliberate blurring of the source of the tips caused some frustration among those who had to follow up.
F.B.I. field agents, who were not told of the domestic surveillance programs, complained that they often were given no information about why names or numbers had come under suspicion. A former senior prosecutor who was familiar with the eavesdropping programs said intelligence officials turning over the tips "would always say that we had information whose source we can't share, but it indicates that this person has been communicating with a suspected Qaeda operative." He said, "I would always wonder, what does 'suspected' mean?"
"The information was so thin," he said, "and the connections were so remote, that they never led to anything, and I never heard any follow-up."...
But in bureau field offices, the N.S.A. material continued to be viewed as unproductive, prompting agents to joke that a new bunch of tips meant more "calls to Pizza Hut," one official, who supervised field agents, said.
Posted by David Lazer at 10:38 PM | Comments (0)
10 January 2006
A bit of a digression from social network analysis, but following from the recent items on the NSA, some results from Washington Post/ABC survey on surveillance/domestic spying.
7. In investigating terrorism, do you think federal agencies are or are not intruding on some Americans' privacy rights?
Are Are not No opinion
1/8/06 64 32 4
8. (IF FEDERAL AGENCIES ARE INTRUDING, Q7) Do you think those intrusions are justified or not justified?
Justified Not justified No opinion
1/8/06 49 46 5
Q7/8 NET:
------ Intrusion -------- Not an No
NET Just. Not just. DK intrusion opin.
1/8/06 64 31 30 4 32 4
Compare to:
In investigating terrorism, do you think the federal agencies like the FBI are or are not intruding on some Americans' privacy rights?
-----Intrusion------- Not an No
NET Just. Not just. intrusion opin.
9/7/03 58 36 17 33 8
9. Which worries you more: (that Bush will not go far enough to investigate terrorism because of concerns about constitutional rights), or (that Bush will go too far in compromising constitutional rights in order to investigate terrorism)?
Will not go Will go Neither No
far enough too far (vol.) opinion
1/8/06 48 44 6 2
10. According to recent news reports, the National Security Agency has been investigating people suspected of involvement with terrorism by secretly listening in on telephone calls and reading e-mails between some people in the United States and other countries, without first getting court approval to do so. How closely have you been following this story - very closely, somewhat closely, not too closely or not closely at all?
------Closely------ ------Not Closely------ No
NET Very Somewhat NET Not too Not at all opinion
1/8/06 66 20 46 34 21 13 *
11. Would you consider this wiretapping of telephone calls and e-mails without court approval as an acceptable or unacceptable way for the federal government to investigate terrorism? Do you feel that way strongly or somewhat?
------Acceptable------ ------Unacceptable------ No
NET Strongly Somewhat NET Somewhat Strongly opin.
1/8/06 51 35 15 47 14 33 2
Posted by David Lazer at 11:06 PM | Comments (0)
9 January 2006
A more expansive scenario would be that the NSA collects all phone log data from US sources as well as non-US calls that pass through US switches, plus locational information from cell phones where available (+ e-mail traffic, etc).
The expansive scenario offers a significant security and logistical advantages to the NSA. The security advantage is that under the more limited scenario, the NSA would have to share critical security information with telecomms, by asking them for information about only certain individuals. That delimited information is terribly sensitive intelligence—by telling telecomms who they want to monitor, etc, it is essentially telling them who the government has received intelligence about.
The logistical advantage is that as the NSA finds out about potentially risky individuals, they can avoid the hurdles of making requests of the telecomms—they could just instantly access the information as they needed it.
Would such a massive data set be useful? Probably. Certainly, the locational information would be very helpful—one would be able to evaluate the physical proximity of people. Further, some of the patterns one would look for would involve the locations of individuals making and receiving calls—a set of calls to different numbers to Washington, DC, from a high risk source might be indicative of a potential event there.
One could also refine the techniques to identify members of a loosely connected set of people by testing them on known sets of people. There has been a lot of work recently on identifying groups of people from network data, e.g., by Ken Frank, Mark Newman, and Bernardo Huberman. I’m not sure how their algorithms would scale up to such a massive data set. I suspect that you could produce some algorithm that could do something similar (if not as well) for a dataset of hundreds of millions, although perhaps I am wrong.
This problem is significantly different, since you would be starting with somewhat more information—e.g., that a handful of people belong to a particular group—and not really want to produce a list of all groups in the data set. Further, you would have more information than just who talks with whom, but when they talked, and through what medium.
So it would be necessary to produce a new, and much more sophisticated, algorithm, testing on a variety of groups where you could validate the results. For example, one could test it on social network analysts—start with a handful of people who you know do social network analysis, and produce an algorithm that does a reasonable job of finding other social network analysts, adjusting the parameters of the algorithm to fit. Repeat this with a variety of groups, until you produced a reasonably robust algorithm.
Hard to say how effective this would be without doing it. And whether any groups that you could validate would have communication patterns like terrorists seems rather unlikely. Of course, you might be willing to settle for lots of false positives if you are looking for a terrorist—e.g., 100 or 1000 false positives to find one true positive.
Of course, such data would have a lot of potential for "collateral" usage, which I will turn to next.
See web pages of Ken Frank, Mark Newman, and Bernardo Huberman.
Posted by David Lazer at 8:02 PM | Comments (1)
7 January 2006
So, what data mining could one do with the data the NSA has collected from telecomm companies? Obviously, it is still unclear as to what is being collected, so this is quite speculative, which is a little different from my normal role of cautious academic. My hope is that this speculation, in the end, will yield some productive discourse about this important subject. I also want to make clear that I am not endorsing (or condemning) such data mining for now. Later I will discuss some of the privacy and policy issues. For now, I just want to do a thought experiment of how one might analyze these data in a fashion that might detect terrorist activity.
My assumption here is that the objective is to identify candidate nodes (individuals) for surveillance.
I am going to start with what I consider a less expansive scenario. In this particular scenario, one is starting out with some phone numbers and e-mails that are designated as “high risk�—e.g., from other intelligence. A simple analysis would simply snowball outwards from these high risk nodes to their contacts, and to their contacts’ contacts, etc. As one snowballs outwards, one will likely find overlaps, where some nodes are members of multiple circles. In the simplest analysis, the more circles that a node is a member of (and the closer to the center of those circles), the higher risk they should be considered.
Obviously, the analysis should get substantially hairier than that, because of the nature of the sampling from the network. For example, I am guessing that the identifications of high risk nodes are not independent events. Imagine that an Al Qaeda cell is identified and its members apprehended in Jordan, and their computers, address books (or equivalents) acquired. One would then snowball outwards from these contacts. However, to find overlap among the contacts of these cell members presumably conveys different information than if one found overlap among the contacts of different cells from different countries (presumably the latter would be more significant).
One could devise a weighting system that depends on the number of paths that go through a particular node, other information about nodes, etc, to develop a ranking of who should be watched. These weights could be validated by fitting them to part of the network data, and then examining whether the technique was effective at identifying those nodes that you knew were already “high risk.�
Ideally, one would use communication data going back in time as far as possible—thus, while telecomm companies are sharing data, you would want them to go back as far as possible. This would also be useful in case you wanted to do sequence and timing analysis—e.g., it’s not just who you call, but it’s when you call (say after some event), or that you called Anne after Joe called you.
Obviously, there are lots of difficult issues re sampling. Further, one would hypothesize that any terrorist worth their salt would be careful about recording contact information, and, more generally, their use of electronic communication. And I would guess that most of the people that terrorists communicate with are non-terrorists, and their contacts, in turn, are even less likely to be terrorists, so the vast majority of people caught in this net are going to be non-terrorists. So, to mix metaphors, one may have removed from the haystack proportionally more hay than needles, but you are still left with a very large haystack with just a few needles.
Once one has identified some risky nodes, the next step would be to monitor actual communications. Presumably, the NSA has finite capacity to have humans listen to conversations, and thus the key management question is how to allocate this scarce resource. The first level of monitoring would therefore simply be recording of conversations. Presumably, this is fairly cheap to do, so, putting civil liberties concerns aside, one would adopt a pretty low risk threshold for recording. This would allow going back in time for human monitoring if an individual were subsequently identified as high risk. A second level, if it is technically possible (at some level it surely is), would be to apply voice recognition to those recordings, where the content of conversations would adjust the evaluated risk level of those nodes. Further, such voice recognition could pick out candidate snippets of conversations for human monitoring. Such “snippet-based� monitoring, I think, would explain why the FISA court process was circumvented, since it might result in the brief, human-based monitoring of a very large number of people (conceivably exceeding the number of warrants approved by the FISA court in its history very quickly), and in the computerized monitoring of a still larger numbers of people. That is, the oversight process specified by FISA would be unable to cope with the sheer volume of requests. Further, the basis of monitoring these snippets is probably weaker than what has traditionally been brought before the FISA court. It would also explain why some defenders of the policy (who presumably know more than has been publicly released) have stated that having a computer monitor your conversation was not a privacy intrusion (thus suggesting that a major component of the program did involve computerized monitoring).
This is the less expansive scenario that I have come up with (although how expansive it is depends on a number of parameters—how many steps out one goes from the initial sample, what is the threshold for monitoring, etc, so the actual numbers of people who are in some fashion caught in the net might number anywhere from thousands to millions). This is a pretty rudimentary analysis, as compared to how one would actually do it, but I think has the essential ingredients. My next entry will consider a more expansive scenario.
Posted by David Lazer at 12:53 PM | Comments (1)
4 January 2006
Before I get to what might be done with the data, a little more on the data that has been collected, from James Risen’s book, State of War: The Secret History of the CIA and the Bush Administration (2006):
Following President’s Bush’s order, US intelligence officials secretly arranged with top officials of major telecommunications switches carrying the bulk of America’s phone calls. The NSA also gained access to the vast majority of American e-mail traffic that flows through the US telecommunications system. (p. 48)
The telephone network today is digital and computerized, but is still built around a switching system that routes calls from city to city, or country to country, as efficiently and quickly as possible…. In addition to handling telephone calls from, say, Los Angeles to New York, the switches also act as gateways into and out of the United States for international communications…. [I]t is now difficult to tell where the domestic telephone system ends and the international network begins. (pp. 49-50)
One of the secrets of the Internet is that its infrastructure is dominated by the United States, and that much of the world’s e-mail traffic, at one time or another, flows through telecommunications networks that are physically on American soil. (p. 51)
Posted by David Lazer at 9:57 PM | Comments (0)