Why Data Center Tapping is (Legally) Different

Last week the Washington Post broke news that the National Security Agency has collected international traffic between Google and Yahoo data centers. I happened to be delivering a course lecture on signals intelligence the same day, so I made brief mention of the program—and how it appears particularly aggressive under the Fourth Amendment.

A sharp student pressed for specifics. How, he asked, could data center tapping be more legally questionable than previously leaked surveillance initiatives? This post is an expanded and refined version of my response.

In short: The firms evade Fourth Amendment pitfalls of citizenship, personal interest, and metadata. They also have enough evidence to establish standing. Finally, the NSA would have difficulty demonstrating that its surveillance was reasonable.

… 

The Web Is Flat

Consider this a bug report for the National Security Agency and its overseers. Dragnet online surveillance may be directed at international activity. But it nonetheless ensnares ordinary Americans as they browse domestic websites.

The spy outfit admits to vacuuming vast quantities of network traffic as it passes through the United States. Some taps are on the nation’s borders; others are on the domestic Internet backbone. International partner agencies, most prominently the UK’s Government Communications Headquarters, contribute to the NSA’s reach. Recent leaks have provided substantial detail: Under the Marina program, the agency appears to retain web browsing activity for a year.1 The XKeyscore system offers at least one way for analysts at the NSA and cooperating services to efficiently query both historical and realtime data.

Agency apologists are quick to point out that the snooping has limits. The NSA only acquires online communications when a sender or recipient seems international. Doing otherwise might, in their view, violate congressional restrictions or constitutional protections.

Tough luck for foreigners. But if you’re within the United States, the notion goes, you don’t have much cause for concern.

That’s wrong. Americans routinely send personal data outside the country. They just might not know it.
… 

Do Not Track in California

Both houses of the California legislature have unanimously approved AB 370, a Do Not Track initiative that is backed by Attorney General Harris. If Governor Brown signs the bill, it will be the first Do Not Track law worldwide. So, what would it do? More and less than a casual reader might expect.
… 

Legislating NSA Crypto Circumvention

The National Security Agency works to circumvent cryptography. In the abstract, that’s hardly objectionable—legitimate intelligence targets may adopt security measures. Concerns arise, however, when the NSA subverts the technologies that ordinary consumers and businesses rely upon. Longstanding conventional wisdom in the computer security community has been that the NSA works to insert backdoors into crypto standards and security products, and that the agency hoards vulnerabilities in popular crypto algorithms and implementations. Widely read reports recently confirmed these views.

The go-to recommendation among many security experts has been deployment of additional protective measures. That’s an appealing near-term option for sophisticated users and companies. It’s largely impractical for ordinary users, however. And adding more crypto won’t restore damaged trust, shut potentially risky backdoors, or patch vulnerable systems.
… 

Advancing Empirical Legal Scholarship: Federal Trial Opinions and Rules

In earlier posts I have shared XML versions of certain legal materials, including federal statutes, appellate opinions, and appellate rules. My aim has been to assist empirical legal scholars by providing machine-readable government documents.

Additional legal materials accompany this post, including federal trial-level opinions and rules. Suggestions from the research community remain very much welcome.

… 

Next Steps for the Firefox Cookie Policy

Consumers neither expect nor approve of web tracking.1 Mozilla has been a frequent advocate for its users, advancing technologies that signal preferences (Do Not Track), lend transparency (Collusion), and facilitate privacy-friendly web services (Persona and Social API). Last fall, the Mozilla community began a concerted effort in a new direction: technical countermeasures against tracking.2 One of our first projects has been a revision of the Firefox cookie policy.3

Cookie policies are inherently imprecise. Some unwanted tracking cookies might slip through, compromising user privacy (“underblocking”). And some non-tracking cookies might get blocked, breaking the web experience (“overblocking”). The challenge in designing a cookie policy is calibrating the tradeoff between underblocking and overblocking.4

The patch that I developed is an intentionally cautious first step: it aims to substantially reduce underblocking with little (if any) overblocking. The revised policy is so cautious, it isn’t even new: it’s drawn directly from Safari.5 Almost every iPhone, iPad, and iPod Touch user is already running the revised Firefox cookie policy. Web engineers are already familiar with designing to accomodate the policy. The notion is simple: start by raising Firefox to the present best practice among competing browsers, then iteratively innovate improvements.

Firefox’s revised cookie policy landed in the pre-alpha build in late February. Since then, Mozillans and I have carefully monitored bug reports. It appears that we achieved our aim: there are only two confirmations of inadvertent breakage.6 We did not hear any novel concerns when the patch advanced to alpha in early April. This past week, Mozilla’s CTO requested a hold on the revised policy for an extra release cycle to measure its performance. At the same time, he reaffirmed that Mozilla is “committed to user privacy” and “committed to shipping a version of the patch that is ‘on’ by default.”

I agree that we should be quantitatively rigorous in our approach to iterating the Firefox cookie policy. An extra six-week release cycle will allow us to further validate our hypothesis that the patch delivers improved privacy without breakage,7 as well as lay the groundwork for future updates. Going forwards, our challenge will be to understand and improve the underblocking and overblocking properties of the Firefox cookie policy.
… 

Advancing Empirical Legal Scholarship: Federal Appellate Opinions and Rules

Last December I shared XML versions of the U.S. Code and Supreme Court opinions through early 2012. My intent was and remains to facilitate empirical legal scholarship by providing government-authored materials in a machine-readable format.

This post is accompanied by additional documents: opinions and rules of various federal appellate tribunals. As before, I welcome feedback from the academic research community.
… 

The New Firefox Cookie Policy

The default Firefox cookie policy will, beginning with release 22, more closely reflect user privacy preferences. This mini-FAQ addresses some of the questions that I’ve received from Mozillans, web developers, and users.
… 

Electronic Privacy and Economic Choice

Critics of consumer privacy protections frequently invoke revealed preference as a justification for laissez-faire policy. If users really cared about their privacy, the argument goes, we should expect to see revolts against intrusive practices. A number of scholars have demonstrated pervasive information asymmetries1 and bounded rationality2 in consumer privacy choices; the decisions that users actually make about online privacy can hardly be expected to reflect their actual preferences.

But let’s suppose that consumers and online firms are fully informed and completely rational. The economic story that consumers value their privacy less than the marginal income from privacy intrusions is certainly consistent with market behavior.

We should not, however, conclude that the status quo is optimal. There is another congruent economic story, where privacy intrusions are inefficient but nevertheless result owing to transaction costs and competition barriers. This post relates the alternative economic story with two possible examples, then closes with policy implications.
… 

Advancing Empirical Legal Scholarship

Modern quantitative analysis has upended the social sciences and, in recent years, made exciting inroads with law. How complex are the nation’s statutes?1 Did a shift in Supreme Court voting dodge President Roosevelt’s court-packing plan?2 How do courts apply fair use doctrine in copyright cases?3 What factors determine the outcome of intellectual property litigation?4 Researchers have begun to answer these and many more questions through the use of empirical methodologies.

Academics have vaulted numerous hurdles to advance this far, including deep institutional siloing and specialization. But barriers do still exist, and one of the greatest remaining is, quite simply, data. There is no easy-to-get, easy-to-process compilation of America’s primary legal materials. In the status quo, researchers are compelled to spend far too much of their time foraging for datasets instead of conducting valuable analysis. Consequences include diminished scholarly productivity, scant uniformity among published works, and—most frustratingly—deterrence for prospective researchers.

My hope is to facilitate empirical legal scholarship by providing machine-readable primary legal materials. In this first release of data, I have prepared XML versions of the U.S. Code and opinions of the Supreme Court of the United States, through approximately early 2012. Subsequent releases may include additional primary legal materials. I would greatly appreciate feedback from the academic community, particularly with regards to the XML schema, text formatting, and prioritizing materials for release.

Update January 13, 2014: The data is now hosted on Amazon S3 in a requester pays bucket. If you have not properly configured your request, you will receive an “Access Denied” error.

United States Code: ZIP (110 MB)
Supreme Court of the United States Opinions: ZIP (348 MB)


Please note, this is a personal project. It is not related to my coursework or research at Stanford University.

1. Michael J. Bommarito II & Daniel M. Katz, A Mathematical Approach to the Study of the United States Code, 389 Physica A 4195 (2010), available at http://www.sciencedirect.com/science/article/pii/S0378437110004875.
2. Daniel E. Ho & Kevin M. Quinn, Did a Switch in Time Save Nine?, 2 J. Legal Analysis 69 (2010), available at http://jla.oxfordjournals.org/content/2/1/69.full.pdf.
3. Matthew Sag, Predicting Fair Use, 73 Ohio St. L.J. 47 (2012), available at http://moritzlaw.osu.edu/students/groups/oslj/files/2012/05/73.1.Sag_.pdf.
4. Mihai Surdeanu et al., Risk Analysis for Intellectual Property Litigation, Proc. 13th Int’l Conf. on Artificial Intelligence & L. 116 (2011), available at http://dl.acm.org/citation.cfm?id=2018375.