Presidential Identifying Information

Sunday’s New York Times included a story about how the presidential campaigns are making extensive use of third-party web trackers. In response to privacy concerns, “[o]fficials with both campaigns emphasize[d] that [tracking] data collection is ‘anonymous.’”1

The campaigns are wrong: tracking data is very often identified or identifiable. Arvind Narayanan has previously written a comprehensive and accessible explanation of why web tracking is hardly anonymous; my survey paper on web tracking provides more extensive discussion.

One of the ways in which web tracking data can become identified or identifiable is “leakage”—data flowing to trackers from the websites that users interact with. Leakage most commonly occurs when a website includes identifying information in a page URL or title. Embedded third parties receive the identifying information if they receive the URL (e.g. referrer headers) or the title (e.g. document.title). Even a little identifying information leakage thoroughly undermines the privacy properties of web tracking: once a user’s identity leaks to a tracker, all of the tracker’s past, present, and future data about the user becomes identifiable.

Web services frequently fail to account for information leakage in their design and testing; a study I conducted last year found that over half of popular websites were leaking identifying information.2 More than a few website operators have made inaccurate representations about the information they share with third parties; in just the past year the Federal Trade Commission settled deception claims against both Facebook and Myspace for falsely disclaiming identifying information leakage.

The Times coverage piqued my curiosity: Are the campaigns identifying their supporters to third-party trackers? Are they directly undermining the anonymity properties that they are so quick to invoke?

Yes, they are. I tested the two leading candidate websites using the methodology from my prior study of identifying information leakage. Both leak. The following sections describe my observations from the Barack Obama and Mitt Romney campaign websites.
… 

Safari Trackers

Apple’s Safari web browser is configured to block third-party cookies by default. We identified four advertising companies that unexpectedly place trackable cookies in Safari. Google and Vibrant Media intentionally circumvent Safari’s privacy feature. Media Innovation Group and PointRoll serve scripts that appear to be derived from circumvention example code.

In the interest of clearly establishing facts on the ground, this post provides technical analysis of Safari’s cookie blocking feature and the four companies’ practices. It does not address policy or legal issues. (More on that soon.)

Before proceeding further, I want to thank the countless friends and colleagues who provided invaluable feedback on this project. In particular: ★★★★★, whose insights have been vital at every step, and Ashkan Soltani, whose crawling data was instrumental in uncovering PointRoll’s practices and understanding the prevalence of cookie blocking circumvention.

… 

Tracking the Trackers: Where Everybody Knows Your Username

Original at the Stanford Center for Internet and Society.

Click the local Home Depot ad and your email address gets handed to a dozen companies monitoring you. Your web browsing, past, present, and future, is now associated with your identity. Swap photos with friends on Photobucket and clue a couple dozen more into your username. Keep tabs on your favorite teams with Bleacher Report and you pass your full name to a dozen again. This isn’t a 1984-esque scaremongering hypothetical. This is what’s happening today.

[Update 10/11: Since several readers have asked – this study was funded exclusively by Stanford University and research grants to the Stanford Security Lab. It was not supported by any advocacy organization.]

… 

Tracking the Trackers: Self-Help Tools

Original at the Stanford Center for Internet and Society.

A number of technologies have been touted to offer consumers control over third-party web tracking. This post reviews the tools that are available and presents empirical evidence on their effectiveness. Here are the key takeaways:

  1. Most desktop browsers currently do not support effective self-help tools. Mobile users are almost completely out of luck.
  2. Self-help tools vary substantially in performance.
  3. The most effective self-help tools block third-party advertising.

Following the usage model in the FTC staff’s 2010 preliminary online privacy report, this post is oriented towards the user who wants a simple, persistent, comprehensive solution such that with high confidence no third party collects her browsing history. We assume that some third-party trackers will use non-cookie tracking methods including supercookies and fingerprinting (e.g. Microsoft, KISSmetrics, Epic Marketplace, BlueCava, Interclick, Quantcast).

Thanks to Jovanni Hernandez and Akshay Jagadeesh for assisting with data collection, and to Arvind Narayanan and Peter Eckersley for input on drafts.

… 

Tracking the Trackers: Microsoft Advertising

Original at the Stanford Center for Internet and Society.

Despite all the attention they’ve received in the debates around online privacy, cookies are far from the only way to track a user. Broadly speaking, a website can either stash a unique identifier anyplace in the browser (“tagging”)1 or explore features of the browser until it becomes unique (“fingerprinting”).2 Tracking technologies that do not rely on cookies are often referred to as “supercookies,” and they are widely viewed as unsavory in the computer security community because they continue tracking even when a user clears her cookies to preserve privacy. Sometimes a site will use a supercookie to “respawn” its original identifier cookie, creating a “zombie cookie” — the basis of several lawsuits.

In one of our recent FourthParty web measurement crawls we included a cookie clearing step to emulate a user’s privacy choice. We observed that after clearing the browser’s cookies an identifier cookie (named “MUID” for “machine unique identifier”) respawned on live.com, a Microsoft domain. We dug into Microsoft’s cross-domain cookie syncing code and discovered two independent supercookie mechanisms, one of which was respawning cookies. We contacted Microsoft with our observations, and we have collaborated to assist in rectifying the issues we uncovered. Here is what we know.

Thanks, once again, to Jovanni Hernandez and Akshay Jagadeesh for their indispensable research assistance.

… 

Tracking the Trackers: The AdChoices Icon

Original at the Stanford Center for Internet and Society.

Jovanni Hernandez and Akshay Jagadeesh are the first authors of this study.

Responding to pressure from the Federal Trade Commission, in mid-2009 the largest advertising industry trade groups joined forces to develop a new self-regulatory program for behavioral advertising: the Digital Advertising Alliance (DAA). Like the parallel self-regulatory program for advertising networks, the Network Advertising Initiative (NAI), the DAA makes no promises about providing privacy choices: DAA members must only provide an opt out of seeing advertising that is based on tracking, not an opt out of tracking itself.1 As Chris Hoofnagle at Berkeley Law has noted on several occasions, the word “privacy” scarcely even appears in the DAA’s documents.

… 

FourthParty: A New Approach to Web Measurement

Original at the Stanford Center for Internet and Society.

Last week marked the twentieth anniversary of the public World Wide Web, and there is much to celebrate. The early web consisted of a few text pages linked together; the modern web supports audio, video, interactivity, complex storage, and even native applications. Both Microsoft and Google are now developing entire operating systems around web technologies.

Tools for measuring the web have not kept pace. Many studies still rely on HTTP header logging and static analysis of HTML, CSS, and JavaScript. Researchers who want to go beyond these simple tools are often forced to develop purpose-built software from scratch.

Today we’re releasing FourthParty, an open-source platform for web measurement. FourthParty is built on Mozilla Firefox and the Add-on SDK, making it fast, modular, easy to use, multi-platform, and up-to-date with the latest web technologies. And FourthParty is already generating research results: it’s the tool we’ve been using in our Tracking the Trackers studies (1, 2). To learn more and get started, visit fourthparty.info.