<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Web Policy</title>
	<atom:link href="http://webpolicy.org/feed/" rel="self" type="application/rss+xml" />
	<link>http://webpolicy.org</link>
	<description>a blog about technology, policy, and law</description>
	<lastBuildDate>Sat, 18 May 2013 23:38:05 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Advancing Empirical Legal Scholarship: Federal Appellate Opinions and Rules</title>
		<link>http://webpolicy.org/2013/05/03/advancing-empirical-legal-scholarship-federal-appellate-opinions-and-rules/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=advancing-empirical-legal-scholarship-federal-appellate-opinions-and-rules</link>
		<comments>http://webpolicy.org/2013/05/03/advancing-empirical-legal-scholarship-federal-appellate-opinions-and-rules/#comments</comments>
		<pubDate>Sat, 04 May 2013 00:00:57 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Empirical Law]]></category>
		<category><![CDATA[Legal Data]]></category>

		<guid isPermaLink="false">http://webpolicy.org/?p=558</guid>
		<description><![CDATA[Last December I shared XML versions of the U.S. Code and Supreme Court opinions through early 2012. My intent was and remains to facilitate empirical legal scholarship by providing government-authored materials in a machine-readable format. This post is accompanied by additional documents: opinions and rules of various federal appellate tribunals. As before, I welcome feedback [...]]]></description>
				<content:encoded><![CDATA[<p>Last December I <a href="http://webpolicy.org/2012/12/28/advancing-empirical-legal-scholarship/">shared</a> XML versions of the U.S. Code and Supreme Court opinions through early 2012. My intent was and remains to facilitate empirical legal scholarship by providing government-authored materials in a machine-readable format.</p>
<p>This post is accompanied by additional documents: opinions and rules of various federal appellate tribunals.  As before, I welcome feedback from the academic research community.<br />
<span id="more-558"></span><br />
United States Court of Appeals for the First Circuit Opinions: <a href="http://x.co/1stcircuit">ZIP (152 MB)</a><br />
United States Court of Appeals for the Second Circuit Opinions: <a href="http://x.co/2dcircuit">ZIP (311 MB)</a><br />
United States Court of Appeals for the Third Circuit Opinions: <a href="http://x.co/3dcircuit">ZIP (239 MB)</a><br />
United States Court of Appeals for the Fourth Circuit Opinions: <a href="http://x.co/4thcircuit">ZIP (190 MB)</a><br />
United States Court of Appeals for the Fifth Circuit Opinions: <a href="http://x.co/5thcircuit">ZIP (409 MB)</a><br />
United States Court of Appeals for the Sixth Circuit Opinions: <a href="http://x.co/6thcircuit">ZIP (244 MB)</a><br />
United States Court of Appeals for the Seventh Circuit Opinions: <a href="http://x.co/7thcircuit">ZIP (305 MB)</a><br />
United States Court of Appeals for the Eighth Circuit Opinions: <a href="http://x.co/8thcircuit">ZIP (263 MB)</a><br />
United States Court of Appeals for the Ninth Circuit Opinions: <a href="http://x.co/9thcircuit">ZIP (442 MB)</a><br />
United States Court of Appeals for the Tenth Circuit Opinions: <a href="http://x.co/10circuit">ZIP (211 MB)</a><br />
United States Court of Appeals for the Eleventh Circuit Opinions: <a href="http://x.co/11circuit">ZIP (180 MB)</a><br />
United States Court of Appeals for the District of Columbia Circuit Opinions: <a href="http://x.co/dccircuit">ZIP (209 MB)</a><br />
United States Court of Appeals for the Federal Circuit Opinions: <a href="http://x.co/fedcircuit">ZIP (164 MB)</a></p>
<p>Federal Rules of Appellate Procedure: <a href="http://x.co/fedrapp">ZIP (82 KB)</a><br />
First Circuit Bankruptcy Appellate Rules: <a href="http://x.co/1cirbrappr">ZIP (37 KB)</a><br />
Sixth Circuit Bankruptcy Appellate Rules: <a href="http://x.co/6cirbrappr">ZIP (29 KB)</a><br />
Eighth Circuit Bankruptcy Appellate Rules: <a href="http://x.co/8cirbrappr">ZIP (37 KB)</a><br />
Ninth Circuit Bankruptcy Appellate Rules: <a href="http://x.co/9cirbrappr">ZIP (86 KB)</a><br />
Tenth Circuit Bankruptcy Appellate Rules: <a href="http://x.co/10cbrappr">ZIP (37 KB)</a></p>
<p>United States Board of Immigration Appeals Opinions: <a href="http://x.co/usbia">ZIP (21 MB)</a></p>
<hr/>
<p>Please note, this is a personal project. It is not related to my coursework or research at Stanford University.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2013/05/03/advancing-empirical-legal-scholarship-federal-appellate-opinions-and-rules/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The New Firefox Cookie Policy</title>
		<link>http://webpolicy.org/2013/02/22/the-new-firefox-cookie-policy/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=the-new-firefox-cookie-policy</link>
		<comments>http://webpolicy.org/2013/02/22/the-new-firefox-cookie-policy/#comments</comments>
		<pubDate>Fri, 22 Feb 2013 17:56:36 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.org/?p=553</guid>
		<description><![CDATA[The default Firefox cookie policy will, beginning with release 22, more closely reflect user privacy preferences. This mini-FAQ addresses some of the questions that I’ve received from Mozillans, web developers, and users. How does the new Firefox cookie policy work? Roughly: Only websites that you actually visit can use cookies to track you across the [...]]]></description>
				<content:encoded><![CDATA[<p>The default Firefox cookie policy will, beginning with release 22, <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=818340">more closely reflect user privacy preferences</a>. This mini-FAQ addresses some of the questions that I’ve received from Mozillans, web developers, and users.</p>
<h3>How does the new Firefox cookie policy work?</h3>
<p>Roughly: Only websites that you actually visit can use cookies to track you across the web.</p>
<p>More precisely: If content has a first-party origin,<sup><a href="#firefox-cookie-policy-fn1" id="firefox-cookie-policy-fnref:1">1</a></sup> nothing changes. Content from a third-party origin only has cookie permissions if its origin already has at least one cookie set.</p>
<h3>How does Firefox’s new policy compare to the other major browsers?</h3>
<p><strong>Chrome</strong> &#8211; Allows all cookies.</p>
<p><strong>Internet Explorer</strong> &#8211; Cookie permissions <a href="http://msdn.microsoft.com/en-us/library/ms537343(v=vs.85).aspx">vary by P3P compact policy</a>. In practice, almost all third-party tracking cookies are allowed.<sup><a href="#firefox-cookie-policy-fn2" id="firefox-cookie-policy-fnref:2">2</a></sup></p>
<p><strong>Safari</strong> &#8211; First-party content has cookie permissions. Third-party content only has cookie permissions if the content already has at least one cookie set.</p>
<p>In short, the new Firefox policy is a slightly relaxed version of the Safari policy.<sup><a href="#firefox-cookie-policy-fn3" id="firefox-cookie-policy-fnref:3">3</a></sup></p>
<h3>Will the new Firefox policy break websites?</h3>
<p>Collateral impact should be limited. Safari’s cookie policy has been in place for over a decade, and it is included in both the desktop and iOS versions of the browser. A few websites may require a tiny code change to accommodate Firefox in the same way as Safari.</p>
<p>Just to be sure, the Mozilla privacy team is closely monitoring the policy before final <a href="https://www.mozilla.org/en-US/firefox/new/">release</a>. The patch will spend about 6 weeks each in the <a href="http://nightly.mozilla.org/">pre-alpha</a>, <a href="https://www.mozilla.org/en-US/firefox/aurora/">alpha</a>, and <a href="https://www.mozilla.org/en-US/firefox/beta/">beta</a> builds. If you spot any oddities, please report them to <a href="http://support.mozilla.org/en-US/home">Mozilla support</a>!</p>
<h3>How can I test whether my website has cookie permissions?</h3>
<p>Easy: try to set a cookie. This approach can introduce cookie permissions into both server-side and client-side code.</p>
<p>Browser sniffing is generally disfavored since it can be unreliable and requires updating. Moreover, sniffing will not accommodate Chrome and Internet Explorer users who have switched from the default cookie policy.</p>
<h3>I operate a third-party website that uses cookies. What should I do?</h3>
<p>If a Firefox user appears to have intentionally interacted with your content, take the same approach as for Safari users.<sup><a href="#firefox-cookie-policy-fn4" id="firefox-cookie-policy-fnref:4">4</a></sup> Examples of content within this category include Facebook apps and comment widgets where a user has typed text.</p>
<p>If a user does not seem to have intentionally interacted with your content, or if you’re uncertain, you should ask for permission before setting cookies. Most analytics services, advertising networks, and unclicked social widgets would come within this category.</p>
<p>In sum, working around the policy&#8217;s technical limits may be reasonable in certain cases, but undermining the policy&#8217;s privacy purpose is never acceptable.</p>
<h3>What happens to preexisting cookies?</h3>
<p>The new policy does not make any special provision for preexisting cookies. Current Firefox users should clear their cookies to fully benefit from the new policy.<sup><a href="#firefox-cookie-policy-fn5" id="firefox-cookie-policy-fnref:5">5</a></sup></p>
<h3>What comes next for the Firefox cookie policy?</h3>
<p>There’s still plenty of work to do. Some possible directions that I’m interested in:</p>
<ul>
<li>Extending the cookie policy to other storage technologies (e.g. <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=536509">HTML5 Web Storage</a>).</li>
<li>Providing a uniform mechanism for requesting storage permissions.</li>
<li>Relaxing the cookie policy for websites that honor <a href="http://donottrack.us/">Do Not Track</a>.</li>
</ul>
<p>Please share your ideas on the <a href="https://groups.google.com/forum/?fromgroups#!forum/mozilla.dev.privacy">mozilla.dev.privacy</a> mailing list!</p>
<hr />
<p>All views are solely my own. I do not speak for Mozilla.</p>
<p>This was my first contribution to the Firefox codebase. Huge thanks to <a href="http://www.sidstamm.com/">Sid Stamm</a>, Monica Chew, <a href="https://brendaneich.com/">Brendan Eich</a>, <a href="http://weblogs.mozillazine.org/asa/">Asa Dotzler</a>, <a href="http://www.joshmatthews.net/">Josh Matthews</a>, <a href="https://blog.mozilla.org/dolske/">Justin Dolske</a>, Daniel Veditz, and many other members of the Mozilla community for their advice, guidance, and tolerance of my inexperience.</p>
<p><span id="firefox-cookie-policy-fn1">1. </span> An origin is determined by <a href="http://publicsuffix.org/">public suffix</a> + 1. <a href="#firefox-cookie-policy-fnref:1" title="return to article" class="reversefootnote">&#160;&#8617;</a></p>
<p><span id="firefox-cookie-policy-fn2">2. </span> Many researchers have criticized Microsoft’s approach for being ineffective, convoluted, and relying on the de facto deprecated P3P standard. For background, see <em>Token Attempt: The Misrepresentation of Website Privacy Policies Through the Misuse of P3P Compact Policy Tokens</em> by Leon et al. <a href="#firefox-cookie-policy-fnref:2" title="return to article" class="reversefootnote">&#160;&#8617;</a></p>
<p><span id="firefox-cookie-policy-fn3">3. </span> The difference is primarily owing to engineering convenience. <a href="#firefox-cookie-policy-fnref:3" title="return to article" class="reversefootnote">&#160;&#8617;</a></p>
<p><span id="firefox-cookie-policy-fn4">4. </span> The most transparent practice is for you to redirect the user through your origin. You could also use a non-cookie storage technology, though alternatives may be limited by this policy in future. <a href="#firefox-cookie-policy-fnref:4" title="return to article" class="reversefootnote">&#160;&#8617;</a></p>
<p><span id="firefox-cookie-policy-fn5">5. </span> Conventional wisdom in the web privacy community is that users clear their cookies every few months. <a href="#firefox-cookie-policy-fnref:5" title="return to article" class="reversefootnote">&#160;&#8617;</a></p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2013/02/22/the-new-firefox-cookie-policy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Electronic Privacy and Economic Choice</title>
		<link>http://webpolicy.org/2013/01/28/electronic-privacy-and-economic-choice/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=electronic-privacy-and-economic-choice</link>
		<comments>http://webpolicy.org/2013/01/28/electronic-privacy-and-economic-choice/#comments</comments>
		<pubDate>Mon, 28 Jan 2013 17:00:07 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Economics]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.org/?p=549</guid>
		<description><![CDATA[Critics of consumer privacy protections frequently invoke revealed preference as a justification for laissez-faire policy. If users really cared about their privacy, the argument goes, we should expect to see revolts against intrusive practices. A number of scholars have demonstrated pervasive information asymmetries1 and bounded rationality2 in consumer privacy choices; the decisions that users actually [...]]]></description>
				<content:encoded><![CDATA[<p>Critics of consumer privacy protections frequently invoke revealed preference as a justification for laissez-faire policy. If users really cared about their privacy, the argument goes, we should expect to see revolts against intrusive practices. A number of scholars have demonstrated pervasive information asymmetries<sup><a href="#privacy_and_coercion_fn_1">1</a></sup> and bounded rationality<sup><a href="#privacy_and_coercion_fn_2">2</a></sup> in consumer privacy choices; the decisions that users actually make about online privacy can hardly be expected to reflect their actual preferences.</p>
<p>But let’s suppose that consumers and online firms are fully informed and completely rational. The economic story that consumers value their privacy less than the marginal income from privacy intrusions is certainly consistent with market behavior.</p>
<p>We should not, however, conclude that the status quo is optimal. There is another congruent economic story, where privacy intrusions are inefficient but nevertheless result owing to transaction costs and competition barriers. This post relates the alternative economic story with two possible examples, then closes with policy implications.<br />
<span id="more-549"></span><br />
<b>Facebook and Instagram</b></p>
<p>Consider the <a href="http://bits.blogs.nytimes.com/2012/12/20/instagram-does-about-face-reverts-to-previous-policy/">recent kerfuffle</a> over Instagram’s user agreement after Facebook acquired the company. An avid Instagram user may have significant concerns about how Facebook might use his or her likeness in advertising products to friends, and the value of those concerns to the user could well exceed the marginal value of new social advertising features to Facebook. The efficient (i.e. <a href="https://en.wikipedia.org/wiki/Kaldor%E2%80%93Hicks_efficiency">welfare-maximizing</a>) outcome would be for Facebook to maintain the preexisting Instagram user agreement.</p>
<p>In a conventional Coasean analysis, Facebook would choose to respect user privacy and extract the welfare gain by charging for its service. From a traditional competition standpoint, if Facebook were to make an inefficient decision to invade consumer privacy, a pro-privacy competitor would spring up and pilfer the site’s users. But what if there are significant transaction costs and competition barriers?<sup><a href="#privacy_and_coercion_fn_3">3</a></sup> If Facebook cannot realistically charge its users,<sup><a href="#privacy_and_coercion_fn_4">4</a></sup> and competition is limited,<sup><a href="#privacy_and_coercion_fn_5">5</a></sup> then Facebook’s income-maximizing choice is to inefficiently invade consumer privacy. So long as users value social networking on Facebook more than the associated privacy risks, they will continue using the service.</p>
<p><b>Behavioral Advertising</b></p>
<p>Behavioral advertising is another possible example. Users may value privacy in their online activities more than the marginal value of tracking-based advertising.<sup><a href="#privacy_and_coercion_fn_6">6</a></sup> In the absence of transaction costs, online services might do away with behavioral advertising and charge consumers for the content that they access. If there were no competition barriers, services that rely on behavioral advertising might be forced under by free, pro-privacy competitors. Depending on the sector of the online economy, however, a service may have significant transaction costs and competition barriers. The alternative economic story has a measure of predictive power: In some markets with high transaction costs and high barriers to competition (e.g. web search), behavioral advertising is an ordinary practice. Meanwhile, in some markets with low transaction costs and low barriers to competition (e.g. paid mobile apps), behavioral advertising is a rarity.</p>
<p><b>Policy Implications</b></p>
<p>If consumer privacy practices are inefficient, then privacy protections could be viewed as mechanisms for correcting structural market failures. Contemporary economic analysis has several lenses to offer:</p>
<ul>
<li><b>Internalizing externalities.</b> Online services visit negative privacy externalities upon users; privacy protections compel a service to internalize those externalities.
<li><b>Solving a collective action problem.</b> If users could collectively negotiate, they would require online services to adopt pro-privacy practices. Users cannot, of course, realistically organize and bind themselves for bargaining at the scale of a mammoth online service. Privacy regulation solves this collective action problem.
<li><b>Simulating competition.</b> Without competition barriers, online services would be compelled to adopt better privacy practices. Privacy protections stand in for absent effects of competition.
<li><b>Eliminating an inefficient and unnecessary subsidy.</b> Privacy regulations nix an unjustified payout to online services.
</ul>
<p>Consumer privacy decision making might also be properly characterized by <a href="https://www.princeton.edu/~tleonard/papers/coercion.pdf">two choice architecture frames</a>. In the stronger frame, the user is <b>coerced</b>: against a background of society where certain online services are a norm or requirement, the user has no real choice but to give up his or her privacy.<sup><a href="#privacy_and_coercion_fn_7">7</a></sup> In the weaker frame, the user is <b>exploited</b>: the user has no baseline statistical expectation or moral claim of using an online service, but the value substantially exceeds the privacy risks, and the service would willingly provide functionality without privacy intrusions. If these views are accurate, privacy regulation would constitute a legitimate prohibition against consumer coercion or exploitation.</p>
<p><b>Parting Thoughts</b></p>
<p>Privacy reform proponents are quick to cite information asymmetry and bounded rationality as justifications for policy intervention. And they should: the body of research evidence supporting those views is substantial. My aim with this piece is to demonstrate the availability of a second set of arguments, grounded in conventional economics of transaction costs and competition barriers, that would also justify privacy regulation.</p>
<p>If users care about their privacy, why don’t they act like it? Actually, it’s quite possible that they do.</p>
<hr/>
<p>Thanks to <a href="http://www.cs.princeton.edu/~felten/">Ed Felten</a> and <a href="http://randomwalker.info/">Arvind Narayanan</a> for comments on an early draft. All views and errors are solely my own.</p>
<p><a name="privacy_and_coercion_fn_1"></a>1. For background on information asymmetry in consumer privacy choice, I recommend beginning with work by <a href="http://lorrie.cranor.org/">Lorrie Cranor</a> and <a href="http://aleecia.com/">Aleecia McDonald</a>. </p>
<p><a name="privacy_and_coercion_fn_2"></a>2. I similarly recommend research by <a href="http://www.heinz.cmu.edu/~acquisti/">Alessandro Acquisti</a> and <a href="http://people.ischool.berkeley.edu/~jensg/">Jens Grossklags</a> for an introduction to bounded rationality in consumer privacy choice.</p>
<p><a name="privacy_and_coercion_fn_3"></a>3. A more formal treatment of the two economic stories follows. Assume a user values an online service at S > 0 and his or her marginal privacy on that service at P > 0. An online service marginally values the privacy intrusion at I > 0 and has a baseline income from providing functionality of B. In the oft-invoked revealed preference story, I > P, and privacy regulation imposes a societal loss of I &#8211; P. In this alternative economic story, P > I, and lack of privacy regulation imposes a societal loss of P &#8211; I. Where there are transaction costs, a transfer is not possible; the only outcomes are (S + P, B) and (S, B + I). In the absence of competition, the service will select an outcome based solely on income maximization. A combination of transaction costs and competition barriers, then, will cause an online service to always invade privacy when I is positive—no matter the relative magnitude of P.</p>
<p>A brief side note: there are three other analytical scenarios worth mentioning.</p>
<ul>
<li><b>No transaction costs, no competition barriers.</b> The user would transfer to the service between S + P (the user’s reservation price) and -B (the service’s reservation price). Owing to competitive pressure, the transfer should be closer to -B.
<li><b>No transaction costs, competition barriers.</b> The user would transfer to the service between S + P (the user’s reservation price) and -B (the service’s reservation price). Since there is no competition, the transfer should be closer to S + P.
<li><b>Transaction costs, no competition barriers.</b> We would expect an equilibrium respecting privacy where B > 0, and intruding upon privacy where 0 > B. Intuitively, if a pro-privacy competitor would be profitable, it would emerge and undercut the service.
</ul>
<p><a name="privacy_and_coercion_fn_4"></a>4. There are a number of reasons why Facebook cannot, in practice, charge for its service. A few of the leading considerations:</p>
<ul>
<li><b>Network effects.</b> A social network’s value is bound up in the size and engagement of its user base. While some users might pay for privacy, others would not or could not. If Facebook is unable to differentiate between the users it can and cannot charge, then it has to give away the service for free to preserve the value of the social network.
<li><b>Past promises.</b> Facebook has frequently reaffirmed that its service will always be free. The current landing page, in fact, reads: &#8220;It’s free and always will be.&#8221; Violating past promises of free service could have significant legal and business implications.
<li><b>Transaction burdens.</b> Beyond the immediate financial costs, consumers must also incur financial management costs associated with keeping up with a monthly service. From Facebook’s perspective, the firm would have to divert precious attention and resources to developing an unprecedented subscription billing capacity.
</ul>
<p>The consumer psychology of free vs. paid products is, to be sure, a dominant factor. For purposes of this post, however, set aside the bounded rationality limitation.</p>
<p><a name="privacy_and_coercion_fn_5"></a>5. Many <a href="http://online.wsj.com/article/SB10001424052748704635704575604993311538482.html">authors</a> and <a href="http://www.theatlantic.com/technology/archive/2012/05/the-case-for-facebook/257767/">investors</a> have argued that Facebook holds something of a monopoly in social networking.</p>
<p><a name="privacy_and_coercion_fn_6"></a>6. For a discussion of the economics of third-party behavioral advertising, see Part VI of &#8220;<a href="https://www.stanford.edu/~jmayer/papers/trackingsurvey12.pdf">Third-Party Web Tracking: Policy and Technology</a>.&#8221;</p>
<p><a name="privacy_and_coercion_fn_7"></a>7. The notion of moral rights in online services is <a href="https://en.wikipedia.org/wiki/Right_to_Internet_access">hotly contested</a>. I do not mean to take a position on the issue here.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2013/01/28/electronic-privacy-and-economic-choice/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Advancing Empirical Legal Scholarship</title>
		<link>http://webpolicy.org/2012/12/28/advancing-empirical-legal-scholarship/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=advancing-empirical-legal-scholarship</link>
		<comments>http://webpolicy.org/2012/12/28/advancing-empirical-legal-scholarship/#comments</comments>
		<pubDate>Sat, 29 Dec 2012 01:00:27 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Empirical Law]]></category>
		<category><![CDATA[Legal Data]]></category>

		<guid isPermaLink="false">http://webpolicy.org/?p=546</guid>
		<description><![CDATA[Modern quantitative analysis has upended the social sciences and, in recent years, made exciting inroads with law. How complex are the nation&#8217;s statutes?1 Did a shift in Supreme Court voting dodge President Roosevelt&#8217;s court-packing plan?2 How do courts apply fair use doctrine in copyright cases?3 What factors determine the outcome of intellectual property litigation?4 Researchers [...]]]></description>
				<content:encoded><![CDATA[<p>Modern quantitative analysis has upended the social sciences and, in recent years, made exciting inroads with law. How complex are the nation&#8217;s statutes?<sup><a href="#empirical_legal_scholarship_fn_1">1</a></sup> Did a shift in Supreme Court voting dodge President Roosevelt&#8217;s court-packing plan?<sup><a href="#empirical_legal_scholarship_fn_2">2</a></sup> How do courts apply fair use doctrine in copyright cases?<sup><a href="#empirical_legal_scholarship_fn_3">3</a></sup> What factors determine the outcome of intellectual property litigation?<sup><a href="#empirical_legal_scholarship_fn_4">4</a></sup> Researchers have begun to answer these and many more questions through the use of empirical methodologies.</p>
<p>Academics have vaulted numerous hurdles to advance this far, including deep institutional siloing and specialization. But barriers do still exist, and one of the greatest remaining is, quite simply, data. There is no easy-to-get, easy-to-process compilation of America&#8217;s primary legal materials. In the status quo, researchers are compelled to spend far too much of their time foraging for datasets instead of conducting valuable analysis. Consequences include diminished scholarly productivity, scant uniformity among published works, and—most frustratingly—deterrence for prospective researchers.</p>
<p>My hope is to facilitate empirical legal scholarship by providing machine-readable primary legal materials. In this first release of data, I have prepared XML versions of the U.S. Code and opinions of the Supreme Court of the United States, through approximately early 2012. Subsequent releases may include additional primary legal materials. I would greatly appreciate feedback from the academic community, particularly with regards to the XML schema, text formatting, and prioritizing materials for release.</p>
<p>United States Code: <a href="http://x.co/uscode">ZIP (110 MB)</a><br />
Supreme Court of the United States Opinions: <a href="http://x.co/scotusopns">ZIP (348 MB)</a></p>
<hr />
<p>Please note, this is a personal project. It is not related to my coursework or research at Stanford University.</p>
<p><a name="empirical_legal_scholarship_fn_1"></a>1. Michael J. Bommarito II &#038; Daniel M. Katz, <i>A Mathematical Approach to the Study of the United States Code</i>, 389 <span style="font-variant:small-caps">Physica A</span> 4195 (2010), <i>available at</i> <a href="http://www.sciencedirect.com/science/article/pii/S0378437110004875">http://www.sciencedirect.com/science/article/pii/S0378437110004875</a>.<br />
<a name="empirical_legal_scholarship_fn_2"></a>2. Daniel E. Ho &#038; Kevin M. Quinn, <i>Did a Switch in Time Save Nine?</i>, 2 <span style="font-variant:small-caps">J. Legal Analysis</span> 69 (2010), <i>available at</i> <a href="http://jla.oxfordjournals.org/content/2/1/69.full.pdf">http://jla.oxfordjournals.org/content/2/1/69.full.pdf</a>.<br />
<a name="empirical_legal_scholarship_fn_3"></a>3. Matthew Sag, <i>Predicting Fair Use</i>, 73 <span style="font-variant:small-caps">Ohio St. L.J.</span> 47 (2012), <i>available at</i> <a href="http://moritzlaw.osu.edu/students/groups/oslj/files/2012/05/73.1.Sag_.pdf">http://moritzlaw.osu.edu/students/groups/oslj/files/2012/05/73.1.Sag_.pdf</a>.<br />
<a name="empirical_legal_scholarship_fn_4"></a>4. Mihai Surdeanu et al., <i>Risk Analysis for Intellectual Property Litigation</i>, <span style="font-variant:small-caps">Proc. 13th Int&#8217;l Conf. on Artificial Intelligence &#038; L.</span> 116 (2011), <i>available at</i> <a href="http://nlp.stanford.edu/pubs/icail11.pdf">http://dl.acm.org/citation.cfm?id=2018375</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2012/12/28/advancing-empirical-legal-scholarship/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Presidential Identifying Information</title>
		<link>http://webpolicy.org/2012/11/01/presidential-identifying-information/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=presidential-identifying-information</link>
		<comments>http://webpolicy.org/2012/11/01/presidential-identifying-information/#comments</comments>
		<pubDate>Thu, 01 Nov 2012 15:59:59 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Anonymity]]></category>
		<category><![CDATA[Measurement]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.org/?p=543</guid>
		<description><![CDATA[Sunday&#8217;s New York Times included a story about how the presidential campaigns are making extensive use of third-party web trackers. In response to privacy concerns, &#8220;[o]fficials with both campaigns emphasize[d] that [tracking] data collection is &#8216;anonymous.&#8217;&#8221;1 The campaigns are wrong: tracking data is very often identified or identifiable. Arvind Narayanan has previously written a comprehensive [...]]]></description>
				<content:encoded><![CDATA[<p>Sunday&rsquo;s New York Times included a <a href="http://www.nytimes.com/2012/10/28/us/politics/tracking-clicks-online-to-try-to-sway-voters.html">story</a> about how the presidential campaigns are making extensive use of third-party web trackers. In response to privacy concerns, &ldquo;[o]fficials with both campaigns emphasize[d] that [tracking] data collection is &lsquo;anonymous.&rsquo;&rdquo;<sup><a href="#presidential_identifying_information_footnote_1">1</a></sup></p>
<p>The campaigns are wrong: tracking data is very often identified or identifiable. Arvind Narayanan has previously written a comprehensive and accessible <a href="http://cyberlaw.stanford.edu/blog/2011/07/there-no-such-thing-anonymous-online-tracking">explanation</a> of why web tracking is hardly anonymous; my <a href="https://stanford.edu/~jmayer/papers/trackingsurvey12.pdf">survey paper</a> on web tracking provides more extensive discussion.</p>
<p>One of the ways in which web tracking data can become identified or identifiable is &ldquo;leakage&rdquo;&mdash;data flowing to trackers from the websites that users interact with. Leakage most commonly occurs when a website includes identifying information in a page URL or title. Embedded third parties receive the identifying information if they receive the URL (e.g. <a href="https://en.wikipedia.org/wiki/HTTP_referer">referrer headers</a>) or the title (e.g. <a href="https://developer.mozilla.org/en-US/docs/DOM/document.title"><code>document.title</code></a>). Even a little identifying information leakage thoroughly undermines the privacy properties of web tracking: once a user&rsquo;s identity leaks to a tracker, all of the tracker&rsquo;s past, present, and future data about the user becomes identifiable.</p>
<p>Web services frequently fail to account for information leakage in their design and testing; a <a href="https://cyberlaw.stanford.edu/blog/2011/10/tracking-trackers-where-everybody-knows-your-username">study</a> I conducted last year found that over half of popular websites were leaking identifying information.<sup><a href="#presidential_identifying_information_footnote_2">2</a></sup> More than a few website operators have made inaccurate representations about the information they share with third parties; in just the past year the Federal Trade Commission settled deception claims against both <a href="http://www.ftc.gov/os/caselist/0923184/">Facebook</a> and <a href="http://www.ftc.gov/os/caselist/1023058/index.shtm">Myspace</a> for falsely disclaiming identifying information leakage.</p>
<p>The Times coverage piqued my curiosity: Are the campaigns identifying their supporters to third-party trackers? Are they directly undermining the anonymity properties that they are so quick to invoke?</p>
<p>Yes, they are. I tested the two leading candidate websites using the methodology from my <a href="https://cyberlaw.stanford.edu/blog/2011/10/tracking-trackers-where-everybody-knows-your-username">prior study</a> of identifying information leakage. Both leak. The following sections describe my observations from the Barack Obama and Mitt Romney campaign websites.<br />
<span id="more-543"></span><br />
<b>Barack Obama</b></p>
<ul>
<li><b>Username.</b> Several pages include the username in their URL or title, including the user preferences page, the social organizing &ldquo;Dashboard&rdquo; profile page, the Dashboard profile editing page, and the Dashboard personal statistics page.<sup><a href="#presidential_identifying_information_footnote_3">3</a></sup>
<p><center></p>
<div style="border: 2px gray solid; padding: 15px 15px 0px 15px; display: inline-block;">
<a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/obama_edit_preferences.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/obama_edit_preferences.png" width="225px"></img></a>&nbsp;&nbsp;&nbsp;<a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/obama_profile.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/obama_profile.png" width="225px"></img></a></p>
<p><a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/obama_edit_profile.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/obama_edit_profile.png" width="225px"></img></a>&nbsp;&nbsp;&nbsp;<a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/obama_numbers.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/obama_numbers.png" width="225px"></img></a></p>
<p>
A sample of pages that include the username in their URL or title.
</p>
</div>
<p></center></p>
<p>In my testing, username leaked to ten companies.<sup><a href="#presidential_identifying_information_footnote_4">4</a></sup></p>
<p>A username is <a href="https://cyberlaw.stanford.edu/blog/2011/10/tracking-trackers-where-everybody-knows-your-username">often personally identifying</a>. It might simply be a user&rsquo;s name, or it could enable linking other public accounts and information about the user. Several companies have already deployed effective username linkage in their products.</p>
<p>The default username selection on <code>barackobama.com</code> facilitates identifying users. If the user registers with a Facebook account, the username defaults to the user&rsquo;s name in dot-separated format (e.g. <code>leland.stanford</code>). If the user signs up with just an email address, the default username is the first part of the user&rsquo;s email address&mdash;which will often be some form of the user&rsquo;s name or a fanciful username shared with other services.</p>
<p>The design of the Dashboard website also enables connecting a username to a user&rsquo;s identity. Any signed-in user (including someone trying to identify tracking data) can look up a user&rsquo;s profile from their username. Unless a user opts out, their profile page will include their name.</p>
<p><center></p>
<div style="border: 2px gray solid; padding: 15px 15px 0px 15px; display: inline-block;">
<a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/obama_other_user_profile_page.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/obama_other_user_profile_page.png" width="225px"></img></a>&nbsp;&nbsp;&nbsp;<a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/obama_edit_profile_name_option.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/obama_edit_profile_name_option.png" width="225px"></a></p>
<p>
Left: Logged-in view of another user&rsquo;s Dashboard profile page.<br />
Right: Option to display last initial instead of last name on the user&rsquo;s profile page.
</p>
</div>
<p></center></p>
<li><b>Name.</b> The title of the Dashboard profile page incorporates the user&rsquo;s name. A script on the page reports impression information to Chartbeat, including the page title.<sup><a href="#presidential_identifying_information_footnote_5">5</a></sup>
<p><center></p>
<div style="border: 2px gray solid; padding: 15px 15px 0px 15px; display: inline-block;">
<a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/obama_profile_page_title.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/obama_profile_page_title.png" width="225px"></img></a></p>
<p>
A user&rsquo;s Dashboard profile page.
</p>
</div>
<p></center></p>
<li><b>Street Address and ZIP Code.</b> If a user searches for an organizing team in Dashboard, the results page includes the query street address and ZIP code in its URL. It appears new Dashboard users are required to search for a team.
<p><center></p>
<div style="border: 2px gray solid; padding: 15px 15px 0px 15px; display: inline-block;">
<a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/obama_dashboard_join.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/obama_dashboard_join.png" width="225px"></img></a>&nbsp;&nbsp;&nbsp;<a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/obama_team_search.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/obama_team_search.png" width="225px"></img></a></p>
<p>
Left: Dashboard landing page.<br />
Right: Searching for a team in Dashboard.
</p>
</div>
<p></center></p>
<p>Similarly, the results page for finding an event includes the query ZIP code in its URL.</p>
<p><center></p>
<div style="border: 2px gray solid; padding: 15px 15px 0px 15px; display: inline-block;">
<a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/obama_event_search.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/obama_event_search.png" width="225px"></img></a></p>
<p>
Searching for an event.
</p>
</div>
<p></center></p>
<p>I spotted the street address and ZIP code leaking to nine companies, and just the ZIP code leaking to one other company.<sup><a href="#presidential_identifying_information_footnote_6">6</a></sup>
</ul>
<p><b>Mitt Romney</b></p>
<ul>
<li><b>Name.</b> The post-login landing page and most preference pages include the user&rsquo;s name in their title.
<p><center></p>
<div style="border: 2px gray solid; padding: 15px 15px 0px 15px; display: inline-block;">
<a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_landing.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_landing.png" width="225px"></img></a>&nbsp;&nbsp;&nbsp;<a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_edit_account.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_edit_account.png" width="225px"></img></a></p>
<p><a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_edit_personal.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_edit_personal.png" width="225px"></img></a>&nbsp;&nbsp;&nbsp;<a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_edit_volunteer.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_edit_volunteer.png" width="225px"></img></a></p>
<p>
A post-login landing page (email signup) and a sample of preference pages.
</p>
</div>
<p></center></p>
<p>Scripts from two companies collect the page title as part of their impression reporting.<sup><a href="#presidential_identifying_information_footnote_7">7</a></sup></p>
<li><b>Partial Email Address.</b> If a user registers with their Facebook account, the post-login landing page URL incorporates the first part of the user&rsquo;s email address (with non-alphanumeric characters removed).
<p><center></p>
<div style="border: 2px gray solid; padding: 15px 15px 0px 15px; display: inline-block;">
<a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_landing_facebook.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_landing_facebook.png" width="225px"></img></a></p>
<p>
Mock-up of a post-login landing page (Facebook signup).
</p>
</div>
<p></center></p>
<p>Thirteen companies received the partial email address.<sup><a href="#presidential_identifying_information_footnote_8">8</a></sup></p>
<li><b>User ID.</b> Many pages include a unique user ID in their URL, which leaks to the same companies.<sup><a href="#presidential_identifying_information_footnote_9">9</a></sup>
<p><center></p>
<div style="border: 2px gray solid; padding: 15px 15px 0px 15px; display: inline-block;">
<a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_edit_account_id.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_edit_account_id.png" width="225px"></img></a>&nbsp;&nbsp;&nbsp;<a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_edit_personal_id.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_edit_personal_id.png" width="225px"></img></a></p>
<p><a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_edit_communities_id.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_edit_communities_id.png" width="225px"></img></a>&nbsp;&nbsp;&nbsp;<a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_edit_payment_id.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_edit_payment_id.png" width="225px"></img></a></p>
<p>
A sample of pages that include the user ID in their URL.
</p>
</div>
<p></center></p>
<p>The ID itself is not identifying information, and <code>mittromney.com</code> does not provide social functionality that would facilitate mapping a user ID to a user&rsquo;s name. It appears, however, that a quirk in <code>mittromney.com</code> can allow anyone (even not logged in) to determine a user&rsquo;s name from their ID. If the user has recently visited a URL that includes their user ID, anyone who visits that URL in the following (very roughly) fifteen minutes can view the user&rsquo;s name in the page heading.<sup><a href="#presidential_identifying_information_footnote_10">10</a></sup></p>
<p><center></p>
<div style="border: 2px gray solid; padding: 15px 15px 0px 15px; display: inline-block;">
<a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_profile_page.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_profile_page.png" width="225px"></img></a></p>
<p>
Logged-out view of a recent user&rsquo;s profile editing page.
</p>
</div>
<p></center></p>
<p>A tracker could identify users by waiting for them to land on one of these URLs, then visiting it and extracting the user&rsquo;s name. Alternatively, anyone in possession of tracking data could periodically test these user ID URLs.</p>
<p><b>ZIP Code.</b> The results page for an events search includes the query ZIP code in its URL.<sup><a href="#presidential_identifying_information_footnote_11">11</a></sup></p>
<p><center></p>
<div style="border: 2px gray solid; padding: 15px 15px 0px 15px; display: inline-block;">
<a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_events_search.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/presidential_leakage_study/romney_events_search.png" width="225px"></img></a></p>
<p>
Results page for an events search.
</p>
</div>
<p></center></p>
<p>The ZIP code leaked to the same companies as the partial email address and user ID.
</ul>
<p><b>Takeaways</b></p>
<p>The major presidential campaigns both fell short of best practices in their website design and testing, and they both misrepresented their privacy practices to the Times. The Gray Lady also deserves a light rap on the knuckles for insufficiently scrutinizing the campaigns&rsquo; anonymity assertions.</p>
<p>But, in my view, the greatest takeaway is that the myth of web tracking&rsquo;s anonymity has proven remarkably resilient&mdash;despite compelling research results and practical experience to the contrary. Companies and trade groups in the tracking business community frequently invoke unfounded claims of anonymity. Policymakers, website operators, and journalists all-too-often repeat those claims&mdash;even, apparently, when they&rsquo;re of the highest caliber.</p>
<p>My hope is that this episode serves as a learning opportunity and a reminder: there is no such thing as anonymous web tracking.</p>
<hr/>
Thanks to <a href="http://www.cs.princeton.edu/~felten/">Ed Felten</a> and <a href="http://randomwalker.info/">Arvind Narayanan</a> for valuable comments on a draft. All errors are solely my own.</p>
<p><a name="presidential_identifying_information_footnote_1"></a>1. An Obama campaign spokesman went even further, asserting &ldquo;[w]e do not provide any personal information to outside entities.&rdquo; The <code>barackobama.com</code> <a href="http://www.barackobama.com/privacy-policy">privacy policy</a> disclaims responsibility for third-party data collection. The <code>mittromney.com</code> <a href="http://www.mittromney.com/privacy">privacy policy</a> reserves unfettered discretion to share information with third parties. Strangely, it also includes a provision about &ldquo;opt[ing] out from our cookies&rdquo; and other information collected by the website&mdash;and then provides a link to opt out of Google&rsquo;s third-party advertising cookies.</p>
<p><a name="presidential_identifying_information_footnote_2"></a>2. <a href="http://www2.research.att.com/~bala/papers/">Balachander Krishnamurthy</a>, <a href="http://web.cs.wpi.edu/~cew/">Craig Wills</a>, and colleagues conducted the seminal studies of identifying information leakage (<a href="http://www2.research.att.com/~bala/papers/pfp-imc06.pdf">1</a>, <a href="http://www2.research.att.com/~bala/papers/soups07.pdf">2</a>, <a href="http://www2.research.att.com/~bala/papers/wosn09.pdf">3</a>, <a href="http://www2.research.att.com/~bala/papers/pmob.pdf">4</a>, <a href="http://www2.research.att.com/~bala/papers/w2sp11.pdf">5</a>).</p>
<p><a name="presidential_identifying_information_footnote_3"></a>3. For example, respectively, <code>https://www.barackobama.com/account/robber.baron</code>, <code>https://dashboard.barackobama.com/people/robber.baron</code>, <code>https://dashboard.barackobama.com/people/robber.baron/edit</code>, and <code>https://dashboard.barackobama.com/people/robber.baron/numbers</code>.</p>
<p><a name="presidential_identifying_information_footnote_4"></a>4. The companies were: Akamai (CDN used by Chartbeat), Amazon (Amazon Web Services used by the campaign and New Relic), BrightTag, Chartbeat, Facebook, Google (Analytics, DoubleClick, and Hosted Libraries), Hoefler &#038; Frere-Jones (typography.com), New Relic, Think Realtime, and Zendesk. Here and throughout this post I err on the side of comprehensiveness in listing third parties that receive data. Opinions differ on the privacy risks associated with various service providers (e.g. Akamai, Amazon Web Services, and Google Analytics). My intent is not to take a position on that issue, but rather, convey sufficient information to satisfy readers across the spectrum of views.</p>
<p><a name="presidential_identifying_information_footnote_5"></a>5. The page title might be, for example, <code>Dashboard - Leland Stanford</code>. The Chartbeat script would report back with a URL like <code> https://ping.chartbeat.net/ping?...i=Leland%Stanford%20-%20Dashboard...</code>.</p>
<p><a name="presidential_identifying_information_footnote_6"></a>6. The results page for a Dashboard teams search has a URL formatted like <code>https://dashboard.barackobama.com/teams/match?street=353+Serra+Mall&#038;zip=94305...</code>. The results page URL for an events search has a format like <code>https://my.barackobama.com/page/event/search_results?...zip_radius%5B0%5D=94305</code>. I observed the street address and ZIP code leak to: Akamai (CDN used by Chartbeat), Amazon (Amazon Web Services used by the campaign and New Relic), Chartbeat, Facebook, Google (Analytics), Hoefler &#038; Frere-Jones (typography.com), New Relic, Optimizely, and Zendesk. ZIP code also leaked to BrightTag and Google (Maps API).</p>
<p><a name="presidential_identifying_information_footnote_7"></a>7. The post-login landing page title has the form <code>Leland Stanford | Mitt Romney for President</code>. A ShareThis script reports back to a URL like <code>https://l.sharethis.com/pview?...title=Leland%Stanford%20%7C%20Mitt%20Romney%20for%20President...</code>, and a Syncapse script contacts a URL like <code>https://cn.clickable.net/?...title=Leland%20Stanford%20%7C%20Mitt%20Romney%20for%20President...</code>.</p>
<p><a name="presidential_identifying_information_footnote_8"></a>8. An example post-login landing page URL for a Facebook user: <code>https://www.mittromney.com/users/lelandstanford</code>. The thirteen companies who received the first portion of the user&rsquo;s email address were: Adobe (Typekit), Akamai (hosting used by the campaign), Amazon (Amazon Web Services used by New Relic and Search Discovery), Compete, comScore (Scorecard Research), Facebook, Google (Ad Services and DoubleClick), Lotame, New Relic, Optimizely, Search Discovery, ShareThis, and Syncapse.</p>
<p><a name="presidential_identifying_information_footnote_9"></a>9. No matter how a user registers, many pages include a unique user ID in their URL. A preferences page, for example, might have the URL <code>https://www.mittromney.com/user/123456789/edit</code>. If the user signs up without a social network login, the post-login landing page has the generic URL <code>https://www.mittromney.com/user</code>. If the user signs up with a Facebook or Twitter account, however, the landing page URL also includes a unique user ID&mdash;but assigned with a different scheme. The Facebook ID allocation system is discussed above; Twitter post-login URLs take the form <code>https://www.mittromney.com/users/12345</code>.</p>
<p><a name="presidential_identifying_information_footnote_10"></a>10. My best hypothesis is that this property arises from a caching misconfiguration; page content is correctly dynamic between users, but page headings are incorrectly cached for a period independent of user permissions.</p>
<p><a name="presidential_identifying_information_footnote_11"></a>11. Thanks to <a href="http://topics.nytimes.com/topics/reference/timestopics/people/s/natasha_singer/index.html">Natasha Singer</a> for identifying ZIP code leakage on <code>mittromney.com</code>.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2012/11/01/presidential-identifying-information/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Trouble with ID Cookies: Why Do Not Track Must Mean Do Not Collect</title>
		<link>http://webpolicy.org/2012/08/10/the-trouble-with-id-cookies-why-do-not-track-must-mean-do-not-collect/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=the-trouble-with-id-cookies-why-do-not-track-must-mean-do-not-collect</link>
		<comments>http://webpolicy.org/2012/08/10/the-trouble-with-id-cookies-why-do-not-track-must-mean-do-not-collect/#comments</comments>
		<pubDate>Fri, 10 Aug 2012 19:00:50 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.org/?p=533</guid>
		<description><![CDATA[Original at the Stanford Center for Internet and Society. Co-authored by Arvind Narayanan. The debate over the meaning of Do Not Track has raged for well over a year now. The primary forum is the W3C Tracking Protection Working Group, with frequent sparring in the press and capitals worldwide. There are, broadly, two Do Not [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at the <a href="https://cyberlaw.stanford.edu/blog/2012/08/trouble-id-cookies-why-do-not-track-must-mean-do-not-collect">Stanford Center for Internet and Society</a>.</em></p>
<p><em>Co-authored by <a href="http://randomwalker.info/">Arvind Narayanan</a>.</em></p>
<p>The debate over the meaning of <a href="http://donottrack.us/">Do Not Track</a> has raged for well over a year now. The primary forum is the W3C <a href="http://www.w3.org/2011/tracking-protection/">Tracking Protection Working Group</a>, with frequent sparring in the press and capitals worldwide. There are, broadly, two Do Not Track proposals: <a href="http://www.aboutads.info/principles/">one chiefly backed by the ad industry</a>, and <a href="http://jonathanmayer.github.com/dnt-compromise/compromise-proposal.html">another advanced by privacy advocates</a> [1]. These proposals reflect vastly different visions for Do Not Track with vastly different practical consequences. The two sides have unsurprisingly been at loggerheads, with scant movement towards resolution of the key issues.<br />
<span id="more-533"></span><br />
The <a href="https://cyberlaw.stanford.edu/blog/2011/11/brief-overview-supplementary-daa-principles">ad industry position</a> is, and has been for over a decade, that data collection and retention should be largely unfettered so long as they are associated with a permitted business use [2]. At present these permitted-use exceptions totally swallow the rule, in practice barring little more than behavioral advertisement targeting (<a href="https://cyberlaw.stanford.edu/blog/2011/11/brief-overview-supplementary-daa-principles">1</a>, <a href="http://commerce.senate.gov/public/?a=Files.Serve&amp;File_id=4c73aa3c-5626-42d6-b6fe-31e3ec6ad1ca">2</a>). (Critics often deride the status quo as “Do Not Target.”) <b id="internal-source-marker_0.3988948173355311" style="font-weight: normal; ">A <a href="http://lists.w3.org/Archives/Public/public-tracking/2012Jun/0232.html">recent proposal</a> by Yahoo! would add, in our view,  <a href="http://lists.w3.org/Archives/Public/public-tracking/2012Jun/0233.html">only modest transparency requirements</a> to the industry position.</p>
<p>But suppose the advertising industry were to meaningfully tighten its permitted uses and retention periods. Would privacy advocates, academics, and policymakers continue to object?</p>
<p>Yes. The industry approach to Do Not Track entirely misses the most serious privacy concerns associated with tracking, including:</p>
<p><span style="font-weight: bold; ">Sensitive information.</span> A user’s browsing history can include remarkably sensitive information, such as medical conditions and financial challenges (e.g. <a href="http://www2.research.att.com/~bala/papers/w2sp11.pdf">1</a>, <a href="http://mmt.me.uk/blog/2010/11/21/nhs-and-tracking/">2</a>). Individual users are often identified or easily identifiable (<a href="https://cyberlaw.stanford.edu/node/6701">1</a>, <a href="https://cyberlaw.stanford.edu/node/6740">2</a>, <a href="http://www2.research.att.com/~bala/papers/w2sp11.pdf">3</a>).</p>
<p><span style="font-weight: bold; ">Lack of consumer control.</span> Users are generally unaware of who’s tracking them and how. Existing consumer control tools are <a href="http://www.cylab.cmu.edu/research/techreports/2011/tr_cylab11017.html">difficult to discover and use</a>, and they <a href="https://cyberlaw.stanford.edu/node/6730">vary significantly in effectiveness</a>.</p>
<p><span style="font-weight: bold; ">Lack of market pressure.</span> Since consumers are unaware of and lack control over tracking, third-party websites are under limited pressure to implement adequate security and privacy protections. Furthermore, many third parties are small, young, growth-oriented companies; security and privacy are not priorities.</p>
<p><span style="font-weight: bold; ">Surveillance.</span> Government requests for data stored in the cloud are becoming a regular occurrence, and many companies <a href="https://www.eff.org/pages/who-has-your-back/">hand over data</a> in response to such requests without informing users. If ad companies’ claims about the inferential power of tracking data are correct, then the potential for surveillance is correspondingly worrisome.</p>
<p>A toughened version of the industry’s position would also have significant practical shortcomings.</p>
<p><span style="font-weight: bold; ">Fragile.</span> Many systems are configured for comprehensive logging by default. It takes only the slightest oversight to begin unintentionally amassing data.</p>
<p><span style="font-weight: bold; ">Unverifiable.</span> There is no straightforward way to externally test whether a company is limiting its information retention and use [3].</p>
<p><span style="font-weight: bold; ">Lock-in.</span> As the online economy and its technology infrastructure change, use-based definitions are likely to become dated. A rigid use-based approach could lock in current advertising business practices, stifling innovation, or motivate some companies to bend the rules and justify tracking for an ever-expanding set of uses.</p>
<p>The privacy advocates’ definition of Do Not Track takes a much different tack: it would allow (just about) any third-party business practice, so long as it does not impose the privacy risk of collecting a user’s browsing history.  A cookie that remembers a language preference would be allowed, for example, while a unique ID cookie would not be allowed [4].</p>
<p>The advocates’ solution avoids the shortcomings of the ad industry approach, and is particularly elegant for two reasons.</p>
<p><span style="font-weight: bold; ">Privacy-preserving alternatives.</span>  There are simple technological solutions to implement most or all current advertising ecosystem functionality, as we have detailed in the “Tracking Not Required” series (<a href="https://air.mozilla.org/tracking-not-required/">overview talk</a>, <a href="http://webpolicy.org/2012/04/23/tracking-not-required-frequency-capping/">frequency capping</a>, <a href="http://33bits.org/2012/06/11/tracking-not-required-behavioral-targeting/">behavioral targeting</a>, <a href="http://webpolicy.org/2012/07/24/tracking-not-required-advertising-measurement/">measurement</a>). Shifting to these architectures would involve switching costs, and in some use cases they would underperform current implementations. That said, we believe it’s quite reasonable for ad companies to incur these minor burdens in exchange for the significant privacy benefits.</p>
<p><span style="font-weight: bold; ">Verifiable.</span>  Tracking carried out in violation of this interpretation of DNT is <span style="font-style: italic; ">externally detectable</span>. This is a crucial point. Some tracking techniques store a unique ID in a user’s device (“supercookies”); others read attributes from a user’s device that, in combination, become unique (“fingerprinting”).  Both approaches require accessing browser functionality in a manner that is, in principle, detectable.</p>
<p>It would also be detectable in practice — a <a href="http://33bits.org/2012/06/04/web-privacy-measurement-genesis-of-a-community/">“Web Privacy Measurement” community</a> has sprung up that has the tools and motivation to police the web for DNT violations. Automated external detection will never achieve 100% accuracy, but it has proven highly effective at flagging possible privacy-violating information flows for manual inspection by analysts. In the worst case, it provides a basis of suspicion for regulators to conduct audits, whereas with the use-based approach audits would essentially have to be conducted blindly. As long as there is a significant chance that violators will be caught, external policing will have a strong deterrent effect.  Companies will be both disincentivized from intentionally gaming DNT and incentived to institutionalize practices that ensure compliance [5].</p>
<p>In conclusion, the Do Not Track negotiations are nearing an impasse, while third-party tracking continues at unprecedented scale. If advertising companies and other third parties don’t step up to the plate, browser vendors and regulators will likely turn to heavy-handed alternatives. We reiterate our belief that a collection-based definition of Do Not Track combined with a deployment of client-side functionality is the ideal outcome for all stakeholders.</p>
<p>[1] The proposal is co-authored by Jonathan Mayer who is also one of the authors of this post.</p>
<p>[2] <b id="internal-source-marker_0.3988948173355311" style="font-weight: normal; ">The paper “<a href="https://www.stanford.edu/~jmayer/papers/trackingsurvey12.pdf">Third-Party Web Tracking: Policy and Technology</a>” includes an expanded explanation of industry self-regulatory initiatives.</p>
<p>[3] This <a href="http://www.cylab.cmu.edu/files/pdfs/tech_reports/CMUCyLab11005.pdf">CMU Cylab study</a> is one of many demonstrating widespread non-compliance with stated policies.</p>
<p>[4] Protocol information (including IP address and User-Agent string) could still be collected and retained for a short duration. This assuredly introduces some privacy risk, but it is much lesser than the risk associated with long-term retention of uniquely identifying information.</p>
<p>[5] Some smaller players, especially those located in jurisdictions where there is no potential legal liability for non-compliance, might simply ignore DNT. The dynamics of the online advertising market mitigate the privacy risks associated with these companies; reputable first-party websites are unlikely to deploy these services. Furthermore, some technical countermeasures (i.e. blocking) are possible against non-compliant companies. The more privacy-forward browser vendors might even choose to enable countermeasures by default.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2012/08/10/the-trouble-with-id-cookies-why-do-not-track-must-mean-do-not-collect/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tracking Not Required: Advertising Measurement</title>
		<link>http://webpolicy.org/2012/07/24/tracking-not-required-advertising-measurement/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=tracking-not-required-advertising-measurement</link>
		<comments>http://webpolicy.org/2012/07/24/tracking-not-required-advertising-measurement/#comments</comments>
		<pubDate>Tue, 24 Jul 2012 19:38:36 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Privacy]]></category>
		<category><![CDATA[Privacy-Preserving]]></category>

		<guid isPermaLink="false">http://webpolicy.org/?p=530</guid>
		<description><![CDATA[Co-authored by Arvind Narayanan. Measurement is central to online advertising: it facilitates billing, performance measurement, targeting decisions, spending allocation, and more. In a pair of earlier posts we explained how advertisement frequency capping and behavioral targeting are achievable without compiling a user&#8217;s browsing history. This post similarly proposes practical, privacy-improved approaches to advertising measurement. There [...]]]></description>
				<content:encoded><![CDATA[<p><em>Co-authored by <a href="http://randomwalker.info/">Arvind Narayanan</a>.</em></p>
<p>Measurement is central to online advertising: it facilitates billing, performance measurement, targeting decisions, spending allocation, and more. In a pair of earlier posts we explained how advertisement <a href="http://webpolicy.org/2012/04/23/tracking-not-required-frequency-capping/">frequency capping</a> and <a href="http://33bits.org/2012/06/11/tracking-not-required-behavioral-targeting/">behavioral targeting</a> are achievable without compiling a user&#8217;s browsing history. This post similarly proposes practical, privacy-improved approaches to advertising measurement.</p>
<p><span id="more-530"></span></p>
<p>There are, broadly, three advertising events that might require measurement.</p>
<ol>
<li><strong>Impression.</strong> An advertisement is displayed to a user. Measured details might include the webpage the ad was served on, the time the ad was served, the user&#8217;s location, and the user&#8217;s browser.<a href="#advertising_measurement_footnote_1" name="advertising_measurement_footnote_return_1"><sup>1</sup></a></li>
<li><strong>Click.</strong> The user clicks the ad.</li>
<li><strong>Action.</strong> After viewing the ad, the user later does something on a different webpage. For example, the user might buy the product or service that was advertised. An action may occur days or weeks after an impression.</li>
</ol>
<p>Sometimes advertisers pay per impression (&#8220;CPM&#8221; billing), sometimes per click (&#8220;CPC&#8221;), sometimes per action (&#8220;CPA&#8221;), and sometimes for a combination of these events (&#8220;hybrid&#8221;).</p>
<p>The following sections explain how to conduct a privacy-improved measurement of each advertising event.</p>
<p><strong>Impression</strong></p>
<p>When a user&#8217;s browser loads an advertisement, the company serving the ad ordinarily learns the URL of the current webpage,<a href="#advertising_measurement_footnote_2" name="advertising_measurement_footnote_return_2"><sup>2</sup></a> as well as the user&#8217;s IP address and <code>User-Agent</code> string.<a href="#advertising_measurement_footnote_3" name="advertising_measurement_footnote_return_3"><sup>3</sup></a> Practical, privacy-improved impression measurement is a granularity problem: How can a website generalize the impression data it collects without substantially compromising the utility of that data?</p>
<p>Requirements will undoubtedly vary by service. We present here a rough design spectrum of the information that a third-party website might retain.<a href="#advertising_measurement_footnote_4" name="advertising_measurement_footnote_return_4"><sup>4</sup></a></p>
<table>
<col class="left" />
<col class="center" />
<col class="center" />
<col class="center" />
<thead>
<tr>
<th class="left"><strong>Information</strong></th>
<th class="center"><strong>Current Approach</strong></th>
<th class="center"><strong>Better Approach</strong></th>
<th class="center"><strong>Even Better Approach</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td class="left"><strong>Webpage</strong></td>
<td class="center">URL</td>
<td class="center">Fully qualified domain name</td>
<td class="center">Public suffix + 1</td>
</tr>
<tr>
<td class="left"><strong>Time</strong></td>
<td class="center">Precise timestamp</td>
<td class="center">Day</td>
<td class="center">Week</td>
</tr>
<tr>
<td class="left"><strong>User Location</strong></td>
<td class="center">IP address</td>
<td class="center">Truncated IP address</td>
<td class="center">Coarse geolocation</td>
</tr>
<tr>
<td class="left"><strong>Browser</strong></td>
<td class="center"><code>User-Agent</code> string</td>
<td class="center">Browser/OS major versions</td>
<td class="center">Browser/OS</td>
</tr>
</tbody>
</table>
<p><strong>Click</strong></p>
<p>Click measurement is exactly the same as impression measurement, with just one extra piece of information: whether the user clicked the ad.<a href="#advertising_measurement_footnote_5" name="advertising_measurement_footnote_return_5"><sup>5</sup></a></p>
<p><strong>Action</strong></p>
<p>Action measurement is a more difficult engineering problem. An ad impression is, for measurement purposes, a one-shot event: it occurs within the context of a single webpage. Measuring an action, on the other hand, requires linking an ad impression on one webpage with a subsequent action on another webpage.<a href="#advertising_measurement_footnote_6" name="advertising_measurement_footnote_return_6"><sup>6</sup></a></p>
<p>A pairing of client-side storage and selective information disclosure can enable privacy-improved action measurement, much like our previous approaches to <a href="http://webpolicy.org/2012/04/23/tracking-not-required-frequency-capping/">frequency capping</a> and <a href="http://33bits.org/2012/06/11/tracking-not-required-behavioral-targeting/">behavioral targeting</a>. When an ad is displayed, information about the impression can be stored in the browser. If the user later completes an action, the ad company can query the browser for relevant impression information.</p>
<p>Implementing a prototype of our action measurement algorithm was straightforward using HTML 5 local storage. Source is <a href="https://github.com/jonathanmayer/Tracking-Not-Required/tree/master/conversion-measurement">available on GitHub</a>. Performance is a non-issue, as with our prototypes for <a href="https://github.com/jonathanmayer/Tracking-Not-Required/tree/master/frequency-capping">frequency capping</a> and <a href="https://github.com/jonathanmayer/Tracking-Not-Required/tree/master/behavioral-targeting">behavioral targeting</a>.</p>
<hr />
<ol>
<li><a name="advertising_measurement_footnote_1"></a>
<p>Many advertising companies also record whether this was a first-time (&#8220;unique&#8221;) impression. Our algorithm for <a href="http://webpolicy.org/2012/04/23/tracking-not-required-frequency-capping/">frequency capping</a> can be trivially modified to provide this functionality. <a href="#advertising_measurement_footnote_return_1">&#160;&#8617;</a></p>
</li>
<li><a name="advertising_measurement_footnote_2"></a>
<p>Third-party websites usually learn the first-party webpage URL from a <code>Referer</code> header or explicit <code>Request-URI</code> parameter. There are some methods for a first-party webpage to hide its URL from third-party content, including <code>iframe</code> sandboxing and the HTML 5 <code>noreferrer</code> link annotation. For the moment, these techniques are not sufficiently simple, comprehensive, or supported to anticipate widespread use. <a href="#advertising_measurement_footnote_return_2">&#160;&#8617;</a></p>
</li>
<li><a name="advertising_measurement_footnote_3"></a>
<p>A semi-trusted intermediary or anonymizing network could conceal or generalize a user&#8217;s IP address, <code>User-Agent</code> string, and other information. See <a href="http://crypto.stanford.edu/adnostic/">Adnostic</a> and <a href="http://static.usenix.org/event/nsdi11/tech/full_papers/Guha.pdf">Privad</a> for examples. These approaches are, at present, not practical for broad deployment. <a href="#advertising_measurement_footnote_return_3">&#160;&#8617;</a></p>
</li>
<li><a name="advertising_measurement_footnote_4"></a>
<p>Present Do Not Track proposals diverge on impression measurement; some would allow current approaches to continue, while others would require quickly generalizing impression data. <a href="#advertising_measurement_footnote_return_4">&#160;&#8617;</a></p>
</li>
<li><a name="advertising_measurement_footnote_5"></a>
<p>Do Not Track would allow the current approach to click measurement since the user has (somewhat constructively) interacted with content from the advertising company. <a href="#advertising_measurement_footnote_return_5">&#160;&#8617;</a></p>
</li>
<li><a name="advertising_measurement_footnote_6"></a>
<p>Because of this property, some Do Not Track proposals would require a privacy-improved approach to action measurement. <a href="#advertising_measurement_footnote_return_6">&#160;&#8617;</a></p>
</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2012/07/24/tracking-not-required-advertising-measurement/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Do Track: Browser-Based Do Not Track Exceptions</title>
		<link>http://webpolicy.org/2012/07/02/do-track-browser-based-do-not-track-exceptions/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=do-track-browser-based-do-not-track-exceptions</link>
		<comments>http://webpolicy.org/2012/07/02/do-track-browser-based-do-not-track-exceptions/#comments</comments>
		<pubDate>Mon, 02 Jul 2012 07:54:14 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Privacy]]></category>
		<category><![CDATA[Privacy-Preserving]]></category>

		<guid isPermaLink="false">http://webpolicy.org/?p=503</guid>
		<description><![CDATA[Users hold widely varying preferences on web tracking.1 Some don&#8217;t mind the practice. Some object to it entirely. Many trust certain organizations to follow them around the web. Do Not Track accomodates these divergent preferences in two ways. First, browsers and other user agents include an option for universally signaling a preference against tracking (&#8220;DNT: [...]]]></description>
				<content:encoded><![CDATA[<p>Users hold widely varying preferences on web tracking.<a href="#dnt_exceptions_fn_1"><sup>1</sup></a> Some don&#8217;t mind the practice. Some object to it entirely. Many trust certain organizations to follow them around the web.</p>
<p><a href="http://donottrack.us/">Do Not Track</a> accomodates these divergent preferences in two ways. First, browsers and other user agents include an option for universally signaling a preference against tracking (&ldquo;DNT: 1&rdquo;). <a href="http://support.mozilla.org/en-US/kb/how-do-i-turn-do-not-track-feature">Firefox</a>, <a href="http://ie.microsoft.com/testdrive/Browser/DoNotTrack/Default.html">Internet Explorer</a>, and <a href="https://www.computerworld.com/common/images/site/features/2012/02/Safari_do_not_track.jpg">Safari</a> have all integrated this feature, and Chrome will support it by the end of the year. Second, a user can configure exceptions to the universal signal. Some websites may choose to build a proprietary &ldquo;out-of-band&rdquo; exception mechanism, using ordinary web technologies, that trumps the &ldquo;DNT: 1&rdquo; signal. The <a href="http://donottrack.us/cookbook/">Do Not Track Cookbook</a> includes an example of how a Facebook out-of-band exception mechanism might appear.</p>
<p>The <a href="http://www.w3.org/2011/tracking-protection/">W3C Do Not Track standard</a> will provide another option: a simple JavaScript interface that allows a website to request an exception, paired with a signal that some tracking is allowed (&ldquo;DNT: 0&rdquo;).</p>
<p><span id="more-503"></span></p>
<p>There are many benefits to managing Do Not Track exceptions through the browser.</p>
<ul>
<li><b>Avoids Duplication of Effort.</b> Browser vendors implement an exception mechanism once. Websites can then take advantage of the mechanism with just a few lines of code.
<li><b>Persistence.</b> A user is unlikely to accidentally clear browser-based exceptions, in contrast to cookies and other out-of-band exception storage mechanisms.<a href="#dnt_exceptions_fn_2"><sup>2</sup></a>
<li><b>Centralized Management.</b> Users can adjust all their Do Not Track preferences in one place. Out-of-band exceptions might be scattered across the web.<a href="#dnt_exceptions_fn_3"><sup>3</sup></a>
<li><b>Consistent User Interface.</b> The Do Not Track exception user interface will be the same across sites and integrated into the browser&#8217;s privacy controls.
<li><b>Design Quality.</b> Browser vendors employ some of the brightest minds in user interface design.
<li><b>Usability Incentives.</b> Web browsers compete on usability and frequently iterate with user interface improvements. Browser development teams are, for the most part, incentivized to provide users with adequate information about and control over a Do Not Track exception. A website that seeks a Do Not Track exception, on the other hand, is incentivized to push the boundaries of acceptable notice and choice to get that exception.<a href="#dnt_exceptions_fn_4"><sup>4</sup></a>
</ul>
<p>In order to better understand the technical challenges associated with browser-based Do Not Track exceptions, I implemented a prototype as a Firefox add-on.</p>
<div style="text-align:center;">
<div>
<p><a href="http://dl.dropbox.com/u/37533397/do_not_track/dnt-exceptions-prototype-explicit-explicit.png"><img src="http://dl.dropbox.com/u/37533397/do_not_track/dnt-exceptions-prototype-explicit-explicit.png" width="150px"></img></a>&nbsp;&nbsp;&nbsp;&nbsp;<a href="http://dl.dropbox.com/u/37533397/do_not_track/dnt-exceptions-prototype-web-wide.png"><img src="http://dl.dropbox.com/u/37533397/do_not_track/dnt-exceptions-prototype-web-wide.png" width="150px"></img></a>&nbsp;&nbsp;&nbsp;&nbsp;<a href="http://dl.dropbox.com/u/37533397/do_not_track/dnt-exceptions-prototype-site-wide.png"><img src="http://dl.dropbox.com/u/37533397/do_not_track/dnt-exceptions-prototype-site-wide.png" width="150px"></img></a></p>
</div>
<p>Example exception requests.</p>
<div>
<p><a href="http://dl.dropbox.com/u/37533397/do_not_track/dnt-exceptions-prototype-management.png"><img src="http://dl.dropbox.com/u/37533397/do_not_track/dnt-exceptions-prototype-management.png" width="150px"></img></a></p>
</div>
<p>A centralized preference management interface.</p>
</div>
<p>The source is <a href="https://github.com/jonathanmayer/Do-Not-Track/tree/master/exceptions">available on GitHub</a>. I want to emphasize that this is a prototype: it does not conform to the current W3C specification draft and it is insecure, buggy, and slow.</p>
<p>I learned several lessons from the project.</p>
<ul>
<li><b>Browser-based exceptions are not very difficult to implement.</b> The Firefox prototype required only a couple days of straightforward development.
<li><b>Browser-based exceptions are markedly more difficult to implement than the &ldquo;DNT: 1&rdquo; header.</b> As a very rough comparison, my reference &ldquo;DNT: 1&rdquo; <a href="https://github.com/jonathanmayer/Do-Not-Track/tree/master/chrome">Chrome extension</a> is 9 lines of code, while my prototype Firefox <a href="https://github.com/jonathanmayer/Do-Not-Track/tree/master/exceptions">exceptions extension</a> is already 521 lines. Furthermore, implementing &ldquo;DNT: 1&rdquo; is largely a systems engineering project, while an exception mechanism necessitates both systems and user interface effort.
<li><b>The Do Not Track exception user interface is hard to get right.</b> After several iterations, the user interface in my prototype still leaves much room for improvement. I expect Do Not Track will, like other browser features, benefit from long-term user interface evolution.
</ul>
<p>The prototype add-on validates what Do Not Track proponents have long recognized: Do Not Track is not a crude on/off switch. Rather, it begins a conversation between websites and users about privacy preferences. Browsers will play a central role in facilitating that conversation.</p>
<hr /></hr>
<p>Thanks to <a href="http://randomwalker.info/">Arvind Narayanan</a> for comments on a draft.</p>
<p><a name="dnt_exceptions_fn_1"></a>1. A number of surveys have reflected mixed consumer preferences on web tracking. See, e.g., <a href="http://pewinternet.org/Reports/2012/Search-Engine-Use-2012.aspx">Pew Internet 2012</a>, <a href="http://truste.com/ad-privacy/">TRUSTe/Harris Interactive 2011</a>, <a href="http://gallup.com/poll/File/145334/Internet Ads</p>
<p>Dec 21 2010.pdf">USA Today/Gallup 2010</a>, <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1989092">McDonald and Cranor 2010</a>, and <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1478214">Turow et al. 2009</a>.</p>
<p><a name="dnt_exceptions_fn_2"></a>2. If a website maintains user accounts, associating an out-of-band exception with an account can greatly reduce the risk of accidental clearing.</p>
<p><a name="dnt_exceptions_fn_3"></a>3. This problem could be somewhat mitigated with a browser scripting interface for registering an out-of-band exception mechanism. I proposed the approach <a href="https://groups.google.com/group/do-not-track/msg/d4fa205a5c3e5f59">last year</a>, and there was renewed interest at a recent W3C working group meeting.</p>
<p><a name="dnt_exceptions_fn_4"></a>4. See <a href="http://www.cylab.cmu.edu/research/techreports/2011/tr_cylab11017.html">Leon et al. 2011</a> for a usability analysis of current online advertising user control mechanisms.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2012/07/02/do-track-browser-based-do-not-track-exceptions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tracking Not Required: Behavioral Targeting</title>
		<link>http://webpolicy.org/2012/06/11/tracking-not-required-behavioral-targeting/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=tracking-not-required-behavioral-targeting</link>
		<comments>http://webpolicy.org/2012/06/11/tracking-not-required-behavioral-targeting/#comments</comments>
		<pubDate>Mon, 11 Jun 2012 21:42:59 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Privacy]]></category>
		<category><![CDATA[Privacy-Preserving]]></category>

		<guid isPermaLink="false">http://webpolicy.org/?p=526</guid>
		<description><![CDATA[Original at 33 Bits of Entropy. Co-authored by Arvind Narayanan and Subodh Iyengar. In the first installment of the Tracking Not Required series, we discussed a relatively straightforward case: frequency capping. Now let&#8217;s get to the 800-pound gorilla, behaviorally targeted advertising, putatively the main driver of online tracking. We will show how to swap a [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at <a href="http://33bits.org/2012/06/11/tracking-not-required-behavioral-targeting/">33 Bits of Entropy</a>.</em></p>
<p><em>Co-authored by <a href="http://randomwalker.info/">Arvind Narayanan</a> and <a href="https://sites.google.com/site/subodhiye/">Subodh Iyengar</a>.</em></p>
<p>In the <a href="http://webpolicy.org/2012/04/23/tracking-not-required-frequency-capping/">first installment</a> of the <em>Tracking Not Required</em> series, we discussed a relatively straightforward case: frequency capping. Now let&#8217;s get to the 800-pound gorilla, behaviorally targeted advertising, putatively the main driver of online tracking. We will show how to swap a <em>little</em> functionality for a <em>lot</em> of privacy.</p>
<p>Admittedly, implementing behavioral targeting on the client is hard and will require some technical wizardry. It doesn&#8217;t come for &#8220;free&#8221; in that it requires a trade-off in terms of various privacy and deployability desiderata. Fortunately, this has been a fertile topic of research over the past several years, and there are papers describing solutions at a variety of points on the privacy-deployability spectrum. This post will survey these papers, and propose a simplification of the <a href="http://crypto.stanford.edu/adnostic/">Adnostic </a>approach — along with prototype code — that offers significant privacy and is straightforward to implement.</p>
<p><span id="more-526"></span></p>
<p><strong>Goals</strong>. Carrying out behavioral advertising without tracking requires several things. First, the user needs to be profiled and categorized based on their browsing history. In nearly all proposed solutions, this happens in the user’s browser. Second, we need an algorithm for selecting targeted ads to display each time the user visits a page. If the profile is stored locally and not shared with the advertising company, this is quite nontrivial. The final component is for reporting of ad impressions and clicks. This component must also deal with click fraud, impression fraud and other threats.</p>
<p><strong>Existing approaches</strong></p>
<p>The chart presents an overview of existing and proposed architectures. </p>
<p><img src="http://dl.dropbox.com/u/37533397/tracking_not_required/ad-privacy-spectrum.png" alt="" width="400px" height="300px" /></p>
<p>“Cookies” refers to the status quo of server-side tracking; all other architectures are presented in research papers summarized in the <a href="http://donottrack.us/bib/">Do Not Track bibliography</a> page. CoP stands for “Client-only Profiles,” the architecture proposed by Bilenko and Richardson.</p>
<p>Several points of note. First, everything except PrivAd — which uses an anonymizing proxy — reveals the IP address, and typically the User Agent and Referer to the ad company as part of normal HTTP requests. Second, everything except CoP (and the status quo of tracking cookies) requires software installation. Opinions vary on just how much of a barrier this is. Third, we don&#8217;t take a stance on whether PrivAd is more deployable than ObliviAd or vice-versa; they both face significant hurdles. Finally, Adnostic can be used in one of two modes, hence it is listed twice.</p>
<p>There is an interesting technological approach, not listed above, that works by exposing more limited referer information. Without the referer header (or an equivalent), the ad server may identify the user but will not learn the first-party URL, and thus will not be able to track. This will be explored in more depth in a future article.</p>
<p><strong>New approach</strong>. In the solution we propose here, the server is recruited for profiling, but doesn&#8217;t store the profile. This avoids the need for software installation and allows easy deployability. In addition, non-tracking is <em>externally verifiable</em>, to the extent that IP address + User-Agent is not nearly as effective for tracking as cookie-based unique identifiers.[1] Like CoP, and unlike Adnostic, each ad company can only profile users during visits to pages that it has a third-party presence on, rather than all pages.</p>
<p><strong>Profiling algorithm</strong>.</p>
<p>1. The user visits a page that has embedded content from the ad company.</p>
<p>2. JavaScript in the ad company’s content sends the top-level URL to a special classifier service run by the ad company.  (The classifier is run on a separate domain.  It does not have any cookies or other information specific to the user.)</p>
<p>3. The classifier returns a topic classification of the page.</p>
<p>4. The ad company’s JavaScript receives the page classification and uses it to update the user’s behavioral profile in HTML5 storage.  The JavaScript may also consider other factors, such as how long the user stayed on the page.</p>
<p>There is a fair degree of flexibility in steps 3 and 4 — essentially any profiling algorithm can be implemented by appropriately splitting it into a server-side component that classifies individual web pages and a client-side component that analyzes the user’s interaction with these pages.</p>
<p><strong>Ad serving and accounting</strong>.</p>
<p>The ad serving process in our proposal is the same as in <a href="http://crypto.stanford.edu/adnostic/">Adnostic</a> — the server sends a list of ads along with metadata describing each ad, and the client-side component picks the ad that best matches the locally stored profile. To avoid revealing which ad was displayed, the client can either download all (say, 10) ads in the list while displaying only one, or the client downloads only one ad, but ads are served from a different domain which does not share cookies with the tracking domain. Note the similarity to our <a href="http://webpolicy.org/2012/04/23/tracking-not-required-frequency-capping/">frequency capping approach</a>, both in terms of the algorithm and its privacy properties.</p>
<p>Accounting, i.e., billing the right advertiser is also identical to Adnostic for the cost-per-click and cost-per-impression models; we refer the reader there. Discussing the cost-per-action model is deferred to a future post.</p>
<p><strong>Implementation</strong>. We <a href="https://github.com/jonathanmayer/Tracking-Not-Required/tree/master/behavioral-targeting">implemented</a> our behavioral targeting algorithm using HTML 5 local storage. As with our frequency capping implementation, we found performance was exceptionally fast in modern desktop and mobile browsers. For simplicity, our implementation uses a static local database mapping websites to interest segments and a binary threshold for determining interests. In practice, we expect implementers would maintain the mapping server-side and apply more sophisticated logic client-side.</p>
<p>We also present a different work-in-progress <a href="https://github.com/siyengar/cookbook-prototype">implementation</a> that’s broader in scope, encompassing retargeting, behavioral targeting and frequency capping.</p>
<p><strong>Conclusion</strong>. Certainly there are costs to our approach — a “thick-client” model will always be slightly more inconvenient to deploy and maintain than a server-based model, and will probably have a lower targeting accuracy. However, we view these costs as minimal compared to the benefits. Some compromise is necessary to get past the current stalemate in web tracking.</p>
<p>Technological feasibility is necessary, but not sufficient, to change the status quo in online tracking. The other key component is incentives. That is why <a href="http://donottrack.us/">Do Not Track</a>, standards and advocacy are crucial to the online privacy equation.</p>
<p>[1] The engineering and business reasons for this difference in effectiveness will be discussed in a future post.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2012/06/11/tracking-not-required-behavioral-targeting/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tracking Not Required: Frequency Capping</title>
		<link>http://webpolicy.org/2012/04/23/tracking-not-required-frequency-capping/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=tracking-not-required-frequency-capping</link>
		<comments>http://webpolicy.org/2012/04/23/tracking-not-required-frequency-capping/#comments</comments>
		<pubDate>Mon, 23 Apr 2012 18:00:18 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Privacy]]></category>
		<category><![CDATA[Privacy-Preserving]]></category>

		<guid isPermaLink="false">http://webpolicy.org/?p=468</guid>
		<description><![CDATA[Co-authored by Arvind Narayanan. Debates over web tracking and Do Not Track tend to be framed as a clash between consumer privacy and business need. That&#8217;s not quite right. There is, in fact, a spectrum of possible tradeoffs between business interests and consumer privacy. Our aim with the Tracking Not Required series is to show [...]]]></description>
				<content:encoded><![CDATA[<p><em>Co-authored by <a href="http://randomwalker.info/">Arvind Narayanan</a>.</em></p>
<p>Debates over web tracking and Do Not Track tend to be framed as a clash between consumer privacy and business need. That&#8217;s not quite right. There is, in fact, a spectrum of possible tradeoffs between business interests and consumer privacy.</p>
<p>Our aim with the Tracking Not Required series is to show how those tradeoffs are not at all linear; it is possible to swap a <em>little</em> functionality for a <em>lot</em> of privacy. We only use technologies that are already deployed in browsers, and the solutions we propose are externally verifiable.<sup><a href="#privacy_preserving_advertising_fn_1">1</a></sup></p>
<p>We focus on issues at the center of <a href="https://www.eff.org/deeplinks/2012/04/will-industry-agree-meaningful-do-not-track">Do Not Track negotiations</a> in the World Wide Web Consortium. Advertising companies have <a href="https://www.eff.org/deeplinks/2012/02/white-house-google-and-other-advertising-companies-commit-supporting-do-not-track">pledged</a> to stop forms of ad targeting once a user enables Do Not Track, but many maintain that tracking is essential for a litany of &#8220;operational uses.&#8221; The Tracking Not Required series demonstrates how business functionality can be implemented without exposing users to the risks of tracking.</p>
<p>This first post addresses frequency capping in online advertising, the most frequently cited &#8220;operational use&#8221; necessitating tracking.</p>
<p><span id="more-468"></span></p>
<p><strong>Background</strong></p>
<p>When an online advertiser places a bid, it often sets a &#8220;frequency cap&#8221; on how many times a user may see a particular ad or ad campaign. Many advertising companies implement frequency capping with a unique ID cookie; when a user loads an ad, the ad company looks up past ad views using the ID and imposes frequency caps. The ID cookie approach is understandable: frequency capping becomes a straightforward database lookup. But ID cookies come at a significant privacy cost: they enable effective tracking of a user&#8217;s browsing activities.</p>
<p><strong>Algorithm</strong></p>
<p>When the browser loads a page, for each ad slot:</p>
<ol>
<li> The advertising company sends a list of ads it is considering for display, including a frequency cap for each ad. The list is ordered by preference.
<li> A script iterates through the list. For each ad in the list, the script checks a local ad viewing history to determine whether the ad is frequency capped. It selects the first ad that is uncapped.
<li> The script sends the uncapped ad back to the ad company for display or entry into an ad exchange auction.
</ol>
<p>When an ad is displayed:</p>
<ol>
<li> The script records the impression in the local ad viewing history.
<li> The advertising company bills the ad as usual (per impression, per click, per action,<sup><a href="#privacy_preserving_advertising_fn_2">2</a></sup> or a hybrid).
</ol>
<p><strong>Explanation</strong></p>
<p>Our algorithm leverages several design features to improve the privacy properties of frequency capping.</p>
<ol>
<li><b>Client-side storage</b>. The ad company does not store the user&#8217;s ad viewing history.
<li><b>Query-response</b>. Our algorithm shares information only about ads that might be shown, not all ads the user has seen.
<li><b>Client-side logic</b>. The ad company learns whether certain ads are capped or not; it does not learn their view counts.
<li><b>Server-side preprocessing</b>. Our algorithm shares only the first uncapped ad, not the capping state of each ad in the list.
</ol>
<p>Our algorithm protects user privacy by limiting the number of states the browser can be in. When $latex m$ ads are under consideration, our algorithm only allows the browser to be in one of $latex m$ states&#8211;each of the ads might be selected as the first uncapped ad.<sup><a href="#privacy_preserving_advertising_fn_3">3</a></sup> The maximum information entropy contributed is $latex log_2 m$ bits.<sup><a href="#privacy_preserving_advertising_fn_4">4</a>, <a href="#privacy_preserving_advertising_fn_5">5</a></sup></p>
<p>The discussion has thus far centered on choosing a single ad. An ad company will often select multiple ads of multiple formats when populating a page. The associated maximum information entropy is $latex sum_{i=0}^{k}{log_2{m_i choose n_i}}$ for a page with $latex k$ ad formats, where $latex m_i$ is the number of ads considered in the $latex i$th format and $latex n_i$ is the number of ads selected in the $latex i$th format.</p>
<p>For a concrete example, consider a New York Times article page, which often features a banner graphical ad, a sidebar graphical ad, and two footer text ads. If there are three ad formats and five possible ads for each format, an ad network that populates all four slots would gain at most 7.97 bits of information entropy from frequency capping. In other words, the ad network cannot learn more about the average user than if it had set a one-character cookie!</p>
<p><strong>Implementation</strong></p>
<p>Several advertising companies have objected that privacy-preserving frequency capping is not feasible in implementation. We respectfully disagree.</p>
<p>Performance is a non-issue. Our algorithm imposes negligible requirements on browser storage<sup><a name="privacy_preserving_advertising_fn_6">6</a></sup> and computation.<sup><a name="privacy_preserving_advertising_fn_7">7</a></sup> As for network latency, the approach would at most require an additional roundtrip&#8211;and for the many ad companies that already load an ad over multiple roundtrips, it wouldn&#8217;t necessitate any extra roundtrip.</p>
<p>There would, of course, be some software engineering required for implementation. The necessary scripting is straightforward; we developed a prototype implementation of our algorithm in hours. (Source is <a href="https://github.com/jonathanmayer/Tracking-Not-Required/tree/master/frequency-capping">available on GitHub</a>.) Backend changes may be more demanding; ad companies would have to shift from selecting particular ads for display to generating preference-ordered lists of possible ads. While assuredly not trivial, we do not see how the engineering would be unusually complicated.</p>
<hr />
<p><a name="privacy_preserving_advertising_fn_1"></a>1. A website&#8217;s compliance with our proposals could be externally verified in a number of ways, such as with an automated auditing tool (e.g. <a href="http://fourthparty.info">FourthParty</a>), through use of a trusted implementation, or by an independent auditing firm.</p>
<p><a name="privacy_preserving_advertising_fn_2"></a>2. Privacy-preserving per-action and hybrid billing can be accomplished by querying the impression history in local storage when a billable action occurs. </p>
<p><a name="privacy_preserving_advertising_fn_3"></a>3. We assume here that at least one ad is uncapped, and we assume in the later discussion that at least $latex n_i$ ads are uncapped. One way to guarantee those assumptions is to designate certain ads as fallbacks. If those assumptions do not hold, the maximum information entropy is $latex log_2 left(m+1right)$ bits for a single format, single ad selection and $latex sum_{i=0}^{k}{log_2{sum_{j=0}^{n_i}{m_i choose j}}}$ bits for multiple format, multiple ad selection.</p>
<p><a name="privacy_preserving_advertising_fn_4"></a>4. This property follows from the <a href="https://en.wikipedia.org/wiki/Gibbs'_inequality">Gibbs inequality</a>. The information entropy is $latex log_2 m$ bits only if users are evenly distributed across the state space. Where the state space is dynamic and the distribution of users among states is skewed, both of which are very likely to occur with the approach we propose, the information entropy will be significantly less than $latex log_2 m$ bits.</p>
<p><a name="privacy_preserving_advertising_fn_5"></a>5. We analyze our algorithm for marginal privacy impact, that is, the extent to which it makes user privacy worse off. IP address and User-Agent already contribute substantial information entropy and allow tracking many users.</p>
<p><a name="privacy_preserving_advertising_fn_6"></a>6. For example, if an ad is represented by a 4 byte identifier and frequency count is stored in 1 byte, frequency caps for 100,000 ads consume merely 500 KB. HTML 5 <a href="http://dev.w3.org/html5/webstorage/">local storage</a> can hold at least 5 MB in the major browsers; <a href="http://www.w3.org/TR/IndexedDB/">Indexed Database</a> and <a href="http://www.w3.org/TR/webdatabase/">Web SQL</a> can store even more.</p>
<p><a name="privacy_preserving_advertising_fn_7"></a>7. In our testing, modern browsers can perform dozens of lookups in an HTML 5 local storage instance with over 100,000 keys in mere milliseconds.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2012/04/23/tracking-not-required-frequency-capping/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Third-Party Web Tracking: Policy and Technology</title>
		<link>http://webpolicy.org/2012/03/13/third-party-web-tracking-policy-and-technology/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=third-party-web-tracking-policy-and-technology</link>
		<comments>http://webpolicy.org/2012/03/13/third-party-web-tracking-policy-and-technology/#comments</comments>
		<pubDate>Tue, 13 Mar 2012 12:30:27 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Paper]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.org/?p=418</guid>
		<description><![CDATA[John Mitchell and I have written a new paper that synthesizes research on policy and technology issues surrounding third-party web tracking. It will appear at the IEEE Symposium on Security and Privacy in May. Abstract In the early days of the web, content was designed and hosted by a single person, group, or organization. No [...]]]></description>
				<content:encoded><![CDATA[<p><a href="http://theory.stanford.edu/people/jcm/">John Mitchell</a> and I have written a <a href="https://www.stanford.edu/~jmayer/papers/trackingsurvey12.pdf">new paper</a> that synthesizes research on policy and technology issues surrounding third-party web tracking. It will appear at the <a href="http://www.ieee-security.org/TC/SP2012/">IEEE Symposium on Security and Privacy</a> in May.</p>
<p><strong>Abstract</strong></p>
<p>In the early days of the web, content was designed and hosted by a single person, group, or organization. No longer. Webpages are increasingly composed of content from myriad unrelated &#8220;third-party&#8221; websites in the business of advertising, analytics, social networking, and more. Third-party services have tremendous value: they support free content and facilitate web innovation. But third-party services come at a privacy cost: researchers, civil society organizations, and policymakers have increasingly called attention to how third parties can track a user&#8217;s browsing activities across websites.</p>
<p>This paper surveys the current policy debate surrounding third-party web tracking and explains the relevant technology. It also presents the <em>FourthParty</em> web measurement platform and studies we have conducted with it. Our aim is to inform researchers with essential background and tools for contributing to public understanding and policy debates about web tracking.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2012/03/13/third-party-web-tracking-policy-and-technology/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The FTC&#8217;s Chairman Groks Do Not Track</title>
		<link>http://webpolicy.org/2012/02/29/the-ftcs-chairman-groks-do-not-track/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=the-ftcs-chairman-groks-do-not-track</link>
		<comments>http://webpolicy.org/2012/02/29/the-ftcs-chairman-groks-do-not-track/#comments</comments>
		<pubDate>Wed, 29 Feb 2012 11:00:20 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.org/?p=380</guid>
		<description><![CDATA[Last Thursday the White House hosted a major event on online privacy. Much of the public attention focused on a long-awaited White House report and a commitment by an online advertising self-regulatory group to implement components of the Do Not Track technology. Both the Electronic Frontier Foundation and the Center for Democracy and Technology have [...]]]></description>
				<content:encoded><![CDATA[<p>Last Thursday the White House hosted a major event on online privacy. Much of the public attention focused on a long-awaited White House <a href="http://www.whitehouse.gov/sites/default/files/privacy-final.pdf">report</a> and a <a href="http://www.aboutads.info/resource/download/DAA_Commitment.pdf">commitment</a> by an online advertising self-regulatory group to implement components of the Do Not Track technology. Both the <a href="https://www.eff.org/deeplinks/2012/02/white-house-google-and-other-advertising-companies-commit-supporting-do-not-track">Electronic Frontier Foundation</a> and the <a href="https://www.cdt.org/blogs/justin-brookman/2402two-steps-forward-privacy">Center for Democracy and Technology</a> have written detailed reviews of what transpired.</p>
<p>There has been scant focus on Federal Trade Commission Chairman Jon Leibowitz&#8217;s brief <a href="http://ftc.gov/speeches/leibowitz/120223whitehouse-privacy.pdf">remarks</a> on Do Not Track. That&#8217;s a mistake.</p>
<p><span id="more-380"></span></p>
<p>The FTC was an early, vocal proponent of Do Not Track with its December 2010 <a href="http://www.ftc.gov/os/2010/12/101201privacyreport.pdf">preliminary staff report</a> on online privacy.  FTC staff have frequently cajoled companies and trade groups to implement Do Not Track, and they have participated in every meeting of the World Wide Web Consortium&#8217;s <a href="http://www.w3.org/2011/tracking-protection/">Do Not Track working group</a>.  Chairman Leibowitz himself attended a recent meeting in Brussels.</p>
<p>The FTC&#8217;s thinking matters.  The agency can—and does—bring enforcement actions against web trackers (e.g. <a href="http://www.ftc.gov/opa/2011/03/chitika.shtm">Chitika</a>, <a href="http://www.ftc.gov/opa/2011/12/scanscout.shtm">ScanScout</a>) and the websites that facilitate them (e.g. <a href="http://ftc.gov/opa/2011/11/privacysettlement.shtm">Facebook</a>). The FTC can call for Do Not Track legislation. Further, under the new White House proposal, the agency would be vested with both veto power over self-regulatory codes and enforcement authority for baseline privacy requirements.</p>
<p>FTC commissioners tend to shy from staking out their individual policy views. The FTC is a law enforcement agency, and it only &#8220;speaks&#8221; through a vote of the commission. Chairman Leibowitz in particular has relied on subtlety and nuance in his policy addresses.</p>
<p>Last Thursday&#8217;s speech was unusually direct. Chairman Leibowitz gave the clearest articulation yet of his thinking on Do Not Track—and he got it completely right. Here&#8217;s a summary.</p>
<ul>
<li>The FTC does policy, not just enforcement. And it&#8217;s the agency that will continue to lead on Do Not Track, not the White House or the Commerce Department.<br />
<blockquote>
<p>Since our founding in 1914, the FTC also has had a policy function, which has focused recently on privacy.</p>
</blockquote>
<blockquote>
<p>With the encouragement of this Administration – which has so keenly recognized the link between protecting consumer privacy online and engendering consumer trust in Internet commerce – an impressive public-private partnership has made a beginning, coming together around one small agency’s Do Not Track initiative.</p>
</blockquote>
</li>
<li>
<p>A consumer&#8217;s Do Not Track preference must do more than stop behavioral ad targeting.</p>
<blockquote>
<p>We envisioned a [Do Not Track] mechanism that would . . . allow consumers to limit how much data is gathered about them online (and not just how many targeted ads they see) . . . .</p>
</blockquote>
</li>
<li>At present, online advertising self-regulation only stops behavioral ad targeting. (Some stakeholders have termed the program &#8220;Do Not Target.&#8221;)<br />
<blockquote>
<p>For the past several years, the online advertising industry has been working to develop an icon that consumers could click to opt out of receiving targeted ads.</p>
</blockquote>
</li>
<li>A consumer&#8217;s Do Not Track preference must affect all third-party websites, including advertising companies, analytics services, and social networks.<br />
<blockquote>
<p>While these developments are encouraging, we still need to ensure that <u>all</u> companies that track users – not just advertisers – are at the table.</p>
</blockquote>
</li>
<li>The World Wide Web Consortium is the multi-stakeholder forum for establishing Do Not Track technology and policy.<br />
<blockquote>
<p>To that end, the World Wide Web Consortium (W3C), an Internet standards-setting body, gathered engineers, consumer groups, and participants across the broad technology industry to create a universal standard for Do Not Track.  We look forward to their deliberations also bearing fruit over the coming year.</p>
</blockquote>
</li>
<li>The FTC will continue to enforce on web tracking issues, especially when a company violates a consent order with the agency.<br />
<blockquote>
<p>Today, although it is still a work in progress, the ad industry has obtained buy-in from companies that deliver 90 percent of online behavioral advertisements; and, with the Better Business Bureau, it has established a mechanism with teeth to address non-compliance, backed up with FTC enforcement.  Said differently, if they don’t enforce it, we will. </p>
</blockquote>
<blockquote>
<p>Most notably, last year, two of the largest Internet companies entered into consent orders with the FTC that require both to honor their privacy commitments to hundreds of millions of consumers worldwide and to hire outside auditors to monitor their privacy practices.</p>
</blockquote>
</li>
<li>The Do Not Track technology is the &#8220;<code>DNT: 1</code>&#8221; preference signaling mechanism, and consumers are already using it. Microsoft&#8217;s Tracking Protection List technology has merit, but it&#8217;s a different proposal.<br />
<blockquote>
<p>In a related effort, very early on the companies that make web browsers stepped up to our challenge to give consumers choice about how they are tracked online, sometimes known as the browser header approach.  Just after the FTC’s call for Do Not Track, Microsoft developed a system to let users of Internet Explorer prevent tracking by different companies and sites.  Mozilla introduced a Do Not Track privacy control for its Firefox browser that an impressive number of consumers have adopted; Apple included a similar Do Not Track control in Safari.</p>
</blockquote>
</li>
</ul>
<p><strong>Implications</strong></p>
<p>European policymakers have already articulated their positions on advertising self-regulation and Do Not Track. In December the European Union&#8217;s Data Protection Working Party issued a <a href="http://ec.europa.eu/justice/data-protection/article-29/documentation/opinion-recommendation/files/2011/wp188_en.pdf">formal opinion</a> that found current advertising self-regulation inadequate under EU privacy law and that signaled support for the W3C&#8217;s Do Not Track standards process. European Commission Vice-President Neelie Kroes indicated in January that she <a href="http://blogs.ec.europa.eu/neelie-kroes/donottrack/">shares those views</a>; she <a href="http://blogs.ec.europa.eu/neelie-kroes/usa-do-not-track/">reaffirmed her position</a> in response to last week&#8217;s event.</p>
<p>Now the head of the chief U.S. consumer protection agency has chimed in, and he agrees. The FTC&#8217;s upcoming report on consumer privacy online will clarify where the other commissioners stand.</p>
<p>There is an increasingly clear transatlantic consensus: online advertising self-regulation is not enough. The W3C&#8217;s Do Not Track standards process is the way forward for providing meaningful consumer choice about third-party web tracking.</p>
<hr />
<p>Thanks to <a href="http://ashkansoltani.org/">Ashkan Soltani</a>, <a href="http://www.dubfire.net/">Chris Soghoian</a>, and &#9733;&#9733;&#9733;&#9733;&#9733; for conversations that informed this post. Thanks also to <a href="https://www.eff.org/about/staff/lee-tien">Lee Tien</a> for input on a draft. All views, especially wrong ones, are my own.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2012/02/29/the-ftcs-chairman-groks-do-not-track/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Setting the Record Straight on Google&#8217;s Safari Tracking</title>
		<link>http://webpolicy.org/2012/02/20/setting-the-record-straight-on-googles-safari-tracking/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=setting-the-record-straight-on-googles-safari-tracking</link>
		<comments>http://webpolicy.org/2012/02/20/setting-the-record-straight-on-googles-safari-tracking/#comments</comments>
		<pubDate>Tue, 21 Feb 2012 05:52:46 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Measurement]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.org/?p=343</guid>
		<description><![CDATA[Our recent research on Google&#8217;s circumvention of the Safari cookie blocking feature has led to some confusion, in part owing to the company&#8217;s statement in response (reproduced in its entirety below). This post is an attempt to elucidate the central issues. As with the original writeup, I aim for a neutral viewpoint in the interest [...]]]></description>
				<content:encoded><![CDATA[<p>Our <a href="http://webpolicy.org/2012/02/17/safari-trackers/">recent research</a> on Google&#8217;s circumvention of the Safari cookie blocking feature has led to some confusion, in part owing to the company&#8217;s statement in response (reproduced in its entirety <a href="#google_safari_statement">below</a>). This post is an attempt to elucidate the central issues. As with the original writeup, I aim for a neutral viewpoint in the interest of establishing a common factual understanding.</p>
<p><span id="more-343"></span></p>
<p>To begin, I&#8217;d like to lend some structure to ongoing policy discussions by unpacking the four business practices that are at issue.</p>
<ol>
<li><strong>Social advertising.</strong>  Google is leveraging user account information to personalize its advertising on non-Google websites.  To do that, Google now identifies its users when they view ads on non-Google websites.</li>
<li><strong>Social advertising circumvention.</strong>  Google intentionally bypassed Safari&#8217;s cookie blocking feature to place an identifying cookie that it uses for social advertising.</li>
<li><strong>Ordinary advertising circumvention.</strong>  Google&#8217;s social circumvention had a collateral effect: it enabled Google to place its ordinary advertising tracking cookie.</li>
<li><strong>Representation.</strong>  A Google instructional webpage claimed that Safari&#8217;s cookie blocking feature &#8220;effectively accomplishes the same thing&#8221; as opting out of Google&#8217;s advertising cookies.
<div style="text-align:center;">
<p><a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/google_safari_instructions.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/google_safari_instructions.png" width="150px" /></a></p>
<p>Safari Advertising Cookie Opt-Out Instructions</p>
</div>
</li>
</ol>
<p>I&#8217;d next like to clarify some key points about our findings.</p>
<ul>
<li><strong>No account, login, or user preference was required for circumvention.</strong> The circumvention behaviors affected all users, independent of whether they had a Google account, were logged into a Google account, or had made a choice about social advertising.</li>
<li><strong>Identifying and identifiable information was collected.</strong> Google&#8217;s social advertising technology is designed to identify the user—that&#8217;s how it shows your friends&#8217; pictures!  Google&#8217;s <a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/google_social_syncing_documentation.pdf">design document</a> provides additional detail on the feature.  For discussion of how third-party web tracking is in general not anonymous, see <a href="http://randomwalker.info/">Arvind Narayanan</a>&#8216;s explanation &#8220;<a href="http://cyberlaw.stanford.edu/node/6701">There is no such thing as anonymous web tracking</a>&#8221; and our <a href="http://cyberlaw.stanford.edu/node/6740">research on identifying information leakage</a>.</li>
<li><strong>Circumvention is not a commonly accepted business practice.</strong> We only identified four advertising companies that deployed technology for circumventing Safari&#8217;s cookie blocking, and all have since stopped the practice. Furthermore, a self-regulatory organization for the online advertising industry <a href="http://www.networkadvertising.org/managing/faqs.asp#question_13">cites</a> Safari&#8217;s cookie blocking feature as a way to stop cookies from advertising companies: &#8220;[Safari's] default setting will block all third-party cookies, including those of our member ad networks and those of other, non-member ad networks.&#8221;</li>
<li><strong>Apple&#8217;s intent was to block advertising-related tracking.</strong> The language in Safari&#8217;s <a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/safari_mac_3p_cookies.png">preferences menu</a>, Apple&#8217;s <a href="http://www.apple.com/safari/features.html#security">promotional materials</a>, and <a href="https://bugs.webkit.org/show_bug.cgi?id=35824">developer discussions</a> all indicate that advertising-related tracking was a central motivation for the cookie blocking feature.</li>
<li><strong>Apple&#8217;s purpose was not messing with Google.</strong> The default cookie blocking feature that Google circumvented was implemented in Safari 1.0, which shipped in 2003—long before Google was in the third-party display advertising business, and long before relations between the companies soured over smartphones. Furthermore, Safari has repeatedly been a pioneer in browser privacy. <a href="http://web.archive.org/web/20030603213933/http://www.apple.com/safari/">Safari 1.0</a> included a simple &#8220;privacy reset&#8221; choice for clearing browser settings; the other major browsers followed with similar features. <a href="http://web.archive.org/web/20050509010738/http://www.apple.com/macosx/features/safari/">Safari 2.0</a>, released in 2005, was the first browser to provide a &#8220;private browsing&#8221; mode; again, <a href="http://en.wikipedia.org/wiki/Privacy_mode">all the other major browsers followed</a>.</li>
<li><strong>No +1 button was visible on circumvention ads.</strong> We never saw an ad with the +1 button in our testing. The circumvention behaviors occurred in ordinary-looking ads. In the special case of YouTube&#8217;s homepage, there was no visible ad at all.</li>
<li><strong>Circumvention was not needed for social sharing.</strong> Google&#8217;s circumvention was not necessary to make the +1 button clickable. (For the geeks in the audience: Google could have trivially routed clicks through <code>google.com</code>.) The circumvention was only needed<sup><a href="#google_safari_fn_1">1</a></sup> to <em>personalize</em> ads—for example, to show friends&#8217; pictures near the +1 button, or in future to target ads based on Google+ social networking data.</li>
<li><strong>Users likely did not understand their social advertising setting.</strong> New users are by default opted into social advertising on signup.
<div style="display:block;height:15px;"></div>
<div style="text-align:center;">
<p><a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/google_plusone_signup_highlighted.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/google_plusone_signup_highlighted.png" width="150px" /></a></p>
<p>Social Advertising Default on Account Signup</p>
</div>
<div style="display:block;height:5px;"></div>
<p>My understanding is that users with accounts predating the +1 button have social advertising disabled, but are eventually prompted about the setting with &#8220;Enable&#8221; selected by default. Disabling the feature requires going to Accounts &rarr; Google+, locating the buried &#8220;+1 on non-Google sites&#8221; setting, then toggling it to &#8220;Disable&#8221;.  Google&#8217;s <a href="http://support.google.com/plus/bin/answer.py?hl=en&amp;answer=1152622">description of the feature</a> does not clearly communicate that it allows Google to identify the user on non-Google websites. The description also does not indicate that the feature would override a browser privacy setting.</p>
<div style="display:block;height:15px;"></div>
<div style="text-align:center;">
<div style="display:inline-block;"><a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/google_plus_settings_highlighted.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/google_plus_settings_highlighted.png" width="150px" /></a></p>
<p>Social Advertising Opt-Out Location</p></div>
<div style="display:inline-block;width:30px;"></div>
<div style="display:inline-block;"><a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/google_plusone_choice.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/google_plusone_choice.png" width="200px" /></a></p>
<p>Social Advertising Opt-Out Page</p></div>
</div>
<div style="display:block;height:15px;"></div>
</li>
<li><strong>Google&#8217;s circumvention only affected Google services.</strong> It did not allow other advertising companies to track the user.</li>
</ul>
<p>Finally, I&#8217;d like to note a couple questions that remain open for Google.</p>
<ul>
<li><strong>Users impacted.</strong> Our measurement data suggests a great number of Safari users may have been affected by Google&#8217;s circumvention. Google has not yet indicated how many users  were impacted.</li>
<li><strong>Profit.</strong> Google held an advantage over its advertising competitors that did not track Safari browsers. That advantage may have resulted in profit. Google has not yet publicized an estimate of its income from tracking Safari browsers.</li>
</ul>
<div style="display:block;height:15px;"></div>
<hr />
<p><a name="google_safari_statement"></a></p>
<p>Google circulated the following statement to media outlets and policymakers on Friday. The company did not post the statement on its website, and my understanding is that Google representatives declined to answer questions about the statement.</p>
<blockquote>
<p>The Journal mischaracterizes what happened and why. We used known Safari functionality to provide features that signed-in Google users had enabled. It’s important to stress that these advertising cookies do not collect personal information.</p>
<p>Unlike other major browsers, Apple’s Safari browser blocks third-party cookies by default.  However, Safari enables many web features for its users that rely on third parties and third-party cookies, such as “Like” buttons.  Last year, we began using this functionality to enable features for signed-in Google users on Safari who had opted to see personalized ads and other content&#8211;such as the ability to “+1” things that interest them.</p>
<p>To enable these features, we created a temporary communication link between Safari browsers and Google’s servers, so that we could ascertain whether Safari users were also signed into Google, and had opted for this type of personalization.  But we designed this so that the information passing between the user’s Safari browser and Google’s servers was anonymous&#8211;effectively creating a barrier between their personal information and the web content they browse.</p>
<p>However, the Safari browser contained functionality that then enabled other Google advertising cookies to be set on the browser.  We didn’t anticipate that this would happen, and we have now started removing these advertising cookies from Safari browsers.  It’s important to stress that, just as on other browsers, these advertising cookies do not collect personal information.</p>
<p>Users of Internet Explorer, Firefox and Chrome were not affected. Nor were users of any browser (including Safari) who have opted out of our interest-based advertising program using Google’s Ads Preferences Manager.</p>
</blockquote>
<hr />
<p>Thanks to <a href="http://randomwalker.info/">Arvind Narayanan</a>, <a href="http://ashkansoltani.org/">Ashkan Soltani</a>, <a href="https://www.eff.org/about/staff/lee-tien">Lee Tien</a>, and &#9733;&#9733;&#9733;&#9733;&#9733; for valuable input.</p>
<p><a name="google_safari_fn_1">1</a>. This discussion presumes Google would host its social advertising from <code>doubleclick.net</code> instead of <code>google.com</code>. If Google hosted social advertising from <code>google.com</code> there would have been no need to circumvent Safari&#8217;s cookie blocking.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2012/02/20/setting-the-record-straight-on-googles-safari-tracking/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Safari Trackers</title>
		<link>http://webpolicy.org/2012/02/17/safari-trackers/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=safari-trackers</link>
		<comments>http://webpolicy.org/2012/02/17/safari-trackers/#comments</comments>
		<pubDate>Fri, 17 Feb 2012 10:30:13 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Measurement]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.org/?p=233</guid>
		<description><![CDATA[Apple&#8217;s Safari web browser is configured to block third-party cookies by default. We identified four advertising companies that unexpectedly place trackable cookies in Safari. Google and Vibrant Media intentionally circumvent Safari&#8217;s privacy feature. Media Innovation Group and PointRoll serve scripts that appear to be derived from circumvention example code. In the interest of clearly establishing [...]]]></description>
				<content:encoded><![CDATA[<p>Apple&#8217;s Safari web browser is configured to block third-party cookies by default. We identified four advertising companies that unexpectedly place trackable cookies in Safari. Google and Vibrant Media intentionally circumvent Safari&#8217;s privacy feature. Media Innovation Group and PointRoll serve scripts that appear to be derived from circumvention example code.</p>
<p>In the interest of clearly establishing facts on the ground, this post provides technical analysis of Safari&#8217;s cookie blocking feature and the four companies&#8217; practices. It does not address policy or legal issues. (More on that soon.)</p>
<p>Before proceeding further, I want to thank the countless friends and colleagues who provided invaluable feedback on this project. In particular: &#9733;&#9733;&#9733;&#9733;&#9733;, whose insights have been vital at every step, and <a href="http://ashkansoltani.org/">Ashkan Soltani</a>, whose crawling data was instrumental in uncovering PointRoll&#8217;s practices and understanding the prevalence of cookie blocking circumvention.</p>
<p><span id="more-233"></span></p>
<p><strong>Third-Party Cookie Blocking in Safari</strong></p>
<p>Every popular web browser, save Opera Mini and the Android built-in browser, includes a &#8220;third-party cookie blocking&#8221; privacy feature.  (The remainder of this post uses the term &#8220;cookie blocking&#8221; for brevity.)  These options share a common high-level purpose: impose limits on cookies from &#8220;third-party domains,&#8221; that is, domains that differ from the &#8220;first-party domain&#8221; in the browser&#8217;s URL bar.  In practice, however, implementations vary substantially; for (slightly out-of-date) specifics, see the Center for Democracy and Technology&#8217;s 2010 <a href="https://www.cdt.org/files/pdfs/20101209_browser_rpt.pdf">Browser Privacy Features</a> report and Google&#8217;s <a href="http://code.google.com/p/browsersec/wiki/Part2#Third-party_cookie_rules">Browser Security Handbook</a>.</p>
<p>Safari&#8217;s cookie blocking feature is unique in two ways: its default and its substantive policy.</p>
<p>Unlike every other browser vendor, Apple enables cookie blocking by default. Every iPhone, iPad, iPod Touch, and Mac ships with the privacy feature turned on. </p>
<div style="text-align:center;">
<p><a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/safari_mac_3p_cookies.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/safari_mac_3p_cookies.png" height="150px" /></a></p>
<p>Default Privacy Settings in Desktop Safari</p>
</div>
<div style="height:25px;">
</div>
<div style="text-align:center;">
<p><a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/safari_iphone4_3pcookies.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/safari_iphone4_3pcookies.png" width="150px" /></a><span style="width:25px;display:inline-block;"></span><a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/safari_ipad2_3pcookies.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/safari_ipad2_3pcookies.png" width="200px" /></a></p>
<p>Default Privacy Settings in iPhone and iPad Safari</p>
</div>
<p>Apple <a href="http://www.apple.com/safari/what-is.html">advertises</a> cookie blocking by default as a <a href="http://www.apple.com/safari/features.html">benefit of choosing Safari</a>.</p>
<blockquote>
<p>Some companies track the cookies generated by the websites you visit, so they can gather and sell information about your web activity. Safari is the first browser that blocks these tracking cookies by default, better protecting your privacy. Safari accepts cookies only from the current domain.</p>
</blockquote>
<p>Apple&#8217;s cookie blocking policy is less restrictive than many competing browser vendors&#8217;.<sup><a href="#safari_trackers_fn_1">1</a></sup></p>
<ul>
<li><strong>Reading Cookies</strong> Safari allows third-party domains to read cookies.</li>
<li><strong>Modifying Cookies</strong> If an HTTP request to a third-party domain includes a cookie, Safari allows the response to write cookies.</li>
<li><strong>Form Submission</strong> If an HTTP request to a third-party domain is caused by the submission of an HTML <code>form</code>, Safari allows the response to write cookies. This component of the policy was <a href="http://trac.webkit.org/changeset/92142">removed from WebKit</a>, the open source browser behind Safari, seven months ago by Google engineers. Their rationale is not public; the bug is marked as a security problem. The change has not yet landed in Safari.</li>
</ul>
<p>These allowances in the Safari cookie blocking policy enable three potentially undesirable behaviors by advertising networks, analytics services, social widgets, and other &#8220;third-party websites.&#8221;</p>
<ul>
<li>If a company operates both a first-party website and a third-party website from the same domain, visitors to the first-party website will be open to cookie-based tracking by the third-party service. Yahoo! is an example: it hosts both first-party websites and third-party advertising services on the <code>yahoo.com</code> domain.</li>
<li>If a third-party website&#8217;s content ever manages to load in a full browser window, the website can set cookies on its domain. An advertising company, for example, could set tracking cookies on its domain with a pop-up, pop-under, or temporary redirect (e.g. <a href="http://www.rimmkaufman.com/blog/yes-advertisers-can-track-ios-sales/23032011/">when a user clicks an ad</a>).</li>
<li>A third-party website can use JavaScript to submit a <code>form</code> in an <code>iframe</code> without user interaction.</li>
</ul>
<p>This post focuses on the last behavior. We discovered four advertising companies that surreptitiously submit a <code>form</code> in an invisible <code>iframe</code> and place trackable cookies in Safari: Google, Vibrant Media, Media Innovation Group, and PointRoll. The balance of the post details each company&#8217;s business practices.</p>
<p><strong>Google</strong></p>
<p>Google has, historically, operated most of its first-party websites on the <code>google.com</code> domain and most of its third-party services on other domains. For example: Google Analytics is served from <code>google-analytics.com</code>, Google software libraries are hosted at <code>googleapis.com</code>, Google static content is at <code>gstatic.com</code>, and Google&#8217;s advertising services are on <code>doubleclick.net</code>.</p>
<p>Separating first-party websites from third-party services improves security: interactions between <code>google.com</code> content and other websites could introduce vulnerabilities. The domain separation also benefits user privacy: Google associates user account information with <code>google.com</code> cookies. By serving its third-party services from other domains, Google ensures it will not receive <code>google.com</code> cookies, and therefore will not be able to trivially identify user activities on other websites.</p>
<p>But what about when Google <em>does</em> want to identify the user on a non-Google website? Social personalization requires<sup><a href="#safari_trackers_fn_2">2</a></sup> just that! Google has two design options.</p>
<p>First, Google could embed <code>google.com</code> content on non-Google websites. This is the approach it has taken with its social sharing widgets; both the (defunct) <a href="http://googleblog.blogspot.com/2010/04/google-buzz-buttons.html">Buzz button</a> and the <a href="http://googlewebmastercentral.blogspot.com/2011/06/add-1-to-help-your-site-stand-out.html">+1 button</a> load resources from <code>google.com</code>.</p>
<p>Second, Google could synchronize information from its <code>google.com</code> domain to another domain, a process called &#8220;cookie syncing&#8221; in online advertising lingo. This is the approach Google took with <code>youtube.com</code> after it acquired YouTube. And this is the approach Google settled on for social personalization of <code>doubleclick.net</code> display advertising. Google <a href="http://adsense.blogspot.com/2011/09/1-now-making-display-ads-more-relevant.html">announced</a> the +1 button for display ads last September; here are the steps in the underlying cookie syncing technology, based on conversations with Googlers and an <a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/google_social_syncing_documentation.pdf">explanatory document</a> that Google provided.</p>
<ol>
<li>A display advertisement includes the cookie syncing mechanism&#8217;s <code>iframe</code>, located at <code><a href="http://googleads.g.doubleclick.net/pagead/drt/s">http://googleads.g.doubleclick.net/pagead/drt/s</a></code>. We observed the <code>iframe</code> load in both desktop and mobile display ads. Here are example ads that included the mechanism, from the Washington Post (Safari on Mac OS X) and MSNBC (Safari on iPhone).
<div style="height:25px;"></div>
<div style="text-align:center;">
<p><a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/google_mac_syncing_ad.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/google_mac_syncing_ad.png" width="250px" /></a><span style="width:25px;display:inline-block;"></span><a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/google_iphone_syncing_ad.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/google_iphone_syncing_ad.png" width="100px" /></a></p>
<p>Google Ads Including the <code>google.com</code> &rarr; <code>doubleclick.net</code> Cookie Syncing Mechanism</p>
</div>
<div style="height:25px;"></div>
<p>We also saw what appeared to be a special use of the mechanism on <code>youtube.com</code>, where Google placed an invisible advertisement in the footer that included the cookie syncing <code>iframe</code>.</p>
<p>In a <a href="http://fourthparty.info/">FourthParty</a> crawl of the homepages of the Alexa U.S. top 500 websites, we <a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/google_cookie_syncing_pages.txt">detected the cookie syncing mechanism on 39 pages</a>. This figure likely underestimates the prevalence of Google&#8217;s cookie syncing code since many websites show less advertising on their homepage. We observed the mechanism on New York Times and MSNBC article pages, for example, but not on their homepages.</li>
<p></p>
<li>The <code>iframe</code> loads a page that contains a <a href="http://en.wikipedia.org/wiki/Meta_refresh"><code>meta</code> refresh</a> to <a href="http://google.com/pagead/drt/ui">http://google.com/pagead/drt/ui</a>.<br />
<blockquote>
<pre>

&lt;!DOCTYPE HTML PUBLIC&gt;

&lt;html&gt;

  &lt;head&gt;

    &lt;meta http-equiv=&quot;refresh&quot; content=&quot;0;url=http://google.com/pagead/drt/ui&quot; /&gt;

  &lt;/head&gt;

&lt;/html&gt;

</pre>
</blockquote>
<p>Apologies for any overflow; here and throughout the post I err on the side of preserving original formatting.</p>
</li>
<li>The response at <a href="http://google.com/pagead/drt/ui">http://google.com/pagead/drt/ui</a> depends on whether the user is logged into Google. If the user is not logged in, the response includes a <code>Location</code> header that directs the browser back to <code>googleads.g.doubleclick.net</code> with some information in the <code>p</code> and <code>ut</code> parameters of the <code>Request-URI</code>.<br />
<blockquote>
<pre>

Location: https://googleads.g.doubleclick.net/pagead/drt/si?p=CAA&amp;ut=AFAKxlQAAAAATzuSTM-wZva6TmRV_FF7YdF2nggZfnlI

</pre>
</blockquote>
<p>If the user is logged in, the response directs the user to Google&#8217;s authentication service.</p>
<blockquote>
<pre>

Location: https://accounts.google.com/ServiceLogin?service=doritos&amp;passive=true&amp;go=true&amp;continue=https://googleads.g.doubleclick.net/pagead/drt/si?p=CAEY9cLA-gQ&amp;ut=AFAKxlQAAAAATz2v-fyl5V0PcBdEsvg95beKTozmJSql

</pre>
</blockquote>
<p>The authentication service then directs the user back to <code>googleads.g.doubleclick.net</code>.</p>
<p>(A quick explanation of the &#8220;Doritos&#8221; reference—as I understand it, that&#8217;s Google&#8217;s internal codename for social personalization of third-party display advertising.)</p>
<blockquote>
<pre>

location:https://googleads.g.doubleclick.net/pagead/drt/si?p=CAEY9cLA-gQ&amp;ut=AFAKxlQAAAAATz2v-fyl5V0PcBdEsvg95beKTozmJSql&amp;pli=1&amp;auth=DQAAAIUAAAAPNIlph4K8ZDuUPlslr38CnSgqvc7E26I5RwkOrDDU7r81Q8H6iVYltrf4TEcE1haR9gSXQuARTXXHSWIW6EnmOyb2inWlPm28lprT6Hmkhn_PzhpuYlNUrSFZ9RdOAdro-hHVwMHQojKjOSSkQxQIIvetbMiMIOTcK88Ltq7Td9rQHLHJ_QrNb7EDz727XUM

</pre>
</blockquote>
<p>Google&#8217;s documentation suggests that the <code>p</code> and <code>ut</code> parameters include an encryption of the user&#8217;s login state and, if logged in, account ID. (The Google design document makes a number of technical claims about how the syncing mechanism preserves user privacy. This post does not address those claims.)</p>
<li>In a browser other than Safari, the response sets a &#8220;_drt_&#8221; social personalization cookie on <code>.doubleclick.net</code>. If the user is not logged into Google, the cookie&#8217;s value is &#8220;NO_DATA&#8221;, and the cookie is set to expire in 12 hours.<br />
<blockquote>
<pre>

set-cookie:_drt_=NO_DATA; expires=Fri, 17-Feb-2012 13:37:41 GMT; path=/; domain=.doubleclick.net; HttpOnly

</pre>
</blockquote>
<p>If the user is logged into Google, the &#8220;_drt_&#8221; cookie contains an encryption of the user&#8217;s Google account ID, set to expire in 24 hours.</p>
<blockquote>
<pre>

set-cookie:_drt_=AFkicjesF-jVECSOLRa1a-hf14FYVKIPEu4goDlxZZdVaxh1D4gDfJ6dvZg7Evnr2C8MluBSk6Nkr8TfL1ksojSb8qsjYHSNMQ; expires=Sat, 18-Feb-2012 01:25:10 GMT; path=/; domain=.doubleclick.net; HttpOnly

</pre>
</blockquote>
<p>My understanding is that the cookie expirations are set to limit syncing frequency. If the user was not logged in at last sync, a sync will not be attempted for at least 12 hours; if the user was logged in, a sync will not be attempted for at least 24 hours.</p>
<p>In early testing, we a noticed a &#8220;PREF&#8221; cookie was sometimes set at the same time as the &#8220;_drt_&#8221; cookie.</p>
<blockquote>
<pre>

set-cookie:PREF=ID=5a7be344032983bc:TM=1325643281:LM=1325643281:S=-BWpqDzbE7gq8rg-; expires=Fri, 03-Jan-2014 02:14:41 GMT; path=/; domain=googleads.g.doubleclick.net

</pre>
</blockquote>
<p>The behavior appeared to stop before we contacted Google about our findings. We have not received information from Google explaining the &#8220;PREF&#8221; cookie on <code>googleads.g.doubleclick.net</code>. It appears to have the same format as the <a href="http://repository.cmu.edu/cgi/viewcontent.cgi?article=1058&amp;context=jpc">&#8220;PREF&#8221; cookie on <code>google.com</code></a> and the same <a href="http://googleblog.blogspot.com/2007/07/cookies-expiring-sooner-to-improve.html">two-year expiration</a>.</p>
</li>
</ol>
<p>So far, a (relatively) straightforward cookie syncing mechanism. But we noticed a special response at the last step for Safari browsers. We tested 400 <code>User-Agent</code> strings to verify that this is a special case; a <a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/google_user-agent_tests.xlsx">spreadsheet of results</a> is available.</p>
<p>Instead of responding with the &#8220;_drt_&#8221; cookie, the server sends back a page that includes a <code>form</code> and JavaScript to submit the <code>form</code> (using POST) to its own URL.</p>
<blockquote>
<pre>

&lt;html&gt;

&lt;head&gt;&lt;/head&gt;

&lt;body&gt;

  &lt;form id=&quot;drt_form&quot; method=post action=&quot;/pagead/drt/si?p=CAA&amp;amp;ut=AFAKxlQAAAAATzuSTM-wZva6TmRV_FF7YdF2nggZfnlI&quot;&gt;&lt;/form&gt;

  &lt;script&gt;

    document.getElementById('drt_form').submit();

  &lt;/script&gt;

&lt;/body&gt;

&lt;/html&gt;

</pre>
</blockquote>
<p>The response to the form submission then includes the <code>Set-Cookie</code> header for the &#8220;_drt_&#8221; cookie.</p>
<p>Recall that if a cookie is sent with an HTTP request, Safari&#8217;s blocking policy will allow the response to write cookies. Owing to the &#8220;_drt_&#8221; cookie, all <code>doubleclick.net</code> content is now immunized from Safari&#8217;s cookie blocking policy. The next time Google advertising content attempts to install the &#8220;id&#8221; tracking cookie for <code>.doubleclick.net</code>, it will successfully set. That next attempt may not even require that the user visit another page: We noticed that many Google ads periodically send requests to <code>doubleclick.net</code>, especially to a URL with the base <code>http://ad.doubleclick.net/activity</code>. A response to one of these requests can include a <code>Set-Cookie</code> header for the &#8220;test_cookie&#8221; cookie, which Google uses to make sure cookies successfully set (presumably to avoid wasting IDs and associated resources). A response to a subsequent request may then include a <code>Set-Cookie</code> header for the &#8220;id&#8221; cookie.</p>
<p>We confirmed that Google&#8217;s <code>doubleclick.net</code> &#8220;id&#8221; cookie was functioning in Safari by observing behavioral interest categories appear in <a href="www.google.com/ads/preferences/">Google Ads Preferences</a>. Here is an example set of inferred interests after browsing the New York Times website.</p>
<div style="text-align:center;">
<p><a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/g_ads_preferences.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/g_ads_preferences.png" height="200px" /></a></p>
<p>Google Ads Preferences in a Fresh Instance of Safari After Browsing the New York Times</p>
</div>
<p><strong>Vibrant Media</strong></p>
<p><a href="http://www.vibrantmedia.com/">Vibrant Media</a> is a contextual advertising network that primarily offers in-text and display advertising. We found conclusive evidence that Vibrant deliberately circumvents Safari&#8217;s third-party cookie blocking feature: one of the URLs involved in the circumvention is for the resource <code>/safari.jsp</code>. The following steps describe the circumvention technology as deployed on <code>answers.com</code>.  We observed identical behavior at the various region-specific subdomains of <code>cbslocal.com</code>.</p>
<ol>
<li>Vibrant&#8217;s main advertising script loads from <a href="http://answers.us.intellitxt.com/intellitxt/front.asp?ipid=31690">http://answers.us.intellitxt.com/intellitxt/front.asp?ipid=31690</a>.  When the browser has a Safari <code>User-Agent</code> string and no Vibrant cookie, the script includes this code:<br />
<blockquote>
<pre>

(function()

{try

{var e=document.createElement('iframe');e.style.display='none';e.src='http://answers.us.intellitxt.com/safari.jsp?t='+(new Date()).getTime();var b=document.getElementsByTagName('body')[0];b.insertBefore(e,b.firstChild);}catch(x){}})();

</pre>
</blockquote>
</li>
<li>The Safari-specific code executes, adding an invisible <code>iframe</code> to the page.<br />
<blockquote>
<pre>

&lt;iframe style=&quot;display: none;&quot; src=&quot;http://answers.us.intellitext.com/safari.jsp?t=1328762901524&quot;&gt;&lt;/iframe&gt;

</pre>
</blockquote>
<p>The <code>Request-URI</code> parameter <code>t</code> is the current time in milliseconds, presumably used to prevent caching.</p>
</li>
<li>The <code>iframe</code> contains a <code>form</code> and a <code>body</code> <code>onload</code> handler that submits the <code>form</code>.<br />
<blockquote>
<pre>

&lt;html&gt;&lt;head&gt;&lt;/head&gt;&lt;body onload=&quot;document.getElementById('myform').submit();&quot;&gt;&lt;form method=&quot;post&quot; action=&quot;http://answers.us.intellitxt.com/safari.jsp?x=1&amp;t=1328762901524&quot; id=&quot;myform&quot;&gt;&lt;/form&gt;&lt;/body&gt;&lt;/html&gt;

</pre>
</blockquote>
</li>
<li>The response to the form contains no content and an instruction to set a Vibrant ID cookie.<br />
<blockquote>
<pre>

Set-Cookie: VM_USR=AG75nlrejUwdiE6n3+naS1YAADwZAAA8VQEAAAE1gRigFAA-; Domain=.intellitxt.com; Expires=[now + 2 months]; Path=/

</pre>
</blockquote>
<p>The <code>Request-URI</code> parameter <code>x=1</code> appears to control whether the response includes the form page or a <code>Set-Cookie</code> header.</p>
<p>We confirmed that the &#8220;VM_USR&#8221; cookie is a Vibrant ID by checking the <a href="http://networkadvertising.org/">Network Advertising Initiative</a>&#8216;s <a href="http://www.networkadvertising.org/managing/opt_out.asp">cookie status page</a>.</p>
<div style="text-align:center;">
<p><a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/vibrant_active_cookie.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/vibrant_active_cookie.png" height="50px" /></a></p>
<p>Active Vibrant Media Cookie Status Indicator</p>
</div>
<p>We verified that the NAI indicator is based on the presence of a valid &#8220;VM_USR&#8221; cookie, not the presence of any cookie or any &#8220;VM_USR&#8221; cookie.</p>
</li>
</ol>
<p><strong>Media Innovation Group</strong></p>
<p><a href="http://www.themig.com/">Media Innovation Group</a> (MIG) is an advertising technology provider within the <a href="http://en.wikipedia.org/wiki/WPP_plc">WPP family of companies</a>. MIG&#8217;s &#8220;Zeus Advertising Platform&#8221; (ZAP) is WPP&#8217;s <a href="http://www.themig.co.uk/mobile/zap.php">&#8220;integrated advertising and analytics platform&#8221;</a>. According to a <a href="http://www.netezza.com/documents/MIG_CaseStudy.pdf">report from a vendor</a>, ZAP &#8220;is one of the cornerstone products created by MIG&#8221; that &#8220;provides a holistic view of site analytics and campaign data for a comprehensive understanding of every individual consumer.&#8221; ZAP &#8220;collects and stores over 13 months of historical user-level data and draws from it to provide complex and robust analysis.&#8221; With ZAP, &#8220;MIG is currently tracking the effectiveness of every single advertising element within many live campaigns that reach hundreds of millions of unique users per month . . . .&#8221;</p>
<p>We found that some MIG advertising content included a script that circumvents Safari&#8217;s cookie blocking feature. Here is the relevant part of one such script we discovered at <a href="http://b3.mookie1.com/2/B3DM/DLX/1672705484@x71">http://b3.mookie1.com/2/B3DM/DLX/1672705484@x71</a>. A few clarifying notes: <code>mookie1.com</code> is a MIG domain (go figure), <code>is_http</code> stores whether the content is loaded over HTTP, and <code>ZAP_id</code> stores the &#8220;id&#8221; cookie.</p>
<blockquote>
<pre>

if(is_http) {

    if(ZAP_id.indexOf(':') != -1 || ZAP_id == '') {

        var firstTimeSession = 0;



        function submitSessionForm() {

            if (firstTimeSession == 0) {

                firstTimeSession = 1;

                $(&quot;#sessionform&quot;).submit();

                //setTimeout(processApplication(),2000);

            }

        }



        $(&quot;body&quot;).append('&lt;iframe id=&quot;sessionframe&quot; name=&quot;sessionframe&quot; onload=&quot;submitSessionForm()&quot; src=&quot;http://t.mookie1.com/t/v1/imp?&quot; style=&quot;display:none;&quot;&gt;&lt;/iframe&gt;&lt;form id=&quot;sessionform&quot; enctype=&quot;application/x-www-form-urlencoded&quot; action=&quot;http://t.mookie1.com/t/v1/imp?&quot; target=&quot;sessionframe&quot; method=&quot;POST&quot;&gt;&lt;/form&gt;');



        function processApplication() {

        }

    }

</pre>
</blockquote>
<p>The script creates an invisible <code>iframe</code> and <code>form</code>, then submits the <code>form</code> into the <code>iframe</code> during the <code>onload</code> handler for the <code>iframe</code>.</p>
<p>In response to the form submission, MIG sets cookies and redirects to a 1&#215;1 GIF.</p>
<blockquote>
<pre>

$ curl -i -L -X POST "http://t.mookie1.com/t/v1/imp?"

HTTP/1.1 302 Found

Date: Fri, 17 Feb 2012 09:48:03 GMT

Server: Apache/2.0.52 (Red Hat)

Cache-Control: no-cache

Pragma: no-cache

P3P: CP="NOI DSP COR NID CUR OUR NOR"

Set-Cookie: id=3025894295853070; path=/; expires=Mon, 18-Mar-13 09:48:03 GMT; path=/; domain=.mookie1.com

Set-Cookie: mdata=1|3025894295853070|1329472083; path=/; expires=Mon, 18-Mar-13 09:48:03 GMT; path=/; domain=.mookie1.com

Set-Cookie: OAX=nVuS508+IlMACEDl; path=/; expires=Mon, 18-Mar-13 09:48:03 GMT; path=/; domain=.mookie1.com

Location: /t/v1/imp/cc?

Content-Length: 277

Connection: close

Content-Type: text/html; charset=iso-8859-1



HTTP/1.1 200 OK

Date: Fri, 17 Feb 2012 09:48:03 GMT

Server: Apache/2.0.52 (Red Hat)

Cache-Control: no-cache

Pragma: no-cache

P3P: CP="NOI DSP COR NID CUR OUR NOR"

Set-Cookie: id=914844815541839; path=/; expires=Mon, 18-Mar-13 09:48:03 GMT; path=/; domain=.mookie1.com

Set-Cookie: mdata=1|914844815541839|1329472083; path=/; expires=Mon, 18-Mar-13 09:48:03 GMT; path=/; domain=.mookie1.com

Set-Cookie: OAX=T6AK5U8+IlMACoIB; path=/; expires=Mon, 18-Mar-13 09:48:03 GMT; path=/; domain=.mookie1.com

Content-Length: 35

Connection: close

Content-Type: image/gif



GIF87a???????,D;

</pre>
</blockquote>
<p>Comments in MIG&#8217;s script indicate that &#8220;id&#8221; is the ZAP ID cookie and &#8220;OAX&#8221; is the ID cookie for WPP&#8217;s <a href="http://www.wpp.com/wpp/press/press/default.htm?guid=%7Bbcf57ca0-dbc0-4329-8410-2a0c876adea0%7D">B3 advertising optimization and custom marketplace product</a>. We verified that the script sets a tracking cookie with MIG&#8217;s NAI status indicator.</p>
<p>MIG&#8217;s circumvention code appeared (relatively) infrequently; our crawl of the Alexa U.S. top 500 homepages <a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/mig_pages.txt">located it on five sites</a>. It is unclear whether MIG served this script only to Safari users. While we did not see the MIG code in any non-Safari browsers, that may have been due to insufficient sample size; we were not able to reliably cause the MIG code to appear.</p>
<p>That said, we believe it is nevertheless reasonable to infer that MIG&#8217;s circumvention was intentional. MIG&#8217;s code appears to be based on widely-cited <a href="http://anantgarg.com/2010/02/18/cross-domain-cookies-in-safari/">sample code</a> by web developer <a href="http://anantgarg.com/">Anant Garg</a>. Even Facebook&#8217;s <a href="http://developers.facebook.com/docs/best-practices/#miscellaneous-issues">developer documentation</a> points to the sample.</p>
<blockquote>
<pre>

var isSafari = (/Safari/.test(navigator.userAgent));

var firstTimeSession = 0;



function submitSessionForm() {

	if (firstTimeSession == 0) {

		firstTimeSession = 1;

		$(&quot;#sessionform&quot;).submit();

		setTimeout(processApplication(),2000);

  	}

}



if (isSafari) {

	$(&quot;body&quot;).append('&lt;iframe id=&quot;sessionframe&quot; name=&quot;sessionframe&quot; onload=&quot;submitSessionForm()&quot; src=&quot;http://www.yourdomain.com/blank.php&quot; style=&quot;display:none;&quot;&gt;&lt;/iframe&gt;&lt;form id=&quot;sessionform&quot; enctype=&quot;application/x-www-form-urlencoded&quot; action=&quot;http://www.yourdomain.com/startsession.php&quot; target=&quot;sessionframe&quot; action=&quot;post&quot;&gt;&lt;/form&gt;');

} else {

	processApplication();

}



function processApplication() {

	alert('Session has been set. Now you can start your application!');

}

</pre>
</blockquote>
<p>The resemblance is uncanny. The scripts share the <em>exact same</em> variable names, structure, logic, and library dependency (jQuery 1.3.2 on Google Libraries). Even more compelling, MIG commented out a line that it didn&#8217;t need!</p>
<p><strong>PointRoll</strong></p>
<p><a href="http://www.pointroll.com/">PointRoll</a> is a rich media advertising company owned by Gannett. PointRoll&#8217;s corporate website claims that it &#8220;[p]ower[s] 55% of all rich media campaigns online&#8221; and &#8220;serv[es] over 450 billion impressions for more than two-thirds of the Fortune 500 brands . . . .&#8221;</p>
<p>We found that a PointRoll cookie helper script circumvents Safari&#8217;s cookie blocking. One instance of the script we studied is at <a href="http://ads.pointroll.com/PortalServe/?pid=1574300Y14520120126002933&amp;flash=11&amp;time=4|13:53|-8&amp;redir=http://at.atwola.com/adlink/5113/2209587/0/2392/AdId=2327012;BnId=1;itime=219618215;nodecode=yes;link=$CTURL$&amp;pos=s&amp;postal=94305&amp;r=0.18630846054557448">http://ads.pointroll.com/PortalServe/?pid=1574300Y14520120126002933&amp;flash=11&amp;time=4|13:53|-8&amp;redir=http://at.atwola.com/adlink/5113/2209587/0/2392/AdId=2327012;BnId=1;itime=219618215;nodecode=yes;link=$CTURL$&amp;pos=s&amp;postal=94305&amp;r=0.18630846054557448</a>.</p>
<p>Here&#8217;s the relevant part of the script. Unlike the other examples, this code was passed through a formatter—it&#8217;s otherwise unreadable.</p>
<blockquote>
<pre>

function submitSessionForm(name) {

    var sessionForm = document.getElementById(name);

    if (typeof (sessionForm) != 'undefined') {

        var txtStatus = document.getElementById('txt_' + name);

        if (txtStatus.value == 'UNSUBMITTED') {

            txtStatus.value = 'SUBMITTED';

            console.log(&quot;form &quot; + name + &quot; Submitted&quot;);

            sessionForm.submit();

        }

    }

}

function prCook(name, value, date, dom) {

    console.log(&quot;add form: name=&quot; + name + &quot;: value=&quot; + value + &quot;: date=&quot; + date + &quot;: dom=&quot; + dom);

    var date = (typeof (date) != &quot;undefined&quot;) ? date : &quot;Fri, 14-Feb-2014 14:47:07 GMT&quot;;

    var dom = (typeof (dom) != &quot;undefined&quot;) ? dom : &quot;ads.pointroll.com&quot;;

    var sCook = '&lt;iframe id=&quot;' + name + '_frame&quot; name=&quot;' + name + '_frame&quot; onload=&quot;submitSessionForm('';

    sCook += name;

    sCook += '')&quot; src=&quot;http://ads.pointroll.com/clients/pointroll/cookie/blank.aspx&quot; style=&quot;display:none;&quot;&gt;&lt;/iframe&gt;';

    sCook += '&lt;form id=&quot;';

    sCook += name;

    sCook += '&quot; style=&quot;display:none;&quot; enctype=&quot;application/x-www-form-urlencoded&quot; action=&quot;http://ads.pointroll.com/clients/pointroll/cookie/drop.ashx&quot; target=&quot;' + name + '_frame&quot; action=&quot;post&quot;&gt;';

    sCook += '&lt;input type=&quot;text&quot; name=&quot;name&quot; value=&quot;' + name + '&quot; /&gt;';

    sCook += '&lt;input type=&quot;text&quot; name=&quot;date&quot; value=&quot;' + date + '&quot; /&gt;';

    sCook += '&lt;input type=&quot;text&quot; name=&quot;value&quot; value=&quot;' + value + '&quot; /&gt;';

    sCook += '&lt;input type=&quot;text&quot; name=&quot;domain&quot; value=&quot;' + dom + '&quot; /&gt;';

    sCook += '&lt;input type=&quot;text&quot; id=&quot;txt_';

    sCook += name;

    sCook += '&quot; status&quot; value=&quot;UNSUBMITTED&quot; /&gt;';

    sCook += '&lt;/form&gt;';

    var d = document.createElement('DIV'),

        p = document.getElementsByTagName('BODY');

    d.innerHTML = sCook;

    if (p.length &lt; 1) {

        p = document.getElementsByTagName('HTML');

    }

    p[0].appendChild(d);

</pre>
</blockquote>
<p>The script provides a cookie setting function, <code>prCook</code>. When called, the function creates a new <code>div</code> and places within it an invisible <code>iframe</code> and <code>form</code> with the cookie fields specified by the input parameters. An <code>onload</code> handler on the <code>iframe</code> submits the <code>form</code> into the <code>iframe</code>. For example, the call</p>
<blockquote>
<pre>

prCook('example_cookie_name', 'example_cookie_value', 'Fri, 14-Feb-2014 14:47:07 GMT', 'exampledomain.com')

</pre>
</blockquote>
<p>would result in this code being added in a new <code>div</code> element:</p>
<blockquote>
<pre>

&lt;iframe id=&quot;example_cookie_name_frame&quot; name=&quot;example_cookie_name_frame&quot; onload=&quot;submitSessionForm('example_cookie_name')&quot; src=&quot;http://ads.pointroll.com/clients/pointroll/cookie/blank.aspx&quot; style=&quot;display:none;&quot;&gt;&lt;/iframe&gt;

&lt;form id=&quot;example_cookie_name&quot; style=&quot;display:none&quot; enctype=&quot;application/x-www-form-urlencoded&quot; action=&quot;http://ads.pointroll.com/clients/pointroll/cookie/drop.ashx&quot; target=&quot;example_cookie_name_frame&quot; action=&quot;post&quot;&gt;

   &lt;input type=&quot;text&quot; name=&quot;name&quot; value=&quot;example_cookie_name&quot; /&gt;

   &lt;input type=&quot;text&quot; name=&quot;date&quot; value=&quot;Fri, 14-Feb-2014 14:47:07 GMT&quot; /&gt;

   &lt;input type=&quot;text&quot; name=&quot;value&quot; value=&quot;example_cookie_value&quot; /&gt;

   &lt;input type=&quot;text&quot; name=&quot;domain&quot; value=&quot;exampledomain.com&quot; /&gt;

   &lt;input type=&quot;text&quot; id=&quot;txt_example_cookie_name&quot; status&quot; value=&quot;UNSUBMITTED&quot; &gt;

&lt;/form&gt;

</pre>
</blockquote>
<p>Here is the response when the <code>form</code> is submitted.</p>
<blockquote>
<pre>

$ curl -i &quot;http://ads.pointroll.com/clients/pointroll/cookie/drop.ashx?name=example_cookie_name&amp;date=Fri%2C%2014-Feb-2014%2014%3A47%3A07%20GMT&amp;value=example_cookie_value&amp;domain=exampledomain.com&quot;

HTTP/1.1 200 OK

Connection: close

Date: Fri, 17 Feb 2012 07:41:13 GMT

Server: Microsoft-IIS/6.0

P3P: CP=&quot;NOI DSP COR PSAo PSDo OUR BUS OTC&quot;

Access-Control-Allow-Origin: *

X-AspNet-Version: 2.0.50727

Pragma: no-cache

p3p: CP=&quot;IDC DSP COR ADM DEVi TAIi PSA PSD IVAi IVDi CONi HIS OUR IND CNT&quot;

Set-Cookie: example_cookie_name=example_cookie_value; domain=exampledomain.com; expires=Fri, 14-Feb-2014 14:47:07 GMT; path=/

Cache-Control: private

Content-Type: text/html; charset=utf-8

Content-Length: 145



&lt;script&gt;console.log('drop details: cookie name=example_cookie_name; cookie value=example_cookie_value; cookie domain=exampledomain.com')

</pre>
</blockquote>
<p>The response includes a <code>Set-Cookie</code> header with the values from the form. (And introduces a cross-site scripting vulnerability.)</p>
<p>PointRoll&#8217;s script includes code for setting 9 cookies.</p>
<blockquote>
<pre>

prCook('PRID', 'EC7B3EB9-DE19-4A32-AC07-83856AB7AE98', 'Fri, 14-Feb-2014 14:47:07 GMT', 'ads.pointroll.com');

prCook('PRbu', 'EuQHJKn7z', 'Fri, 14-Feb-2014 14:47:07 GMT', '.pointroll.com');

prCook('PRgo', 'BAA', 'Fri, 14-Feb-2014 14:47:07 GMT', '.pointroll.com');

prCook('PRimp', '01B90400-2AC9-F37E-0209-DAC000190101', 'Fri, 14-Feb-2014 14:47:07 GMT', 'ads.pointroll.com');

prCook('PRca', '|AK5C*37611:1|#', 'Fri, 14-Feb-2014 14:47:07 GMT', 'ads.pointroll.com');

prCook('PRcp', '|AK5CAJmd:1|#', 'Fri, 14-Feb-2014 14:47:07 GMT', 'ads.pointroll.com');

prCook('PRpl', '|Fhtv:1|#', 'Fri, 14-Feb-2014 14:47:07 GMT', 'ads.pointroll.com');

prCook('PRcr', '|GTnd:1|#', 'Fri, 14-Feb-2014 14:47:07 GMT', 'ads.pointroll.com');

prCook('PRpc', '|FhtvGTnd:1|#', 'Fri, 14-Feb-2014 14:47:07 GMT', 'ads.pointroll.com');

</pre>
</blockquote>
<p>It would seem reasonable to infer that the two-year &#8220;PRID&#8221; cookie is PointRoll&#8217;s unique ID cookie.</p>
<p>As with the MIG code, we did not have a large enough sample size to conclusively determine that this script was sent only to Safari browsers. But, again, we were able to deduce that the cookie blocking circumvention was intentional by comparison to Anant Garg&#8217;s example code. There are several telltale signs of copying.</p>
<ul>
<li><strong>Structure</strong> The script is structured in the same way: an <code>onload</code> handler function followed by the <code>iframe</code> and <code>form</code>.</li>
<li><strong>Variable Names</strong> Both use the variable <code>sessionForm</code> and the handler function <code>submitSessionForm</code>.</li>
<li><strong>HTML Elements</strong> Both use the same attributes, attribute ordering, and coding style in their <code>iframe</code> and <code>form</code> elements. (The PointRoll code includes an additional <code>style</code> attribute in its <code>form</code> element.)</li>
<li><strong>POST Bug</strong> If you&#8217;re still unconvinced, here&#8217;s the giveaway: the scripts have the same bug. Both say <code>action="post"</code> (which does nothing, apparently) instead of <code>method="post"</code> (which sets the form to use the POST method instead of the GET method). The author of the example code corrected himself in a comment.<br />
<blockquote>
<p>“action” is correct for the target script in an HTML form. However, there is a second “action” in the code that should be “method” instead: method=”post” (not action=”post”).</p>
</blockquote>
</li>
</ul>
<p>We found PointRoll&#8217;s code on <a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/safari_study/pointroll_pages.txt">19 homepages</a> in the Alexa U.S. top 500.</p>
<p><strong>Conclusion</strong></p>
<p>When Apple&#8217;s developers implemented Safari&#8217;s cookie blocking feature, they were balancing several conflicting design priorities. But one decision was clear: it should prevent advertising companies from tracking the user. As a lead developer <a href="https://bugs.webkit.org/show_bug.cgi?id=35824#c19">noted</a>, whatever the implications for other websites, &#8220;I do think it would typically stop, say, doubleclick.net from tracking you . . . .&#8221;</p>
<p>Four advertising companies circumvented Apple&#8217;s protection. Some privacy researchers and advocates have characterized the interplay between third-party web trackers and browser privacy measures as a &#8220;cat and mouse game&#8221; or &#8220;arms race.&#8221; This research result regrettably affirms that view as reality—for, quite possibly, millions of users.</p>
<hr />
<p><a name="safari_trackers_fn_1">1</a>. This post does not address the merits of Safari&#8217;s cookie blocking policy.</p>
<p><a name="safari_trackers_fn_2">2</a>. There are some options for privacy-preserving social personalization. But they are far beyond the scope of this post.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2012/02/17/safari-trackers/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Brief Overview of the Supplementary DAA Principles</title>
		<link>http://webpolicy.org/2011/11/08/a-brief-overview-of-the-supplementary-daa-principles/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=a-brief-overview-of-the-supplementary-daa-principles</link>
		<comments>http://webpolicy.org/2011/11/08/a-brief-overview-of-the-supplementary-daa-principles/#comments</comments>
		<pubDate>Wed, 09 Nov 2011 07:51:30 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=114</guid>
		<description><![CDATA[Original at the Stanford Center for Internet and Society. Yesterday the Digital Advertising Alliance (DAA) announced a supplementary set of self-regulatory principles for third parties on the web (pdf, press release). This post is a brief &#8212; and far from comprehensive &#8212; overview of improvements, continued deficiencies, and procedural issues. Improvements 1. Several sensitive uses [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at the <a href="http://cyberlaw.stanford.edu/node/6755">Stanford Center for Internet and Society</a>.</em></p>
<p>Yesterday the <a href="http://aboutads.info" rel="nofollow">Digital Advertising Alliance</a> (DAA) announced a supplementary set of self-regulatory principles for third parties on the web (<a href="http://www.aboutads.info/resource/download/Multi-Site-Data-Principles.pdf" rel="nofollow">pdf</a>, <a href="http://www.aboutads.info/resource/download/DAA_MSD-Principles-Release_FINAL.pdf" rel="nofollow">press release</a>). This post is a brief &mdash; and far from comprehensive &mdash; overview of improvements, continued deficiencies, and procedural issues.</p>
<p><span id="more-114"></span></p>
<p><b>Improvements</b></p>
<p><b>1.</b> Several sensitive uses of third-party web tracking data are now completely prohibited: adverse terms or ineligibility for employment, credit, medical treatment, and insurance. The principles do not, however, prohibit offering favorable terms or determining eligibility from third-party web tracking data.</p>
<p><b>2.</b> Transparency and consumer control are now required for many forms of per-device content personalization, not just behaviorally targeted advertising. The principles are ambiguous about whether per-user content personalization, such as Facebook social widgets, also requires transparency and consumer control.</p>
<p><b>Continuing Substantive Deficiencies</b></p>
<p><b>1.</b> Many stakeholders on online privacy, including U.S. and EU regulators, have repeatedly emphasized that effective consumer control necessitates restrictions on the <i>collection</i> of information, not just prohibitions on specific <i>uses</i> of information. The very existence of third-party web tracking data gives rise to numerous privacy risks, including data breach, employee misconduct, government access, and more. The DAA principles nevertheless remain a set of limitations on data use, not data collection. While the supplementary principles begin with broad language about collection limits, they incorporate vast exceptions that wholly swallow the rule. Consider, for example, the exceptions for &#8220;market research,&#8221; &#8220;product development,&#8221; and &#8220;reporting.&#8221;</p>
<blockquote>
<p>Market Research means the analysis of: market segmentation or trends; consumer preferences and behaviors; research about consumers, products, or services; or the effectiveness of marketing or advertising.</p>
</blockquote>
<blockquote>
<p>Product Development means the analysis of: (i) the characteristics of a market or group of consumers; or (ii) the performance of a product, service or feature, in order to improve existing products or services or to develop new products or services.</p>
</blockquote>
<blockquote>
<p>Reporting is the logging of Multi-Site Data on a Web site(s) . . . for:</p>
<p>   • Statistical reporting in connection with the activity on a Web site(s);</p>
<p>. . .</p>
</blockquote>
<p>In a plain reading, every third-party web tracking practice would come within these exceptions to mandatory consumer control. (A simple thought experiment: name a third-party web tracking practice that is not encompassed by the provisions above.) Per-device personalization uses of data are not excepted from consumer control only because the principles explicitly add exceptions to the exceptions. Here is the language on &#8220;market research&#8221; and &#8220;product development.&#8221;</p>
<blockquote>
<p>A key characteristic of market research is that the data is not re-identified to market directly back to, or otherwise re-contact a specific computer or device. Thus, the term &#8220;market research&#8221; does not include sales, promotional, or marketing activities directed at a specific computer or device.</p>
</blockquote>
<blockquote>
<p>Like Multi-Site Data used for Market Research, such data used for product development is not re-identified to market directly back to, or otherwise re-contact a specific computer or device.</p>
</blockquote>
<p><b>2.</b> Practices that do not include per-device personalization are not only exempted from consumer control, but are also exempted from any transparency requirement.</p>
<p><b>3.</b> The DAA has not closed the <a href="http://lists.w3.org/Archives/Public/public-tracking/2011Oct/0349.html" rel="nofollow">loophole</a> in its principles for data sharing among corporate affiliates.</p>
<p><b>4.</b> Despite lack of adoption and widespread criticism, the DAA continues to advocate its opt-out cookie and icon mechanisms.</p>
<p><b>Procedural Issues</b></p>
<p><b>1.</b> The supplementary principles were developed through an opaque process, with limited input from policymakers, researchers, and civil society organizations. Legitimate self-regulation transparently and inclusively addresses consumer concerns; it does not present a fait accomplis.</p>
<p><b>2.</b> It is unclear why the DAA, as a consortium of organizations in the online advertising space, would have a legitimate claim to regulate third-party web tracking that is not related to advertising. The new principles may, in fact, run contrary to the current policy positions of several companies, including Facebook. It remains to be seen how many non-advertising third parties will accept the DAA&#8217;s principles.</p>
<p>In sum: It&#8217;s great to see the online advertising industry taking steps in the right direction. There are a few real improvements in the supplementary principles. But they do not address the core privacy issues consistently raised by regulators, legislators, researchers, and advocates, and it is far from clear that the online advertising industry will be able to expand the scope of its program beyond advertising practices.</p>
<hr />
<p>Thanks to <a href="https://www.eff.org/about/staff/peter-eckersley" rel="nofollow">Peter Eckersley</a> at the <a href="https://www.eff.org/" rel="nofollow">Electronic Frontier Foundation</a> for reviewing a draft.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2011/11/08/a-brief-overview-of-the-supplementary-daa-principles/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tracking the Trackers: Where Everybody Knows Your Username</title>
		<link>http://webpolicy.org/2011/10/11/tracking-the-trackers-where-everybody-knows-your-username/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=tracking-the-trackers-where-everybody-knows-your-username</link>
		<comments>http://webpolicy.org/2011/10/11/tracking-the-trackers-where-everybody-knows-your-username/#comments</comments>
		<pubDate>Tue, 11 Oct 2011 14:06:25 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Measurement]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=90</guid>
		<description><![CDATA[Original at the Stanford Center for Internet and Society. Click the local Home Depot ad and your email address gets handed to a dozen companies monitoring you. Your web browsing, past, present, and future, is now associated with your identity. Swap photos with friends on Photobucket and clue a couple dozen more into your username. [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at the <a href="http://cyberlaw.stanford.edu/node/6740">Stanford Center for Internet and Society</a>.</em></p>
<p>Click the local Home Depot ad and your email address gets handed to a dozen companies monitoring you. Your web browsing, past, present, and future, is now associated with your identity. Swap photos with friends on Photobucket and clue a couple dozen more into your username. Keep tabs on your favorite teams with Bleacher Report and you pass your full name to a dozen again. This isn&#8217;t a 1984-esque scaremongering hypothetical. This is what&#8217;s happening <i>today</i>.</p>
<p>[<b>Update 10/11</b>: Since several readers have asked &ndash; this study was funded exclusively by Stanford University and research grants to the Stanford Security Lab. It was not supported by any advocacy organization.]</p>
<p><span id="more-90"></span></p>
<p><b>Background on Third-Party Web Tracking and Anonymity</b> </p>
<p>In <a href="http://cyberlaw.stanford.edu/node/6701">a post on the Stanford CIS blog</a> two months ago, <a href="http://randomwalker.info">Arvind Narayanan</a> explained how third-party web tracking is not at all anonymous.</p>
<blockquote>
<p>In the language of computer science, clickstreams &ndash; browsing histories that companies collect &ndash; are not anonymous at all; rather, they are pseudonymous. The latter term is not only more technically appropriate, it is much more reflective of the fact that at any point after the data has been collected, the tracking company might try to attach an identity to the pseudonym (unique ID) that your data is labeled with. Thus, identification of a user affects not only future tracking, but also retroactively affects the data that&#8217;s already been collected. Identification needs to happen only once, ever, per user.</p>
</blockquote>
<p>Arvind noted five ways in which a user&#8217;s identity may be associated with third-party web tracking data.</p>
<ul>
<li>A third party is also a first party, e.g. Facebook, Twitter, or Google+.</li>
<li>A first party hands off (&#8220;leaks&#8221;) identifying information to a third party.</li>
<li>A third party buys identifying information from a &#8220;matching service.&#8221;</li>
<li>A third party exploits a security vulnerability to learn a user&#8217;s identity.</li>
<li>A third party &#8220;deanonymizes&#8221; its data by matching it against identified data.</li>
</ul>
<p>This post is an empirical study of identifying information leakage from first-party websites to third-party websites.<sup><a href="#pii_leakage_footnote_1">1</a></sup></p>
<p><b>Web Information Leakage</b></p>
<p>Leakage most often occurs when a first-party website stuffs information into a URL. For example, suppose Example Website sends users after they register to:</p>
<blockquote>
<p>http://example.com/register?</p>
<p>username=GoCardinal</p>
<p>&amp;name=Leland%20Stanford</p>
<p>&amp;email=leland%40stanford.edu</p>
<p>&amp;&#8230;</p>
</blockquote>
<p>Third parties embedded in the page will receive the URL in a referrer header or equivalent<sup><a href="#pii_leakage_footnote_2">2</a></sup> &ndash; and therefore Leland Stanford&#8217;s username, name, and email.</p>
<p>Another common form of leakage is through the page title. Suppose a website&#8217;s landing page includes a title tag of:</p>
<blockquote>
<p>Welcome, Leland Stanford!</p>
</blockquote>
<p>Embedded third-party scripts often report back with the page title; in this case, they&#8217;d include Leland Stanford&#8217;s name.</p>
<p>[<b>Update 10/11</b>: The original version of this post conflated the information OkCupid provides to Lotame and BlueKai. In the interest of complete accuracy, and in response to both a deluge of questions on OkCupid's intentional leakage and a note from BlueKai seeking clarification, I have updated this section with per-company intentional leakage. I have also included the results of a leakage test (with the methodology described below) on OkCupid. My apologies to BlueKai for the incorrect implication that it collects the same sensitive profile data that Lotame does. The amibiguous discussion was solely my error.]</p>
<p>Leakage, in common parlance, implies unintentionality. In computer security, leakage is a term of art for an information flow &ndash; some instances of leakage are entirely intentional. For example, <a href="http://www.okcupid.com">OkCupid</a>, a free online dating website, appears to sell user profile information to the data providers <a href="http://www.bluekai.com">BlueKai</a> and <a href="http://www.lotame.com">Lotame</a>. <strike>, including gender, age, ZIP code, relationship status, and drug use frequency.</strike> To learn which profile information OkCupid leaks, I modified each field of a profile and observed how values sent to the two companies changed. Here&#8217;s what the companies appeared to receive:</p>
<blockquote>
<p>Age &#8211; Both</p>
<p>Cats &#8211; Both</p>
<p>Children &#8211; Both</p>
<p>Country &#8211; Both</p>
<p>Dogs &#8211; Both</p>
<p>Drinking Frequency &#8211; Lotame</p>
<p>Drug Use Frequency &#8211; Lotame</p>
<p>Education &#8211; Both</p>
<p>Ethnicity &#8211; Lotame</p>
<p>Gender &#8211; Both</p>
<p>Income &#8211; Both</p>
<p>Job Sector &#8211; Both</p>
<p>Language Proficiencies &#8211; BlueKai</p>
<p>Relationship Status &#8211; Lotame</p>
<p>Religion &#8211; Lotame</p>
<p>Smoking Frequency &#8211; Lotame</p>
<p>State &#8211; Both</p>
<p>ZIP Code &#8211; Both</p>
</blockquote>
<p>(I also ran the leakage test described below on OkCupid. The username was sent to 27 third-party PS+1s (defined below), including crwdcntrl.net (Lotame) and bluekai.com (BlueKai). Since OkCupid does not limit who can see a profile &ndash; a user can only require that visitors be logged in &ndash; a username provides access to a user&#8217;s entire profile.)</p>
<p>In a series of groundbreaking studies <a href="http://www2.research.att.com/~bala/papers/">Balachander Krishnamurthy</a>, <a href="http://web.cs.wpi.edu/~cew/">Craig Wills</a>, and Konstantin Naryshkin have demonstrated that information leakage is a pervasive problem (<a href="http://www2.research.att.com/~bala/papers/w2sp11.pdf">1</a>, <a href="http://www2.research.att.com/~bala/papers/pmob.pdf">2</a>, <a href="http://www2.research.att.com/~bala/papers/wosn09.pdf">3</a>). In their <a href="http://www2.research.att.com/~bala/papers/w2sp11.pdf">most recent paper</a>, the authors examined signup and interaction with 120 popular sites for information leakage to third parties. They found that 56% leaked some form of private information, and 48% leaked a user identifier.</p>
<p>We roughly followed the same methodology as Krishnamurthy, Wills, and Naryshkin, with 1) a focus on identifying information leakage, 2) a greater number of sites, 3) and a public dataset.</p>
<p><b>Usernames as Identifying Information</b></p>
<p>Given the sizeable role usernames play in web information leakage, it&#8217;s worth taking a moment to note how a username is identifying information. In some cases a username is just a user&#8217;s name &ndash; for example, @<a href="http://www.twitter.com/jonathanmayer">jonathanmayer</a> on Twitter. Even when it isn&#8217;t the user&#8217;s name, a username is often more than adequate for identifying a user.</p>
<p>First, a username is likely sufficient to link accounts across websites. Users routinely reuse their usernames &ndash; after all, who&#8217;s going to remember a new login for each site they use? In <a href="http://planete.inrialpes.fr/papers/high_entropy.pdf">a paper at PETS 2011</a>, <a href="http://planete.inrialpes.fr/~perito/index.php">Daniele Perito</a> et al. examined a sample of public data from Google, eBay, and other sites to estimate how linkable usernames are. They found that the vast majority of usernames in their sample had high entropy, and that simple algorithms for linking usernames could achieve pairwise precision and recall of over 70%. (For further discussion of using usernames to link social profiles, see Arvind&#8217;s blog posts &#8220;<a href="http://33bits.org/2011/02/16/usernames-linkability-uber-profiles/">The Linkability of Usernames</a>&#8221; and &#8220;<a href="http://33bits.org/2008/11/12/57/">Lendingclub.com: A De-anonymization Walkthrough</a>,&#8221; as well as &#8220;<a href="http://www.cc.gatech.edu/~danesh/download/Dirani_InternetComp_2011.pdf">Modeling Unintended Personal-Information Leakage from Multiple Online Social Networks</a>&#8221; and &#8220;<a href="http://www.cc.gatech.edu/~danesh/download/DIrani_SecureCom_2009.pdf">Large Online Social Footprints &#8211; An Emerging Threat</a>&#8221; by <a href="http://www.cc.gatech.edu/~danesh/">Danesh Irani</a> et al.) Some companies are already linking usernames in their products, including social matching services (e.g. <a href="http://www.infochimps.com/datasets/social-network-identity-mapping-api">Infochimps</a>), scraped profiles (e.g. <a href="http://www.spokeo.com/">Spokeo</a>), and automated social network linkage (e.g. <a href="http://www.google.com/support/websearch/bin/answer.py?answer=1142745">Google Social Search</a>).<sup><a href="#pii_leakage_footnote_3">3</a></sup></p>
<p>Second, combining data from multiple accounts often provides a sufficiently comprehensive mosaic to identify an individual.<sup><a href="#pii_leakage_footnote_4">4</a></sup> Arvind, for example, usually goes by the username &#8220;randomwalker.&#8221; The first page of a Google search turned up his <a href="http://news.ycombinator.com/user?id=randomwalker">yCombinator Hacker News account</a>, which includes his job and links to his personal website, blog, and Twitter account.</p>
<p>Some websites (e.g. <a href="http://www.quantcast.com/how-we-do-it/consumer-choice/privacy-policy/">Quantcast</a>) have responsibly recognized that a username is identifying information and have included username in their legal definition of &#8220;personally identifiable information&#8221; (PII).</p>
<p><b>Methodology</b></p>
<p>We examined each website in the <a href="http://www.quantcast.com/top-sites-1">Quantcast top 250</a>, checking for whether it</p>
<ul>
<li>offered a sign up,</li>
<li>did not require a purchase or other qualification to sign up, and</li>
<li>did not include so many features as to be impractical for study.</li>
</ul>
<p>For each of the 185 websites that met all three criteria, we used the <a href="http://fourthparty.info">FourthParty web measurement platform</a> to create an account and interact with the site.<sup><a href="#pii_leakage_footnote_5">5</a></sup> We emphasized exploring content that dealt with a user&#8217;s identity, such as profile and settings pages. After collecting data, we searched <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html">Request-URIs</a> and <a href="http://en.wikipedia.org/wiki/HTTP_referrer">Referrer headers</a> for known personal information. We treated each <a href="http://publicsuffix.org/">public suffix + 1 (PS+1)</a> as an independent entity, and we considered any PS+1 different from a first party&#8217;s to be a third party.<sup><a href="#pii_leakage_footnote_6">6</a></sup></p>
<p><b>Results</b></p>
<p>A complete spreadsheet of results is <a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/leakage_study/identifying_information_leakage.xlsx">available in Excel format</a>. We encourage interested readers to examine the results for themselves. [<b>Update 10/22</b>: Before consulting the spreadsheet, please be sure to read <a href="#pii_leakage_footnote_6">Footnote 6</a> to understand the limitations of our methodology.] Please email if you would like FourthParty logs for a specific site.</p>
<p>The most frequent type of leakage was a username or user ID.<sup><a href="#pii_leakage_footnote_7">7</a></sup> We identified username or user ID leakage to a third party on 113 websites, 61% of the websites in our sample. The top five PS+1 recipients of username and user ID leakage were:</p>
<ol>
<li>scorecardresearch.com (comScore), on 81 (44%) of the websites in our sample</li>
<li>google-analytics.com (Google Analytics), on 78 (42%) of the websites in our sample</li>
<li>quantserve.com (Quantcast), on 63 (34%) of the websites in our sample</li>
<li>doubleclick.net (Google Advertising), on 62 (34%) of the websites in our sample</li>
<li>facebook.com (Facebook), on 45 (24%) of the websites in our sample</li>
</ol>
<p>Some websites leaked the username or user ID to dozens of third parties. For example, popular photo sharing website <a href="http://www.photobucket.com">Photobucket</a> embeds username in many of its URLs, and includes advertising on most of its pages; we observed the username get sent to 31 third-party PS+1s.</p>
<p>Other identifying information leaked in a number of instances. A sample:</p>
<ul>
<li>Viewing a local ad on the <a href="http://www.homedepot.com">Home Depot website</a> sent the user&#8217;s first name and email address to 13 companies.
</li>
<li>Entering the wrong password on the <a href="http://www.wsj.com">Wall Street Journal website</a> sent the user&#8217;s email address to 7 companies.
<p>[<b>Update 10/11</b>: A number of readers have written in noting that the Wall Street Journal leak is not in our spreadsheet. We identified the Wall Street Journal leak in a different browsing session from the one reported in the spreadsheet &ndash; and by accident. In the interest of consistency &ndash; we did not test logging out and logging back in on other sites, nor logging in with the wrong password &ndash; we decided to discuss the leak in our post but not our spreadsheet.]</p>
</li>
<li>Changing user settings on the video sharing site <a href="http://www.metacafe.com">Metacafe</a> sent first name, last name, birthday, email address, physical address, and phone numbers to 2 companies.
</li>
<li>Signing up on the <a href="http://www.nbc.com">NBC website</a> sent the user&#8217;s email address to 7 companies.
</li>
<li>Signing up on <a href="http://www.wunderground.com">Weather Underground</a> sent the user&#8217;s email address to 22 companies.
</li>
<li>The mandatory mailing list page during <a href="http://www.cnbc.com">CNBC</a> signup sent the user&#8217;s email address to 2 companies.
</li>
<li>Clicking the validation link in the <a href="http://www.reuters.com">Reuters</a> signup email sent the user&#8217;s email address to 5 companies.
</li>
<li>Interacting with <a href="http://www.bleacherreport.com">Bleacher Report</a> sent the user&#8217;s first and last names to 15 companies.
</li>
<li>Interacting with <a href="http://www.classmates.com">classmates.com</a> sent the user&#8217;s first and last names to 22 companies.
</li>
</ul>
<p><b>Implications</b></p>
<p>From a legal perspective, identifying information leakage is a debacle. Many first-party websites make what would appear to be incorrect, or at minimum misleading, representations about not sharing PII. Here are some examples.</p>
<p><a href="http://www.homedepot.com/webapp/wcs/stores/servlet/ContentView?pn=Privacy_Security&amp;langId=-1&amp;storeId=10051&amp;catalogId=10053">The Home Depot</a>:</p>
<blockquote>
<p>Personal Information Disclosure: The Home Depot will not trade, rent or sell your personal information, without your prior consent, except as otherwise set out herein. [Does not describe sharing with third-parties for advertising or analytics.]</p>
</blockquote>
<p><a href="http://online.wsj.com/public/page/privacy-policy.html">The Wall Street Journal</a>:</p>
<blockquote>
<p>We will not sell, rent, or share your Personal Information with these third parties for such parties&#8217; own marketing purposes, unless you choose in advance to have your Personal Information shared for this purpose. Information about your activities on our Online Services and other non-personally identifiable information about you may be used to limit the online ads you encounter to those we believe are consistent with your interests. Third-party advertising networks and advertisers may also use cookies and similar technologies to collect and track non-personally identifiable information such as demographic information, aggregated information, and Internet activity to assist them in delivering advertising on our Online Services that is more relevant to your interests.</p>
</blockquote>
<p><a href="http://www.metacafe.com/privacy/">Metacafe</a>:</p>
<blockquote>
<p>Metacafe&#8217;s Privacy Policy is to share personal information only with the owner&#8217;s informed consent.</p>
</blockquote>
<p>Likewise, a number of third-party trackers disclaim collection of personally identifiable information.<sup><a href="#pii_leakage_footnote_8">8</a></sup></p>
<p><a href="http://www.scorecardresearch.com/privacy.aspx">Scorecard Research (comScore)</a>:</p>
<blockquote>
<p><b>Does your beacon collect or store any personally identifiable information about me?</b></p>
<p>The tagging used by ScorecardResearch is unable to identify the user visiting a page.</p>
</blockquote>
<p><a href="http://www.quantcast.com/privacy">Quantcast</a>:</p>
<blockquote>
<p>We do not tie the information gathered by Quantcast Tags to the personally identifiable information of visitors to a Web site.</p>
<p>. . .</p>
<p>We do not link Log Data to any other Personally Identifiable Information about you or otherwise attempt to discover your identity.</p>
</blockquote>
<p><a href="http://www.google.com/privacy/ads/">Google Advertising</a>:</p>
<blockquote>
<p>We don&#8217;t collect or serve ads based on personally identifying information without your permission.</p>
</blockquote>
<p>The better practice for all first-party and third-party websites would be to acknowledge that identifying information leakage is a fact of life on the web, and that identifying information may be shared with third parties.</p>
<p>As for policy, some strands of the Do Not Track debate echo a sentiment of &#8220;it&#8217;s all anonymous,&#8221; and so, &#8220;where&#8217;s the harm?&#8221; We believe there is now overwhelming evidence that third-party web tracking is not anonymous. It is a legitimate policy question whether, on balance, Do Not Track should be enforced by law. But the difficult weighing of competing privacy risks and economics can&#8217;t be short-circuited by claims of anonymity.</p>
<hr />
<p>Thanks to Arvind Narayanan for comments on a draft.</p>
<p><a name="pii_leakage_footnote_1">[1]</a> For purposes of this post, &#8220;identifying information&#8221; is information that with moderate probability and moderate effort can be used to identify a user. This post does not use a formulaic legal definition of &#8220;personally identifiable information&#8221; (PII), an approach that has been discredited by a growing body of computer science research. The Federal Trade Commission staff notably rejected the notion of PII in its <a href="http://www.ftc.gov/os/2010/12/101201privacyreport.pdf">draft privacy report</a> last year.</p>
<p><a name="pii_leakage_footnote_2">[2]</a> Some third parties encode the referring URL into their Request-URI.</p>
<p><a name="pii_leakage_footnote_3">[3]</a> A username isn&#8217;t, of course, all a third party has to go on. IP geolocation is another trivial source of information, and can help disambiguate when several individuals use similar usernames. How many Jonathan Mayers are there in Palo Alto, CA? Using the Stanford University network? This is a possible area for future research.</p>
<p><a name="pii_leakage_footnote_4">[4]</a> While it is quite clear that in practice a username can often be used to discern a user&#8217;s identity, confirmatory empirical research would be valuable.</p>
<p><a name="pii_leakage_footnote_5">[5]</a> We used a fictional persona with unique biographical traits to minimize false positives.</p>
<p><a name="pii_leakage_footnote_6">[6]</a> For readers who engage in detail with our data, we wish to emphasize several caveats to our methodology.</p>
<ul>
<li>We did not study &ndash; and cannot study &ndash; what companies do when they receive personal information. It is likely that many of the information leaks we identified were logged. Some third parties may take precautions to prevent logging of identifying information, and we certainly laud such efforts. But for policy purposes, there is a tremendous difference between a tracking ecosystem that is anonymous and a tracking ecosystem that is suffused with identity but promises to ignore it.</li>
<li>Since some websites host content from multiple PS+1s (e.g. amazon.com and amazonaws.com), our definition of a third party introduces some false positives. That said, our findings appear to be quite robust. For example, thresholding for leakage at more than three third parties still leaves 84 websites (45%) leaking a username or user ID.</li>
<li>We did not examine <a href="http://en.wikipedia.org/wiki/POST_(HTTP)">POST request</a> bodies or cookies, nor did we attempt to identify obfuscated or encrypted personal information.</li>
<li>Our interaction with websites was neither comprehensive nor representative of what the average user might do. We may have missed information leaks, and some of the information leaks we identified may have affected only a minority of users.</li>
<li>In the course of a user&#8217;s browsing, identifying information for other users might leak. We did not gauge how easily a third party could identify which information was the user&#8217;s. In most cases it appeared such a determination would be straightforward.</li>
<li>The regular expressions we used for matching birth year, birthday, gender, and last name had a not insignificant number of false positives. We recommend against relying solely upon those fields.</li>
<li>We did not explicitly take note of which stage of signup a leak occurred at.</li>
<li>We did not use a <a href="http://en.wikipedia.org/wiki/Single_sign-on">single sign-on (SSO) provider</a> unless required. Where an SSO was mandatory, we manually labeled PS+1s associated with the SSO provider as first-party. Measuring information leakage when SSOs are used is a promising avenue for future research.</li>
<li>We did not attempt to discover third parties that have been <a href="http://en.wikipedia.org/wiki/CNAME_record">CNAME</a>d into a first-party PS+1 (dubbed &#8220;hidden third-parties&#8221; in some papers).</li>
</ul>
<p><a name="pii_leakage_footnote_7">[7]</a> User IDs were, in our testing, almost always sufficient to locate at least a username, and sometimes additional identifying information. For example, with a <a href="http://www.causes.com">Causes.com</a> user ID, anyone can attain a link to a user&#8217;s Facebook profile &ndash; which in turn provides a name, photo, and possibly more.</p>
<p><a name="pii_leakage_footnote_8">[8]</a> Please note: we are not claiming any company has breached its self-regulatory commitments. The <a href="http://aboutads.info">Digital Advertising Alliance (DAA)</a> online advertising self-regulation imposes lax restrictions on personally identifiable information. First, personally identifiable information is defined to only include information that is <i>used</i> to identify a user.</p>
<blockquote>
<p>Personally Identifiable Information is information about a specific individual including name, address, telephone number, and email address&mdash;when used to identify a particular individual.</p>
</blockquote>
<p>Second, the DAA principles only require noting the use of PII in a privacy policy and getting consent to retroactively use PII before the privacy policy change.</p>
<blockquote>
<p>PII is a term used primarily in two areas in the Principles and Commentary. First, PII is used in the Transparency principle so that consumers are informed specifi- cally about the collection and use of PII for Online Behavioral Advertising purposes. Second, PII is used in this Commentary to describe a specific example of a &#8220;material” change that would require Consent from the consumer under Principle V.</p>
</blockquote>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2011/10/11/tracking-the-trackers-where-everybody-knows-your-username/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tracking the Trackers: Self-Help Tools</title>
		<link>http://webpolicy.org/2011/09/13/tracking-the-trackers-self-help-tools/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=tracking-the-trackers-self-help-tools</link>
		<comments>http://webpolicy.org/2011/09/13/tracking-the-trackers-self-help-tools/#comments</comments>
		<pubDate>Tue, 13 Sep 2011 10:35:29 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Measurement]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=87</guid>
		<description><![CDATA[Original at the Stanford Center for Internet and Society. A number of technologies have been touted to offer consumers control over third-party web tracking. This post reviews the tools that are available and presents empirical evidence on their effectiveness. Here are the key takeaways: Most desktop browsers currently do not support effective self-help tools. Mobile [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at the <a href="http://cyberlaw.stanford.edu/node/6730">Stanford Center for Internet and Society</a>.</em></p>
<p>A number of technologies have been touted to offer consumers control over third-party web tracking. This post reviews the tools that are available and presents empirical evidence on their effectiveness. Here are the key takeaways:</p>
<ol>
<li>Most desktop browsers currently do not support effective self-help tools. Mobile users are almost completely out of luck.</li>
<li>Self-help tools vary substantially in performance.</li>
<li>The most effective self-help tools block third-party advertising.</li>
</ol>
<p>Following the usage model in the FTC staff&#8217;s <a href="http://www.ftc.gov/os/2010/12/101201privacyreport.pdf">2010 preliminary online privacy report</a>, this post is oriented towards the user who wants a simple, persistent, comprehensive solution such that with high confidence no third party collects her browsing history. We assume that some third-party trackers will use non-cookie tracking methods including <a href="http://samy.pl/evercookie/">supercookies</a> and <a href="http://panopticlick.eff.org/">fingerprinting</a> (e.g. <a href="http://cyberlaw.stanford.edu/node/6715">Microsoft</a>, <a href="http://ashkansoltani.org/docs/respawn_redux.html">KISSmetrics</a>, <a href="http://cyberlaw.stanford.edu/node/6695">Epic Marketplace</a>, <a href="http://www.bluecava.com/">BlueCava</a>, <a href="http://cseweb.ucsd.edu/~hovav/dist/history.pdf">Interclick</a>, <a href="http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1446862">Quantcast</a>).</p>
<p>Thanks to Jovanni Hernandez and Akshay Jagadeesh for assisting with data collection, and to <a href="http://randomwalker.info">Arvind Narayanan</a> and <a href="https://www.eff.org/about/staff/peter-eckersley">Peter Eckersley</a> for input on drafts.</p>
<p><span id="more-87"></span></p>
<p><b>Self-Regulatory &#8220;Opt-Out&#8221; Cookies</b></p>
<p>For over a decade a minority of third-party trackers, most prominently the members of the self-regulatory <a href="http://networkadvertising.org/">Network Advertising Initiative</a> (NAI) and <a href="http://aboutads.info">Digital Advertising Alliance</a> (DAA), have offered users the ability to set an &#8220;opt-out&#8221; cookie.</p>
<p>As a technological matter, cookies are a poor mechanism for storing persistent and comprehensive user preferences. Users often delete their cookies, wiping out &#8220;opt outs.&#8221; Cookies can expire (e.g. <a href="http://www.ftc.gov/opa/2011/03/chitika.shtm">Chitika&#8217;s &#8220;opt-out&#8221; cookie</a>). And because each &#8220;opt-out&#8221; cookie is scoped to specific domain, a user has to periodically install cookies for companies that have newly begun to offer an &#8220;opt out.&#8221; The user experience for setting &#8220;opt-out&#8221; cookies is unnecessarily arduous (see <a href="http://www.aboutads.info/">the DAA &#8220;opt-out&#8221; page</a>), and in some cases &#8220;opt-out&#8221; cookies <a href="http://www.truste.com/blog/2011/08/22/what%E2%80%99s-under-the-hood-not-all-opt-outs-are-created-equal/">aren&#8217;t even set correctly</a>. That said, there are several browser extensions that significantly improve the usability and persistence of &#8220;opt-out&#8221; cookies (e.g. <a href="https://addons.mozilla.org/en-US/firefox/addon/targeted-advertising-cookie-op/">Abine Taco</a>, <a href="https://addons.mozilla.org/en-US/firefox/addon/beef-taco-targeted-advertising/">Beef Taco</a>, <a href="https://chrome.google.com/webstore/detail/hhnjdplhmcnkiecampfdgfjilccfpfoe">Google Keep My Opt Outs</a>, <a href="http://www.networkadvertising.org/managing/protector_license.asp">NAI Consumer Opt Out Protector</a>, and <a href="https://chrome.google.com/webstore/detail/eoibfeagdaaoimfpfalgbmmegagdconp">PrivacyChoice Keep More Opt Outs</a>). </p>
<p>You may have noticed that &#8220;opt out&#8221; is scrupulously placed in quotes throughout this discussion. That&#8217;s because, setting aside technical issues, <b>&#8220;opt-out&#8221; cookies don&#8217;t actually opt users out of tracking</b>. As we explained in <a href="http://cyberlaw.stanford.edu/node/6694">an earlier post</a>, &#8220;opt-out&#8221; cookies only opt users out of seeing ads based on tracking&mdash;not tracking itself. And as we showed in <a href="http://cyberlaw.stanford.edu/node/6714">a later post on the DAA&#8217;s self-regulatory icon initiative</a>, both the NAI and DAA use slippery, deceptive language in describing their &#8220;opt-out&#8221; programs.<sup><a href="#blocking_footnote_1">1</a></sup></p>
<p><b>Do Not Track</b></p>
<p><a href="http://donottrack.us">Do Not Track</a> uses an HTTP header to signal a user&#8217;s preference to opt out of third-party tracking. Browsers have been <a href="http://online.wsj.com/article/SB10001424052748703551304576261272308358858.html">quick to adopt the proposal</a>, user adoption is skyrocketing (<a href="http://blog.mozilla.com/metrics/2011/09/08/understanding-dnt-adoption-within-firefox/">1</a>, <a href="http://paidcontent.org/article/419-new-study-shows-use-of-do-not-track-is-on-the-rise/">2</a>), and <a href="http://fourthparty.info">tools are under development for detecting violations</a>. But, for the moment, most tracking companies steadfastly refuse to comply. We believe Do Not Track is the right way to provide consumer choice on third-party tracking (learn more at <a href="http://donottrack.us">DoNotTrack.Us</a>), and we recommend users enable the feature to send a signal to regulators, legislators, and tracking companies. While we are pleased with Do Not Track&#8217;s progress, convincing stakeholders to adopt the proposal is a lengthy process. In the interim, users must look elsewhere for effective protection against third-party tracking.</p>
<p><b>Browser Profile Clearing</b></p>
<p>Users are often advised to regularly clear their cookies, cache, history, and other browser profile settings to prevent third-party tracking. There are several reasons this approach does not adequately protect users.</p>
<p>First, many third-party tracking methods continue to work. Tracking techniques that do not require storing state in the browser are wholly unaffected. As for stateful tracking, the user must play Whac-A-Mole with third parties. To remove <a href="http://en.wikipedia.org/wiki/HTTP_ETag#Tracking_using_ETags">ETag cookies</a>, the user must clear the browser&#8217;s cache. To remove <a href="http://en.wikipedia.org/wiki/Flash_cookie">Flash cookies</a>, she has to independently clear her Flash plugin data. In short: the user has to scrub <i>anyplace</i> the browser or a plugin can store state.</p>
<p>Second, clearing the browser profile only provides periodic protection. In the intervals between when a user clears his settings, every tracking method works.</p>
<p>Third, clearing the browser profile undermines beneficial functionality. Many of the lost features result in significant annoyances (e.g. stored logins and browsing history). Some even introduce security vulnerabilities (e.g. stored authentication tokens and <a href="http://en.wikipedia.org/wiki/HTTP_Strict_Transport_Security">HTTP Strict Transport Security</a>).</p>
<p>Last, as a practical matter, clearing the browser profile is an unworkable solution. The average user cannot, and should not, reasonably be expected to diligently vacuum her browser on a monthly basis&mdash;let alone every week or every day.<sup><a href="#blocking_footnote_2">2</a></sup></p>
<p><b>Private Browsing Mode</b></p>
<p>While <a href="http://crypto.stanford.edu/~dabo/pubs/abstracts/privatebrowsing.html">implementation specifics vary by browser</a>, private browsing modes share a common goal: eliminate evidence of browsing that resides on the computer. To a first approximation private browsing modes function the same as clearing the browser profile, except the user proactively declares a session to be private (automatically clearing profile changes when the session ends) instead of retroactively clearing the profile. Private browsing has the very same shortcomings as clearing the browser profile: it does not stop all tracking methods, it provides only periodic protection (the user can be tracked within a private browsing session), beneficial web functionality breaks, and as a practical matter a user will not adjust a setting every time his browser launches.</p>
<p><b>Third-Party Cookie Blocking</b></p>
<p>All the major web browsers include an option to prevent third-party domains from setting cookies. Because cookies are just one of many ways third parties track users, third-party cookie blocking provides limited protection. And unless a browser blocks third-party cookies from being read,<sup><a href="#blocking_footnote_3">3</a></sup> clicking a tracker&#8217;s ad or visiting a tracker&#8217;s website (e.g. Facebook or Google) once is enough to set an indefinite tracking cookie.<sup><a href="#blocking_footnote_4">4</a></sup></p>
<p><b>Targeted Cookie Blocking</b></p>
<p>Internet Explorer, Firefox, Chrome, and several browser extensions offer the ability to prevent cookies from certain domains from being read or set. Just like third-party cookie blocking, this approach does not mitigate non-cookie tracking technologies. It also largely eliminates interactive functionality on websites that are both a first party and a third-party tracker (e.g. Facebook or Google).</p>
<p><b>Execution Blocking</b></p>
<p>A number of tools are available for preventing the execution of JavaScript (e.g. <a href="https://addons.mozilla.org/en-US/firefox/addon/noscript/">NoScript</a>), Flash (e.g. <a href="https://addons.mozilla.org/en-US/firefox/addon/flashblock/">Flashblock</a>), and other script content that could be used for tracking. While there are many other reasons to use these tools (including security, speed, and power consumption), they only mitigate a subset of tracking mechanisms.</p>
<p><b>Content Blocking</b></p>
<p>[Updated 9/14 to include a note on Request Policy. Thanks to <a href="http://josephhall.org/">Joe Hall</a> for the suggestion.]</p>
<p>Because of the myriad methods for tracking, many privacy tools focus on preventing the browser from even requesting certain third-party content. While content blocking can effectively prevent third-party tracking, a content blocking tool is only as effective as its list of rules on what to block (often called a &#8220;blocklist&#8221;). Most content blocking tools consist of nothing more than a regularly updated blocklist (or family of blocklists), in either <a href="http://adblockplus.org/en/filters">Adblock Plus</a> or <a href="http://www.w3.org/Submission/web-tracking-protection/">Tracking Protection List</a> format. <a href="https://www.requestpolicy.com/">Request Policy</a>, a Firefox extension, takes the opposite approach: all requests to third-party domains are blocked, save those the user explicitly allows. While Request Policy offers nearly comprehensive protection from third-party tracking, properly configuring it requires substantially greater patience and expertise than the average user can reasonably be expected to possess.</p>
<p><b>Please note: Chrome, Safari, Mobile Safari, and the Android browser DO NOT presently support content blocking.</b><sup><a href="#blocking_footnote_5">5</a></sup> Firefox extensions are able to block content, and users can install blocklists in Internet Explorer 9.</p>
<p><b>Effectiveness Measurement</b></p>
<p>We conducted a study of the effectiveness of twelve web privacy tools at mitigating third-party web tracking. <b>Please note: several of the blocklists we studied in Adblock Plus format are also available in the less expressive Tracking Protection List format. The change in formats may impact performance.</b></p>
<ul>
<li><a href="http://ie.microsoft.com/testdrive/Browser/TrackingProtectionLists/">Abine Tracking Protection List</a></li>
<blockquote>
<p>Abine&#8217;s Tracking Protection List blocks many online advertising and marketing technologies that can track and profile you as you browse the Web. This list is updated weekly to keep you safer and more private.</p>
</blockquote>
<p>In our initial testing, the Abine list performed very poorly; manually inspecting the list we identified several typos. We called our findings to Abine&#8217;s attention, and the company responded with an updated list. We present below our findings on both the original and updated Abine lists.</p>
<li><a href="http://adversity.uk.to/">Adversity Ads + Privacy + Antisocial Adblock Plus lists</a></li>
<li><a href="https://easylist.adblockplus.org/en/">EasyList Adblock Plus list</a></li>
<blockquote>
<p>EasyList is the primary subscription that removes adverts from English webpages, including unwanted frames, images and objects. It is the most popular list for Adblock Plus, with over 7 million daily users, and forms the basis of over a dozen combination and supplementary subscriptions.</p>
</blockquote>
<li><a href="https://easylist.adblockplus.org/en/">EasyPrivacy Adblock Plus list</a></li>
<blockquote>
<p>EasyPrivacy is an optional supplementary subscription that completely removes all forms of tracking from the internet, including web bugs, tracking scripts and information collectors, thereby protecting your personal data.</p>
</blockquote>
<li><a href="https://easylist.adblockplus.org/en/">EasyList + EasyPrivacy Adblock Plus lists</a></li>
<li><a href="http://www.fanboy.co.nz/">Fanboy&#8217;s List Ads + Tracking + Annoyance Adblock Plus lists</a></li>
<li><a href="http://www.ghostery.com/">Ghostery browser extension</a> (configured to block all trackers, &#8220;experimental&#8221; cookie blocking not enabled)</li>
<blockquote>
<p>Ghostery allows you to block scripts from companies that you don&#8217;t trust, delete local shared objects, and even block images and iframes. Ghostery puts your web privacy back in your hands.</p>
</blockquote>
<li><a href="http://ie.microsoft.com/testdrive/Browser/TrackingProtectionLists/">PrivacyChoice 1 Tracking Protection List</a></li>
<li><a href="http://ie.microsoft.com/testdrive/Browser/TrackingProtectionLists/">PrivacyChoice 2 Tracking Protection List</a></li>
<blockquote>
<p>PrivacyChoice maintains a comprehensive database of tracking companies, including domains used by nearly 300 ad networks and platforms, tracking methods, summaries of key policies, oversight, and opt-out and opt-in processes. PrivacyChoice has created Tracking Protection Lists based on this data. You have the option of installing two lists. The first list blocks companies that are not subject to oversight by the NAI and the second list blocks all tracking company domains in the PrivacyChoice database. These lists will be automatically updated with new tracking domains discovered through continuous website scanning and user panels.</p>
</blockquote>
<li><a href="https://addons.mozilla.org/en-US/firefox/addon/trackerblock/">PrivacyChoice TrackerBlock browser extension</a> (configured to block all trackers, opt-out cookies not enabled)</li>
<blockquote>
<p>Complete control over online tracking using multiple methods, including cookie blocking, persistent opt-out cookies, Flash and HTML5 control, and Do Not Track signals.</p>
</blockquote>
<li><a href="http://ie.microsoft.com/testdrive/Browser/TrackingProtectionLists/">TRUSTe Tracking Protection List</a></li>
<blockquote>
<p>TRUSTe is the leading online privacy certification and services provider. TRUSTe&#8217;s TRUSTed Tracking Protection List enables relevant and targeted ads from companies that demonstrate respectful consumer privacy practices and comply with TRUSTe&#8217;s high standards and direct oversight. TRUSTe helps users get good ads, without compromising personal privacy.</p>
</blockquote>
</ul>
<p><b>Effectiveness Measurement &#8211; Methodology</b></p>
<p>For each blocking tool we conducted a crawl of the <a href="http://www.alexa.com/topsites/countries/US">Alexa U.S. top five hundred websites</a> using the <a href="http://fourthparty.info/">FourthParty web measurement platform</a>. To ensure broad coverage of third parties we crawled the list three times in series, and to provide fresh browser state for each page load we cycled private browsing mode off and on. We also conducted a baseline crawl for comparison. Our crawl data is available on request.</p>
<p>We compiled three measurements with each blocking tool:</p>
<p><b>HTTP Requests.</b> The number of crawled pages on which each domain (<a href="http://publicsuffix.org/">public suffix + 1</a>) receives at least one HTTP request. Almost all third-party web content is served using HTTP, so there likely few if any false negatives. But this measurement includes false positives: some resources are served from a third party that does not track. For example, the <a href="http://code.google.com/apis/libraries/devguide.html">Google Libraries API</a> (googleapis.com) serves static content and instructs the browser to cache it for a year.</p>
<p><b>HTTP Set-Cookie Responses.</b> The number of crawled pages on which each domain (<a href="http://publicsuffix.org/">public suffix + 1</a>) sends at least one HTTP response that includes a Set-Cookie header. This metric has some false negatives since it includes neither trackers that do not set cookies over HTTP nor trackers that set their cookies in a first-party context (e.g. Twitter). There are few false positives since in almost all cases cases if a web service wants to preserve state across multiple sites it will just use a unique identifier.</p>
<p><b>Cookies Added &#8211; Cookies Deleted.</b> The number of cookies added less the number of cookies deleted by each domain (<a href="http://en.wikipedia.org/wiki/Fully_qualified_domain_name">fully qualified domain name</a>). Measuring the difference between cookies added and deleted neglects trackers that do not use cookies or set cookies only as a first party, and is overinclusive of first-party sites that set a large number of cookies. Scripts can behave erratically when a browser blocks content, introducing significant noise into this measurement. We include it as a rough benchmark for cookie blocking tools.</p>
<p><b>Effectiveness Measurement &#8211; Results</b></p>
<p>The following graph reports the average across all tracking domains of the relative difference in each measurement.<sup><a href="#blocking_footnote_6">6</a></sup> We encourage interested users to examine <a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/tpl_study/results.xlsx">the complete spreadsheet of measurements</a>.</p>
<p><a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/tpl_study/results_graph.png"><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/tpl_study/results_graph.png" width="700px" /></a></p>
<p>Some observations from inspecting the tools and analyzing the crawl data:</p>
<ul>
<li>Self-help tools vary significantly in their effectiveness. Some (especially the Tracking Protection Lists from Abine before 9/6 and from TRUSTe) offer very little protection.<sup><a href="#blocking_footnote7">7</a></sup> No tool is comprehensive.</li>
<li>Installing multiple self-help tools can <i>decrease</i> user privacy. Despite receiving <a href="http://www.which.co.uk/news/2011/03/ie-9s-anti-tracking-feature-flawed-study-finds-247480/">widespread negative press</a> for the practice, TRUSTe continues to serve a Tracking Protection List that overrides other lists to <i>allow</i> tracking by BlueKai, comScore, Scorecard Research, and others.</li>
<li>Some websites depend on the presence of certain third-party scripts (e.g. <a href="https://adblockplus.org/forum/viewtopic.php?t=4160">the Google Analytics ga.js or urchin.js</a>). Ghostery cleverly circumvents this issue by replacing several popular scripts with dummy stand-ins. (See also <a href="http://hackademix.net/2009/01/25/surrogate-scripts-vs-google-analytics/">NoScript surrogate scripts</a>.) It may be worthwhile to add support for dummy scripts to blocklist formats.</li>
<li>The top performers (EasyList + EasyPrivacy and Fanboy&#8217;s List Ads + Tracking + Annoyance) are community-maintained blocklists.</li>
<li>Both top performers require installing more than one blocklist.</li>
<li>All the top and near-top performers (EasyList + EasyPrivacy, Fanboy&#8217;s List Ads + Tracking + Annoyance, Ghostery, and Adversity Ads + Privacy + Antisocial) block third-party advertising. This result should come as little surprise: third-party tracking is often inextricably commingled with third-party advertising.</li>
<li>Most self-help tools do a poor job of blocking social plugins, even from the most popular social networks and sharing platforms.</li>
</ul>
<p><b>Policy Implications</b></p>
<p>In the debates surrounding online privacy, many tracking companies have assumed that if they can hold out against Do Not Track, their business practices will continue. That&#8217;s not necessarily the case. Some users will turn to the next-best alternative, and we now know what that is: ad blocking. Internet Explorer 9 already supports ad blocking with two clicks. Representatives from Mozilla have repeatedly delivered the ultimatum that if effective regulation or self-regulation does not occur, Firefox will provide users with self-help tools. W3C is working to standardize a blocklist format. The extent to which users adopt ad blocking will, of course, depend on usability, advocacy, and much more. But it likely won&#8217;t take much persuading: <a href="http://www.forrester.com/rb/Research/consumer_ad-itudes_stay_strong/q/id/58875/t/2">users dislike advertising</a>, and ad blockers are already the most popular extensions for Firefox, Chrome, and Safari. Third parties should not be so hasty to play Russian roulette with the Internet economy. And publishers should not be so willing to let them.</p>
<hr />
<p><a name="blocking_footnote_1">[1]</a> Sometimes even the NAI and DAA member companies misunderstand what the self-regulatory programs require. Here are two examples from Google&#8217;s Keep My Opt Outs tool (<a href="http://googlepublicpolicy.blogspot.com/2011/01/keep-your-opt-outs.html">1</a>, <a href="https://chrome.google.com/webstore/detail/hhnjdplhmcnkiecampfdgfjilccfpfoe">2</a>):</p>
<blockquote>
<p>Today we&#8217;re making available Keep My Opt-Outs, which enables you to opt out permanently from ad tracking cookies.</p>
</blockquote>
<blockquote>
<p><b>Will this persistently opt me out of every cookie on the web?</b></p>
<p>No, this will not opt you out of cookies that are not related to personalized online ads.</p>
</blockquote>
<p><a name="blocking_footnote_2">[2]</a> Some browsers offer options for clearing components of the browser profile on exit. These options may somewhat mitigate usability issues with regularly cleaning the profile.</p>
<p><a name="blocking_footnote_3">[3]</a> Third-party cookie blocking in Internet Explorer, Chrome, and Safari only prevents cookies from being set, not read. Chrome does provide a separate &#8220;experimental&#8221; option in about:flags that prevents third parties from reading cookies. Firefox&#8217;s third-party cookie blocking <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=417800">prevents both setting and reading cookies</a>.</p>
<p><a name="blocking_footnote_4">[4]</a> There may also be trivial ways to circumvent third-party cookie blocking. In Safari, for example, a redirect through a domain or a POST to a domain will allow setting cookies.</p>
<p><a name="blocking_footnote_5">[5]</a> The <a href="http://developer.apple.com/library/safari/#documentation/Tools/Conceptual/SafariExtensionGuide/MessagesandProxies/MessagesandProxies.html#//apple_ref/doc/uid/TP40009977-CH14-SW1">blocking API</a> in WebKit (used in Chrome and Safari extensions) has a number of shortcomings. First, <a href="https://bugs.webkit.org/show_bug.cgi?id=52581">it doesn&#8217;t prevent network requests, just loading content into the DOM</a>. Second, <a href="https://bugs.webkit.org/show_bug.cgi?id=52577">the API doesn&#8217;t allow blocking for all HTTP requests</a>. Last, to support even modestly comprehensive blocking without a significant performance impact, <a href="http://code.google.com/p/chromium/issues/detail?id=54257">it requires a synchronous message passing feature that Chrome lacks</a>. A more comprehensive blocking API for Chrome <a href="http://code.google.com/p/chromium/issues/detail?id=60101">is currently under development</a> with no set release date. A <a href="http://code.google.com/p/chromium/issues/detail?id=16932">previous effort towards a Chrome blocklist feature</a> was cancelled after six months of development.</p>
<p>Android users can block third-party web (but not app) content by running Firefox with Adblock Plus.</p>
<p><a name="blocking_footnote_6">[6]</a> We treated a domain as a third-party tracking domain if its metric value was greater than six in the baseline crawl. In other words, we considered a domain to be a third-party tracker if it, to a first approximation, consistently appeared on more than two sites. We found our results quite robust against changing the threshold value for considering a domain a third-party tracker.</p>
<p>To conserve space, the graph above does not show values below zero.</p>
<p><a name="blocking_footnote_7">[7]</a> We did not conduct a crawl with the <a href="http://www.enhancedprivacy.eu/">Enhanced Privacy Tracking Protection List</a>, though a cursory inspection of <a href="http://www.enhancedprivacy.eu/tpl/enhancedprivacy.tpl">the list</a> revealed that it blocks very few third-party trackers.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2011/09/13/tracking-the-trackers-self-help-tools/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tracking the Trackers: Microsoft Advertising</title>
		<link>http://webpolicy.org/2011/08/18/tracking-the-trackers-microsoft-advertising/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=tracking-the-trackers-microsoft-advertising</link>
		<comments>http://webpolicy.org/2011/08/18/tracking-the-trackers-microsoft-advertising/#comments</comments>
		<pubDate>Thu, 18 Aug 2011 09:56:06 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Measurement]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=84</guid>
		<description><![CDATA[Original at the Stanford Center for Internet and Society. Despite all the attention they&#8217;ve received in the debates around online privacy, cookies are far from the only way to track a user. Broadly speaking, a website can either stash a unique identifier anyplace in the browser (&#8220;tagging&#8221;)1 or explore features of the browser until it [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at the <a href="http://cyberlaw.stanford.edu/node/6715">Stanford Center for Internet and Society</a>.</em></p>
<p>Despite all the attention they&#8217;ve received in the debates around online privacy, cookies are far from the only way to track a user. Broadly speaking, a website can either stash a unique identifier anyplace in the browser (&#8220;tagging&#8221;)<sup><a href="#microsoft_footnote_1">1</a></sup> or explore features of the browser until it becomes unique (&#8220;fingerprinting&#8221;).<sup><a href="#microsoft_footnote_2">2</a></sup> Tracking technologies that do not rely on cookies are often referred to as &#8220;supercookies,&#8221; and they are widely viewed as unsavory in the computer security community because they continue tracking even when a user clears her cookies to preserve privacy. Sometimes a site will use a supercookie to &#8220;respawn&#8221; its original identifier cookie, creating a &#8220;zombie cookie&#8221; &mdash; the basis of <a href="http://www.wired.com/epicenter/2010/12/zombie-cookie-settlement/">several</a> <a href="http://www.mediapost.com/publications/?fa=Articles.showArticle&amp;art_aid=155032">lawsuits</a>.</p>
<p>In one of our recent <a href="http://fourthparty.info">FourthParty</a> web measurement <a href="http://fourthparty.info/data">crawls</a> we included a cookie clearing step to emulate a user&#8217;s privacy choice. We observed that after clearing the browser&#8217;s cookies an identifier cookie (named &#8220;MUID&#8221; for &#8220;machine unique identifier&#8221;) respawned on <span style="font-family:courier;">live.com</span>, a Microsoft domain. We dug into Microsoft&#8217;s cross-domain cookie syncing code and discovered two independent supercookie mechanisms, one of which was respawning cookies. We contacted Microsoft with our observations, and we have collaborated to assist in rectifying the issues we uncovered. Here is what we know.</p>
<p>Thanks, once again, to Jovanni Hernandez and Akshay Jagadeesh for their indispensable research assistance.</p>
<p><span id="more-84"></span></p>
<p><b>Microsoft&#8217;s cookie syncing script would, in some cases, function as a cache cookie and respawn the MUID cookie.</b></p>
<p>One of the foundational concepts in web security is the cookie <a href="http://en.wikipedia.org/wiki/Same_origin_policy">same-origin policy</a>: cookies can only be read and modified by the domain that set them. If domains collaborate they can trivially circumvent the same-origin policy and share cookies with each other; this practice is called &#8220;cookie syncing.&#8221; Cookie syncing often raises privacy concerns. For example, in online advertising real-time bidding, <a href="http://www.adopsinsider.com/ad-exchanges/ssp-to-dsp-cookie-synching-explained/">cookie syncing allows a single advertising exchange to notify many advertising networks and data aggregators whenever a user visits a website</a>. That said, there are some unequivocally legitimate use cases for cookie syncing, such as when a company has spread its business across multiple domains (e.g. <span style="font-family:courier;">amazon.com</span> and <span style="font-family:courier;">amazonaws.com</span>).</p>
<p>Microsoft uses cookie syncing to share identifiers across many of its web properties, including <span style="font-family:courier;">bing.com</span>, <span style="font-family:courier;">microsoft.com</span>, <span style="font-family:courier;">msn.com</span>, <span style="font-family:courier;">live.com</span>, and <span style="font-family:courier;">xbox.com</span>. Microsoft also syncs its MUID cookie to <span style="font-family:courier;">atdmt.com</span>, the domain for its <a href="http://www.atlassolutions.com/">Atlas third-party advertising network</a>. We found that one of Microsoft&#8217;s cookie syncing scripts (<span style="font-family:courier;">wlHelper.js</span>) included an instruction to set the MUID cookie, and the script would get cached indefinitely.<sup><a href="#microsoft_footnote_3">3</a></sup> If the cached script ran and no MUID cookie was present, the script would set a cookie with its stored MUID. Here is a slightly simplified example snippet of the relevant code:<sup><a href="#microsoft_footnote_4">4</a></sup></p>
<blockquote><p><span style="font-family:courier;">var id_muid = &#8220;<b>5CBC2F2396F14F4EBA255A695D313CD1</b>&#8220;;</p>
<p>var muidValue = null;</p>
<p>// the MUID cookie value is read into muidValue</p>
<p>&#8230;</p>
<p>if (muidValue == null &amp;&amp; id_muid != null) {</p>
<p>&nbsp;&nbsp;&nbsp;&#8230;</p>
<p>&nbsp;&nbsp;&nbsp;// cookieDomain is set to &#8220;; domain=&#8221; + the current domain</p>
<p>&nbsp;&nbsp;&nbsp;var cookieSettings = cookieDomain + &#8220;; expires=Fri, 01 Jan 2021 00:00:00 GMT; path=/;&#8221;;</p>
<p>&nbsp;&nbsp;&nbsp;document.cookie = &#8220;MUID=&#8221; + id_muid + cookieSettings;</p>
<p>}</span></p>
</blockquote>
<p>We identified <span style="font-family:courier;">wlHelper.js</span> scripts on several Microsoft domains:</p>
<blockquote><p><span style="font-family:courier;">http://analytics.atdmt.com/Scripts/wlHelper.js?i=MUID</p>
<p>http://analytics.live.com/Scripts/wlHelper.jsi=MUID</p>
<p>http://analytics.microsoft.com/Scripts/wlHelper.js?i=MUID</p>
<p>http://analytics.msn.com/Scripts/wlHelper.js?i=MUID</span></p>
</blockquote>
<p>In our <a href="http://fourthparty.info/data">crawling data</a> from the Alexa world top 10,000 sites we found that one or more <span style="font-family:courier;">wlHelper.js</span> scripts were loaded when the browser visited:</p>
<blockquote><p><span style="font-family:courier;">http://www.microsoft.com/en-us/default.aspx</p>
<p>http://www.microsoftstore.com/store/msstore/DisplayHomePage</p>
<p>http://www.msn.com/</p>
<p>http://ca.msn.com/</p>
<p>http://es.msn.com/</span></p>
</blockquote>
<p>A user would have her MUID respawned if she: 1) ever visited a site with a <span style="font-family:courier;">wlHelper.js</span> embedded, 2) cleared her cookies but not her cache, and 3) visited a site with the same <span style="font-family:courier;">wlHelper.js</span> embedded and no MUID. It is difficult to estimate the number of users affected by Microsoft&#8217;s respawning without knowing more about traffic to Microsoft&#8217;s web properties and the conditions under which it would set an MUID. We would note that Microsoft&#8217;s web properties are popular destinations with tens of millions of visitors per day.</p>
<p>Once a cookie respawned, we often saw it get sent to other Microsoft domains. In the FourthParty data above, for example, the old MUID was passed to <span style="font-family:courier;">atdmt.com</span>.</p>
<blockquote><p><span style="font-family:courier;">http://c.atdmt.com/c.gif?&#8230;&amp;<b>MXFR=5CBC2F2396F14F4EBA255A695D313CD1</b></span></p>
</blockquote>
<p>Microsoft therefore had, in at least this case, sufficient information to trivially associate the user&#8217;s interactions with <span style="font-family:courier;">msn.com</span>, <span style="font-family:courier;">live.com</span>, and <span style="font-family:courier;">atdmt.com</span> from before and after cookie clearing.<sup><a href="#microsoft_footnote_6">6</a></sup></p>
<p><b>Microsoft&#8217;s cookie syncing script included an ETag cookie.</b></p>
<p><a href="http://en.wikipedia.org/wiki/HTTP_ETag">ETags</a> are a simple cache control mechanism built into HTTP. A website can assign a version number to a resource; when the browser goes to request the resource, and the version hasn&#8217;t changed, the website can just tell the browser to use its cached copy. It had long been known that, instead of storing a version number, an ETag could be used to store a user identifier (an &#8220;ETag cookie&#8221;). Two weeks ago a research team at University of Caliornia, Berkeley discovered <a href="http://ashkansoltani.org/docs/respawn_redux.html">the first instance of ETag cookies in use</a>.</p>
<p>We found that, in addition to functioning as a cache cookie, Microsoft&#8217;s <span style="font-family:courier;">wlHelper.js</span> script was associated with an ETag cookie containing the MUID.</p>
<blockquote><p><span style="font-family:courier;">sqlite&gt; select name, value from cookies where host=&#8217;.live.com&#8217; and name=&#8217;MUID&#8217; limit 1;</p>
<p>MUID	<b>5CBC2F2396F14F4EBA255A695D313CD1</b></span></p>
<p><span style="font-family:courier;">sqlite&gt; select http_response_headers.name, http_response_headers.value from http_responses, http_response_headers where http_responses.id = http_response_headers.http_response_id and http_responses.url=&#8217;http://analytics.atdmt.com/Scripts/wlHelper.js?i=MUID&#8217; and http_response_headers.name=&#8217;Etag&#8217; limit 1;</p>
<p>Etag	&#8220;<b>5CBC2F2396F14F4EBA255A695D313CD1</b>3698&#8243;</span></p>
</blockquote>
<p>The practical effect was that if a user cleared her cookies but not her cache, subsequent requests for <span style="font-family:courier;">wlHelper.js</span> would be accompanied by both the new MUID value (in a cookie) and the old MUID value (in the ETag). This pairing of old and new MUIDs gave Microsoft sufficient information to associate user interactions with its domains from before and after cookie clearing.<sup><a href="#microsoft_footnote_7">7</a></sup></p>
<p>In addition to supercookies, we spotted two other issues with Microsoft&#8217;s advertising practices.</p>
<p><b>Microsoft&#8217;s targeted advertising opt-out button was invisible in Chrome and Safari.</b></p>
<p>Microsoft operates <a href="http://choice.live.com/Default.aspx">its own advertising choice page</a>. (Note that Microsoft only allows users to opt out of seeing behaviorally targeted advertising; it does not remove its identifier cookies after a user has opted out, nor does it make any promise to stop tracking.) We observed that the opt-out link on Microsoft&#8217;s advertising choice page was invisible for Chrome and Safari users.</p>
<p><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/microsoft_optout.png" /></p>
<p>Microsoft fixed their opt-out button after we called the issue to their attention.</p>
<p><b>Microsoft&#8217;s approach to segregating advertising data does not meaningfully protect user privacy.</b></p>
<p>In a 2007 report entitled &#8220;<a href="http://go.microsoft.com/?linkid=9702232">Privacy Protections in Microsoft&#8217;s Ad Serving System and the Process of &#8216;De-identification</a>,&#8217;&#8221; Microsoft&#8217;s privacy team explains how the company segregates identified Windows Live user information from de-identified third-party advertising data. </p>
<blockquote><p>One of Microsoft&#8217;s goals is to serve targeted ads in a manner that protects user privacy. To avoid using the LiveID cookie to serve per-user ads&mdash;because, as described earlier, it is directly associated with information that could personally identify the user&mdash;Microsoft has created an &#8220;Anonymous&#8221; ID, called the ANID, on which its ad serving capabilities are based.</p>
<p>When a user first registers with Windows Live or MSN, a LiveID and an ANID are created simultaneously. The ANID is derived by applying a one-way cryptographic hash function to the LiveID. A one-way cryptographic hash function ensures that there is no practical way of deriving the original value from the resulting hash value&mdash;that is, the process cannot be reversed to obtain the original number.</p>
</blockquote>
<p>Microsoft makes several expansive claims about its advertising system&#8217;s privacy guarantees.</p>
<blockquote><p>Because all personally and directly identifying information about a user is stored on servers in association with a LiveID rather than an ANID, there is no practical way to link data stored in association with an ANID back to any data on Microsoft servers that could personally and directly identify an individual user.</p>
</blockquote>
<blockquote><p>Furthermore, to associate any of the ANID-based data in the Microsoft ad system with an individual user, an internal or external attacker would not only need access to the ad serving system (to access the data), the Windows Live ID system (to access all LiveIDs ever issued) and the hashing algorithm but would also need a massive computing infrastructure to run the algorithm on each and every LiveID ever created to try to find the ANID in question.</p>
</blockquote>
<p>And in 2008 testimony before the Senate Commerce Committee, attached to a <a href="http://www.ftc.gov/os/comments/privacyroundtable/544506-00020.pdf">2009 comment to the Federal Trade Commission</a>:</p>
<blockquote><p>As a result of this &#8220;deidentification&#8221; process, search query data and data about Web surfing behavior used for ad targeting is associated with an anonymized identifier rather than an account identifier that could be used to personally and directly identify a consumer.</p>
</blockquote>
<p>Microsoft&#8217;s attempt at &#8220;anonymous&#8221; advertising data does not achieve nearly so much. First, as <a href="http://randomwalker.info">Arvind Narayanan</a> noted in <a href="http://cyberlaw.stanford.edu/node/6701">a recent blog post</a>, de-identified online tracking data is far from anonymous. Even using completely unassociated identifiers for Windows Live user information and advertising data would not provide much of a privacy guarantee.<sup><a href="#microsoft_footnote_8">8</a></sup></p>
<p>Second, Microsoft&#8217;s use of a cryptographic hash to generate its ANID cookie contributes little privacy protection. The privacy threat that Microsoft is attempting to mitigate is a comparison between a user&#8217;s ANID and LiveID. Cryptographic hashing does not make comparison of two known values a challenge: on the contrary, comparison is a <i><a href="http://en.wikipedia.org/wiki/Cryptographic_hash_function#Applications">core use case</a></i> for cryptographic hashing. With knowledge of Microsoft&#8217;s hash function, an employee or outsider could trivially compare any LiveID to any ANID.</p>
<p><b>Closing Thoughts</b></p>
<p>The online advertising industry is currently locking horns in Washington to prove it can regulate itself. Several trade organizations and private firms have billed themselves as rigorous watchdogs. And yet, in our analysis of one of the most prolific online advertising networks, we found significant privacy shortcomings that even a cursory privacy audit would have uncovered. It is increasingly difficult to accept industry claims that recent negative discoveries reflect &#8220;just a few bad apples.&#8221; And it is more than a little troubling that a few research groups and occasional press coverage seem to be the only present checks on one of the most privacy-invasive industries in history.</p>
<hr />
<p>Thanks to <a href="http://ashkansoltani.org">Ashkan Soltani</a> for providing feedback on a draft.</p>
<p><a name="microsoft_footnote_1">[1]</a> See <a href="http://samy.pl/evercookie/">Evercookie</a>, <a href="http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1898390">Flash Cookies and Privacy II</a>, and <a href="http://crypto.stanford.edu/~dabo/pubs/papers/privatebrowsing.pdf">An Analysis of Private Browsing Modes in Modern Browsers</a>.</p>
<p><a name="microsoft_footnote_2">[2]</a> See <a href="http://panopticlick.eff.org/browser-uniqueness.pdf">How Unique Is Your Web Browser?</a> and <a href="http://www.stanford.edu/~jmayer/papers/thesis09.pdf">Any person&#8230; a pamphleteer</a>.</p>
<p><a name="microsoft_footnote_3">[3]</a> The <span style="font-family:courier;">wlHelper.js</span> script was served with a two-day cache expiry. Subsequent requests after the cache expired received a 304 response to keep the cached version with another two-day expiry.</p>
<p><a name="microsoft_footnote_4">[4]</a> Microsoft replaced its cookie syncing system, including <span style="font-family:courier;">wlHelper.js</span>, after we alerted them to our findings. An example of the old script is <a href="http://code.google.com/p/chromium/issues/attachmentText?id=56273&amp;aid=2761828953879496656&amp;name=wlHelper.js&amp;token=f1dcc72833e25da7166ad875d9a4e534">available on Google Code</a>.</p>
<p><a name="microsoft_footnote_5">[5]</a> We also saw the <span style="font-family:courier;">microsoft.com</span> MUID cookie respawn, but not through <span style="font-family:courier;">wlHelper.js</span>. We are still working to discover the additional supercookie mechanism on <span style="font-family:courier;">microsoft.com</span>.</p>
<p><a name="microsoft_footnote_6">[6]</a> Web measurement provides limited insight into a website&#8217;s backend. In cases where a domain&#8217;s cookie respawned, it is quite likely that new tracking information was associated with old tracking information. We cannot say how Microsoft used its data in cases where a domain&#8217;s MUID didn&#8217;t respawn but it received an old MUID from another domain that did respawn. At minimum it seems unlikely Microsoft discarded this information from all logs.</p>
<p><a name="microsoft_footnote_7">[7]</a> As above, we cannot say what Microsoft did with its ETag cookie data. It again would be unlikely Microsoft discarded this information from all logs.</p>
<p><a name="microsoft_footnote_8">[8]</a> Google follows this approach: it serves its third-party advertising content from <span style="font-family:courier;">doubleclick.net</span>, and uses a Doubleclick identifier, instead of serving from <span style="font-family:courier;">google.com</span> and using a Google identifier.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2011/08/18/tracking-the-trackers-microsoft-advertising/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tracking the Trackers: The AdChoices Icon</title>
		<link>http://webpolicy.org/2011/08/17/tracking-the-trackers-the-adchoices-icon/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=tracking-the-trackers-the-adchoices-icon</link>
		<comments>http://webpolicy.org/2011/08/17/tracking-the-trackers-the-adchoices-icon/#comments</comments>
		<pubDate>Thu, 18 Aug 2011 06:57:52 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Measurement]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=79</guid>
		<description><![CDATA[Original at the Stanford Center for Internet and Society. Jovanni Hernandez and Akshay Jagadeesh are the first authors of this study. Responding to pressure from the Federal Trade Commission, in mid-2009 the largest advertising industry trade groups joined forces to develop a new self-regulatory program for behavioral advertising: the Digital Advertising Alliance (DAA). Like the [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at the <a href="http://cyberlaw.stanford.edu/node/6714">Stanford Center for Internet and Society</a>.</em></p>
<p><em>Jovanni Hernandez and Akshay Jagadeesh are the first authors of this study.</em></p>
<p>Responding to pressure from the Federal Trade Commission, in mid-2009 the largest advertising industry trade groups joined forces to develop a new self-regulatory program for behavioral advertising: the <a href="http://aboutads.info">Digital Advertising Alliance</a> (DAA). Like the parallel self-regulatory program for advertising networks, the <a href="http://networkadvertising.org">Network Advertising Initiative</a> (NAI), the DAA makes no promises about providing privacy choices: DAA members must only provide an opt out of seeing advertising that is based on tracking, not an opt out of tracking itself.<sup><a href="#adchoices_footnote_1">1</a></sup> As Chris Hoofnagle at Berkeley Law has noted on several occasions, the word &#8220;privacy&#8221; scarcely even appears in the DAA&#8217;s documents.</p>
<p><span id="more-79"></span></p>
<p>The centerpiece of the DAA self-regulatory program is &#8220;enhanced notice,&#8221; a common text and icon for linking a consumer to information about behavioral advertising and an opt out of behavioral targeting (again, not tracking). The <a href="http://www.nytimes.com/2010/01/27/business/media/27adco.html">initial proposal</a> for &#8220;enhanced notice&#8221; was a large insert for third-party behavioral ads consisting of a &#8220;Power I&#8221; icon alongside descriptive phrases such as &#8220;Interest based ads&#8221; and &#8220;Why did I get this ad?&#8221;</p>
<p><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/adchoices_original.jpeg" /></p>
<p>In its <a href="http://www.aboutads.info/resource/download/DAA-Icon_Ad-Creative-Primer.pdf">final consensus form</a>, &#8220;enhanced notice&#8221; consists of a small &#8220;Forward I&#8221; icon (aka the &#8220;Advertising Option Icon&#8221;) in or around a third-party behaviorally targeted ad and, in some cases, the text &#8220;AdChoices.&#8221;</p>
<p><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/adchoices_icon_text.png" /></p>
<p><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/adchoices_icon.png" /></p>
<p>&#8220;Enhanced notice&#8221; can also appear in the footer of a page.</p>
<p><img src="http://dl.dropbox.com/u/37533397/tracking_the_trackers/adchoices_footer.png" /></p>
<p>Usability of an advertising choice mechanism, like any software feature, will be a key driver of adoption. We note that at no point did the DAA conduct testing to determine how &#8220;enhanced notice&#8221; compares in usability to alternative choice mechanism designs, such as an in-browser option.<sup><a href="#adchoices_footnote_2">2</a></sup> In fact, web browser <a href="http://donottrack.us">Do Not Track</a> features &mdash; which aren&#8217;t yet fully backed by industry or regulation &mdash; are already seeing <a href="http://paidcontent.org/article/419-new-study-shows-use-of-do-not-track-is-on-the-rise/">significantly greater usage</a> than the DAA&#8217;s &#8220;enhanced notice&#8221; (<a href="http://www.clickz.com/clickz/news/1933561/opt-behavioral-ads">1</a>, <a href="http://media6degrees.com/blog/ad-choices-update/">2</a>).</p>
<p>In recent <a href="http://naiblog.org/wp-content/uploads/2011/06/DAA-Companies-Release_FINAL.pdf">press materials</a> and <a href="http://www.iab.net/media/file/DC1DOCS1-432016-v1-John_Montgomery_-_Written_Testimony.pdf">congressional testimony</a> members of the online advertising industry have touted that trillions of &#8220;enhanced notice&#8221; icons are being served. We conducted this study to empirically examine the DAA&#8217;s &#8220;enhanced notice&#8221; icons.<sup><a href="#adchoices_footnote_3">3</a></sup> For simplicity, and following convention in the privacy community, we refer to any iteration of the DAA&#8217;s icon-based program as &#8220;AdChoices.&#8221;</p>
<p><b>Methodology</b></p>
<p>We began with the <a href="http://www.alexa.com/topsites/countries/US">Alexa list of top 500 U.S. websites</a> as of August 4th, 2011. For each site on the list we manually inspected<sup><a href="#adchoices_footnote_4">4</a></sup> the homepage for third-party advertisements. We labeled content as a third-party ad if it appeared in a standard ad box size and was served by a third-party advertising network. For each third-party ad, we noted whether an AdChoices icon or text link was present. We also manually examined page footers for AdChoices icons and text. Please email if you would like to examine the screenshots from our study.<sup><a href="#adchoices_footnote_5">5</a></sup></p>
<p><b>Results</b></p>
<p>We identified 627 third-party ads. Only 62 (9.9%) included an AdChoices icon in or around the ad. And only 32 (5.1%) had an &#8220;AdChoices&#8221; text link. We found an AdChoices icon and text link in the footer of only 13 (2.6%) of the pages we examined.</p>
<p>Restricting our dataset to the 449 non-explicit, domestic websites in the Alexa U.S. top 500, we spotted 512 third-party ads. 58 (11.3%) had an AdChoices icon. 28 (5.5%) had an &#8220;AdChoices&#8221; text link. We identified the AdChoices icon and text in the footer of 13 pages (2.9%).</p>
<p>A full spreadsheet of results is <a href="http://dl.dropbox.com/u/37533397/tracking_the_trackers/adchoices.xlsx">available in Excel format</a>.</p>
<p><b>Concluding Thoughts</b></p>
<p>We, and many other researchers, already had grave doubts about the AdChoices program&#8217;s efficacy. The icon is only approximately 13&#215;13 pixels<sup><a href="#adchoices_footnote_6">6</a></sup> and nondescript. The accompanying text is in small font and reads, ambiguously, &#8220;AdChoices.&#8221; Now we learn that the icon rarely shows up and, half the time, doesn&#8217;t even include <i>any</i> text.</p>
<p>Beyond demonstrating shortcomings in the DAA&#8217;s AdChoices program, our findings also run contrary to two common claims from members of the online advertising industry: that <a href="http://www.mediapost.com/publications/?fa=Articles.showArticle&amp;art_aid=145077">the vast majority of third-party ads are behaviorally targeted</a><sup><a href="#adchoices_footnote_7">7</a></sup> and that the largest players in behavioral targeting have embraced the AdChoices icon (<a href="http://naiblog.org/nai-companies-ready-to-serve-the-icon/">1</a>, <a href="http://naiblog.org/wp-content/uploads/2011/06/DAA-Companies-Release_FINAL.pdf">2</a>). Given our results, both claims cannot be true. We call upon the online advertising industry to share its statistics on what proportion of third-party ads are behaviorally targeted and what proportion of third-party ads bear an AdChoices icon. </p>
<hr />
<p><a name="adchoices_footnote_1">[1]</a> The NAI and DAA use slightly different language to describe their opt-out commitments. An attorney for Venable, DAA&#8217;s counsel, confirmed that there is no practical effect to the difference at a <a href="http://www.law.yale.edu/intellectuallife/madbotsconference.htm">recent symposium at Yale Law School</a>. The NAI is now a participant in the DAA consortium.</p>
<p><a href="http://networkadvertising.org/networks/2008%20NAI%20Principles_final%20for%20Website.pdf">NAI</a>: &#8220;Opt out of OBA [online behavioral advertising] means that a consumer is provided an opportunity to exercise a choice to <i>disallow OBA</i> . . . . [C]ollection of non-PII data regarding that consumer&#8217;s browser may only <i>continue for non-OBA purposes</i> . . . .&#8221; (emphasis added).</p>
<p><a href="http://www.aboutads.info/resource/download/seven-principles-07-01-09.pdf">DAA</a>: &#8220;A Third Party should provide consumers with the ability to exercise choice with respect to the collection and use of data <i>for Online Behavioral Advertising purposes</i> . . . .&#8221; (emphasis added).</p>
<p><a name="adchoices_footnote_2">[2]</a> <a href="http://www.aleecia.com/authors-drafts/tprc-behav-AV.pdf">Research by McDonald and Cranor</a> found that the NAI&#8217;s choice mechanism, which is built around an information page and an opt-out page much like the DAA&#8217;s, created substantial consumer confusion. A <a href="http://futureofprivacy.org/final_report.pdf">paper</a> by the Future of Privacy Forum, an industry-funded advocacy group, found that the icon approach fell flat in conveying information about behavioral targeting (&#8220;substantial repetition and consumer education may be needed to improve [the icon's] communication effectiveness over time&#8221;) and that the &#8220;Power I&#8221; icon and &#8220;AdChoice&#8221; text performed worse than an alternative icon and several alternative texts.</p>
<p><a name="adchoices_footnote_3">[3]</a> This study is not an examination of legal compliance with the DAA self-regulatory program. The DAA&#8217;s formal third-party &#8220;enhanced notice&#8221; requirement is very lax, and can be satisfied by as little as a link to the DAA in the website&#8217;s privacy policy. For a study examining compliance with the DAA&#8217;s self-regulatory program, see &#8220;<a href="http://www.cylab.cmu.edu/research/techreports/2011/tr_cylab11005.html">AdChoices? Compliance with Online Behavioral Advertising Notice and Choice Requirements</a>&#8221; by Komanduri et al.</p>
<p><a name="adchoices_footnote_4">[4]</a> As with any study that relies on manual labeling, there is a possibility we made labeling errors. At least two different researchers examined each page to minimize the possibility of mislabeling. Moreover, the proportions we are reporting are very robust against individual labeling errors owing to the size of the dataset.</p>
<p><a name="adchoices_footnote_5">[5]</a> Because a number of the Alexa top websites contain explicit content, we are not able to publicly post our screenshot dataset.</p>
<p><a name="adchoices_footnote_6">[6]</a> Compare to the smallest <a href="http://www.iab.net/iab_products_and_industry_services/1421/1443/1452">standard display ad size</a> at 88&#215;31 pixels.</p>
<p><a name="adchoices_footnote_7">[7]</a> Industry metrics imply only a very small proportion of third-party advertising is behaviorally targeted. See &#8220;<a href="http://cyberlaw.stanford.edu/node/6592">Do Not Track Is No Threat to Ad-Supported Businesses</a>.&#8221;</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2011/08/17/tracking-the-trackers-the-adchoices-icon/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>FourthParty: A New Approach to Web Measurement</title>
		<link>http://webpolicy.org/2011/08/08/fourthparty-a-new-approach-to-web-measurement/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=fourthparty-a-new-approach-to-web-measurement</link>
		<comments>http://webpolicy.org/2011/08/08/fourthparty-a-new-approach-to-web-measurement/#comments</comments>
		<pubDate>Tue, 09 Aug 2011 06:21:33 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Measurement]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=76</guid>
		<description><![CDATA[Original at the Stanford Center for Internet and Society. Last week marked the twentieth anniversary of the public World Wide Web, and there is much to celebrate. The early web consisted of a few text pages linked together; the modern web supports audio, video, interactivity, complex storage, and even native applications. Both Microsoft and Google [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at the <a href="http://cyberlaw.stanford.edu/node/6706">Stanford Center for Internet and Society</a>.</em></p>
<p>Last week marked the <a href="http://groups.google.com/group/alt.hypertext/msg/395f282a67a1916c" rel="nofollow">twentieth anniversary of the public World Wide Web</a>, and there is much to celebrate. The early web consisted of a few text pages linked together; the modern web supports <a href="http://dev.w3.org/html5/spec/Overview.html#the-audio-element" rel="nofollow">audio</a>, <a href="http://dev.w3.org/html5/spec/Overview.html#the-video-element" rel="nofollow">video</a>, <a href="http://dev.w3.org/html5/spec/Overview.html#the-canvas-element" rel="nofollow">interactivity</a>, <a href="http://www.w3.org/TR/IndexedDB/" rel="nofollow">complex storage</a>, and even <a href="http://code.google.com/p/nativeclient/" rel="nofollow">native applications</a>. Both <a href="http://arstechnica.com/microsoft/news/2011/06/html5-centric-windows-8-leaves-microsoft-developers-horrified.ars" rel="nofollow">Microsoft</a> and <a href="http://en.wikipedia.org/wiki/Google_Chrome_OS" rel="nofollow">Google</a> are now developing entire operating systems around web technologies.</p>
<p>Tools for measuring the web have not kept pace. Many studies still rely on HTTP header logging and static analysis of HTML, CSS, and JavaScript. Researchers who want to go beyond these simple tools are often forced to develop purpose-built software from scratch.</p>
<p>Today we&#8217;re releasing <a href="http://fourthparty.info" rel="nofollow">FourthParty</a>, an open-source platform for web measurement. FourthParty is built on <a href="http://www.mozilla.com/en-US/firefox/new/" rel="nofollow">Mozilla Firefox</a> and the <a href="https://addons.mozilla.org/en-US/developers/docs/sdk/latest/" rel="nofollow">Add-on SDK</a>, making it fast, modular, easy to use, multi-platform, and up-to-date with the latest web technologies. And FourthParty is already generating research results: it&#8217;s the tool we&#8217;ve been using in our Tracking the Trackers studies (<a href="http://cyberlaw.stanford.edu/node/6694" rel="nofollow">1</a>, <a href="http://cyberlaw.stanford.edu/node/6695" rel="nofollow">2</a>). To learn more and get started, visit <a href="http://fourthparty.info" rel="nofollow">fourthparty.info</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2011/08/08/fourthparty-a-new-approach-to-web-measurement/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tracking the Trackers: To Catch a History Thief</title>
		<link>http://webpolicy.org/2011/07/19/tracking-the-trackers-to-catch-a-history-thief/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=tracking-the-trackers-to-catch-a-history-thief</link>
		<comments>http://webpolicy.org/2011/07/19/tracking-the-trackers-to-catch-a-history-thief/#comments</comments>
		<pubDate>Tue, 19 Jul 2011 10:20:55 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=74</guid>
		<description><![CDATA[Original at the Stanford Center for Internet and Society. Last week we reported some early results from the Stanford Security Lab&#8216;s new web measurement platform on how advertising networks respond to opt outs and Do Not Track. This week we&#8217;re back with a new discovery in the online advertising ecosystem: Epic Marketplace,1 a member of [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at the <a href="http://cyberlaw.stanford.edu/node/6695">Stanford Center for Internet and Society</a>.</em></p>
<p>Last week we reported some <a href="http://cyberlaw.stanford.edu/node/6694">early results</a> from the <a href="http://seclab.stanford.edu/">Stanford Security Lab</a>&#8216;s new web measurement platform on how advertising networks respond to opt outs and Do Not Track. This week we&#8217;re back with a new discovery in the online advertising ecosystem: <a href="http://www.epicmarketplace.com/">Epic Marketplace</a>,<sup><a href="#footnote-1">1</a></sup> a member of the self-regulatory <a href="http://networkadvertising.org/">Network Advertising Initiative</a> (NAI), is history stealing.</p>
<p>Many thanks once again to research assistants Akshay Jagadeesh and Jovanni Hernandez.</p>
<p><span id="more-74"></span></p>
<p><b>Background</b></p>
<p>A link can be styled differently based on whether you&#8217;ve been to the page it points to. You may recall, for example, that in the early days of the web links you hadn&#8217;t visited were <span style="color:blue;text-decoration:underline;">blue</span> and links you had visited were <span style="color:purple;text-decoration:underline;">purple</span>. History stealing is a practice that exploits link styling to learn a user&#8217;s web browsing history. The approach is simple: to test whether the user has visited a link, add it to a page and check how it&#8217;s styled.<sup><a href="#footnote-2">2</a></sup></p>
<p>Members of the computer security community have <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=147777">long considered</a> history stealing a serious privacy vulnerability. The risk goes beyond leaking individual tidbits about past browsing; history stealing can be used to <a href="http://33bits.org/2010/02/18/cookies-supercookies-and-ubercookies-stealing-the-identity-of-web-visitors/">track or even identify</a> a user. Mozilla finally <a href="http://blog.mozilla.com/security/2010/03/31/plugging-the-css-history-leak/">implemented a fix</a> in Firefox 4, and the other major browser vendors quickly followed. According to <a href="http://en.wikipedia.org/wiki/Usage_share_of_web_browsers">browser usage statistics</a> roughly half of users remain vulnerable to history stealing. </p>
<p>About a year ago researchers at UCSD conducted the first <a href="http://cseweb.ucsd.edu/~hovav/dist/history.pdf">comprehensive study</a> of history stealing in practice. They found that a few popular adult sites were history stealing to learn whether users had visited their competitors. The UCSD team also discovered history stealing by several advertising networks, including <a href="http://interclick.com/">Interclick</a> (another NAI member). Class action litigation <a href="http://www.scribd.com/doc/44635414/Pitner-Versus-Youporn">is</a> <a href="http://www.infolawgroup.com/uploads/file/Bose%20v_%20Interclick%20(History%20Sniffing).pdf">ongoing</a>.</p>
<p><b>Technical Findings &#8211; History Stealing</b></p>
<p>While testing the JavaScript instrumentation in our new web measurement platform we stumbled across Epic Marketplace history stealing on <a href="http://www.flixster.com/">Flixster</a> and <a href="http://charter.net/">Charter.net</a>. We reverse engineered the Epic Marketplace history stealing script and found a number of features:</p>
<ul>
<li>The script is <i>fast</i>. Thousands of links are tested per second.
</li>
<li>Links are added in an invisible iframe; there is no apparent effect on the page layout.
</li>
<li>The script dynamically loads lists of URLs and associated interest segments using <a href="http://en.wikipedia.org/wiki/JSONP">JSONP</a>.
</li>
<li>Progress is stored in a cookie so the script can resume where it left off.
</li>
<li>The script sets a cookie indicating when it was last run; it will not history steal more than once every twenty-four hours.
</li>
<li>If history stealing is still in progress when the window is closed (e.g. the user navigates to another page) the script sends its findings before ending execution.
</li>
<li>The script slows down if a URL list takes over two seconds to process.
</li>
<li>To prevent multiple history stealing attempts in parallel, the script uses a <a href="http://en.wikipedia.org/wiki/Mutual_exclusion">mutex</a> cookie.
</li>
<li>The script does not directly report the URLs that it detects the user has visited; it sends a deduplicated list of the interest segments associated with the visited URLs.
</li>
</ul>
<p>(For the technically inclined reader, here are an example <a href="http://cdn1.trafficmp.com/prod/ig/110701-130258_ig.html?pid=478&amp;plid=20282">iframe</a>, <a href="http://cdn1.trafficmp.com/prod/ig/110701-130258_ig-min.js">script</a>, and <a href="http://cdn1.trafficmp.com/prod/ig/110701-130258_adv_0.html">URL list</a>.)</p>
<p>We also examined a series of URL lists (<a href="http://donottrack.us/docs/epic_marketplace.xlsx">spreadsheet</a>) that contain 15,511 entries. The URLs and interest segments range greatly. Some URLs are for a landing page; others are for a specific page. Some interest segments are broad; others are fine-grained. A few example segments:</p>
<ul>
<li>Segment 758: discount sites including <a href="http://www.groupon.com/">Groupon</a> and <a href="http://deals.ebay.com/">eBay Daily Deals</a>
</li>
<li>Segment 876: sites about coffee, including <a href="http://www.dunkindonuts.com/">Dunkin&#8217; Donuts</a>, <a href="http://www.folgers.com/">Folgers</a>, and <a href="http://www.starbucks.com/">Starbucks</a>
</li>
<li>Segments 984-989: home improvement sites including <a href="http://www.homedepot.com/">Home Depot</a> and <a href="http://www.grainger.com/">Grainger</a>
</li>
<li>Segment 2701: pages about the <a href="http://www.fordvehicles.com/cars/fiesta/">Ford Fiesta</a>
</li>
</ul>
<p>Several interest segments are highly sensitive:</p>
<ul>
<li>Segment 760: pages about getting pregnant and fertility, including at the <a href="http://www.mayoclinic.com/health/how-to-get-pregnant/PR00103">Mayo Clinic</a>
</li>
<li>Segment 2640: pages about menopause, including at the <a href="http://www.nlm.nih.gov/medlineplus/menopause.html">NIH</a> and the <a href="http://www.umm.edu/altmed/articles/menopause-000107.htm">University of Maryland</a>
</li>
<li>Segment 2014: pages about repairing bad credit, including at the <a href="http://www.ftc.gov/bcp/edu/pubs/consumer/credit/cre13.shtm">FTC</a>
</li>
<li>Segment 2265: pages about debt relief, including at the <a href="http://www.ftc.gov/bcp/edu/pubs/consumer/alerts/alt015.shtm">FTC</a> and the <a href="http://www.irs.gov/individuals/article/0,,id=179414,00.html">IRS</a>
</li>
</ul>
<p>&nbsp;</p>
<p><b>Technical Findings &#8211; Opt Out</b></p>
<p>We applied the methodology from <a href="http://cyberlaw.stanford.edu/node/6694">last week&#8217;s study</a> to examine Epic Marketplace&#8217;s opt-out practices. (Epic Marketplace was one of the eleven NAI members not included in that study.) We found that Epic Marketplace leaves its tracking cookies in place after both opting out with the NAI mechanism and enabling Do Not Track. We also found that history stealing continues after using either choice mechanism.</p>
<p><b>Privacy Representations</b></p>
<p>The 2008 <a href="http://www.networkadvertising.org/networks/2008%20NAI%20Principles_final%20for%20Website.pdf">NAI Code of Conduct</a> requires member companies to receive express consent from a user before collecting &#8220;Sensitive Consumer Information,&#8221; defined as:</p>
<blockquote>
<ul>
<li>Social Security Numbers or other Government-issued identifiers
</li>
<li>Insurance plan numbers
</li>
<li>Financial account numbers
</li>
<li>Information that describes the precise real-time geographic
<p>location of an individual derived through location-based services</p>
<p>such as through GPS-enabled devices </p>
</li>
<li>Precise information about past, present, or potential future health
<p>or medical conditions or treatments, including genetic, genomic,</p>
<p>and family medical history</p>
</li>
</ul>
</blockquote>
<p>(The Code of Conduct includes the unhelpful footnote, &#8220;[t]his provision is to be further developed in a distinct implementation guideline.&#8221;)</p>
<p>The Epic Marketplace <a href="http://www.epicmarketplace.com/policies.php">privacy policy</a> contains the following paragraph under the headings &#8220;Information We Collect&#8221; and &#8220;Non-Personally Identifiable Information&#8221;:</p>
<blockquote>
<p>Epic Marketplace also automatically receives and records anonymous information that your browser sends whenever you visit a website which is part of the Epic Marketplace Network. We use log files to collect Internet protocol (IP) addresses, browser type, Internet service provider (ISP), referring/exit pages, platform type, date/time stamp, one or more cookies that may uniquely identify your browser, and responses by a web surfer to an advertisement delivered by us. This information may be stored on our systems for about one year.</p>
</blockquote>
<p>The privacy policy also claims that:</p>
<blockquote>
<p>Web surfers may elect not to provide non-personally identifiable information by following the cookie opt-out procedures set forth below.</p>
</blockquote>
<p>As with our prior work, we leave it to the reader to assess whether Epic Marketplace is complying with its privacy representations.</p>
<p>&nbsp;</p>
<hr />
<p>Thanks to Gordon Franken for reviewing this post.</p>
<p><a name="footnote-1">1.</a> Epic Marketplace was, <a href="http://epicmediagroup.wordpress.com/2011/06/08/introducing-epic-marketplace/">until recently</a>, named Traffic Marketplace. It hosts its third-party content on <a href="http://trafficmp.com">trafficmp.com</a>.</p>
<p><a name="footnote-2">2.</a> Other forms of history stealing, beyond the scope of this post, rely on page layout, background images, and <a href="http://websec.sv.cmu.edu/visited/visited.pdf">user interaction</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2011/07/19/tracking-the-trackers-to-catch-a-history-thief/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tracking the Trackers: Early Results</title>
		<link>http://webpolicy.org/2011/07/11/tracking-the-trackers-early-results/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=tracking-the-trackers-early-results</link>
		<comments>http://webpolicy.org/2011/07/11/tracking-the-trackers-early-results/#comments</comments>
		<pubDate>Tue, 12 Jul 2011 06:12:43 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=72</guid>
		<description><![CDATA[Original at the Stanford Center for Internet and Society. Over the past several months researchers at the Stanford Security Lab have been developing a platform for measuring dynamic web content. One of our chief applications is a system for automated enforcement of Do Not Track by detecting the myriad forms of third-party tracking, including cookies, [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at the <a href="http://cyberlaw.stanford.edu/node/6694">Stanford Center for Internet and Society</a>.</em></p>
<p>Over the past several months researchers at the <a href="http://seclab.stanford.edu/">Stanford Security Lab</a> have been developing a platform for measuring dynamic web content. One of our chief applications is a system for automated enforcement of <a href="http://donottrack.us">Do Not Track</a> by detecting the myriad forms of third-party tracking, including <a href="http://en.wikipedia.org/wiki/HTTP_cookie">cookies</a>, <a href="http://www.w3.org/TR/webstorage/">HTML5 storage</a>, <a href="http://panopticlick.eff.org/">fingerprinting</a>, and much more. While the software isn&#8217;t quite polished enough for public release, we&#8217;re eager to share some unexpected early results on the advertising ecosystem. Please bear in mind that these are preliminary findings from experimental software; our primary aims at this stage are developing the platform and validating the approach to third-party tracking detection. Many thanks to Jovanni Hernandez and Akshay Jagadeesh for their invaluable research assistance.</p>
<p><span id="more-72"></span></p>
<p><b>Methodology</b></p>
<p>We began with a list of advertising companies that participate in the self-regulatory <a href="http://networkadvertising.org/">Network Advertising Initiative</a> (NAI). By navigating popular websites we identified a piece of tracking content (primarily ads and <a href="http://en.wikipedia.org/wiki/Web_bug">beacons</a>) from 64 of the 75 NAI member companies. We performed the following tests on each company&#8217;s content:</p>
<p>1) Load the content.</p>
<p>2) Load the content, opt out of the company on the NAI website, and then reload the content.</p>
<p>3) Load the content, enable Do Not Track, and then reload the content.</p>
<p>We manually identified tracking cookies (cookies that appeared to contain a unique identifier or substantially unique information) and how they were altered throughout each test. A <a href="http://donottrack.us/docs/tracking_the_trackers_early_results.xlsx">spreadsheet</a> of results is available. Please email if you would like a copy of the data we logged while testing a particular company&#8217;s content.</p>
<p><b>1. At least two NAI members are taking overt steps to respect Do Not Track.</b></p>
<p><a href="http://media6degrees.com/">Media6Degrees</a>, an advertising data provider, deletes its tracking cookies and sets an opt-out cookie upon receiving a Do Not Track request.</p>
<p><a href="http://www.bluekai.com/">BlueKai</a>, a data provider and management platform, does not set tracking cookies in response to a Do Not Track request, but it does not delete any existing tracking cookies.</p>
<p><b>2. <strike>Over half</strike> Half of the NAI members we tested did not remove their tracking cookies after opting out.</b></p>
<p>NAI member companies pledge only to allow opting out of behavioral ad targeting, not tracking. Of the 64 companies we studied, <strike>33</strike> 32 left tracking cookies in place after opting out. </p>
<p><b>3. At least eight NAI members promise to stop tracking after opting out, but nonetheless leave tracking cookies in place.</b></p>
<p>We compared our results to a <a href="http://www.cylab.cmu.edu/research/techreports/2011/tr_cylab11005.html">survey</a> of NAI member privacy and opt-out policies recently conducted by <a href="http://www.cylab.cmu.edu/index.html">Carnegie Mellon&#8217;s CyLab</a>. We identified seven companies that (in the study&#8217;s reading) promise to stop tracking when a user opts out, but nonetheless leave their tracking cookies in place.</p>
<p>The <a href="http://www.247realmedia.com/EN-US/">24/7 Real Media</a> <a href="http://www.247realmedia.com/EN-US/privacy-policy.html">privacy policy</a> claims that a user may &#8220;opt out of receiving our ad delivery, audience management and behavioral targeting cookies.&#8221; We found that opting out deleted the company&#8217;s tracking cookies, but reloading the content reinstalled the tracking cookies.</p>
<p><a href="http://www.adconion.com/">Adconion</a>&#8216;s <a href="http://www.adconion.com/us/privacy-policy.html">privacy policy</a> states that a user is &#8220;free to opt out of the Adconion Cookie.&#8221; Opting out deleted one of three tracking cookies but left the other two in place. Reloading the content did not update the remaining tracking cookies.</p>
<p>In its <a href="http://www.audiencescience.com/privacy">privacy policy</a>, <a href="http://www.audiencescience.com/">AudienceScience</a> describes its opt-out option as follows: &#8220;Should you choose to opt-out, we delete all previously collected information from the cookies, and put new information in the cookie which tells us to stop collecting information from that device.&#8221; We found that opting out of AudienceScience removes its unique tracking cookie but does not remove a highly unique cookie that represents the user&#8217;s interests. Subsequent loads of the content updated the interest cookie.</p>
<p>[See below for an update from AudienceScience.]</p>
<p><a href="http://www.netmining.com/">Netmining</a>&#8216;s <a href="http://www.netmining.com/en/privacy-policy-service.html">privacy policy</a> states that upon opting out &#8220;we will delete your existing ntmng.com or netmining.com cookie(s) and try to place a new cookie that instructs us not to track your future activities when we detect that cookie.&#8221; Opting out deleted the Netmining tracking cookie but did not delete a tracking cookie served from a retailer-specific subdomain of netmng.com (and presumably only used on that retailer&#8217;s site). Reloading the content refreshed the retailer-specific cookie.</p>
<p>The <a href="http://www.undertone.com/">Undertone</a> <a href="http://www.undertone.com/privacy/">privacy policy</a> notifies users: &#8220;If you would like to opt out of OBA, then we offer &#8216;opt-out cookies&#8217; to block the tracking and placement of future Undertone cookies for OBA purposes on your system for five (5) years.&#8221; Opting out removed a highly unique cookie that stores the user&#8217;s interests but did not remove a unique cookie. Subsequent loads of the content updated the unique cookie.</p>
<p><a href="http://www.vibrantmedia.com/">Vibrant Media</a>&#8216;s <a href="http://www.vibrantmedia.com/privacy.asp">privacy policy</a> provides: &#8220;If you&#8217;d like to opt-out from having Vibrant Media collect your Non-PII in connection with our Technology, please click here. When you opt out, we will place an opt-out cookie on your computer. The opt-out cookie tells us not to collect your Non-PII to tailor our online advertisement campaigns.&#8221; Opting out of Vibrant Media does not remove the network&#8217;s unique tracking cookie; the cookie remains in place and is updated with subsequent loads of the content.</p>
<p>The <a href="http://ad.wsod.com/?view=privacy">privacy policy</a> on <a href="http://www.wallst.com/business.asp">Wall Street on Demand</a>&#8216;s advertising platform claims: &#8220;By clicking here, the unique cookie used by this system/domain and stored locally by your browser will be changed to &#8216;OPT_OUT&#8217;. By creating a generic cookie id instead of a unique cookie id &#8211; it is even more impossible to track your history.&#8221; Opting out deleted Wall Street on Demand&#8217;s unique cookie, but left in place a seemingly highly unique cookie that appears to store user interests. Refreshing the content renewed the interests cookie.</p>
<p>We identified one additional company with a privacy policy that may be interpreted to prohibit its current business practices. The <a href="http://www.targusinfo.com/">TARGUSinfo</a> <a href="http://www.adadvisor.net/">AdAdvisor</a> <a href="http://www.adadvisor.net/optout.html">opt-out page</a> explains that &#8220;[t]he AdAdvisor opt-out works by replacing the existing AdAdvisor cookie with a new cookie that clearly indicates that the user has elected to opt-out of the Services.&#8221; Opting out left TARGUSinfo&#8217;s unique tracking cookie in place. Refreshing the content did not update the tracking cookie.</p>
<p><b>4. At least ten NAI members go beyond their privacy policies and remove their tracking cookies.</b></p>
<p>In comparing our results to the <a href="http://www.cylab.cmu.edu/research/techreports/2011/tr_cylab11005.html">Carnegie Mellon study</a> of privacy policies we found that ten NAI members remove their tracking cookies upon opting out, even though they promise to only stop behavioral targeting of ads. The companies are: <a href="http://www.bluekai.com/">BlueKai</a> (retains city-level geolocation), <a href="http://advertising.yahoo.com/campaign/dapper.html">Dapper</a> (bought by <a href="http://advertising.yahoo.com/">Yahoo!</a>), <a href="http://www.fetchback.com/">FetchBack</a>, <a href="http://www.google.com/ads/">Google</a>, <a href="http://www.invitemedia.com/">Invite Media</a>, <a href="http://media6degrees.com/">Media6Degrees</a>, <a href="http://www.mediaplex.com/">Mediaplex</a>, <a href="http://www.quantcast.com/">Quantcast</a>, <a href="http://www.tidaltv.com/">TidalTV</a>, and <a href="http://www.yume.com/">YuMe</a>.</p>
<p><b>Concluding Thoughts</b></p>
<p>These early results scarcely scratch the surface of what we aim to learn with our new web measurement platform. We look forward to sharing new insights in the coming weeks and opening the software in the coming months. If you have experience in the web measurement field and would like to participate in testing the platform, please reach out. And please send web measurement questions &mdash; we&#8217;re looking for new ways to put the system through its paces!</p>
<p><b>Updates</b></p>
<p>[If you would like us to add a statement from your company, please reach out.]</p>
<p>24/7 Real Media has updated its privacy policy.</p>
<blockquote>
<p>You may also simply opt out of receiving interest-based advertising by clicking here.</p>
</blockquote>
<p>AddThis contacted us about our findings. After a reevaluation, we discovered we had mislabeled a unique session cookie associated with AddThis&#8217;s opt-out process as a tracking cookie. The post and spreadsheet have been updated. Our apologies to AddThis for the error.</p>
<p>AudienceScience reached out to clarify its practices. Its cookies store a compressed and encrypted data structure. When a user opts out, AudienceScience removes all interest segments and the unique ID from the data structure, but it continues to update the last time the browser contacted its servers. We have confirmed that AudienceScience now entirely removes its data structure after opting out.</p>
<p>BlueKai <a href="http://twitter.com/#!/BlueKai/status/91015681537081344">confirmed</a> it is taking steps to honor Do Not Track.</p>
<p>Media6Degrees <a href="http://twitter.com/#!/media6degrees/status/90786231268552705">confirmed</a> it is taking steps to honor Do Not Track.</p>
<p>Netmining has updated its privacy policy.</p>
<blockquote>
<p>If you select the &#8220;opt out&#8221; button there for Netmining, we will delete your existing netmng.com or netmining.com online behavioral advertising cookie(s) and try to place a new cookie that instructs us not to track your future activities for the purposes of serving online behavioral advertising when we detect that cookie.</p>
</blockquote>
<p>The Network Advertising Initiative has posted a <a href="http://naiblog.org/2011/07/moving-the-goal-posts-without-changing-the-rule-book/">response</a> to the study.</p>
<p>TARGUSinfo submitted the following statement.</p>
<blockquote>
<p>Immediately upon the publication of this study, we verified that our Opt-Out was fully functional both through our own www.adadvisor.net/optout.html site as well as through the NAI site. At no time was our opt-out not functioning, meaning that any consumer who had elected to opt out either through us or NAI or aboutads.info was indeed opted out, and no further activity was conducted on that user&#8217;s browser. We did identify a minor inconsistency between the opt-out running on our own site and that which was running on the NAI site. Specifically, a second cookie was deleted when the opt-out was set from our own site, but that cookie was left on the browser if the user opted out through the NAI. Despite this cookie remaining on the browser, it was rendered dormant because our opt-out prevents us from reading or accessing any other cookie. We updated the code running on NAI to ensure that this second cookie also gets deleted when a user opts-out through NAI, to ensure that there is no confusion with our actual opt-out functionality and what was stated in our privacy policy.</p>
</blockquote>
<p>Undertone has posted a <a href="http://www.undertone.com/temp/tempprivacy.php">statement</a> responding to the study.</p>
<p>Vibrant Media submitted the following statement.</p>
<blockquote>
<p>We drop a user ID cookie when a user initiates engagement with one of our ad units. This collects non-personally identifiable information on keywords a user has engaged with. If the user doesn&#8217;t visit a site in our network for 10 days, we delete this data. If someone opts out, we add a do-not-track cookie.</p>
<p>We had been deleting any data associated with the user ID, but had not been deleting the cookie itself (this is acceptable for NAI compliance). When we encounter someone with a do-not-track cookie, we completely ignore the user ID and therefore don&#8217;t use their information to serve ads. Although the cookie was remaining, we do not reference or use the ID in any way and we completely delete all data, be it in logs or storage devices for that particular user ID. Going forward, in order to prevent any misunderstanding we will also be deleting that cookie.</p>
<p>We have always been vigilant about adhering to industry best practices and NAI compliance policies.</p>
</blockquote>
<p>Wall Street on Demand has updated its privacy policy.</p>
<blockquote>
<p>Online Behavioral Advertising (OBA) is the process of targeting specific advertisements to each individual user, based on browsing history. If you opt out of OBA from our service by clicking the link below, the OBA cookie we use to contain this information will be emptied and changed to a placeholder signaling that you have done so. . . . Opting out does not necessarily delete or replace all cookies from our domain; others may remain which are used for aggregate reporting on the performance of the advertisements we serve.</p>
</blockquote>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2011/07/11/tracking-the-trackers-early-results/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Do Not Fool Will Make the Internet Explode</title>
		<link>http://webpolicy.org/2011/04/01/do-not-fool-will-make-the-internet-explode/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=do-not-fool-will-make-the-internet-explode</link>
		<comments>http://webpolicy.org/2011/04/01/do-not-fool-will-make-the-internet-explode/#comments</comments>
		<pubDate>Fri, 01 Apr 2011 17:46:35 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=70</guid>
		<description><![CDATA[Original at the Stanford Center for Internet and Society. Joint post with Arvind Narayanan. Earlier today Mozilla announced support for Do Not Fool, a proposed mechanism for opting out of April Fools&#8217; pranks. We cannot support this misguided effort. First, Do Not Fool would require fundamentally reengineering the Internet, the HTTP protocol, and countless websites. [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at the <a href="http://cyberlaw.stanford.edu/node/6649">Stanford Center for Internet and Society</a>.</em></p>
<p><em>Joint post with <a href="http://randomwalker.info">Arvind Narayanan</a>.</em></p>
<p>Earlier today Mozilla <a href="http://mozillalabs.com/blog/2011/04/protecting-users-from-an-age-old-threat/" rel="nofollow">announced</a> support for Do Not Fool, a proposed mechanism for opting out of April Fools&#8217; pranks.  We cannot support this misguided effort.</p>
<p>First, Do Not Fool would require fundamentally reengineering the Internet, the HTTP protocol, and countless websites.  Many of your favorite web destinations like <a href="http://theonion.com" rel="nofollow">The Onion</a> rely on fooling.</p>
<p>Second, fooling is integral to the American competitive landscape and to innovation. In fact, Do Not Fool would demolish the web&#8217;s revenue channels.  Don’t just take our word for it—industry-funded, non-peer reviewed, quasi-relevant research <i>proves</i> that fooling accounts for over 99.9% of online revenues.  </p>
<p>Third, self-regulation is working.  Every time you get fooled today, you have the opportunity to click a tiny icon—on sites that support it—to learn more about how you&#8217;ve been fooled.  And over fifty major pranksters already allow you to set a cookie to opt out of getting fooled by them, once you figure out who they are.  (Though roughly half are just fooling you with that opt out.)</p>
<p>Don&#8217;t enable this dangerous new feature.  Don&#8217;t be fooled by Do Not Fool.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2011/04/01/do-not-fool-will-make-the-internet-explode/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Response to Commissioner Rosch on Do Not Track</title>
		<link>http://webpolicy.org/2011/03/28/a-response-to-commissioner-rosch-on-do-not-track/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=a-response-to-commissioner-rosch-on-do-not-track</link>
		<comments>http://webpolicy.org/2011/03/28/a-response-to-commissioner-rosch-on-do-not-track/#comments</comments>
		<pubDate>Tue, 29 Mar 2011 06:37:38 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=60</guid>
		<description><![CDATA[Original at the Stanford Center for Internet and Society. Late last week FTC Commissioner Rosch penned a column in which he repeated a number of hackneyed criticisms of Do Not Track. Senators McCaskill and Pryor articulated similar concerns at a recent hearing. This piece sequentially deconstructs Rosch&#8217;s column and replies to each of his substantive [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at the <a href="http://cyberlaw.stanford.edu/node/6646">Stanford Center for Internet and Society</a>.</em></p>
<p>Late last week FTC Commissioner Rosch penned <a href="http://adage.com/article/guest-columnists/ftc-commissioner-thinks-track-track/149558/" rel="nofollow">a column</a> in which he repeated a number of hackneyed criticisms of Do Not Track. Senators McCaskill and Pryor articulated similar concerns at a recent hearing. This piece sequentially deconstructs Rosch&#8217;s column and replies to each of his substantive critiques.</p>
<p><span id="more-60"></span></p>
<blockquote>
<p>I also have serious questions about the various do-not-track proposals. In my concurring statement to the preliminary staff report, I said I would support a do-not-track mechanism if it were &#8220;technically feasible.&#8221; By that I meant that it needed to have a number of attributes that had not yet been demonstrated. That is still true, in my judgment.</p>
</blockquote>
<p>Do Not Track raises issues of both technology and policy; it is essential to draw a sharp dividing line between the two. The concerns that Commissioner Rosch expresses relate to business impact. There is now widespread consensus that Do Not Track, implemented as an HTTP header, is &#8220;technically feasible.&#8221;</p>
<blockquote>
<p>First, there are a number of consequences if a consumer adopts a do-not-track mechanism. To begin with, a consumer may sacrifice being served relevant advertising.</p>
</blockquote>
<p>Users who enable Do Not Track will still see relevant advertising&mdash;just not based on their browsing history on other sites. Contextual advertising and non-tracking forms of interest-targeted advertising are unaffected.</p>
<blockquote>
<p>On a related note, there is academic research suggesting that in order to compensate for the loss of the ability to track consumer behavior and the associated ability to serve relevant advertising, advertisers may need to turn to advertising that is more &#8220;obtrusive&#8221; in order to attract consumers&#8217; attention.</p>
</blockquote>
<p>While a number of pundits have claimed Do Not Track will lead to more obtrusive advertising, I am unaware of any academic research on this point. Given layout constraints and the limited consumer tolerance for advertising, it is difficult to believe many sites will add additional advertising for Do Not Track users. And if sites could add more advertising, why wouldn&#8217;t they already?</p>
<blockquote>
<p>Consumers may also lose the free content they have taken for granted. Not only could consumers potentially lose access to free content on specific websites, I fear that the aggregate effect of widespread adoption by consumers of overly broad do-not-track mechanisms might be the reduction of free content, free applications and innovation across the entire internet economy.</p>
</blockquote>
<p>On the contrary, there is substantial reason to believe <a href="http://cyberlaw.stanford.edu/node/6592" rel="nofollow">Do Not Track is no threat to ad-supported businesses</a>. This conclusion is bolstered by the news that <a href="http://online.wsj.com/article/SB10001424052748704662604576202971768984598.html" rel="nofollow">thirty online advertising firms are willing to implement Do Not Track</a>.</p>
<blockquote>
<p>Beyond that, consumers may forgo the reported ability to earn commissions from &#8220;selling&#8221; the right to track their behavior or allow the use of their personal information.</p>
</blockquote>
<p>Do Not Track allows users to veto third-party tracking; third parties are welcome to respond by offering cash-for-data deals. One of the <i>advantages</i> of Do Not Track is that it creates a market mechanism for negotiating over privacy preferences.</p>
<blockquote>
<p>I also wonder whether an overly broad do-not-track mechanism would deprive consumers of some beneficial tracking, such as tracking performed to prevent fraud, to avoid being served the same advertising, or to conduct analytics that foster innovation.</p>
</blockquote>
<p>The <a href="http://datatracker.ietf.org/doc/draft-mayer-do-not-track/" rel="nofollow">Do Not Track Internet-Draft</a> accommodates fraud prevention, advertising frequency capping, and aggregate analytics.</p>
<blockquote>
<p>Concerns have been raised that do-not-track mechanisms also may have the unintended consequence of blocking tailored content, in addition to advertising.</p>
</blockquote>
<p>Do Not Track would not affect first-party personalization (e.g. New York Times recommended reading). It would disallow third-party tailored content, but of course a user could opt back into tracking and personalization by services she trusts.</p>
<blockquote>
<p>Second, another issue is potential consumer confusion about the terminology &#8220;do not track.&#8221; As some have pointed out, there is no consensus on what &#8220;tracking&#8221; means. In fact, I don&#8217;t know precisely what it means.</p>
</blockquote>
<p>Like any technical standard, Do Not Track will go through a number of draft iterations. The standards process provides a clear path to achieving a final, consensus definition of Do Not Track.</p>
<blockquote>
<p>Some tracking, for purposes unrelated to behavioral advertising, may always occur. When consumers are offered a do-not-track option, they may misunderstand the limited scope of that choice; and, in some instances, calling a mechanism &#8220;do not track&#8221; could arguably be deceptive.</p>
</blockquote>
<p>Just like Do Not Call, Do Not Track will necessarily have certain exceptions that do not completely align with consumer expectations. First, this is no basis for rebuffing the approach in its entirety. Do Not Call remains immensely popular despite not covering political and non-profit entities. Second, browser vendors are already taking steps to improve their privacy user interfaces. Mozilla has retained <a href="http://www.aleecia.com/" rel="nofollow">Aleecia McDonald</a>, a leading scholar on user-friendly privacy, to consult on its Do Not Track implementation.</p>
<blockquote>
<p>Third, I am concerned that the recent rush to adopt untested do-not-track mechanisms might be based, in part, on a reluctance to take on the harder task of examining more-nuanced methods of providing consumers with choice. It is always easier to just say &#8220;no&#8221; with a blunt instrument, rather than to take the time and effort to consider all of the ramifications of the different alternatives.</p>
</blockquote>
<p>A common misconception of Do Not Track is that it is a one-size-fits-all choice. This isn&#8217;t the case: as I noted in a <a href="http://www.law.yale.edu/documents/pdf/ISP/Jonathan_Mayer.pdf" rel="nofollow">thought piece</a> for Yale Law&#8217;s recent symposium on online advertising, nuanced privacy mechanisms can and should be built on top of Do Not Track.</p>
<blockquote>
<p>Finally, the implementation of do-not-track mechanisms must not jeopardize competition by injuring potential competitors. I am concerned that some firms with a monopoly or near-monopoly on a relevant market may use do-not-track mechanisms to cripple competitors from constraining their power.</p>
<p>More specifically, the browser market is heavily concentrated. Most &#8212; though not all &#8212; firms in the browser market operate for profit and those firms monetize some of their other businesses by advertising. There is nothing wrong with that as such. But we need to know: 1.) whether any of those firms enjoy monopoly or near-monopoly power in any online advertising market; 2.) whether there is any difference between the advertising in which those firms are invested (including the various kinds and combinations) and the advertising portfolio of competitors that may make the latter more vulnerable in the event do-not-track mechanisms are installed; and 3.) whether there is any other way that a firm that dominates the market may be able to disadvantage a rival if do-not-track mechanisms are adopted.</p>
</blockquote>
<p>Here&#8217;s the concern I believe Commissioner Rosch is attempting to convey: Google, one of the major browser vendors, earns most of its revenue from search advertising and other first-party advertising. If Google were to adopt Do Not Track, it would harm competitors supported by third-party advertising more than it would harm itself. And so Google has an incentive to push Do Not Track in Chrome.</p>
<p>I believe this concern is unfounded. First, as noted above, <a>Do Not Track will not significantly impact advertising revenues for websites</a>. Second, third-party advertising comprises an increasing share of Google&#8217;s revenue. Even if Do Not Track would significantly impact third-party advertising revenue, Google would not have an incentive to press its adoption unless it could gain a significant business advantage over its competitors that offset those losses&mdash;and there&#8217;s no reason to believe it could. Third, the concern is largely moot: Google has yet to implement Do Not Track in Chrome, let alone encourage the feature&#8217;s adoption.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2011/03/28/a-response-to-commissioner-rosch-on-do-not-track/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Do Not Track, Meet IETF</title>
		<link>http://webpolicy.org/2011/03/09/do-not-track-meet-ietf/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=do-not-track-meet-ietf</link>
		<comments>http://webpolicy.org/2011/03/09/do-not-track-meet-ietf/#comments</comments>
		<pubDate>Wed, 09 Mar 2011 08:44:15 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=58</guid>
		<description><![CDATA[Original at the Stanford Center for Internet and Society. Do Not Track is on its way to becoming an Internet standard. In collaboration with Sid Stamm at Mozilla we&#8217;ve submitted an Internet-Draft to the IETF, specifying both the HTTP header syntax and the requirements for compliance. This is just the beginning of the IETF&#8217;s process [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at the <a href="http://cyberlaw.stanford.edu/node/6633">Stanford Center for Internet and Society</a>.</em></p>
<p>Do Not Track is on its way to becoming an Internet standard. In collaboration with <a href="http://www.sidstamm.com/" rel="nofollow">Sid Stamm</a> at Mozilla we&#8217;ve submitted an <a href="http://datatracker.ietf.org/doc/draft-mayer-do-not-track/" rel="nofollow">Internet-Draft</a> to the <a href="http://en.wikipedia.org/wiki/Internet_Engineering_Task_Force" rel="nofollow">IETF</a>, specifying both the HTTP header syntax and the requirements for compliance.</p>
<p>This is just the beginning of the IETF&#8217;s process and the evolution of the draft. But it&#8217;s a transformative moment for web privacy: Do Not Track is now a formal standards proposal. Every browser, advertising network, analytics service, and social plug-in provider has a clear instruction manual on how to implement Do Not Track.</p>
<p>We owe a tremendous debt of gratitude to the colleagues and friends whose efforts have made Do Not Track a reality: <a href="http://www.alissacooper.com/" rel="nofollow">Alissa Cooper</a>, <a href="https://www.eff.org/about/staff/peter-eckersley" rel="nofollow">Peter Eckersley</a>, <a href="http://firstpersoncookie.wordpress.com/" rel="nofollow">Alex Fowler</a>, <a href="http://theory.stanford.edu/people/jcm/" rel="nofollow">John Mitchell</a>, <a href="http://www.ashkansoltani.org/" rel="nofollow">Ashkan Soltani</a>, <a href="https://www.eff.org/about/staff/lee-tien" rel="nofollow">Lee Tien</a>, and <a href="http://www.cs.princeton.edu/~harlanyu/" rel="nofollow">Harlan Yu</a>. And we particularly thank <a href="http://www.dubfire.net/" rel="nofollow">Chris Soghoian</a>, Do Not Track&#8217;s unflagging champion for nearly two years.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2011/03/09/do-not-track-meet-ietf/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Do Not Track FTC Comment: What It Means, How to Enforce It, and More</title>
		<link>http://webpolicy.org/2011/02/24/do-not-track-ftc-comment-what-it-means-how-to-enforce-it-and-more/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=do-not-track-ftc-comment-what-it-means-how-to-enforce-it-and-more</link>
		<comments>http://webpolicy.org/2011/02/24/do-not-track-ftc-comment-what-it-means-how-to-enforce-it-and-more/#comments</comments>
		<pubDate>Thu, 24 Feb 2011 22:51:51 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=56</guid>
		<description><![CDATA[Original at the Stanford Center for Internet and Society. Last Friday we submitted a comment to the FTC articulating our vision for Do Not Track. We expanded on a number of views already expressed on this blog: Do Not Track is about much more than behavioral advertising, an HTTP header is the right implementation, and [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at the <a href="http://cyberlaw.stanford.edu/node/6621">Stanford Center for Internet and Society</a>.</em></p>
<p>Last Friday we submitted a <a href="http://donottrack.us/docs/FTC_Privacy_Comment_Stanford.pdf" rel="nofollow">comment to the FTC</a> articulating our vision for Do Not Track.  We expanded on a number of views already expressed on this blog: <a href="http://cyberlaw.stanford.edu/node/6573" rel="nofollow">Do Not Track is about much more than behavioral advertising</a>, <a href="http://cyberlaw.stanford.edu/node/6559" rel="nofollow">an HTTP header is the right implementation</a>, and <a href="http://cyberlaw.stanford.edu/node/6592" rel="nofollow">Do Not Track is no threat to ad-supported businesses</a>.  Here are the new highlights.  (For a fuller exposition of each, please see our <a href="http://donottrack.us/docs/FTC_Privacy_Comment_Stanford.pdf" rel="nofollow">comment</a>.)</p>
<p><span id="more-56"></span></p>
<p><b>Defining Do Not Track</b></p>
<p>Beginning with process, there is a strong temptation to leap into precise definitions of Do Not Track. But any set of bright-line rules will incorporate distinctions that are controversial. The FTC should be vested with guided rulemaking authority to, in collaboration with stakeholders, develop bright-line rules. This is the standard model for regulation; Do Not Track should be no different.</p>
<p>Turning to substance, Do Not Track should be coextensive with the privacy concern it addresses: third-party tracking. Providing rulemaking guidance for Do Not Track thus requires establishing standards for &#8220;third party,&#8221; &#8220;tracking,&#8221; and exceptions.</p>
<p><u>Defining Third Party</u></p>
<p>In our view, the privacy distinction between first parties and third parties is shorthand for user expectations. <b>An entity acts in a first-party capacity if a user reasonably expects to interact with it; it acts in a third-party capacity if a user does not.</b> Relevant factors for user expectations include domain names, branding, and business relationships. In most cases resolving the standard is straightforward: The website the user visits is a first party; an advertising network or analytics provider is a third party.</p>
<p><u>Defining Tracking</u></p>
<p>The user&#8217;s remedy against third parties should, in the first instance, be coextensive with the privacy violation: <b>Do Not Track should prohibit all data collection, retention, and use.</b> Of course, the online ecosystem is quite complex, and we recognize that there will be a need for well-delineated exceptions where privacy concerns must reasonably give way to greater interests.</p>
<p><u>Exceptions</u></p>
<p>Exceptions to Do Not Track may be warranted when there is significant commercial need and privacy concerns and enforcement impact are minimal. We believe a two-step standard best captures this policy: <b>First, legitimate commercial interests must substantially outweigh privacy and enforcement interests, and second, the means of achieving the commercial interests must have no greater privacy and enforcement impact than necessary.</b> A number of tools are available for minimizing the privacy and enforcement effects of an exception, including client-side storage, dropping parts of data, secure hashing, retention periods, internal business controls, limited sharing agreements, trusted intermediaries, and audits.</p>
<p><b>Do Not Track Can Be Enforced</b></p>
<p>We envision two technical approaches to verifying Do Not Track compliance. First, most tracking at the application layer can be detected by modifying a browser to report tracking-related activity. For example, if after receiving a Do Not Track header third-party embedded content sets a unique cookie or lists the browser&#8217;s plug-ins, the third party may be violating Do Not Track. Second, behavioral advertising can be identified by monitoring ads for interest targeting. Data should be sourced using both crawling and crowdsourcing to ensure comprehensive coverage of top websites and a real-world sample of observations. We are beginning development of a Do Not Track verification system with colleagues in the <a href="http://seclab.stanford.edu/" rel="nofollow">Stanford Security Laboratory</a>, and we look forward to sharing our work in the coming months.</p>
<p><b>Do Not Track Should Be Extended to Mobile Platforms</b></p>
<p>Third-party tracking is <a href="http://online.wsj.com/article/SB10001424052748704694004576020083703574602.html" rel="nofollow">proliferating on mobile platforms</a>; such tracking implicates the same privacy concerns as third-party tracking on the web, and likewise warrants a Do Not Track choice mechanism.</p>
<p>The Do Not Track header can be easily adapted to mobile platforms. Instead of a universal browser setting, Do Not Track should be a platform-wide preference that adds a Do Not Track header to all HTTP requests and provides a Do Not Track signal to apps. Much as embedded third-party web trackers would check for the Do Not Track header, embedded third-party mobile app trackers would check for the Do Not Track platform preference. Paralleling the granularity of the header, apps should be able to interact with the platform to request an exception from Do Not Track. As for verification, since third-party mobile tracking is heavily concentrated the problem is much simpler than in the desktop browser context.</p>
<p>Turning to policy, the first vs. third party distinction also seamlessly transitions to the mobile context. An app is a first party; a behavioral advertising network embedded in the app would be a third-party since its presence violates reasonable privacy expectations. </p>
<p><b>FTC Involvement is Necessary</b></p>
<p>When we initially articulated our vision for Do Not Track, we noted it could be implemented voluntarily or through industry self-regulation. We now believe legislation and FTC involvement are necessary.</p>
<p>In our view, third-party opposition to Do Not Track at a technological level is largely a façade. The HTTP standard is designed to allow flexible signaling with headers; Internet Explorer alone uses <a href="http://blogs.msdn.com/b/ieinternals/archive/2009/06/30/internet-explorer-custom-http-headers.aspx" rel="nofollow">at least eight proprietary headers</a>.</p>
<p>The substantive disagreements about Do Not Track arise from policy. A number of third parties oppose a stringent definition of third-party tracking. Given the diversity of online business models and businesses Do Not Track would affect, and given the consensus-based nature of the relevant trade associations, we believe voluntary comprehensive adoption will not occur. Moreover, as an empirical matter, the online advertising industry in particular has a <a href="http://www.iab.net/iablog/2011/02/the-challenge-of-self-governan.html" rel="nofollow">track record</a> of unsuccessful self-regulation.</p>
<p>The Federal Trade Commission is the right agency to define and enforce Do Not Track. The Commission&#8217;s growing technical staff lends it unique domain expertise for defining Do Not Track, and its capacity for and experience with consumer protection actions prime it to enforce Do Not Track.</p>
<hr />
<p>The FTC received <a href="http://www.ftc.gov/os/comments/privacyreportframework/index.shtm" rel="nofollow">439 comments</a> on its draft privacy report. We strongly encourage reading the comments by our colleagues at the <a href="https://www.eff.org/files/FTCcommentsEFF.pdf" rel="nofollow">Electronic Frontier Foundation</a> and the <a href="http://www.cs.princeton.edu/~harlanyu/Yu_Soltani_Comment_on_FTC_Privacy_Report.pdf" rel="nofollow">Center for Information Technology Policy</a>.</p>
<p>We thank <a href="http://theory.stanford.edu/people/jcm/" rel="nofollow">John Mitchell</a> for his helpful feedback.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2011/02/24/do-not-track-ftc-comment-what-it-means-how-to-enforce-it-and-more/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Minor Updates to the Do Not Track Header</title>
		<link>http://webpolicy.org/2011/01/27/minor-updates-to-the-do-not-track-header/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=minor-updates-to-the-do-not-track-header</link>
		<comments>http://webpolicy.org/2011/01/27/minor-updates-to-the-do-not-track-header/#comments</comments>
		<pubDate>Thu, 27 Jan 2011 20:27:55 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=54</guid>
		<description><![CDATA[Original at the Stanford Center for Internet and Society. We&#8217;re pleased to announce we&#8217;re beginning work on an IETF Internet-Draft for the Do Not Track header. We look forward to incorporating broad feedback. In anticipation of the first version of the Internet-Draft, we&#8217;re making a few minor updates to the header. The reference implementations at [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at the <a href="http://cyberlaw.stanford.edu/node/6597">Stanford Center for Internet and Society</a>.</em></p>
<p>We&#8217;re pleased to announce we&#8217;re beginning work on an <a href="http://en.wikipedia.org/wiki/Internet_Draft" rel="nofollow">IETF Internet-Draft</a> for the Do Not Track header.  We look forward to incorporating broad feedback.</p>
<p>In anticipation of the first version of the Internet-Draft, we&#8217;re making a few minor updates to the header.  The reference implementations at <a href="http://donottrack.us" rel="nofollow">DoNotTrack.Us</a> will be revised shortly.</p>
<p><b>Dropping &#8220;X-&#8221;</b></p>
<p>Since Do Not Track is entering a standardization process, convention dictates dropping the prefix &#8220;X-&#8221;.</p>
<p><b>Abbreviating &#8220;DNT&#8221;</b></p>
<p>In keeping with header naming best practices, and to conserve network resources, we&#8217;re shortening the name.</p>
<p><b>Adding a &#8220;0&#8243; Value</b></p>
<p>There&#8217;s an important policy distinction between users who consent to third-party tracking and users who haven&#8217;t expressed a preference.  To clarify this difference, the header now has three states:</p>
<p>&#8220;DNT: 1&#8243; &#8211; The user opts out of third-party tracking.</p>
<p>&#8220;DNT: 0&#8243; &#8211; The user consents to third-party tracking.</p>
<p>[No Header] &#8211; The user has not expressed a preference about third-party tracking.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2011/01/27/minor-updates-to-the-do-not-track-header/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Do Not Track Is No Threat to Ad-Supported Businesses</title>
		<link>http://webpolicy.org/2011/01/20/do-not-track-is-no-threat-to-ad-supported-businesses/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=do-not-track-is-no-threat-to-ad-supported-businesses</link>
		<comments>http://webpolicy.org/2011/01/20/do-not-track-is-no-threat-to-ad-supported-businesses/#comments</comments>
		<pubDate>Thu, 20 Jan 2011 10:12:43 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=52</guid>
		<description><![CDATA[Original at the Stanford Center for Internet and Society. &#8220;If you remove tracking, you remove advertisers.&#8221; &#8220;Stop [data] sharing and you put a stop to the Internet as we know it.&#8221; &#8220;Thousands of small websites may disappear.&#8221; &#8220;Would you like to pay $20 a month for Facebook?&#8221; A spate of such recent commentaries have speculated [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at the <a href="http://cyberlaw.stanford.edu/node/6592">Stanford Center for Internet and Society</a>.</em></p>
<p><a href="http://www.nytimes.com/2010/12/06/business/media/06privacy.html">&#8220;If you remove tracking, you remove advertisers.&#8221;</a> <a href="http://www.usnews.com/opinion/articles/2011/01/03/do-not-track-rules-would-put-a-stop-to-the-internet-as-we-know-it.html">&#8220;Stop [data] sharing and you put a stop to the Internet as we know it.&#8221;</a> <a href="http://www.businessweek.com/technology/content/dec2010/tc20101222_392883.htm">&#8220;Thousands of small websites may disappear.&#8221;</a> <a href="http://blogs.reuters.com/mediafile/2010/12/23/privacy-regulation-and-the-free-internet/">&#8220;Would you like to pay $20 a month for Facebook?&#8221;</a> A spate of such recent commentaries have speculated that <a href="http://donottrack.us">Do Not Track</a> could hobble advertising-supported businesses. Here&#8217;s why it won&#8217;t.</p>
<p><span id="more-52"></span></p>
<p><b>Do Not Track would only affect a sliver of the online advertising market.</b></p>
<p>First, a brief overview of online advertising. Suppose you operate a high-end Napa winery and decide to run an ad. You might place your ad on a specific website (&#8220;first-party advertising&#8221;), or you might arrange your ad with an advertising network that spans thousands of sites (&#8220;third-party advertising&#8221;). Here&#8217;s a sample of how you might target your ad:</p>
<ul>
<li><a href="http://en.wikipedia.org/wiki/Contextual_advertising">Contextual Advertising</a>: Your ad appears on pages about wine.
</li>
<li><a href="http://www.google.com/support/adxbuyer/bin/answer.py?hl=en&amp;answer=152095">Demographic Advertising</a>: Your ad appears on pages whose visitors tend to be wealthy.
</li>
<li><a href="http://en.wikipedia.org/wiki/Behavioral_advertising">Behavioral Advertising</a>: Your ad appears to users who have viewed a number of pages about wine.<sup><a href="#DNT_ECONOMICS_FN1">1</a></sup>
</li>
<li><a href="http://en.wikipedia.org/wiki/Search_advertising">Search Advertising</a>: Your ad appears on search result pages for the query &#8220;wine.&#8221;
</li>
<li><a href="https://www.google.com/adsense/support/bin/answer.py?hl=en&amp;answer=18195">Placement Advertising</a>: Your ad appears on particular pages.
</li>
<li><a href="http://en.wikipedia.org/wiki/Social_network_advertising">Social Network Advertising</a>: Your ad appears to social network users who have listed &#8220;wine&#8221; as an interest.
</li>
</ul>
<p>Of these myriad modes of advertising, Do Not Track would only affect one: third-party behavioral advertising, because it incorporates third-party tracking. And that accounted for, at most, <a href="http://democrats.energycommerce.house.gov/documents/20101201/Briefing.Memo.12.01.2010.pdf">just 4% (less than $1B)</a> of 2009 U.S. online advertising expenditures. While the use of third-party behavioral advertising is rapidly growing, so is the online advertising market; projections place behavioral advertising at only 7% of the U.S. online advertising market in 2014 (<a href="http://www.emarketer.com/blog/index.php/behavorial-targeting-outmoded/">1</a>, <a href="http://www.emarketer.com/Reports/All/Emarketer_2000722.aspx">2</a>).<sup><a href="#DNT_ECONOMICS_FN2">2</a></sup></p>
<p><b>Do Not Track would only affect a new segment of the online advertising market.</b></p>
<p>Not only is third-party behavioral advertising a small piece of the online advertising market, it&#8217;s also a new piece. Behavioral advertising accounted for a negligible share of online advertising until roughly 2007 (<a href="http://www.emarketer.com/Article.aspx?R=1006384">1</a>, <a href="http://www.emarketer.com/blog/index.php/behavorial-targeting-outmoded">2</a>). Countless ad-supported online businesses launched and thrived before then.</p>
<p><b>Do Not Track would cap&mdash;not eliminate&mdash;third-party behavioral advertising.</b></p>
<p>Do Not Track is an opt-out mechanism; uptake is likely to be far from complete. Two helpful benchmarks: After seven years of a permanent opt out, fewer than half of U.S. phone numbers are on the Do Not Call registry (<a href="http://www.ftc.gov/opa/2010/07/dnc.shtm">1</a>, <a href="http://www.itu.int/ITU-D/ict/publications/idi/2010/Material/MIS_2010_without%20annex%204-e.pdf">2</a>). And after four years of availability, fewer than 3% of Firefox users have installed its most popular add-on (<a href="https://addons.mozilla.org/en-US/statistics/addon/1865">1</a>, <a href="http://www.mozilla.org/foundation/annualreport/2009/a-competitive-world.html">2</a>).</p>
<p><b>Advertisers might not reallocate their ad dollars.</b></p>
<p>Websites that host third-party ads usually receive a fixed share of revenue; they earn more only if advertisers spend more (e.g. <a href="http://adsense.blogspot.com/2010/05/adsense-revenue-share.html">AdSense</a>, <a href="http://adwords.google.com/support/aw/bin/answer.py?hl=en&amp;answer=146606">DoubleClick</a>). Do Not Track would thus impact advertising revenue only if it caused advertisers to reallocate online ad dollars. But that would happen only if advertisers have a strong preference for third-party behavioral advertising. There&#8217;s some evidence that advertisers don&#8217;t: despite the growing availability of third-party behavioral advertising over the past several years, advertisers haven&#8217;t rushed to adopt it. In fact, U.S. online advertising revenues grew at an average annual rate of <a href="http://www.iab.net/media/file/IAB-Ad-Revenue-Full-Year-2009.pdf">only 3.4%</a> between 2007 (when behavioral advertising first caught on) and 2009.<sup><a href="#DNT_ECONOMICS_FN3">3</a></sup></p>
<p><b>There&#8217;s a technology fix: interest-targeted advertising without tracking.</b></p>
<p>Third-party behavioral advertising incorporates tracking to discover a user&#8217;s interests. But interest-targeted advertising can be achieved without tracking. Under one alternative model, the web browser learns a user&#8217;s interests, and then passes those interests to an advertising network. A number of research and commercial efforts do just this, including <a href="http://crypto.stanford.edu/adnostic/">AdNostic</a>, <a href="http://research.microsoft.com/apps/pubs/default.aspx?id=137038">RePriv</a>, and <a href="http://www.google.com/ads/preferences/">Google Ads Preferences</a>.</p>
<p><b>Ad-supported businesses could ask&mdash;or possibly require&mdash;Do Not Track users to allow third-party behavioral advertising.</b></p>
<p>Do Not Track is not all-or-nothing; users who have opted out can opt back into third-party tracking on specific sites or with specific trackers. So even if third-party behavioral advertising were an important revenue source for ad-supported businesses, even if enough users opted out to have an impact, even if advertisers were inclined to pull their ad dollars, and even if alternative technologies for interest-targeted advertising weren&#8217;t available, a business would <i>still</i> have an easy remedy: ask&mdash;or possibly require<sup><a href="#DNT_ECONOMICS_FN4">4</a></sup>&mdash;visitors to disable Do Not Track on the site. The proposal is about increasing privacy choice and transparency, not restricting online business practices.</p>
<p>The reports of advertising&#8217;s death are greatly exaggerated.</p>
<p><a name="DNT_ECONOMICS_FN1">[1]</a> For simplicity this post glosses over <a href="http://en.wikipedia.org/wiki/Behavioral_retargeting">behavioral retargeting</a>, a small subset of behavioral advertising.</p>
<p><a name="DNT_ECONOMICS_FN2">[2]</a> These figures reflect both first- and third-party behavioral advertising. They should be taken as an upper limit on the market size for third-party behavioral advertising.</p>
<p><a name="DNT_ECONOMICS_FN3">[3]</a> One possible reason: behavioral ads may be only a marginally better deal for advertisers. In Q4 2009, a behavioral ad was 2.1x as effective as the average online ad&mdash;but it cost 2x as much (<a href="http://www.emarketer.com/Article.aspx?R=1007599">1</a>).</p>
<p><a name="DNT_ECONOMICS_FN4">[4]</a> Some regulatory proposals have called for disallowing such &#8220;tiered&#8221; access.</p>
<p>Many thanks to <a href="http://randomwalker.info">Arvind Narayanan</a> for endless patience in reviewing drafts.  All views are my own.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2011/01/20/do-not-track-is-no-threat-to-ad-supported-businesses/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Do Not Track &#8211; Q &amp; A</title>
		<link>http://webpolicy.org/2010/11/22/do-not-track-q-a/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=do-not-track-q-a</link>
		<comments>http://webpolicy.org/2010/11/22/do-not-track-q-a/#comments</comments>
		<pubDate>Tue, 23 Nov 2010 07:18:52 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=50</guid>
		<description><![CDATA[Original at the Stanford Center for Internet and Society. Since our introduction of DoNotTrack.Us last week we&#8217;ve received a deluge of questions. This post answers some of the most common inquiries. If we haven&#8217;t covered an issue you&#8217;d like a response on, shoot us an email and stay tuned &#8211; more Q &#38; A posts [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at the <a href="http://cyberlaw.stanford.edu/node/6559">Stanford Center for Internet and Society</a>.</em></p>
<p>Since our introduction of <a href="http://donottrack.us" rel="nofollow">DoNotTrack.Us</a> last week we&#8217;ve received a deluge of questions.  This post answers some of the most common inquiries.  If we haven&#8217;t covered an issue you&#8217;d like a response on, shoot us an email and stay tuned &#8211; more Q &amp; A posts are in the pipeline.</p>
<p><span id="more-50"></span></p>
<p><b>Q: Do Not Track does not block third-party tracking.  Wouldn&#8217;t that be a better solution?</b></p>
<p>Some privacy-conscious users block third-party tracking, most commonly through browser add-ons.  This type of self-help is completely compatible with and complementary to Do Not Track; many Do Not Track users may elect to use blocking software.  But blocking alone is not a complete solution to web tracking.  Here are our chief concerns:</p>
<ul>
<li><b>Universal blocking is infeasible.</b>  Web security research (<a href="https://panopticlick.eff.org/browser-uniqueness.pdf" rel="nofollow">1</a>, <a href="http://crypto.stanford.edu/~dabo/pubs/papers/privatebrowsing.pdf" rel="nofollow">2</a>, <a href="http://samy.pl/evercookie/" rel="nofollow">3</a>) has uncovered dozens of means of tracking users; technical barriers to all these approaches are not practical.  And a <a href="http://privacychoice.wordpress.com/2010/11/17/are-privacy-add-ons-effective-surprising-results-from-our-testing/" rel="nofollow">recent informal study</a> of popular Firefox blocking add-ons suggests that blocking is, in practice, far from a universal opt out.  Users should not be left guessing as to whether they&#8217;ve actually opted out of tracking.
</li>
<li><b>Blocking software requires perpetual development and user vigilance.</b>  There is frequent turnover of tracking services and tracking technologies.  If a developer takes a break, its blocking tool will diminish in effectiveness.  Users must, consequently, periodically ensure their blocking software is still maintained and up-to-date.
</li>
<li><b>Blocking inhibits third-party tools.</b>  A number of popular website tools and plug-ins are hosted by a third party that also tracks users.  Blocking would disable these tools, while Do Not Track accommodates them.
</li>
</ul>
<p><b>Q: Would Do Not Track require users to opt out of all third-party tracking, on all sites?</b></p>
<p>No.  Do Not Track users would have the ability to whitelist sites and tracking services by simply not sending the Do Not Track header (or otherwise signaling that tracking is permissible).  Whitelist management is primarily a user interface problem; we look forward to creative solutions in this space.</p>
<p><b>Q: Ads provide important context and notice for managing tracking preferences.  How would Do Not Track be linked to ads?</b></p>
<p>Linking ads, privacy notices, and other visible content to a user&#8217;s Do Not Track settings is certainly possible.  A software interface (i.e. a protocol handler or JavaScript API) could facilitate querying the user&#8217;s Do Not Track settings and prompting the user to make changes.</p>
<p><b>Q: Would Do Not Track be enabled by default?</b></p>
<p>Do Not Track could be a default; this is up to browser vendors, legislators, and regulators.  Owing to conflicting business interests and engineering demands, we find it unlikely the major browser vendors will voluntarily enable Do Not Track by default.  We caution in general, however, against regulation of client software, which has historically proven challenging.</p>
<p><b>Q: How difficult would it be for a tracking service to implement support for Do Not Track?</b></p>
<p>We did not have difficulty implementing Do Not Track for popular <a href="http://donottrack.us/server.html" rel="nofollow">web servers</a> and <a href="http://donottrack.us/application.html" rel="nofollow">web application frameworks</a>.  We believe tracking services will have a similar experience, though the business logic required may involve more effort.</p>
<p><b>Q: Does Do Not Track require the cooperation of browser vendors?</b></p>
<p>To support Do Not Track, a browser must at minimum provide an add-on API that allows header modification (we review browser APIs <a href="http://donottrack.us/support.html" rel="nofollow">here</a>).  Browser vendors may also choose to implement Do Not Track themselves.</p>
<p><b>Q: Will older browsers be left out of Do Not Track?</b></p>
<p>Backwards compatibility is a challenge for any new web technology.  We are aiming for as broad support for Do Not Track as is possible, but we recognize that a minority of older browsers may be left out.</p>
<p><b>Q: Is Do Not Track a modification to HTTP?</b></p>
<p>No.  Custom HTTP headers are a component of the HTTP standard (see IETF <a href="http://www.ietf.org/rfc/rfc2616.txt" rel="nofollow">RFC 2616</a>).</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2010/11/22/do-not-track-q-a/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Ending the Web Privacy Stalemate &#8211; DoNotTrack.Us</title>
		<link>http://webpolicy.org/2010/11/15/ending-the-web-privacy-stalemate-donottrack-us/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=ending-the-web-privacy-stalemate-donottrack-us</link>
		<comments>http://webpolicy.org/2010/11/15/ending-the-web-privacy-stalemate-donottrack-us/#comments</comments>
		<pubDate>Tue, 16 Nov 2010 05:05:00 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Do Not Track]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=47</guid>
		<description><![CDATA[Original at the Stanford Center for Internet and Society. The web privacy debate is stuck. Privacy proponents decry the diffusion of behavioral advertising and tracking services (1, 2, 3); industry coalitions respond by expounding the merits of personalized content and advertising revenue (1, 2). But for the average user, the arguments are academic: there is [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at the <a href="http://cyberlaw.stanford.edu/node/6556">Stanford Center for Internet and Society</a>.</em></p>
<p>The web privacy debate is stuck. Privacy proponents decry the diffusion of behavioral advertising and tracking services (<a href="http://www.democraticmedia.org/doc/privacy-legislative-primer" rel="nofollow">1</a>, <a href="http://www.cdt.org/issue/behavioral-advertising" rel="nofollow">2</a>, <a href="https://www.eff.org/issues/online-behavioral-tracking" rel="nofollow">3</a>); industry coalitions respond by expounding the merits of personalized content and advertising revenue (<a href="http://www.iab.net/media/file/ven-principles-07-01-09.pdf" rel="nofollow">1</a>, <a href="http://naiblog.org/wp-content/uploads/2010/06/Commerce-Comments1.pdf" rel="nofollow">2</a>).  But for the average user, the arguments are academic: there is no viable technology for opting out of web tracking. A registry of tracking services, like <a href="http://www.cdt.org/privacy/20071031consumerprotectionsbehavioral.pdf" rel="nofollow">privacy advocates proposed years ago</a>, is cumbersome and unmanageable. Fiddling with cookies, as many advertising networks and anti-regulation advocates recommend, is an incomplete and temporary fix; both <a href="http://www.google.com/ads/preferences/plugin/" rel="nofollow">Google</a> and <a href="http://www.networkadvertising.org/managing/protector_license.asp" rel="nofollow">NAI</a> (an advertising industry association) have already moved away from opt-out cookies.</p>
<p>Do Not Track ends this standoff. It provides a web tracking opt-out that is user-friendly, effective, and completely interoperable with the existing web. The technology is simple: whenever your web browser makes a request, it includes an opt-out preference. It&#8217;s then up to advertisers and tracking services to honor that preference – voluntarily, by industry self-regulation, or by law.</p>
<p><a href="http://randomwalker.info" rel="nofollow">Arvind Narayanan</a> and I have been researching Do Not Track for several months, and are pleased to now introduce <a href="http://donottrack.us" rel="nofollow">DoNotTrack.Us</a>, a compilation of what we&#8217;ve learned. The resource explains Do Not Track, provides prototype implementations, and answers some common questions. We&#8217;ll be updating it in the coming months with new findings and responses to feedback.</p>
<p>Excited as we are about the Do Not Track technology, it is but a first step. Important substantive policy questions remain open: What tracking should be impermissible? When a user visits a site, what constitutes a third party? We look forward to collaborating with advertising networks, NGO&#8217;s, regulators, lawmakers, and other stakeholders in answering these crucial questions.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2010/11/15/ending-the-web-privacy-stalemate-donottrack-us/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cyber Détente Part III: American Procedural Negotiation</title>
		<link>http://webpolicy.org/2010/01/20/cyber-detente-part-iii-american-procedural-negotiation/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=cyber-detente-part-iii-american-procedural-negotiation</link>
		<comments>http://webpolicy.org/2010/01/20/cyber-detente-part-iii-american-procedural-negotiation/#comments</comments>
		<pubDate>Wed, 20 Jan 2010 11:40:15 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[International Relations]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=45</guid>
		<description><![CDATA[Original at Freedom to Tinker. The first post in this series rebutted the purported Russian motive for renewed cybersecurity negotiations and the second advanced more plausible self-interested rationales. This third and final post of the series examines the U.S. negotiating position through both substantive and procedural lenses. &#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212; American interest in a substantive cybersecurity deal [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at <a href="https://freedom-to-tinker.com/blog/jrmayer/cyber-d%C3%A9tente-part-iii-american-procedural-negotiation">Freedom to Tinker</a>.</em></p>
<p>The <a href="http://www.freedom-to-tinker.com/blog/jrmayer/cyber-détente-part-i-security-dilemma">first post</a> in this series rebutted the purported Russian motive for renewed cybersecurity negotiations and the <a href="http://www.freedom-to-tinker.com/blog/jrmayer/cyber-détente-part-ii-russian-diplomatic-and-strategic-self-interest">second</a> advanced more plausible self-interested rationales.  This third and final post of the series examines the U.S. negotiating position through both substantive and procedural lenses.</p>
<p><span id="more-45"></span></p>
<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;</p>
<p>American interest in a substantive cybersecurity deal appears limited, and the U.S. is rightly skeptical of Russian motives (perhaps for the reasons detailed in the prior two posts).  Negotiators have <a href="http://www.nytimes.com/2009/06/28/world/28cyber.html">publicly</a> expressed support for institutional cooperation on the closely related issue of cybercrime, but firmly oppose an arms control or cyberterrorism treaty.  This tenuous commitment is further implicated by the U.S. delegation’s <a href="http://www.nytimes.com/2009/12/13/science/13cyber.html">composition</a>.  Representation of the NSA, State, DoD, and DHS suggests only a preliminary willingness to hear the Russians out and minimal consideration of a full-on bilateral negotiation.</p>
<p>While the cybersecurity talks may thus be substantively vacuous, they have great procedural merit when viewed in the context of shifting Russian relations and perceptions of cybersecurity.</p>
<p>The Bush administration’s Russia policy was marked by antagonism; proposed <a href="http://www.nytimes.com/2008/08/15/world/europe/16poland.html">missile defense</a> installations in Poland and the Czech Republic and <a href="http://www.nytimes.com/2008/04/03/world/europe/03nato.html">NATO membership</a> for Georgia and Ukraine particularly rankled the Kremlin.  Upon taking office the Obama administration committed to <a href="http://www.washingtonpost.com/wp-dyn/content/article/2009/02/07/AR2009020700756.html">“press[ing] the reset button”</a> on U.S.-Russia relations by recommitting to cooperation in areas of shared interest.</p>
<p>Cybersecurity talks may best be evaluated as a facet of this systemic “reset.”  Earnest discussions – including fruitless ones – may contribute towards a collegial relationship and further other more substantively promising negotiations between the two powers.  The cybersecurity topic is particularly well suited for this role in that it brings often less-than-friendly defense, intelligence, and law enforcement agencies to the same table.</p>
<p>Inside-the-beltway perceptions of cybersecurity have also experienced a sea change.  In the early Bush administration cybersecurity problems were predominantly construed as cybercrime problems, and consequently within the purview of law enforcement.  For example, one of the first “major actions” advocated by the White House&#8217;s 2003 <a href="http://www.us-cert.gov/reading_room/cyberspace_strategy.pdf">National Strategy to Secure Cyberspace</a> was, “[e]nhance law enforcement’s capabilities for preventing and prosecuting cyberspace attacks.”  But by the Obama administration cybersecurity was perceived as a national security issue; the 2009 <a href="http://www.whitehouse.gov/assets/documents/Cyberspace_Policy_Review_final.pdf">Cyberspace Policy Review</a> located primary responsibility for cybersecurity in the National Security Council.</p>
<p>This shift suggests additional procedural causes for renewed U.S.-Russia and UN cybersecurity talks.  Not only do the discussions reflect the new perception of cybersecurity as a national security issue, but also they nudge other nations towards that view.  And directly engaging defense and intelligence agencies accustoms them to viewing cybersecurity as an international issue within their domain.</p>
<p>The U.S. response of simultaneously substantively balking at and procedurally engaging with Russia on cybersecurity appears well-calibrated.  Where meager opportunity exists for concluding a meaningful cybersecurity instrument given the Russian motives discussed earlier, the U.S. is nonetheless generating value.</p>
<p>While this favorable outcome is reassuring, it is by no means guaranteed for future cybersecurity talks.  There is already a noxious atmosphere of often unwarranted alarmism about cyberwarfare and free-form parallels drawn between cyberattack and weapons of mass destruction.  Admix the recurrently prophesied &#8220;Digital Pearl Harbor&#8221; and it is easy to imagine how an international compact on cybersecurity could look all-too-appealing.  This pitfall can only be avoided by training an informed, critical eye on states&#8217; motives to develop the appropriate &#8211; if any &#8211; cybersecurity negotiating position.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2010/01/20/cyber-detente-part-iii-american-procedural-negotiation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cyber Détente Part II: Russian Diplomatic and Strategic Self-Interest</title>
		<link>http://webpolicy.org/2010/01/15/cyber-detente-part-ii-russian-diplomatic-and-strategic-self-interest/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=cyber-detente-part-ii-russian-diplomatic-and-strategic-self-interest</link>
		<comments>http://webpolicy.org/2010/01/15/cyber-detente-part-ii-russian-diplomatic-and-strategic-self-interest/#comments</comments>
		<pubDate>Fri, 15 Jan 2010 12:45:32 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[International Relations]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=38</guid>
		<description><![CDATA[Original at Freedom to Tinker. The first post in this series rebutted the purported Russian motive for negotiations, avoiding a security dilemma. This second post posits two alternative self-interested Russian inducements for rapprochement: legitimizing use of force and strategic advantage. &#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212; An alternative rationale for talks advanced by the Russians is fear of “cyberterror” – [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at <a href="https://freedom-to-tinker.com/blog/jrmayer/cyber-d%C3%A9tente-part-ii-russian-diplomatic-and-strategic-self-interest">Freedom to Tinker</a>.</em></p>
<p>The <a href="http://www.freedom-to-tinker.com/blog/jrmayer/cyber-détente-part-i-security-dilemma">first post</a> in this series rebutted the purported Russian motive for negotiations, avoiding a security dilemma.  This second post posits two alternative self-interested Russian inducements for rapprochement: legitimizing use of force and strategic advantage.</p>
<p><span id="more-38"></span></p>
<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;</p>
<p>An alternative rationale for talks advanced by the Russians is fear of “cyberterror” – not the capacity for offensive cyberwarfare, but its use against civilians.  A weapons use treaty of this sort could have value in establishing a norm against civilian cyberattack… but there are already strong international treaties and norms against attacks aimed at civilians.  And at any rate the untraceability of most cyberattacks will take the teeth out of any use-banning treaty of this sort.</p>
<p>The U.S. delegation is rightly skeptical of this motive; the Russians may well be raising cyberterror in the interest of legitimating use of conventional force.  The Russians have repeatedly likened political dissidence to cyberterror, and a substantive cyberterrorism treaty may be submitted by Russia as license to pursue political vendettas with conventional force.   To probe how such a treaty might function, consider first a hypothetical full-blown infrastructure-crippling act of cyberterror where the perpetrator is known – Russia already need not restrain itself in retaliating.  On the other hand, consider the inevitable website defacements by Chechen separatists or Georgian sympathizers in the midst of increasing hostilities – acts of cyberterrorism in violation of a treaty will assuredly be added to the list of provocations should Russia elect to engage in armed conflict.</p>
<p>This simple thought experiment reveals the deep faultlines that will emerge in negotiating any cyberterrorism treaty.  Where is the boundary between vandalism (and other petty cybercrime) and cyberterror?  What if acts are committed, as is often the case, by nationals of a state but not its government?  What proof is required to sustain an allegation of cyberterror?  Doubtlessly the Russian delegation would advance a broad definition of cyberterror, while the Americans would propose a narrowly circumscribed definition.  Even if, improbably, the U.S. and Russia negotiated to a shared definition of cyberterror, I fail to see how it could be articulated in a manner not prone to later manipulation.  It is not difficult to imagine, for example, how trivial defacement of a bank’s website might be shoehorned into a narrow definition: “destructive acts targeting critical civilian infrastructure.”</p>
<p>Another compelling motive for the Russians is realist self-interest: the Russians may believe they will gain a strategic advantage with a capacity-limiting cyberwarfare treaty.  At first blush this seems an implausible reading – the U.S., with its technologically advanced and integrated armed forces, appears a far richer target for cyberattack than Russia given its reliance on decrepit Soviet equipment.  Moreover, anecdotally the U.S. military has proven highly vulnerable: multiple unattributed attacks have penetrated defense-related systems (<a href="http://www.cbsnews.com/stories/2009/11/06/60minutes/main5555565.shtml">most prominently in 2007</a>), and late last year the Wall Street Journal reported Iraqi militants <a href="http://online.wsj.com/article/SB126102247889095011.html">trivially intercepted live video from Predator drones</a>.  But looking ahead a Russian self-interest motive is more plausible.  Russia has made no secret of its attempts to rapidly stand up modern, professional armed forces, and in 2009 alone <a href="http://www.time.com/time/magazine/article/0,9171,1891681,00.html">increased military spending by over 25%</a> (projects include <a href="http://www.nytimes.com/2008/07/27/world/europe/27iht-kremlin.4.14813103.html">a revamped navy</a> and <a href="http://www.nytimes.com/2007/04/04/business/worldbusiness/04gps.html">a satellite positioning system</a>, among many others).  To accomplish this end the Russians may rely to a large degree on information technology, and particularly on commercial off-the-shelf hardware and software.  Lacking time and finances the Russians may be unable to secure their new military systems against cyberattack.  Thus while at present the U.S. is more vulnerable, in future Russia may have greater weaknesses.  Locking in a cyberwarfare arms control agreement now, while the U.S. is more likely to sign on, could therefore be in Russia’s long-term strategic self-interest.</p>
<p>The specific offensive capabilities Russia has reportedly sought to ban are strongly corroborative of this self-interest rationale.  In prior negotiations the Russian delegation has <a href="http://www.nytimes.com/2009/06/28/world/28cyber.html">signaled particular concern</a> of deliberately planted software and hardware that would allow disabling or co-opting military equipment.  The U.S. will likely have far greater success in developing assets of this sort given the at times close relationship between intelligence agencies and commercial IT firms (e.g. the NSA warrantless wiretapping scandal) and the prevalence of American IT worldwide in military applications (think Windows).  Russia, on the other hand, would likely have to rely on human intelligence to place assets of this sort.</p>
<p>Russia’s renewed interest in bilateral cybersecurity negotiations also belies its purported security dilemma rationale.  Russian interest in talks lapsed between 1996 and 2009, suggesting a novel stimulus is at work, not some long-standing fear of a security dilemma.  The recent rise of alleged “cyberterror” and attempts to modernize Russian armed forces – especially in the wake of the 2008 South Ossetia War with Georgia – far better correlate with Russia’s eagerness to come to the table.</p>
<p>To put a point on these two posts, I submit legitimization of use of force and strategic self-interest are far more plausible Russian motives for cybersecurity negotiations than the purported rationale of avoiding a security dilemma and consequent arms race or destabilization.  In the following post I will explore the U.S. delegation’s position and argue the American response to Russia’s proposals is well-calibrated.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2010/01/15/cyber-detente-part-ii-russian-diplomatic-and-strategic-self-interest/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cyber Détente Part I: A Security Dilemma?</title>
		<link>http://webpolicy.org/2010/01/11/cyber-detente-part-i-a-security-dilemma/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=cyber-detente-part-i-a-security-dilemma</link>
		<comments>http://webpolicy.org/2010/01/11/cyber-detente-part-i-a-security-dilemma/#comments</comments>
		<pubDate>Mon, 11 Jan 2010 10:39:18 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[International Relations]]></category>
		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=34</guid>
		<description><![CDATA[Original at Freedom to Tinker. Late last year the Obama administration reopened talks with Russia over the militarization of cyberspace and assented to cybersecurity discussion in the United Nations First Committee (Disarmament and National Security). My intention in this three-part series is to probe Russian and American foreign policy on cyberwarfare and advance the thesis [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at <a href="https://freedom-to-tinker.com/blog/jrmayer/cyber-d%C3%A9tente-part-i-security-dilemma">Freedom to Tinker</a>.</em></p>
<p>Late last year the Obama administration <a href="http://www.nytimes.com/2009/12/13/science/13cyber.html">reopened talks</a> with Russia over the militarization of cyberspace and assented to cybersecurity discussion in the United Nations First Committee (Disarmament and National Security).  My intention in this three-part series is to probe Russian and American foreign policy on cyberwarfare and advance the thesis that the Russians are negotiating for specific strategic or diplomatic gains, while the Americans are primarily procedurally invested owing to the “reset” in Russian relations and changing perceptions of cyberwarfare.</p>
<p>This first post rebuts the Russians’ purported rationale for talks: avoiding a security dilemma.</p>
<p><span id="more-34"></span></p>
<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;</p>
<p>The Russians seek a cyberwarfare arms control instrument ostensibly to avoid a security dilemma and arms race, in the vein of past arrangements for nuclear weapons (i.e. SALT <a href="http://www.state.gov/t/isn/5191.htm">I</a>/<a href="http://www.state.gov/t/isn/5195.htm">II</a>, START <a href="http://www.state.gov/t/isn/18535.htm">I</a>/<a href="http://www.state.gov/t/isn/10425.htm">II</a>, and <a href="http://assets.opencrs.com/rpts/RL31448_20081230.pdf">SORT</a>) and anti-ballistic missile technology (<a href="http://www.state.gov/www/global/arms/treaties/abm/abm2.html">ABM</a>), among others.  This basis for negotiations does not withstand scrutiny.</p>
<p>A security dilemma may arise where a state has the opportunity to develop a game-changing new weapons system, even if for purely defensive purposes.  For fear of strategic disadvantage other powers may elect to develop the weapon – an arms race – resulting in none gaining a strategic advantage and all bearing a significant cost.  Alternatively, technologically incapable of matching or unable to afford the development, other states may take destabilizing offensive steps.  Arms control treaties resolve this form of security dilemma by committing states to not developing certain weapons.</p>
<p>Cyberwarfare lacks necessary elements of a security dilemma.  First and foremost, cyberwarfare capabilities defy quantifiability.  Consider the Cold War nuclear arms race, for example, and the strategic fixation on differences in the number and type of nuclear warheads and delivery systems (the “missile gap”).  In the absence of such a metric the two powers have no means of calibrating their activities, and there is no persistent pressure to match or surpass some specific capability the other side maintains.</p>
<p>Intelligence might give each power a rough indication of the other’s cyberwarfare capabilities, but it will be harder to come by than for other military operations.  Unlike with other weapons systems, cyberwarfare does not require special installations or resources.  There are no centrifuge sites to inspect or uranium shipments to track – just talented programmers and generic computer hardware.  </p>
<p>A related issue is that a successful arms control agreement on cyberwarfare would require monitoring and enforcement provisions (“trust but verify”).  But as discussed above intelligence on cyberwar capabilities will be harder to come by than for other weapons systems.  The <a href="http://www.state.gov/t/isn/4718.htm">Biological Weapons Convention</a> is illustrative of how ineffective an arms control treaty may be without effective monitoring: until <a href="http://www.nytimes.com/2001/11/23/world/v-pasechnik-64-is-dead-germ-expert-who-defected.html">a 1989 defection</a> the West was unaware of the scope of Russia’s secret biological weapons program.</p>
<p>Supposing, arguendo, that cyberwarfare capabilities did form an avoidable security dilemma, the negative results that make a security dilemma worth avoiding – excessive expenditures and destabilization – do not arise.</p>
<p>Cyberwarfare is cheap.  Developing the F-22 aircraft, for example, <a href="http://www.nytimes.com/2008/12/10/us/politics/10jets.html">cost roughly $65 billion</a>; the annual Air Force cyberspace budget, on the other hand, appears in the low billions and consists primarily of personnel and basing expenditures (<a href="http://www.stratcom.mil/news/article/53/finalists_announced_for_cyber_command_hq/">Strategic Command Press Release</a>; <a href="http://www.saffm.hq.af.mil/shared/media/document/AFD-090511-058.pdf">FY2010 budget</a>).</p>
<p>As for destabilization, there is minimal marginal strategic gain from cyberwarfare capabilities.  In the Cold War nuclear arms race there was a perception that if the other side achieved even a slight advantage the bipolar strategic equilibrium would collapse.  Cyberwarfare is neither perceived to be – nor is it, in actuality – so effective on the margin.  While specific capabilities are not public, it is difficult to imagine cyberattacks will be consistently more effective than conventional strikes.  Moreover, given the United States’ enormous strategic advantages in the whole, even significant marginal strategic gains would do little to tip the balance of power to Russia.</p>
<p>Having deconstructed the alleged Russian rationale for talks, the next post in this series will explore alternate viable Russian rationales.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2010/01/11/cyber-detente-part-i-a-security-dilemma/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Search Neutrality ≠ Net Neutrality</title>
		<link>http://webpolicy.org/2009/12/30/search-neutrality-%e2%89%a0-net-neutrality/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=search-neutrality-%25e2%2589%25a0-net-neutrality</link>
		<comments>http://webpolicy.org/2009/12/30/search-neutrality-%e2%89%a0-net-neutrality/#comments</comments>
		<pubDate>Wed, 30 Dec 2009 18:14:01 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Net Neutrality]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=29</guid>
		<description><![CDATA[Original at Freedom to Tinker. Sunday’s New York Times featured a provocative op-ed arguing in addition to regulating “net neutrality” the FCC should also effectuate “search neutrality” &#8211; requiring search providers rank results without consideration of business entities. The author heaps particular scorn upon Google for promoting its own context-relevant services (i.e. maps and weather) [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at <a href="https://freedom-to-tinker.com/blog/jrmayer/search-neutrality-%E2%89%A0-net-neutrality">Freedom to Tinker</a>.</em></p>
<p>Sunday’s New York Times featured <a href="http://www.nytimes.com/2009/12/28/opinion/28raff.html">a provocative op-ed</a> arguing in addition to regulating “net neutrality” the FCC should also effectuate “search neutrality” &#8211; requiring search providers rank results without consideration of business entities.  The author heaps particular scorn upon Google for promoting its own context-relevant services (i.e. maps and weather) at the fore of search results.  Others have already <a href="http://balkin.blogspot.com/2009/12/dilemmas-of-domination-google-faces.html">reviewed the proposal</a>, <a href="http://madisonian.net/2009/12/28/there-is-no-search-engine-neutrality/">leveled implementation critiques</a>, and <a href="http://econsultancy.com/blog/4456-foundem-vs-google-a-case-study-in-seo-fail">criticized the author’s gripes with his own site</a>.  My aim here is to rebut the piece’s core argument: the analogy of search neutrality to net neutrality.  Clearly both are debates about the promotion of innovation and competition through a level playing field.  But beyond this commonality the parallel breaks down.</p>
<p>Net neutrality advocates call for regulation because ISP discrimination could render innovative services either impossible to implement owing to traffic restrictions or too expensive to deploy owing to traffic pricing.  Consumers cannot “vote with their dollars” for a nondiscriminatory ISP since most locales have few providers and the market is hard to break into.  Violations of net neutrality, the argument goes, threaten to nip entire industries in the bud and rob the economy of growth.</p>
<p>Violations of search neutrality, on the other hand, at most increase marketing costs for an innovative or competitive offering.  Consumers are more than clever enough to seek and use an alternative to a weaker Google offering (Yelp vs. Google restaurant reviews, anyone?).  The author of the op-ed cites Google Maps’ dethroning of MapQuest as evidence of the power of search non-neutrality; on the contrary, I would contend users flocked to Google’s service because it was, well, better.  If Google Maps featured MapQuest’s clunky interface and vice versa, would you use it?  A glance at <a href="http://weblogs.hitwise.com/us-heather-hopkins/2008/01/google_maps_making_inroads_aga.html">historical map site statistics</a> empirically rebuts the author’s claim.  The mid-May 2007 introduction of Google&#8217;s context-relevant (“universal”) search does not appear correlated with any irregular shift in map site traffic.</p>
<p>Moreover, unlike with net neutrality search consumers stand ready to “vote with their [ad] dollars.”  Should Google consistently favor its own services to the detriment of search result quality, consumers can effortlessly shift to any of its numerous competitors.  It is no coincidence Google <a href="http://googleblog.blogspot.com/2008/05/introduction-to-google-search-quality.html">sinks enormous manpower</a> into improving result quality.</p>
<p>There may also be a benefit to the increase in marketing costs from existing violations of search neutrality, like Google’s map and weather offerings.  If a service would have to be extensively marketed to compete with Google’s promoted offering &#8211; say, a current weather site vs. searching for “Stanford weather” &#8211; the market is sending a signal that consumers <i>don’t care</i> about the marginal quality of the product, and the non-Google provider should quit the market.</p>
<p>There is merit to the observation that violations of search neutrality are, on the margin, slightly anti-competitive.  But this issue is dwarfed by the potential economy-scale implications of net neutrality.  The FCC should not deviate in its rulemaking.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2009/12/30/search-neutrality-%e2%89%a0-net-neutrality/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>There’s anonymity on the Internet. Get over it.</title>
		<link>http://webpolicy.org/2009/10/27/theres-anonymity-on-the-internet-get-over-it/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=theres-anonymity-on-the-internet-get-over-it</link>
		<comments>http://webpolicy.org/2009/10/27/theres-anonymity-on-the-internet-get-over-it/#comments</comments>
		<pubDate>Tue, 27 Oct 2009 19:07:55 +0000</pubDate>
		<dc:creator>jonathan</dc:creator>
				<category><![CDATA[Anonymity]]></category>
		<category><![CDATA[Privacy]]></category>
		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://webpolicy.wordpress.com/?p=11</guid>
		<description><![CDATA[Original at Freedom to Tinker. In a recent interview prominent antivirus developer Eugene Kaspersky decried the role of anonymity in cybercrime. This is not a new claim – it is touched on in the Commission on Cybersecurity for the 44th Presidency Report and Cybersecurity Act of 2009, among others – but it misses the mark. [...]]]></description>
				<content:encoded><![CDATA[<p><em>Original at <a href="https://freedom-to-tinker.com/blog/jrmayer/there%E2%80%99s-anonymity-internet-get-over-it">Freedom to Tinker</a>.</em></p>
<p>In a <a href="http://www.zdnetasia.com/insight/security/0,39044829,62058697,00.htm">recent interview</a> prominent antivirus developer Eugene Kaspersky decried the role of anonymity in cybercrime.  This is not a new claim – it is touched on in the <a href="http://csis.org/files/media/csis/pubs/081208_securingcyberspace_44.pdf">Commission on Cybersecurity for the 44th Presidency Report</a> and <a href="http://cdt.org/security/CYBERSEC4.pdf">Cybersecurity Act of 2009</a>, among others – but it misses the mark.  <em>Any</em> Internet design would allow anonymity.  What renders our Internet vulnerable is primarily weakness of software security and authentication, not anonymity.</p>
<p><span id="more-11"></span></p>
<p>Consider a hypothetical of three Internet users: Alice, Bob, and Charlie.  If Alice wants to communicate anonymously with Charlie, she may relay her messages through Bob.  While Charlie knows Bob is an intermediary, Charlie does not know with whom he is ultimately communicating.  For even greater anonymity Alice can pass her messages through multiple Bobs, and by <a href="http://en.wikipedia.org/wiki/Onion_routing">applying cryptography</a> she can ensure no individual Bob can piece together that she is communicating with Charlie.  This basic approach to anonymity is remarkable in its independence of the Internet’s design: it only requires that some Bob(s) can and do run intermediary software.  Even on an Internet where users could verify each other’s identity this means of anonymity would remain viable.</p>
<p>The sad state of software security – the <a href="http://www.us-cert.gov/cas/bulletins/SB09-292.html">latest DHS weekly bulletin</a> alone identified over 40 “high severity” vulnerabilities – is what enables malicious users to exploit the Internet’s indelible capacity for anonymity.  Modifying the prior hypothetical, suppose Alice now wants to spam, phish, denial of service (DoS) attack, or hack Charlie.  After compromising Bob’s computer with malicious software (malware), Alice can send emails, host websites, and launch DoS attacks from it; Charlie knows Bob is apparently misbehaving, but has no means of discovering Alice’s role.  Nearly all spam, phishing, and DoS attacks are now perpetrated with networks of compromised computers like Bob’s (botnets).  At the writing of a <a href="http://www.m86security.com/newsimages/trace/Marshal8e6_TRACE_Report_July_2009.pdf">July 2009 private sector report</a>, just five botnets sourced nearly 75% of spam.  Worse yet, botnets are increasingly self-perpetuating: spam and phishing websites propagate malware that compromises new computers for the botnet.</p>
<p>Shortcomings in authentication, the means of proving one’s identity either when necessary or at all times, are a secondary contributor to the Internet’s ills.  Most applications rely on passwords, which are easily guessed or divulged through deception – the very mechanisms of most phishing and account hijacking.  There are <a href="http://en.wikipedia.org/wiki/Password-authenticated_key_agreement">potential technical solutions</a> that would enable a user to authenticate themselves without the risk of compromising accounts.  But any approach will be undermined by weaknesses in underlying software security when a malicious party can trivially compromise a user’s computer.</p>
<p>The policy community is already trending towards acceptance of Internet anonymity and refocusing on software security and authentication; the recent <a href="http://www.whitehouse.gov/assets/documents/Cyberspace_Policy_Review_final.pdf">White House Cyberspace Policy Review</a> in particular emphasizes both issues.  To the remaining unpersuaded, I can only offer at last a <a href="http://www.wired.com/politics/law/news/1999/01/17538">truism</a>: There’s anonymity on the Internet.  Get over it.</p>
]]></content:encoded>
			<wfw:commentRss>http://webpolicy.org/2009/10/27/theres-anonymity-on-the-internet-get-over-it/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

<!-- Served from: webpolicy.org @ 2013-05-19 12:45:00 by W3 Total Cache -->