Co-authored by Arvind Narayanan.
Measurement is central to online advertising: it facilitates billing, performance measurement, targeting decisions, spending allocation, and more. In a pair of earlier posts we explained how advertisement frequency capping and behavioral targeting are achievable without compiling a user’s browsing history. This post similarly proposes practical, privacy-improved approaches to advertising measurement.
There are, broadly, three advertising events that might require measurement.
- Impression. An advertisement is displayed to a user. Measured details might include the webpage the ad was served on, the time the ad was served, the user’s location, and the user’s browser.1
- Click. The user clicks the ad.
- Action. After viewing the ad, the user later does something on a different webpage. For example, the user might buy the product or service that was advertised. An action may occur days or weeks after an impression.
Sometimes advertisers pay per impression (“CPM” billing), sometimes per click (“CPC”), sometimes per action (“CPA”), and sometimes for a combination of these events (“hybrid”).
The following sections explain how to conduct a privacy-improved measurement of each advertising event.
Impression
When a user’s browser loads an advertisement, the company serving the ad ordinarily learns the URL of the current webpage,2 as well as the user’s IP address and User-Agent
string.3 Practical, privacy-improved impression measurement is a granularity problem: How can a website generalize the impression data it collects without substantially compromising the utility of that data?
Requirements will undoubtedly vary by service. We present here a rough design spectrum of the information that a third-party website might retain.4
Information | Current Approach | Better Approach | Even Better Approach |
---|---|---|---|
Webpage | URL | Fully qualified domain name | Public suffix + 1 |
Time | Precise timestamp | Day | Week |
User Location | IP address | Truncated IP address | Coarse geolocation |
Browser | User-Agent string |
Browser/OS major versions | Browser/OS |
Click
Click measurement is exactly the same as impression measurement, with just one extra piece of information: whether the user clicked the ad.5
Action
Action measurement is a more difficult engineering problem. An ad impression is, for measurement purposes, a one-shot event: it occurs within the context of a single webpage. Measuring an action, on the other hand, requires linking an ad impression on one webpage with a subsequent action on another webpage.6
A pairing of client-side storage and selective information disclosure can enable privacy-improved action measurement, much like our previous approaches to frequency capping and behavioral targeting. When an ad is displayed, information about the impression can be stored in the browser. If the user later completes an action, the ad company can query the browser for relevant impression information.
Implementing a prototype of our action measurement algorithm was straightforward using HTML 5 local storage. Source is available on GitHub. Performance is a non-issue, as with our prototypes for frequency capping and behavioral targeting.
-
Many advertising companies also record whether this was a first-time (“unique”) impression. Our algorithm for frequency capping can be trivially modified to provide this functionality. ↩
-
Third-party websites usually learn the first-party webpage URL from a
Referer
header or explicitRequest-URI
parameter. There are some methods for a first-party webpage to hide its URL from third-party content, includingiframe
sandboxing and the HTML 5noreferrer
link annotation. For the moment, these techniques are not sufficiently simple, comprehensive, or supported to anticipate widespread use. ↩ -
A semi-trusted intermediary or anonymizing network could conceal or generalize a user’s IP address,
User-Agent
string, and other information. See Adnostic and Privad for examples. These approaches are, at present, not practical for broad deployment. ↩ -
Present Do Not Track proposals diverge on impression measurement; some would allow current approaches to continue, while others would require quickly generalizing impression data. ↩
-
Do Not Track would allow the current approach to click measurement since the user has (somewhat constructively) interacted with content from the advertising company. ↩
-
Because of this property, some Do Not Track proposals would require a privacy-improved approach to action measurement. ↩