Thirdparty: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
Line 20: Line 20:
:3. User visits their credit union, which uses third party resources for banking functions, and wants those functions to work.
:3. User visits their credit union, which uses third party resources for banking functions, and wants those functions to work.


:4. User visits a site that uses OpenID or Facebook Connect, and wants to be able to log in to those services and use them with the site.
:4. User visits a site that uses OpenID, Facebook Connect, or other federated login service, and wants to be able to log in to those services and use them with the site.


= Proposal =
= Proposal =

Revision as of 17:37, 8 June 2010

Overview

Problem: Third party cookies and other browser fingerprinting techniques allow behavioral tracking, which is frequently undesirable to the user. However, disabling third party cookies by default (as attempted in the past) breaks too many legitimate cases.

Goals:

  • Stop behavioral tracking to the maximum extent possible, except where the user specifically wants it. (Note: this includes more than cookies.)
  • Allow common legitimate cases, such as Facebook Connect, OpenID, bank logins and such, to work as seamlessly as possible.
  • To make this the default setting in Firefox 4.

Use cases

Evil:

1. User visits multiple shopping sites, which have resources from ad sites embedded in iframes, images, or requests made directly from script. They do not want the advertiser to track their movements across those sites.
2. User visits a site that embeds Facebook Connect, but does not want their Facebook login cookies automatically sent.

Good:

3. User visits their credit union, which uses third party resources for banking functions, and wants those functions to work.
4. User visits a site that uses OpenID, Facebook Connect, or other federated login service, and wants to be able to log in to those services and use them with the site.

Proposal

Currently, cookies are keyed (i.e. set for and sent back to) by the domain that set them. Instead, double-key the cookies by (first party base domain, setting domain). Cookies are first party if the second key is derived from the first key, e.g. (google.com, mail.google.com); third party otherwise, e.g. (huffingtonpost.com, doubleclick.net).

Cookies are only sent back in situations where the double-keys are the same. For instance, when browsing buy.com, cookies set by an image hosted on ads.google.com would only be sent back when browsing buy.com; not when browsing another site.

In addition, third party cookies are discarded after the session (i.e. on browser close). (This part may be non-default behavior; it does not necessarily strike a good balance wrt UX/privacy.)

Definitions:

first party domain: the domain of the site that the user is browsing; specifically, what appears in the urlbar.
base domain: the toplevel domain for a given site, e.g. for mail.google.com the base domain would be google.com.

Discussion

This prevents automatic tracking by third parties across different sites (cases 1 and 2), since there's effectively a separate third party store per first party site. It also prevents automatic sending of session information in a third party context, even when the user has logged into that site as a first party, since the third party store is separate from the first party store for a given site.

It allows cases where temporary third party cookies are required on a given first party site (case 4), since those situations will have the same (first party, third party) key. Whether we limit third party cookie lifetime to session only will have no effect here.

It can also allow OpenID and Facebook Connect to work, with some additional user interaction. The (first party, third party) context will prevent an existing Facebook login (via facebook.com) from automatically carrying over to another site (huffingtonpost.com). However, the user can log in to Facebook from Huffington Post, and separately for each site that embeds Facebook, and things can work as usual. If we optionally limit third party lifetime to session, this login will persist for the session only. If the user trusts Facebook, they can whitelist facebook.com (via the usual mechanism) to circumvent the double keying restrictions, resulting in their Facebook login to carry over to all other sites.

The tricky part is defining in what cases the first party context should carry over. For instance, an iframe within a page has an obvious first party domain (the urlbar). What about a redirect (such as a clickthrough ad, or an OpenID login)? Since it's an obvious hole, we have to track first party context through redirects. (So going to digg.com --> redirect to clickthrough ad on ads.google.com --> click back to digg.com would maintain a first party context of digg.com throughout.) If we didn't, those clickthrough ads would be first parties, and could track the user across sites.

Facebook Connect uses a JS lightbox to throw the login dialog (http://wiki.developers.facebook.com/index.php/Authenticating_Users_with_Facebook_Connect). This counts as part of the page, rather than a popup window, and thus would be considered a third party. So double-keying would work fine here. Note that the embedder can specify they want to use a popup dialog instead, but let's say that's not the common case.

OpenID probably uses redirects in general (http://www.merchantos.com/makebeta/php/single-sign-on-with-openid-and-google-part-1/), though I'm not sure about provider specifics. If we track redirects and consider them third parties -- which would require some extra mechanics -- then this would work just fine. (So going to digg.com --> redirect to clickthrough ad on ads.google.com --> click back to digg.com would maintain a first party context of digg.com throughout. If we didn't, those clickthrough ads would be first parties, and could track the user across sites. So doing this is good all around.)

Note that Opera does something interesting here: by default, they consider redirects to be "unverified transactions", which are considered third party. Link clicks are verified transactions -- first party. This is actually part of RFC2965 (http://www.faqs.org/rfcs/rfc2965.html) section 3.3.6: "A transaction is verifiable if the user, or a user-designated agent, has the option to review the request-URI prior to its use in the transaction." In Opera, with "automatic redirection" turned off, I believe this means that redirects throw a page which says "this is a redirect to http://foo.com, continue?" or somesuch. Clicking that link then makes the transaction verified, and the cookies are first party.

With that, I propose (where it is implied that the first party domain carries over, until reset):

1. Typing in the urlbar, loading bookmarks, other totally toplevel actions -- resets first party domain.
2. Link clicks (href tags) -- resets (but I'm not sure about this yet).
3. Setting document.location -- carries over first party domain. (It's hard to distinguish a user-initiated action that results in a document.location change vs. an automated change. So we have to go with carrying over here.)
4. Redirects -- carries over.
5. Popup windows -- carries over.

We might want to make link clicks carry over the first party. Rationale: a site that relies on an href click (to a third party) to perform a login operation, rather than using a redirect or document.location, needs that load to carry over the first party such that things work when redirected back. The downside is that long browsing sessions in a single tab, across multiple sites, will result in them all being considered third party. (And thus allow behavioral tracking during that tab lifetime.) Having it reset is probably a good tradeoff, since it's less surprising. But it would allow holes, e.g. where a site has a link targeted at ads.google.com which then redirects back to some content.

Implementation

Step 1: Make third party cookies persist for the session only, by default. (Can be disabled by a network.cookie.thirdparty.sessionOnly pref.) See bug 565475; patch up.

Step 2: Double-key cookies by (first party domain, setting domain). See bug 565965; patch in progress.

Step 3: Implement the first party carry-over rules described above, probably as a separate service such that localstorage etc. can use it.

Further Steps

Other services such as localstorage should use a set of policies consistent with the above.

Make the browser fingerprint more anonymous, by reducing the uniqueness of queryable information other than cookies. See Fingerprinting for details.