Blog

Do cookie-free analytics need cookie compliance popups?

Published Jun 09, 2022

It might sound like a bit of a weird ques­tion; if we don’t use cook­ies, surely there’s no way we’d be vi­o­lat­ing cookie laws. That’s why it was sur­pris­ing to me to learn that we ac­tu­ally do: if we want to be in com­pli­ance with the EU’s ePri­vacy Di­rec­tive (com­monly called the cookie law) with­out using cookie ban­ners, we’ll need to meet far stricter cri­te­ria than just “not using cook­ies”. Keep read­ing for a deep dive.

Tech­ni­cal back­ground

In re­cent years there has been an in­creased focus on on­line pri­vacy and how our data is han­dled when we browse the web. This comes in the wake of the mas­sively privacy-​invading track­ing done by com­pa­nies like Google and Face­book be­com­ing pub­lic knowl­edge, of EU leg­is­la­tion forc­ing com­pa­nies to get in­formed con­sent be­fore pro­cess­ing our data, and of bet­ter pri­vacy tools be­com­ing eas­ily avail­able in our browsers.

This has dri­ven a height­ened de­mand for “privacy-​aware an­a­lyt­ics”, a term re­served for an­a­lytic so­lu­tions that aim to anonymize the col­lected data enough that it is no longer con­sid­ered pri­vate data. An early ex­am­ple is Fathom, but other self-​hosted so­lu­tions like Plau­si­ble, Ackee, and Counter have also started pop­ping up.

Privacy-​aware an­a­lyt­ics are faced with an ob­sta­cle: most an­a­lyt­ics re­quires some way to rec­og­nize re­peat vis­i­tors. Since HTTP is state­less, know­ing whether a user has vis­ited be­fore (to count how many unique vis­i­tors we’ve had), vis­ited an­other page (to cal­cu­late bounce rate), and when they left (to cal­cu­late av­er­age visit du­ra­tion) re­quires some kind of user track­ing. Tra­di­tion­ally this has been done by stor­ing a small token - in the form of a cookie, usu­ally - on the user’s de­vice, which their browser then at­taches to fu­ture re­quests. The an­a­lyt­ics back­end can then com­pare that to a list of pre­vi­ously seen to­kens, and thereby know whether the user is new or has vis­ited be­fore.

Cookie-​free an­a­lyt­ics work slightly dif­fer­ently. In­stead of stor­ing a small token on the user’s de­vice, the back­end in­stead cal­cu­lates a unique token when the re­quest comes in based on the user’s avail­able data. This process - known as fin­ger­print­ing - al­lows the an­a­lyt­ics so­lu­tion to skirt any leg­is­la­tion that lim­its when data can be stored on a user’s de­vice, and often works just as well as cookie-​based to­kens. Privacy-​aware an­a­lyt­ics usu­ally com­bine this con­cept with a sig­nif­i­cant focus on pri­vacy and ex­tends it with things like hash­ing, ro­tat­ing salts, and lim­ited fin­ger­print data (for more in­for­ma­tion see Fathom’s great ex­plainer on their al­go­rithm). This gen­er­ally pro­duces what most peo­ple would con­sider anonymized data: in al­most all sit­u­a­tions it is im­pos­si­ble to trace any data back to the in­di­vid­ual user, es­pe­cially once the salt has been ro­tated.

While this im­ple­men­ta­tion is enough to sat­isfy even me when it comes to re­spect­ing users’ pri­vacy, it also causes a com­mon mis­con­cep­tion: namely that since it is a privacy-​aware an­a­lyt­ics so­lu­tion that doesn’t use cook­ies, we don’t have to ask users for con­sent. To work out whether that’s true, we’ll have to look at why we ask for con­sent in the first place.

Note: I am not a lawyer. The fol­low­ing is my sim­plis­tic un­der­stand­ing of the rel­e­vant leg­is­la­tion. Do not use it as legal ad­vice

When we’re talk­ing about an­a­lyt­ics there are pri­mar­ily two dif­fer­ent EU laws we’ll need to cover: the big bad­die, GDPR, and the older cookie law, the ePri­vacy Di­rec­tive. These two both re­late to data pri­vacy and are often con­fused, so here’s a quick break­down:

GDPR [Regulation (EU) 2016/679]
Regulation adopted in 2016, came into force in 2018.
Huge law primarily addressing the right of an individual to control their personal data. Trivial examples include names, birthdays, and emails; more complicated examples include browsing history, timestamps, and preferences. If there is any theoretical way that data can be traced back to a person, that data is probably covered by GDPR.
GDPR is a huge beast and one that I am far from competent enough to cover. I will be assuming that the analytics solution you're looking to implement already strongly anonymizes all data, and therefore won't cover GDPR further.
ePrivacy Directive [Directive 2002/58/EC]
Directive from 2002, later amended in 2009.
One of the early attempts at legislating online privacy. Of interest to us is its regulations on what data can be processed for what uses without requiring informed consent from the user.
Enforces privacy of any data read from a user's device, personal or not. This will be especially important for us, as it limits us greatly in what data we can use, even if we anonymize it heavily.
Note that this is a directive, not a regulation, meaning it is up to the individual EU countries to implement the directive into law. This distinction isn't that important for our uses, and I will only be considering the wording of the directive itself in this article.

We’ll be fo­cus­ing our ef­forts on the cookie law since the con­sent re­quired by GDPR can be largely skirted around by anonymiz­ing data prop­erly.

Two com­mon mis­con­cep­tions of the cookie law are that 1) it only ap­plies to data stored in cook­ies, and 2) that it only ap­plies to per­sonal data.
As put in Ar­ti­cle 5(3) of the ePri­vacy Di­rec­tive:

[…] the stor­ing of in­for­ma­tion, or the gain­ing of ac­cess to in­for­ma­tion al­ready stored, in the ter­mi­nal equip­ment of a sub­scriber or user is only al­lowed on con­di­tion that the sub­scriber or user con­cerned has given his or her con­sent […]

This cov­ers in­for­ma­tion stored in cook­ies, in­for­ma­tion stored in local stor­age, and even (as we shall see later) in­for­ma­tion pre­vi­ously stored on the de­vice by oth­ers, such as browser user agents. We there­fore can’t skirt the leg­is­la­tion by sim­ply using other stor­age mech­a­nisms, de­spite the un­of­fi­cial “cookie law” mis­nomer.
The di­rec­tive also doesn’t have any ex­cep­tions for anonymized data. As put by the EU Data Pro­tec­tion Work­ing Party in a 2014 opin­ion on anonymiza­tion, “anonymi­sa­tion is a tech­nique ap­plied to per­sonal data in order to achieve ir­re­versible de-​identification. There­fore, the start­ing as­sump­tion is that the per­sonal data must have been col­lected and processed in com­pli­ance with the ap­plic­a­ble leg­is­la­tion on the re­ten­tion of data in an iden­ti­fi­able for­mat.” This means that the sim­ple act of ac­cess­ing the data in the first place is re­stricted.

If we look at the al­go­rithm used by Fathom, we see that it col­lects the fol­low­ing in­for­ma­tion for fin­ger­print­ing: IP ad­dress, user agent, site do­main name, and site ID. Two of these can be ac­cessed with­out re­ly­ing on the user’s de­vice: the site do­main and the site ID. The IP ad­dress and user agent, as we’ll see below, are ex­plic­itly cov­ered by the ePri­vacy Di­rec­tive, and there­fore re­quire user con­sent to ac­cess. This means that even if we use Fathom an­a­lyt­ics, even de­spite its cookie-​free and privacy-​friendly de­sign, we’ll still need a cookie ban­ner to get user con­sent.

A sim­i­lar story is true for Plau­si­ble’s al­go­rithm. It col­lects the fol­low­ing in­for­ma­tion: IP ad­dress, user agent, page URL, and re­fer­rer. Here the IP ad­dress, user agent, and re­fer­rer re­quire con­sent ac­cord­ing to the ePri­vacy Di­rec­tive, so we are in ex­actly the same boat as with Fathom: privacy-​aware and cookie-​free or not, a cookie popup is legally re­quired. Even the peo­ple be­hind Plau­si­ble seem to be con­fused here, using (at the time of this writ­ing) “No need for cookie ban­ners or GDPR con­sent” as one of the ti­tles on their land­ing page.

Every other privacy-​aware an­a­lyt­ics so­lu­tion I’ve found, in­clud­ing Ackee and Counter, also uses data cov­ered by the ePri­vacy Di­rec­tive.

Let’s have a look at the var­i­ous data points we can use to iden­tify re­peat vis­its, and what the ePri­vacy Di­rec­tive says about them:

IP address or other traffic data
Covered by the Directive under traffic data:
(26) The data [...] processed within electronic communications networks to establish connections and to transmit information contain information on the private life of natural persons [...]. Such data may only be stored to the extent that is necessary for the provision of the service for the purpose of billing and for interconnection payments, and for a limited time. Any further processing of such data [...] may only be allowed if the subscriber has agreed to this on the basis of accurate and full information given by the provider [...]. Traffic data used for marketing communications services or for the provision of value added services should also be erased or made anonymous after the provision of the service. Service providers should always keep subscribers informed of the types of data they are processing and the purposes and duration for which this is done.

Not only does this re­quire us to get con­sent be­fore pro­cess­ing a user’s IP ad­dress, but it also re­quires us to im­me­di­ately anonymize it (if we weren’t al­ready). This cov­ers all traf­fic data re­gard­less of how it is ac­cessed. You can of course use still use the IP ad­dress for traf­fic pur­poses, but any­thing else re­quires in­formed con­sent.

Since things like GeoIP lookups use the IP ad­dress and there­fore re­quire ac­cess­ing it first, using a user’s coun­try or city is also cov­ered.

Screen size, browser version and other client data
Covered by the Directive as data that has previously been stored by the browser vendor or device manufacturer, which we then access. As put by the EU Data Protection Working Party in a 2014 opinion on fingerprinting:
Information that is stored by one party (including information stored by the user or device manufacturer) which is later accessed by another party is therefore within the scope of Article 5(3). [...] The consent requirement also applies when a read-only value is accessed (e.g. requesting the MAC address of a network interface via the OS API).

In the same opin­ion, they also cover the act of fin­ger­print­ing using HTTP head­ers, using the User-​Agent header as an ex­am­ple, but do not specif­i­cally men­tion it as re­quir­ing in­formed con­sent.

User-Agent, Referer, and other HTTP headers
It could be argued that reading these headers isn't gaining access to data stored on the user's terminal equipment. However, these headers are specifically designed to allow tailoring of a response to the requester, and are sent for the purpose of allow the server to transmit the correct data (as described in the specification).

Since the ePri­vacy Di­rec­tive de­fines traf­fic data as being “any data processed for the pur­pose of the con­veyance of a com­mu­ni­ca­tion on an elec­tronic com­mu­ni­ca­tions net­work or for the billing thereof”, this means these HTTP head­ers also re­quire in­formed con­sent.

Any data stored by third parties (or us)
This is trivially covered by Article 5(3) of the Directive, which we have covered already:
[...] the storing of information, or the gaining of access to information already stored, in the terminal equipment of a subscriber or user is only allowed on condition that the subscriber or user concerned has given his or her consent [...]

As pointed out by the Work­ing Party in the pre­vi­ously quoted opin­ion on fin­ger­print­ing, this in­cludes in­for­ma­tion stored by the user them­selves or by de­vice man­u­fac­tur­ers - not to men­tion third par­ties. It blocks es­sen­tially all di­rect track­ing, as it is in­tended to.

This leaves us with very few, if any, data points left to use for fin­ger­print­ing with­out ask­ing the user for in­formed con­sent. While this is cer­tainly a win for pri­vacy when it comes to com­pa­nies that are in­ter­ested in find­ing loop­holes that let them track users, it does ex­tin­guish any hope we have of es­cap­ing the popup night­mare the web has be­come today.

So what can we use cook­ies for?

Just be­cause we can­not track users using any sort of data with­out con­sent doesn’t mean we need con­sent to use cook­ies. As put by ar­ti­cle 5.3: “[the re­stric­tions] shall not pre­vent any tech­ni­cal stor­age or ac­cess for the sole pur­pose of car­ry­ing out the trans­mis­sion of a com­mu­ni­ca­tion over an elec­tronic com­mu­ni­ca­tions net­work, or as strictly nec­es­sary in order for the provider of an in­for­ma­tion so­ci­ety ser­vice ex­plic­itly re­quested by the sub­scriber or user to pro­vide the ser­vice.”.

The first part, using stored or ac­cessed data for the sole pur­pose of trans­mis­sion, al­lows us to store in­for­ma­tion we may need to send in­for­ma­tion to the cor­rect place (e.g. an as­signed server when using load bal­anc­ing). The sec­ond part, using stored or ac­cessed data to ser­vice an ex­plicit re­quest by the user, al­lows us to use cook­ies for many of the com­mon web­site tasks we have, such as track­ing whether a user is logged in, what they have in their cart on e-​commerce sites, and how far they’ve come in ques­tion­naires.

For each of those cases it is im­por­tant to no­tice that we can­not use the in­for­ma­tion for any­thing other than the le­git­i­mate use out­lined above: just be­cause we use a cookie to track whether a user is logged in doesn’t give us a free pass to also use that cookie for track­ing pur­poses.

If you are in­ter­ested in a more in-​depth analy­sis of when these ex­emp­tions apply, the EU Data Pro­tec­tion Work­ing Party re­leased an opin­ion on ex­emp­tions in 2012 (with ex­am­ples) where they delve deeper into it. I can highly rec­om­mend read­ing it. It even in­cludes a spe­cific analy­sis of first-​party anonymized an­a­lyt­ics, con­clud­ing that “While they are often con­sid­ered as a ‘strictly nec­es­sary’ tool for web­site op­er­a­tors, they are not strictly nec­es­sary to pro­vide a func­tion­al­ity ex­plic­itly re­quested by the user […]. As a con­se­quence, these cook­ies do not fall under the [ex­emp­tions]” (al­though the work­ing group rec­om­mends that if Ar­ti­cle 5(3) is re­vis­ited, an ex­emp­tion for them could be granted - this has not hap­pened at the time of writ­ing).

Con­clu­sion

Al­though it might come as a sur­prise (it cer­tainly did for me), privacy-​aware and cookie-​free an­a­lyt­ics aren’t going to save us from cookie ban­ners with the way the leg­is­la­tion is cur­rently writ­ten. If you are in­ter­ested in avoid­ing cookie pop­ups on your web­site, there, un­for­tu­nately, aren’t any use­ful an­a­lyt­ics so­lu­tions that will legally allow that. Whether that is good or bad cer­tainly de­pends on per­spec­tive: user pri­vacy has taken huge strides, and its pro­tec­tion is im­por­tant. At the same time, even the EU Data Pro­tec­tion Work­ing Party has con­ceded that first-​party anonymized an­a­lyt­ics might be worth an ex­emp­tion. But since this ex­emp­tion has not yet been im­ple­mented, there is only one con­clu­sion pos­si­ble:

Like it or not, the cookie ban­ners are here to stay.