Do cookie-free analytics need cookie banners?
Published Jun 09, 2022 — last updated Jan 20, 2025
In recent years there’s been a bit of a push for “privacy-aware analytics”, a group of web analytics solutions that claim to do analytics without any of the morally dubious tracking the industry has otherwise been known for. In addition to putting a lot of effort into not tracking individual users, their primary claim to fame is that they say they don’t need cookie banners.
But… how true is that? While I can be convinced about the privacy aspect of it, the legal aspect seems a bit less well-argued. EU laws are complex and far-reaching, and it wasn’t entirely clear to me that you could skirt them like this.
As just some dude with no legal background but with way too much time on my hands, I figured I’d be as good as any to dive into EU laws on “cookie banners”, and see if you can actually do analytics without them.
The technical background
All web analytics need to somehow track individual users. Since HTTP is a stateless protocol, it’s up to the analytics solution to somehow correlate HTTP requests from the same user so it can count unique visitors (have we seen this user before?) and calculate bounce rates (is the user that just requested /about
the same user as the one that requested /home
3 minutes ago?).
Traditionally this has been done by storing a small token on the user’s device which their browser then attaches to future requests: a cookie. The analytics backend can then compare that to a list of previously seen tokens, and thereby know whether the user is new or has visited before.
This is how the web has worked for decades at this point. Unfortunately it has also been abused for roughly as long, causing widespread tracking that the EU has tried to crack down on.
Most notably they’ve done this by passing the ePrivacy Directive, a directive from 2002 that (amongst other things) prohibits storing cookies on end-user devices without consent (unless strictly required for functionality), and later the GDPR that requires consent for processing personal information (see the next section for more details).
Some crafty souls have come up with a way to skirt this requirement: “cookie-free analytics”, aka. fingerprinting. Instead of storing a small token on the user’s device, the backend instead calculates a unique token when the request comes in based on the user’s available data (e.g. their browser version and IP address):
In many ways this fingerprinting approach can be used for far worse things than cookies can (it’s a lot harder to change browser than it is to delete a cookie), but it does seem like a promising way to avoid the EU’s consent requirements. There’s still the problem of the GDPR, but we can largely get around that by strongly anonymizing the gathered data using binning, differential privacy, or similar.
This is the approach taken by companies delivering “privacy-aware analytics”. One of the first was Fathom, but other (more or less) self-hosted solutions like Plausible, Ackee, and Counter have also started popping up. They all share the goal of not tracking individual users, while still gathering useful insights into how visitors interact with a website - and without using cookies.
At least Fathom and Plausible explicitly claim that you don’t need cookie banners if you use their product:
The legal background
Note: I am not a lawyer. The following is my simplistic understanding of the relevant legislation. Do not use it as legal advice.
When we’re talking about cookie banners there are primarily two different EU laws we’ll need to cover: the older cookie law, the ePrivacy Directive, and the big baddie, GDPR. These two both relate to data privacy and are often confused, so here’s a quick breakdown:
- ePrivacy Directive [Directive 2002/58/EC]
- Directive from 2002, later amended in 2009.
One of the early attempts at legislating online privacy. Puts rather strict limits on what data can be accessed or processed without requiring informed consent from the user. This is regardless of whether the data is personal or not.
Note that this is a directive, not a regulation, meaning it is up to the individual EU countries to implement the directive into law. We’ll arbitrarily ignore this distinction, and I will only be considering the wording of the directive itself in this article. - GDPR [Regulation (EU) 2016/679]
- Regulation adopted in 2016, came into force in 2018.
Huge law primarily addressing the right of an individual to control their personal data. Trivial examples include names, birthdays, and emails; more complicated examples include browsing history, timestamps, and preferences. If there is any theoretical way that data can be traced back to a person, that data is probably covered by GDPR.
GDPR is a huge beast and one that I am far from competent enough to cover. I will be assuming that the analytics solution you’re looking to implement already strongly anonymizes all data, and therefore won’t cover GDPR further.
We’ll be focusing our efforts on the ePrivacy Directive, since (I believe) the consent required by GDPR can be largely skirted around by anonymizing data properly.
The ePD Article 5(3)
Article 5(3) of the ePrivacy Directive, which is the article that requires consent for cookies, says:
Article 5(3). Member States shall ensure that the storing of information, or the gaining of access to information already stored, in the terminal equipment of a subscriber or user is only allowed on condition that the subscriber or user concerned has given his or her consent […] This shall not prevent any technical storage or access […] as strictly necessary in order for [a web service provider] explicitly requested by the subscriber or user to provide the service.
Considering how legalese this is it isn’t that difficult to parse: it’s just saying that you can’t access any data that’s stored on terminal equipment (nor store any new data), unless
- you have consent, or
- it’s strictly necessary for fulfilling what the user asked for.
There are some details we’ll have to walk through though:
- What does “gaining of access” mean?
This actually isn’t well-defined in the ePD itself; instead, the EU Data Working Group and the EDPB have tackled the problem in various opinions. Their 2023 guidelines summarizes it in paragraph 32:
Whenever an entity takes steps towards gaining access to information stored in the terminal equipment, Article 5(3) ePD would apply. Usually this entails the accessing entity to proactively send specific instructions to the terminal equipment in order to receive back the targeted information. For example, this is the case for cookies, where the accessing entity instructs the terminal equipment to proactively send information on each subsequent Hypertext Transfer Protocol (HTTP) call.
So essentially any kind of “take this information and send it to me” process would fall under “gaining of access” to said information. This counts even if the request is implicit, e.g. the browser is asked to send a new request which just so happens to come with additional data because of an agreed upon protocol (paragraph 34), and even if the information was stored by a different entity than the one accessing it (paragraph 31).
- What exactly is “terminal equipment”?
The ePD piggybacks on prior and rather extensive legislation for defining this. For our purposes it’s enough to know that both the user’s computer and each individual component of it is terminal equipment. There are some less well-defined details for when in the networking path things stop being terminal equipment (is the customer’s residential router? the ISP’s NAT? the routing infrastructure of the internet at large?), but for those details you’ll have to look at section 2.3 of the EDPB 2023 guidelines.
- What information is considered “already stored”?
As noted in section 2.6 of the 2023 guidelines:
‘Stored information’ refers to information already existing on the terminal equipment, regardless of the source or nature of this information. This includes […] storage on the terminal equipment by the user or subscriber themselves, or by a hardware manufacturer (such as the MAC addresses of network interface controllers), sensors integrated into the terminal equipment or processes and programs executed on the terminal equipment, which may or may not produce information that is dependent on or derived from stored information.
This covers data explicitly stored in a file (e.g. cookies), data stored in memory by other parties (e.g. the browser’s
User-Agent
string) and even hardware information (e.g. MAC addresses or screen sizes):In essence, any and all persistent information from the terminal equipment is “stored” at some point, and so would require consent to access for purposes other than what is strictly necessary.
- What does “strictly necessary” mean?
According to a 2012 opinion on exemptions from the EU Data Protection Working Party, this means
[The access/storage] is strictly needed to enable the information society service: if [the access/storage] is disabled, the service will not work.
Some site owners argue that analytics are “strictly necessary” to deliver their website, so Art. 5(3) shouldn’t apply. This exact argument is addressed by the 2012 opinion:
While [cookie-based analytics] are often considered as a ‘strictly necessary’ tool for website operators, they are not strictly necessary to provide a functionality explicitly requested by the user […]. As a consequence, these cookies do not fall under the [exemption].
So in short: if you have a technical need to access the data to serve the user’s request, then you’re good without explicit consent. If it’s just because you want to, then that’s no good.
All in all the most important part to remember is that Art. 5(3) blocks us from storing or accessing pretty much any data from the terminal equipment, unless it’s strictly required to fulfill the user’s request, or we have gathered informed consent.
Cookie-free doesn’t mean free of cookie banners
To see what all this means for cookie-free analytics, let’s have a look at the algorithm used by Fathom. It is essentially identical to that of Plausible, and presumably all the rest of the privacy-aware analytics out there:
To [fingerprint the user], we combine the following data to create a unique hash for the user:
- A salt based on IP address & Site ID
- IP Address
- User-Agent
- Hostname (The domain of the website, e.g. https://usefathom.com)
- Site ID
We then take this data and perform a SHA256 hash on it.
If we boil it down to just the parts that are relevant for us, Fathom (and Plausible et al.) are generating a hash based on the IP address of the incoming request and the User-Agent
HTTP header. They do this by including a bit of JavaScript on the page that sends a request to their backend.
Unfortunately it is my belief that even this runs afoul of the ePD Art. 5(3) in terms of requiring consent:
- It’s accessing the
User-Agent
HTTP header, which is transmitted by the browser as part of an agreed upon protocol (specifically RFC 9110). This is done when the analytics solution explicitly asks the browser to send a new request. This is a clear case of “accessing” theUser-Agent
under the ePD. - The
User-Agent
HTTP header is stored by the browser vendor (or is derived from information previously stored) on the end-user’s machine, before being transmitted to the analytics solution. It almost certainly falls under “information already stored”. - It’s not strictly necessary in order to fulfill the user’s request to see the web page, so it’s not exempt.
The EDPB (unfortunately) seems to agree with that analysis. In their 2023 guidelines paragraph 51 they say:
The addition of tracking information to URLs or images (pixels) sent to the user constitutes an instruction to the terminal equipment to send back the targeted information (the specified identifier). In the case of dynamically constructed tracking pixels, it is the distribution of the applicative logic (usually a JavaScript code) that constitutes the instruction. As a consequence, it can be considered that the collection of identifiers provided through such tracking mechanisms constitutes a ‘gaining of access’ in the meaning of Article 5(3) ePD, thus it applies to that step as well.
So sending out JavaScript code that instructs the terminal equipment to send back information is… accessing the information. It doesn’t matter what kind of hashing or anonymization the analytics engine is doing, neither before nor after the information leaves the terminal equipment: the simple act of accessing it is enough to require consent under the ePD.
Even worse, and luckily a bit less definitive, accessing the IP address might even be in violation:
In [the context of IP-only tracking] Article 5(3) ePD could apply even though the instruction to make the IP available has been made by a different entity than the receiving one. However, gaining access to IP addresses would only trigger the application of Article 5(3) ePD in cases where this information originates from the terminal equipment of a subscriber or user […]. Unless the entity can ensure that the IP address does not originate from the terminal equipment of a user or subscriber, it has to take all the steps pursuant to the Article 5(3) ePD
All of this would apply even if the analytics solution fingerprinted the original request, instead of asking the browser to send a new one, as noted in EDPB 2023 paragraph 41:
Network communication usually relies on a layered model that necessitates the use of identifiers to allow for a proper establishment and carrying out of the communication. The communication of those identifiers to remote actors is instructed through software following agreed upon communication protocols. As outlined above, the fact that the receiving entity might not be the entity instructing the sending of information does not preclude the application of Article 5(3) ePD. This might concern routing identifiers such as the MAC or IP address of the terminal equipment, but also session identifiers (SSRC, Websocket identifier), or authentication tokens.
So what now?
I originally started exploring this topic because I wanted to create a privacy-aware analytics solution of my own (this was back in 2020, when the available options looked a little different). This was partly for fun, partly to see if I could design something that didn’t require cookie banners.
Unfortunately I came to the rather demotivating conclusion that there simply isn’t any way to implement web analytics without running afoul of the ePrivacy Directive.
This was a surprising conclusion at the time. Morally we can go very far: we can put a lot of smart stuff together and create a system that can’t be used to track individual users. But legally, that doesn’t particularly matter. The ePrivacy Directive is written as it is.
Even the EU Data Protection Working Party decries this. In their 2012 opinion they write:
the Working Party considers that first party analytics cookies are not likely to create a privacy risk when they are strictly limited to first party aggregated statistical purposes and when they are used by websites that already provide clear information about these cookies in their privacy policy as well as adequate privacy safeguards. […] In this regard, should article 5.3 of the Directive 2002/58/EC be re-visited in the future, the European legislator might appropriately add a third exemption criterion to consent for cookies that are strictly limited to first party anonymized and aggregated statistical purposes.
This “re-visiting” of the ePrivacy Directive is supposed to come eventually in the form of the ePrivacy Regulation. At the time of writing it has been delayed for 6 years and counting.
Until then we’re stuck with cookie banners. Or just ignoring the issue and doing it anyway, I guess.