I built a privacy friendly analytics project https://github.com/sheshbabu/freshlytics and I've been wondering how to correctly count unique visitors to a website. I don't store cookies or any PII data at all so by definition, it's hard to distinguish between two different visits - are they from same person or different people?
An alternative approach is used by Simple Analytics - https://docs.simpleanalytics.com/uniques where they use referrer header to derive unique visits. They mention that they don't use IP addresses as they're considered fingerprinting.
But it looks like a hash function (whose salt gets rotated daily) strikes a good balance between fingerprinting while maintaining user privacy. Any downsides to this approach?
downsides are basically that in order to gain the benefits for the user privacy and compliance with regulations, you lose a bit of accuracy depending on the situation. we cannot see whether the same person returns to a site on a different day so count them as a new unique visitor.
i assume that using the referrer header to count uniques has even more downsides as i imagine the number of unique visitors with that method would be much higher than it actually is.
That's a pretty interesting technique!
I built a privacy friendly analytics project https://github.com/sheshbabu/freshlytics and I've been wondering how to correctly count unique visitors to a website. I don't store cookies or any PII data at all so by definition, it's hard to distinguish between two different visits - are they from same person or different people?
An alternative approach is used by Simple Analytics - https://docs.simpleanalytics.com/uniques where they use referrer header to derive unique visits. They mention that they don't use IP addresses as they're considered fingerprinting.
But it looks like a hash function (whose salt gets rotated daily) strikes a good balance between fingerprinting while maintaining user privacy. Any downsides to this approach?