No still laggy for me, I only use it for some alternate accounts. Just feels jittery compared to chrome specially when you are using the element inspector.
I have the complete opposite experience. Firefox runs smoothly
even when I have 150+ tabs open in 4-5 windows. Meanwhile
running 20+ tabs in Chromium bogs down my system so hard that
the mouse can start lagging.
To be fair it runs very nicely when not excessively tabbing
everything, but that's just how I prefer to use the web.
Same. Firefox gets bogged down debugging one of our more complex websites. Was quite surprised when chrome didn't. Figured all the speghetti meant it was doomed either way.
We generate a daily changing identifier using the visitor’s IP address and User Agent. To anonymize these datapoints, we run them through a hash function with a rotating salt.
This generates a random string of letters and numbers that is used to calculate unique visitor numbers for the day. Old salts are deleted to avoid the possibility of linking visitor information from one day to the next.
Doesn’t this mean that if you accidentally serve your website on two domains (e.g. example.com and example.net with no redirect) it will count one visitor twice if they visit both domains? What is the benefit to including the domain in the hash?
Not really keen on the use of the IP address. I’ve been behind load balancing proxies and weird mobile networks often enough to know that I can appear from a dozen different IPs in the space of an hour just by browsing the web normally with default settings.
Have you considered requesting a 24 hour private cacheable resource and counting the requests on the server? Or is the browser cache too unreliable?
More accurate would be that we use the site_id in the hash. You can serve your website from many different domains but as long as the site_id in your script tag is the same, the hash remains the same.
The site id is included in the hash to prevent cross-site tracking. Otherwise the hash would act almost like a third-party cookie and people could be tracked across different sites.
> Not really keen on the use of the IP address.
Yeah, it's not ideal. We do check the X-Forwarded-For header so as long as the proxies are being good citizens, the client IP is present in that header.
Is it possible to track users over a week or month? Often the same user return to our website several times before buying and it's important to learn about this behavior.
no. that's the decision we made in order to be as privacy focused as possible. there's no way to know whether the same person comes back to a site the day after or later. they will always be counted as a new visitor after the first day.
Not sure if privacy is a valid argument here. As long as visitors are annonymised and it is self hosted I don't see how this would invade a person's privacy.
The problem with Google Analytics is not that they track a user on a single domain, the problem is that users are tracked on a *.google.com domain and Google knows who you are based on your other google sessions and knows everything you do on the internet because every website uses GA. With a self hosted product that wouldn't be the case, so the privacy is given by nature even if you'd track a user across a space of a month or longer.
I agree, self-hosting and decentralizing analytics data is the biggest improvement to user privacy that can be done today and has multiple other benefits for both the user and the webmaster.
Now, if only the EU would push more against data decentralization instead of writing cookie policies that result in horrendous user experience on the web...
That is not how Google Analytics works. Its cookie (the Client ID) is a first-party cookie set on the domain of the website hosting it.
Google Analytics optionally integrates with both Google Ads and DoubleClick, and both of those integrations do a cookie-match against .google.com or .doubleclick.net cookies. But those integrations are optional and off by default.
this will limit your solution from many deployments.
Knowing returning visitors from first time visitors is quite important and helps to asssess if viewership, audience and customer base is growing over time.
For startups the "how many unique visitors do you get in a month" may be an important KPI and you're saying your solution cannot answer this question, so another solution will be needed to be deployed.
Unique visitor data's also needed to assess effectiveness of campaigns and run e-commerce operations. There's often campaigns to bring back a user who previously didn't buy (email, ads etc). It's important to measure the effectiveness of these investments separately in web analytics given the campaigns will be different for new and recurring visitors.
Correlation will hold between unique hashes and unique visitors to access increase in unique visitors for your campaign. Even the percentage of these accesses that are returning is likely constant. All you would be giving up is the ability to measure variation in the returning percentage across several days (even so you could probably modify the code to change the salt every x days without losing much of the privacy benefits)
There will always be value to be extracted from the invasion of your users' privacy, but you also hit diminishing returns over this increasingly invasive probing. Plausible is aiming for "good enough" whilst respecting people's privacy, and that is a good compromise IMHO.
There is a trade-off. You will never get 100% of the information without all the tracking, but there is information that represents more bang for the privacy buck.
Would you not have a acceptable error increase in your decisions with a bit less information and a lot less privacy invasion?
EDIT> I think more control to the user is better, so instead of canvas fingerprinting, shady cross-site tracking and all, I would rather have a uuid that my browser informed, but that I controlled, so I could be anonymous when I want to and be tracked when I don't care, or when I genuinely agrees it adds value.
"this will limit your solution from many deployments."
I think tech companies (particulalry Google) have conditioned us to expect analytics [1] to be an essential component of all apps and web services. Developers have happily accepted this, rather than questioned it (unless they happen to be the ones being tracked). But actually, analytics may not need to be as detailed (or as intrusive) as many think it needs to be.
Here is a blog post from Whimsical (an online flow chart tool) who decided to remove Google Analytics:
> "We realized that all our tracking stuff had barely marginal value. We were just accumulating data because it might be useful someday. And because everybody else was doing it. Yet 100% of our product decisions over the past year were based on strategy, qualitative feedback and our own needs." (My emphasis)
[1] Words like 'Analytics', 'Telemetry', 'Web Beacon' etc are examples of the dishonesty of the tech industry in using words to hide their real purpose and soften their impact. All of these words are about tracking online behaviour, but no-one would dare use the clearer, more honest word – tracking – in their app or web copy.
Apples and oranges. You have taken the example of a SaaS tool who might not need that level of information. But, a newspaper or a magazine absolutely needs deep information in order to sell advertising and sponsorship on their site.
yeah, i understand. we're trying to have a balance between these:
1. privacy of site visitors
2. compliance with privacy regulations
3. useful and actionable data for site owners
it's difficult to track people from visit to visit or from one device to another without breaking the first two (cookies, browser fingerprinting...) so we had to make some decisions.
in general sites that try to get visitor consent to cookies and/or to tracking realise that majority of them don't give it, so even the data that may not be as accurate as full on tracking becomes very valuable.
The problem is that I most likely have to get consent (opt-in) from the user in order to collect that data - at that point I can either trick my visitors into giving it by using various dark pattterns or I end up with a dataset that is pretty much useless since nobody will opt-in.
If those are my options I prefer not having to annoy my visitors with consent banners over extended and persistent analytics data. That's why we actually migrated from Google Analytics to Plausible.
By the way: Import of analytics data from Google would be a great feature if anyone from Plausible is still reading here.
If you have a customer relationship, then they can sign in (and accept your ToS) - and then you can, if they allow, track them that way. Similarly for re-activation campaigns, one can use something like UTM.
Don't need to track everyone on the internet with generic stats mechanisms (like Plausible provides) for that.
I built a privacy friendly analytics project https://github.com/sheshbabu/freshlytics and I've been wondering how to correctly count unique visitors to a website. I don't store cookies or any PII data at all so by definition, it's hard to distinguish between two different visits - are they from same person or different people?
An alternative approach is used by Simple Analytics - https://docs.simpleanalytics.com/uniques where they use referrer header to derive unique visits. They mention that they don't use IP addresses as they're considered fingerprinting.
But it looks like a hash function (whose salt gets rotated daily) strikes a good balance between fingerprinting while maintaining user privacy. Any downsides to this approach?
downsides are basically that in order to gain the benefits for the user privacy and compliance with regulations, you lose a bit of accuracy depending on the situation. we cannot see whether the same person returns to a site on a different day so count them as a new unique visitor.
i assume that using the referrer header to count uniques has even more downsides as i imagine the number of unique visitors with that method would be much higher than it actually is.
yes, that's a trade-off in this privacy first approach. if several people are on the same ip address, visiting the same website and having the same user agent on the same day, they look the same to us.
That seems pretty robust, but I have question about the salt. How does the daily salt part work? How is it stored during the day? Are historic salt values stored?
That's a pretty valid criticism. I am wary of significant whitespaces too as are many people(and one of the reasons I use Ruby for scripting rather than python.)
It's also hard to argue against it. In the end, Google uses all that data to make more money with ads. And having relevant ads for your taste isn't necessarily bad for many.
The amount of data collection is truly bad for people living in totalitarian regimes or have sensitive jobs but that's not the majority of friends for most of us. For the rest, the lack of privacy has no negative impact on their life and is unlikely to have in the near future, so there's no reason to change.
Wanting change to benefit all of us is what we try (and fail) with preventing climate warming.