More

seiflotfy · 2025-02-28T08:27:17 1740731237

I fail to see how this works with hyperloglog. But will read more

onedognight · 2025-03-03T20:45:49 1741034749

I meant for the `get_beta` function which does polynomial evaluation.

seiflotfy · on Jan 10, 2023

Actually i found zig easier to read than rust… something that i found more appealing!

seiflotfy · on Jan 9, 2023

This is awesome... question though: ```pub fn HyperLogLog(comptime p: u8) type { return struct { dense: [1<<p]u6;

            const Self = @This();
            pub fn init() Self {
                var s = Self{};
                for (s.dense) |*x| x.* = 0;
                return s;
            }
        }
    }```

doesn't this allocate 1<<p upfront though. If yes then the HLL of size 16384 bytes upfront which kinda beats the purpose of having a sparse representation no?

kristoff_it · on Jan 9, 2023

> doesn't this allocate 1<<p upfront though

Yes, it does. The idea (see the last code snipped in that post), is that the user delays the creation of the HLL until they are ready to switch to a dense representation. Before then, they just use a std.AutoHashMap directly.

Or, alternatively, the HLL could use the same buffer for both dense and sparse representation (see the Redis code).

seiflotfy · on Jan 9, 2023

Author here, will publish my vlang take on it in the next couple of days.

henry_viii · on Jan 9, 2023

Looking forward to it :).

It seems the V port should be pretty straightforward (the repo is 880 LOC and doesn't have any dependencies).

seiflotfy · on May 17, 2022

CTO here... we built our own homegrown tsdb that works on top of S3. Coordination free ingestion, Serverless querying :D

mikercampbell · on May 17, 2022

Interesting. What made you believe that a homegrown solution was better than existing alternatives?

I'm not being critical, I'm genuinely interested

seiflotfy · on May 17, 2022

Great question (and apologies for the length of the answer)! Through our previous experiences in building services around tsdbs & relying on tsdbs to host monitoring data, we kept hitting the same issues around ingest, retention, and querying capabilities.

Existing solutions, and those architected in a more traditional way than Axiom, would require highly-coordinated nodes running on expensive VMs that would be bound by cpu, memory, and storage depending on your use case:

- Want to store TBs of data for months/years and query any piece of that at any time? Prepare to have expensive SSDs or wait for 'archived' data to swap in from cheaper storage.

- Want to run a query that combines N datasets, calculates aggregations, over TBs of data, and then compare that against data with the time shifted back a week, month, or year? Great, fire up some heavy VMs with enough cpu, memory, and bandwidth to compute all that.

- What if you use case varying greatly in each dimension (how much ingest you'd use, how much storage you'd need, and how heavy your queries would be) across a day, week, or month? You'd have to find a way to adapt to changing requirements or just scale up for the worst case x2 and just pay the $$$.

With Axiom, we had three key goals:

- Hyper-efficient, co-ordination free, schema-less, and index-free ingest (~1.4TB/day on a $5/mo container)

- Cheapest storage for hot, cold, archived and warehoused data: all data is stored in object storage, highly compressed & ready to query. 10 seconds or 10 years old data is the same, already as cheap as possible.

- Serverless querying that can expand from 0 to as-much-as-AWS-will-allow depending on what your query needs + how many of your team are querying at once.

The above was achieved through a lot of trial-and-error, tweaking, testing, and head-scratching, but we're really proud of what we've built. We can run super-fast ad-hoc queries, we've eliminated the need to think about retention, we have live streaming, and we have an incredible query language inspired by Microsoft's Kusto (Splunk users will be familiar!).

We still have a lot I'd like to see done, but I think we really do have something unique :)

mikercampbell · on May 20, 2022

I love this and I love the response!!

And I'm glad you went to the lengths you did in your explanation!

Those specs are _insane_. What was your stack other than S3?

Do you have plans on PaaS/SaaS-ing your tsdb?

I've kind of taken it on as my life's work to study the software development process when it comes to "Product vs Platform" solutions. It's something I'm passionate about, and I'm forming strong opinions, loosely held for now.

So often in my (still short) career I've encountered so many "we couldn't find the right fit and so we built our own"-isms, and with very, very few have I ever felt like it was necessary.

Here is the rough draft of my opinion on internal solutions vs 3rd party/FOSS solutions. It's in the heat of building a series of internal solutions that aren't necessary, given the constraints and resources available.

https://mikercampbell.bearblog.dev/build-tech-or-product/

I'm currently working on a blog post that explains how (this time with more tech than just opinion) 95% of tech problems could easily be solved by 5% of the technology out there.

With the traffic and demands you have, you are of the 5% of tech problems where it isn't improbable that you couldn't find a tech to meet your needs.

I just need more experience in the field of tooling before I feel confident putting my foot down anywhere and drawing the line. So I'd love to talk more if you're open to it!

seiflotfy · on May 16, 2022

Leerob and co are fantastic :D The feedback we got from the Vercel team while building the integration was invaluable :D

Also feel free to join our slack for more feedback :D

seiflotfy · on Dec 5, 2021

Sorry had to edit my reply ;)

seiflotfy · on Dec 5, 2021

We are familiar with honeycomb and we also have a honeycomb->axiom multiplexer, so you can use their great tools, to forward to use and compare side by side.

https://github.com/axiomhq/axiom-honeycomb-proxy It's called proxy but its actually a multiplexer ;)

Some things where we are different:

* We don't sample if you push to us directly.

* Our pricing is different (https://www.axiom.co/pricing#calculator)

* We are younger but we built it a bit differently, with object storage being our only storage.

* We have a Kusto(-ish) language (https://play.axiom.co/axiom-play-qf1k/explorer?qid=uqXTzFSTv...)

* We provide a streaming view https://play.axiom.co/axiom-play-qf1k/stream/github-issue-co...

seiflotfy · on Oct 22, 2021

renamed to variance

seiflotfy · on Oct 19, 2021

The beginning of Lockdowns in 2020 is when there was more mention of death than life on HN.