I'd worry a bit about the chance of collisions. They correctly state you can gen...

zuzun · on Jan 22, 2021

Look for a different solution if you need 1 quintillion IDs per year. Noted.

ktpsns · on Jan 22, 2021

I guess you assume to build one timeflake per ms for one year. That's about 3e13 timeflakes. Ordinary UUIDs have a collisions at the order of 1e18 hashes (cf. https://en.wikipedia.org/wiki/Universally_unique_identifier#...). So we are "only" 5 orders of magnitude away. For me the collision quality feels the same.

Ellipsis753 · on Jan 22, 2021

If you assume one per ms then timeflake and UUIDv1 both have a 0% chance of collision as they encode the timestamp.

You might go above this rate some of the time though and we might as well strive for perfection.

One of the nice things about UUIDs is that you can just merge data together willy-nilly without worrying about collisions - for example, if you gets bought by Google. You can't really do this with timeflake.

phkahler · on Jan 22, 2021

>> If you assume one per ms then timeflake and UUIDv1 both have a 0% chance of collision as they encode the timestamp.

They can be generated anywhere at any time. It two machines are generating even 1 per hour, they will have collisions if their clocks are synchronized. The random bits are important.

pas · on Jan 22, 2021

You'd run into a bigger problem than collisions a lot sooner (IMHO): time ordering. "ms" precision is very misleading, because keeping clocks really in sync while processing this much data is a real challenge.

Sure, maybe it doesn't matter that one server/emitter and thus a bunch of IDs are out of order, but if this is really just for helping indexes, then a simple 0.1 second precision would make more sense. (Or even simply truncating the timestamp to however many bits to get ~2-30 sec precision, and using a data store with LevelDB-like staged compaction. Which is virtually a requirement at this scale.)

doersino · on Jan 22, 2021

> Unfortunately this means the chance of a collision in a year is 99.999999999998% which isn't great!

How was this computed? My intuition begs to differ.

onetoo · on Jan 22, 2021

Probability of no collision in a single millisecond: p_1 = 999999999/1000000000

Milliseconds in a year: t = 1000 * 60 * 60 * 24 * 365

Probability of no collision happening for every millisecond of a year: p_2 = p_1^t

Probability of a collision happening for at least one millisecond of a year: p_3 = 1 - p_2

https://www.wolframalpha.com/input/?i=1+-+%28999999999%2F100...

jychang · on Jan 22, 2021

There are 31,536,000,000 milliseconds in a year. If you do (1-1E-9)^3.1536E10, you get a 2E-14 chance of no collision in a year.

hoseja · on Jan 22, 2021

I think the presumption is you will not be generating 50 million of them each millisecond.

alpaca128 · on Jan 22, 2021

Which is a pretty safe assumption, considering just generating those 128bits 50 million times a millisecond results in over 20 million TB of data in a year. I think it's safe to say this is not a common scenario.

mytailorisrich · on Jan 22, 2021

There are close to 32 billion millisecond in a year. If the chance of a collision happening within a millisecond is 1 in a billion then my intuition is that the probability of a collision happening at least once a year is quite high, indeed.

It's like a dice: You have 1 in 6 chance of getting a 6 every time your roll the dice. Now, if you roll the dice 100 times the chance that you'll get a 6 at least once is then pretty high.