There’s a spec called ULID that’s pretty much this with default base32 encoding ...

j-pb · on Jan 22, 2021

ULID has a serious security flaw though. In order to try to improve performance / increase prefix similarity, there is a mode where it reuses the random value, and just increments it, for each new ULID.

With purely random UUID, you can be pretty sure that nobody can guess them, so they're essentially identifier and access token in one.

However once you can easily predict the next uuid from the previous one (as by incrementing the random part by one), the access token property vanishes.

At least this proposal doesn't make the same mistake, however 80 bit's isn't THAT much entropy. (Remember every bit counts with exponential growth, and 48 bits less, means 281474976710656 times easier to guess.)

So it boils down to. Would you be ok to secure all your data with 10 character one time passwords, which might not be guessed all at once, but where guessing SOME is sufficient to get owned?

Death by a thousand paper-cuts.

m1245 · on Jan 22, 2021

Thanks for the comment!

Yes, I'd like to clarify Timeflake (and many alternatives suggested) is NOT meant to be used for security.

There's an important note on the readme about this:

> Since the timestamp part is predictable, the search space within any given millisecond is 2^80 random numbers, which is meant to avoid collisions, not to secure or hide information. You should not be using timeflakes for password-reset tokens, API keys or for anything which is security sensitive. There are better libraries which are meant for this use case (for example, the standard library's secrets module).

j-pb · on Jan 22, 2021

You have to ask yourself the question though. If "URL-Safety" is a main feature. What URLs / user-data are NOT security relevant?

Leaking users private pictures / documents / whatever, will also kill a company.

I guess one could maybe circumvent this, by using a hash of the ID, but then you have to store that somewhere too, and you're back to square one.

Nevertheless, I like that you fixed the main flaw with ULID. Also you provide way better insight into the tradeoffs, so kudos!

remcob · on Jan 22, 2021

> using a hash of the ID

Hashing the IDs won't solve their lack of entropy. Crude example: If you hash your pincode I still have only 10^4 values to try.

The easiest way to fix this is to add an access token column that is cryptographically random and use both the ID and the token in the URL.

If you trust the 80 bits already in the ID the token only needs to be 80 bits for a total of 160 bits of entropy. But if you do that you have to make sure that a missing ID and invalid token are handled identically from the attackers perspective (same reply, same timing).

j-pb · on Jan 22, 2021

> Hashing the IDs won't solve their lack of entropy.

Yes and no. It would still significantly largen the search space, as an attacker wouldn't get a point of reference to latch on to.

sudhirj · on Jan 22, 2021

Which implementation are you referring to? The Go package uses crypto.random and generation blocks if the system can’t provide enough randomness. It’s possible to override this with custom implementations, but either ways it’s no less secure than a UUID unless the implementation is horrible.

The spec doesn’t specify a source of randomness - an implementation that uses https://xkcd.com/221/ will or course not be a very good one.

sudhirj · on Jan 22, 2021

If you’re referring to the monotonic counter implementation, then yes, that’s a documented choice you can make if you explicitly want strictly increasing ULIDs on the same millisecond, but given that you can’t ensure this across servers anyway, it’s more an edge case.

j-pb · on Jan 22, 2021

Yes, monotonic counters, are ULID's "billion dollar mistake".

They're not an edge case, in that the standard displays them quite prominently. They're actually the only code examples given.

The standard should deprecate them, because people WILL use them "wrong".

sudhirj · on Jan 22, 2021

Dunno about that. The implementations I’ve seen so far default to /dev/random and make you jump through hoops to get the monotonic counter, so if you manage to enable it they should assume it’s what you want. I’ve actually used them effectively in a couple of NoSQL problems where I didn’t need cryptographic security, just distributed timestamps - I was creating two sequential events (create and update), and they had to always order one after another, even though I created them in the same loop and millisecond.

mgkimsal · on Jan 22, 2021

I've been using ULID on a couple projects. In one case, we're trying to convert an existing project to use them, and... it's a bit of a pain, but it's not ULID's fault as much as just... changing primary keys all over the place is time consuming and not simple.

The biggest pushback (and it's not a terribly big one, imo) I've read on ULID is that the time portion is sold as 'sortable' but is 'only ms-level resolution'. Apparently the critics I was initially reading on ULID needed sub millisecond accuracy, and somehow thought this made ULIDs bad. Seemed a non-useful criticism, as needing that level of time granularity isn't a day to day issue for most people, but might be something to be aware of if the idea of overloading one field with multiple uses is a requirement you have.

techdragon · on Jan 22, 2021

If you’re an edge case you’re an edge case. It’s frustrating when people start assuming their use case is normal without any considerations of the wider world.

Sub millisecond precision sorting of high speed database inserts is hardly a normal requirement. At that point even the size of your transaction can matter (depending on the database) and you have to start asking questions like do I want ordering by the time the transaction starts or the time it ends? And if your loading existing data from other sources and care about date ordering you should just be using the a dedicated column with the database appropriate time/date time/time stamp field anyway.

spiffytech · on Jan 22, 2021

I adopted ULID for a project after seeing it recently mentioned in another HN comments thread[0] and it's very useful. The project uses Azure Table Storage, which encourages querying by ranges of record IDs for efficiency, instead of filtering datasets by arbitrary record properties.

My project revolves around processing windows of time in timestamp-ordered datasets, and ULID works well as a way to generate record IDs in a distributed system that naturally sort in timestamp order. With ULID, all of my major queries can operate on ranges of record IDs, as Azure Table Storage encourages.

[0] https://news.ycombinator.com/item?id=25650907

mfollert · on Jan 22, 2021

This is super useful and exactly what I need for a current project. Thanks you for that hint!