> Are you saying they actively remove ISBN numbers from scans? No, he‘s playing ...

NoMoreNicksLeft · 2025-01-10T20:18:52 1736540332

No, I'm saying that the ISBN doesn't describe titles, it describes editions, and editions matter.

nickelpro · 2025-01-10T23:27:06 1736551626

You said:

> From a strict point of view, they've released new editions of these books.

And this is clearly a semantically worthless distinction from the point of view of the archive.

When different editions have different content, archiving those differences in that content may matter (arguably not for simple typographical corrections, printing errors, etc). When different ISBNs have identical content, it is totally irrelevant to the goals of the archive.

edflsafoiewq · 2025-01-11T11:22:08 1736594528

This is addressed somewhat in the "The critical window of shadow libraries" post

> Until now, the only options to shrink the total size of our collection has been through more aggressive compression, or deduplication. However, to get significant enough savings, both are too lossy for our taste. Heavy compression of photos can make text barely readable. And deduplication requires high confidence of books being exactly the same, which is often too inaccurate, especially if the contents are the same but the scans are made on different occasions.

Finnucane · 2025-01-11T02:50:32 1736563832

A text may be derived from an edition with an isbn, but the isbn wouldn’t apply to that file, it is effectively a different edition.