> From a strict point of view, they've released new editions of these books.
And this is clearly a semantically worthless distinction from the point of view of the archive.
When different editions have different content, archiving those differences in that content may matter (arguably not for simple typographical corrections, printing errors, etc). When different ISBNs have identical content, it is totally irrelevant to the goals of the archive.
This is addressed somewhat in the "The critical window of shadow libraries" post
> Until now, the only options to shrink the total size of our collection has been through more aggressive compression, or deduplication. However, to get significant enough savings, both are too lossy for our taste. Heavy compression of photos can make text barely readable. And deduplication requires high confidence of books being exactly the same, which is often too inaccurate, especially if the contents are the same but the scans are made on different occasions.
No, he‘s playing the pointless „well, actually a scan of a book is a different thing from the book itself“ game.