Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Are you saying they actively remove ISBN numbers from scans?

No, he‘s playing the pointless „well, actually a scan of a book is a different thing from the book itself“ game.



No, I'm saying that the ISBN doesn't describe titles, it describes editions, and editions matter.


You said:

> From a strict point of view, they've released new editions of these books.

And this is clearly a semantically worthless distinction from the point of view of the archive.

When different editions have different content, archiving those differences in that content may matter (arguably not for simple typographical corrections, printing errors, etc). When different ISBNs have identical content, it is totally irrelevant to the goals of the archive.


This is addressed somewhat in the "The critical window of shadow libraries" post

> Until now, the only options to shrink the total size of our collection has been through more aggressive compression, or deduplication. However, to get significant enough savings, both are too lossy for our taste. Heavy compression of photos can make text barely readable. And deduplication requires high confidence of books being exactly the same, which is often too inaccurate, especially if the contents are the same but the scans are made on different occasions.


A text may be derived from an edition with an isbn, but the isbn wouldn’t apply to that file, it is effectively a different edition.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: