Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not every book is readily available as a digital copy. Things like textbooks, older technical books or just books that weren't too popular can be easier to source as physical books and scan destructively.

A lot of digital copies are also DRM'd to shit - to obtain raw text usable for AI training, you'd have to break DRM. Which isn't that hard, on a technical level - but DMCA exists.

DMCA is a shit law that should have been dismantled two decades ago - but as long as it's around, bypassing DRM on things you own can be illegal. Scanning sidesteps that.



I'm anti-DRM personally, but I suppose in this case we could argue it's serving its purpose, it's just that workarounds have been found in the form of scanning physical books.

If no physical copies existed and there were only DRMd digital copies of everything, the companies scanning books for AI training would be forced to work out some deal with the DRM-overlords to have it removed for their use. That (I think) would be a net benefit as hopefully the authors would get paid too.


You say you're anti-DRM but that sounds like a very pro-DRM stance to me.

Within the bounds of personal use, copyright holders should have no say over what people do with media after it is sold. That goes equally when the entity that buys the media is a company rather than a person. The entire reason DRM is a problem is that it subverts that principle using technical means.


That wasn't a stance, it was a hypothetical.

I'm totally in agreement with you, once we buy something, it should be ours to do with as we wish, company or person. DRM is the sketchy technical solution that doesn't really solve a technical purpose, it's easily broken, but serves a legal one; the act of breaking it is the legal issue.

I make my stance by avoiding buying DRMd content where possible; DRM free games and digital books, but it's not always possible to avoid, if I buy a BD, I can't rip it to my NAS without subverting the DRM.

Linux is also the only OS running in my home (on computers with screens and keyboards) so I mostly can't even legitimately play those DRMd things if I buy them, whether it's a BD, or Netflix in my web browser, or whatever else if I wanted to.

I'm very, very much anti-DRM.

EDIT: Typo


> Within the bounds of personal use, copyright holders should have no say over what people do with media after it is sold. That goes equally when the entity that buys the media is a company rather than a person.

how can a company be covered under personal-use?


The problem is the transition from analog to digital. It is entirely legal, for an entity to buy a physical book, and then loan those books out, aka a library. That entity is free to charge money, or might even be a part of the local government. But see, copyright is a thing that was invented in the first place. Why should it even exist in the first place? Of course we can't argue with the fact that the world we are in has copyright, but in countries where there is less copyright protection, it doesn't seem like the sky has fallen there either. We want to promote science and useful arts and incentivize creation. It's supposed to be a temporary monopoly granted by the government before works fall into the public domain. Originally 14 years, with another 14 years if the author was alive. We should absolutely do what we can to encourage science and the arts, but Disney's managed to take it way further than it was originally specified for.


Given that "training AI on books you own" was ruled fair use, the "purpose" DRM is serving here is preventing fair use.

Which is the kind of thing you would expect it to do.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: