> Do you ask for permission when you train your mind on copyrighted books?
I pay the books directly (cash, credit) or indirectly (school books via taxes). I do pay the louvre to observe the painting. I also pay to listen music in ads (YouTube) or via subscription (YT Music and Spotify).
But the datasets these tools are using are available to view for free. The AI isn't stealing physical books or paintings, it's viewing the same data that you or I can by sending an HTTP request, for free.
Could an AI view you for free in a street or even through a window? Does that imply it can use that view data to create advertising using your modified likeness, for example?
Just because you can view something for free doesn't mean you can use it anyway you want.
> Just because you can view something or free doesn't mean you can use it anyway you want.
This whole thread really makes me want to pull my hair out.
Difference between illegaly creating a (even temporary) copy of a copyrighted work (e.g. streaming a movie) vs. creating a derivative work of said copyrighted work: Two completely different things, with completely different legal outcomes.
If OpenAI in any shape or form creates a temporary copy (<--- by copyright definition of what a copy is!) than this needs to be adressed with the former. If OpenAI creates a work that is considered to be a derivative work (<---- by copyright definition of what a derivative work is!) than that needs to be adressed with the latter.
The crux of this whole thing is: Human minds cannot make a copy of a copyrighted work by definition of copyright laws (in Germany, I presume the same can be said for pretty much all western copyright laws), while anything that a computer does can be construed as making a copy.
> anything that a computer does can be construed as making a copy.
but that's not the point of contention. The training data set has been granted the right to be distributed (by virtue of it being available for viewing already - it's not hidden or secret). The proof is that a human can already view it manually. Let's call this 'public'.
The question is, whether using this public training dataset constitutes creating a derivative work. Is the ML model sufficiently transformative, that the ML model is itself a new work and thus does not fall under the copyright of the original dataset?
>but that's not the point of contention. The training data set has been granted the right to be distributed (by virtue of it being available for viewing already - it's not hidden or secret). The proof is that a human can already view it manually. Let's call this 'public'.
This is wrong. My paintings are publicly available (especially going by your definition [which I'm confused by the origin of?]). Taking a photograph of my paintings is still a copyright violation. I hope we can ignore all the legal kerfuffle about personal use, as it has no bearing on our discussion. Again -- all of this boils down back to what I've said before: Bare human consumption does not constitute as making a copy, nearly everything else does.
Your second point -- a copyrighted work automatically granting someone else any rights (especially distrubtionial rights) by just being available to be consumed -- is even more wrong. I'm not going to go further into that, as you can very easily prove yourself wrong by googling it.
>The question is, whether using this public training dataset constitutes creating a derivative work
I'm not well versed in the US copyright laws, but I would assume (strongly so) that this would not be the case. I -- again, for US copyright law -- assume that for something to be considered a derivative work, it needs to include (or be present in other ways) copyrightable (!) parts of the original work(s). In other words, the original work needs to "shine through" the derivative work, in one way or the other. The delta of parameter changes of a ML model would (imo) not constitute such a thing.
Problems with derivative works will come into play when considering the things ML models produce.
But the AI is (supposedly) not making a copy of your painting. It is ingesting it, and adjusting it's internal "model of what a good painting looks like" to accommodate the information it gleaned from your work. This seems more similar to what a human might do, when they draw inspiration from another's work. The question is - to what extent does the exact image of your painting remain within the AI's data matrices? That, no-one knows for sure.
> But the AI is (supposedly) not making a copy of your painting.
You are mixing up the two things that I've mentioned in my original comment. You have to differentiate between creating a copy and creating a derivative work. Both of those things matter, when talking about AI, but the former is way more cut clear.
>The question is - to what extent does the exact image of your painting remain within the AI's data matrices?
And the answer is: It's irrelevant. The model has to be ingested with a copy of something. That's all that matters. The AI could even reject learning from that something. By the time that something reaches the AI to even do something with it, it's been copied (in the literal sense) who knows how many times, each of those times being a copyright violation.
I see what you're saying, though could you not say the same thing about the browser's internet cache? That copies the file from it's original server, to the user's local machine in order to display it efficiently.
The poster is wrong about what constitutes a copy (for the purposes of distribution). The temporary copy that resides in your browser's memory, or local caches, aren't considered violations unless they are publicly accessible.
I would put the same criteria to the copy made for the purpose of AI training. As long as you have the right to view the image, you would also have the right to ingest that image using an algorithm.
Yes, of course. It could even create advertising using my unmodified likeness, and that wouldn't be a problem. A person's appearance in public is public domain, you don't need permission to use it.
I think that's an entirely different scenario. For one, I'm not displaying myself in my front window with the explicit intent of people viewing me. If you replace the AI in your example with a human taking a photograph, I would be equally appalled at the misuse of my image, and I'm very confident I'd have legal recourse to stop it.
Rephrasing of the question: If you put you eyes on a single pirated work in your lifetime, all future potential creations are potentially inspired by that experience. Is every future creation of a human who has put their eyes on a pirated work copyright infringement?
I pay the books directly (cash, credit) or indirectly (school books via taxes). I do pay the louvre to observe the painting. I also pay to listen music in ads (YouTube) or via subscription (YT Music and Spotify).