Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> If the AI industry is to survive, we need a clear legal rule that neural networks, and the outputs they produce, are not presumed to be copies of the data used to train them. Otherwise, the entire industry will be plagued with lawsuits that will stifle innovation and only enrich plaintiff’s lawyers

Or maybe, get this, how about people running AI only feed them information that they legally have the right to use? How is it a bad thing that somebody can't legally steal other peoples' work without their permission because of pesky copyright?



As an extension of this, only allow children to look at works they purchased publication rights to, lest their creative output becomes influenced by a different person's style.


There is absolutely no comparison here, because children don't charge you to look at their artwork, if you ask nicely, they will probably give it to you for free. Companies using other peoples work without permission to train AI, will charge.

Your suggestion would be accurate if we lived in a world where we all shared, and there was no money, and copyright didn't exist, but we don't.


It is my understanding that it makes no legal difference (at least in my country) whether I charge for my work or not when it infringes somebody's copyright. Simply sharing it is sufficient to get into trouble.


I don't know from which weird country you're from, but in mine profiting from it changes things a lot.


It sounds like your country is the weird one. Try burning the complete works of Disney onto stacks of DVDs, then go down to your high street and hand them out. See how long you get away with that for.


I think the big difference is distribution vs consumption. Where I live, there are additional clauses in law, for mass reproduction and selling.

But republishing any work as your own, probably falls into that category. And it isn't about profit, but commercial use; thus pasting onto a blog to improve your business (rankings, hit count) is a business use case.


It's ultimately a commercial use because you are directly affecting Disney's market. This should be obvious.


Unless I do like millions of them, literally nothing would happen.


Do you think they would act much faster if you charged a dollar or five? I don't.


There are plenty of competitors to the corporate AI models that are freely distributed, free to use, etc. You just need to have the hardware, which is admittedly pricey. The worst outcome is if there is a legal risk in creating AI models that only big companies with an army of lawyers large enough to fend off lawsuits can afford to face. Then you'd have the continuation of big tech controlling things for "responsibility" reasons instead of AI being a technology anybody can use.


I have no idea what your point is here.

These AI companies are making serious amounts of money (OpenAI is valued in tens of billions) on the back of artists who never gave permission for their work to be used in this way.

If a child took an artist's work, copied it and made significant amounts of money from selling it then yes they should be within the purview of copyright law.


> who never gave permission for their work to be used in this way.

the copyright aren't all encompassing. There's only an enumerated set of rights granted, and "this way" (aka, training an AI model) is not one of those restricted activities (like distribution or broadcast).

Unless the model can be argued to be a derivative work of the training data set (which i don't believe it is, since the process of training is sufficiently transformative imho), the original copyright holders of the training data do not need to be asked permission.


AI doesn't copy one to one. It mixes, like humans.


And sometimes it mixes just from one. See github copilot.


Anthropomorphism isn't an argument.


I wouldn't bet for or against that without legal advice, and even then it might vary by jurisdiction. Legal fictions are a thing, but I'm no lawyer, and I know better than to assume my interpretation of any legal issue is any better than a Hollywood script writer's: https://en.wikipedia.org/wiki/Legal_fiction


AI is not human children.


I can instruct an AI to draw Mickey Mouse and infringe copyright and I can instruct a child to draw Mickey Mouse and infringe copyright.


And when you try and sell that work without attribution or compensation then there is a problem.


Exactly. If I produce a work, and then you produce an identical-enough work, you're infringing my copyright. I don't care how you did it, it makes absolutely no difference to me.


The problem is there's no way a user will know whose copyright they are infringing when they ask AI to "paint a landscape."

Maybe the AI needs to be able to print out a list of sources to provide attribution. That would be interesting.


Artists get inspired by others all the time, and if the results are far enough from each other, then nobody has a problem with that. In fact, pre copyright, the similarities used to be even larger. Art lives from the concept of taking ideas, and improving on them.


> they ask AI to "paint a landscape."

that's the responsibility of the user said AI to check.


The most original authors are those who never learned how to read.


> legally have the right to use

is the "right to use in ML training" well defined?


The EU has a copyright exemption for noncommercial model training, and at least the UK is changing that to even allow commercial model training, without even an opt-out required. So it appears they legally have the right to use anything.

Why should you want a model designed to know all human knowledge to know only public-domain knowledge?


> Or maybe, get this, how about people running AI only feed them information that they legally have the right to use? How is it a bad thing that somebody can't legally steal other peoples' work without their permission because of pesky copyright?

On one extreme:

"Unless you pay your annual Disney fee for having watched Disney films in early childhood, you will need to return your brain to us for processing. Disney was used as the basis for all concepts you know, and as such, Disney owns all subsequent intellectual output of your brain."

And on the other:

In the age of AI, copyright will cease to hold weight. We'll make more new content on a per-month basis than all of recorded human history. The old regime must be thrown away to accommodate the radically new world we're entering.

We'll land somewhere in-between, and I'm hoping it's much closer to (or even precisely) the latter.


Very curious that so many people adopted this position exactly when it became feasible for giant corporations to profit by mass producing laundered copyrighted works!


Implying that I haven’t always been anti-copyright. How is either this comment or yours supposed to be productive?


Those wanting AI to respect copyright are going to find that the big players will navigate copyright just fine. It's the small players that won't. They're advocating for institutional control over AI.


One of the first commercial uses of modern neural networks was Microsoft laundering GPL code with Copilot, so I’m not really sure what you mean by saying that “big players will navigate copyright just fine”.


There are laws in the EU that only apply to large companies, like Facebook et. al. because they have much more power in certain spaces. Similar laws can be made for Disney vs. small studios, e.g. "if turnover is less than 100M EUR/month..." - I feel this is often proposed as a false dichotomy.


So if I want AI to respect copyright to protect individual artists, designers etc.

I am not fighting for the smaller players but for large enterprises. That is illogical.


The problem lies with AI artists wanting copyright for me but not for thee.


> how about people running AI only feed them information that they legally have the right to use?

Well, this is the crux of the matter, isn't it? Do you, a human, have the right to look at copyrighted works and learn from them? Do you have the right to use AI to do the same?


But we don't want poor innocent microsoft to train models on their own code, do we?


You’re talking about training. Training is legal.

- If you bought the book, you can read it.

- If the book is free, you can read it.

- If the painting is in a museum, or on Wikipedia, you can visit it.

- If Bozo the clown says you’re not allowed to look at drawings he posted online, it’s ok. You still can.

Same for AI.


> how about people running AI only feed them information that they legally have the right to use?

That's what they did!

It was in fair use. So yes, they did have the right to legally train the data on copyrighted images.


> It was in fair use

Many artists don't believe this and the law is very much unclear.

In many cases the AI generated work literally looks like a clone.


I'm curious the effort spent prompt-crafting and searching for a seed to arrive at a similar image. Have they provided the prompt and seed used?


Train on data: sure… but sell the output? Different question altogether.


Copyright only governs publishing. So you have the right to train AI with any and all data you have access to, as far as copyright is concerned.


> Copyright only governs publishing. So you have the right to train AI with any and all data you have access to, as far as copyright is concerned.

Sure, but you still just cannot output anything that looks like a derived or copied work.

So, maybe ... how about if image generation nets hold onto the training images so that it can compare the generated output against its training data to ensure that it is not too similar.

/s (but only a little)


Train it on, perhaps. But personal not-published use is a rather small niche compared to the current commercial explosion.


Yes but using AI to generate works that can be used commercially is the way commercial AI companies plan to monetize AI.


There have been leaks where OpenAI is charging $42/month to use their service.

How much of that is going back to the copyright holders whose work their service derives value from ?


> How much of that is going back to the copyright holders whose work their service derives value from ?

how much of the earnings of the student of art goes to the textbook authors, paintings and learning materials he used to get to where he is today?


Derivative works are their own things (when sufficiently derivative). And AIs are not humans - using an algorithm does not automatically remove the copyright. See also "I uploaded a movie to youtube but it's upside down, why did it get taken down".


But that's because the movie is still recognisable. Using an algorithm doesn't automatically remove copyright, but if the algorithm transforms the data to a point where it can't be recognised as the original work, then it isn't breaking copyright.


Those may be fine, yes.

But the AI as a whole is capable of reproducing the original in a recognizable form, and it does so on demand quite easily, because it was trained on them - how is it different than selling a zip file containing millions of copyrighted works, and also a bunch of new stuff?


> how is it different than selling a zip file containing millions of copyrighted works

so you're saying that the digits of pi is violating copyright then?


Copyright law hinges on human element of the actions taking place, not on mathematical technicality. The digits of pi are not creative human expression, nor are they derived from human expression, they're a factual mathematical discovery. They can neither infringe on copyright, nor are they subject to it themselves.


So what differentiates the matrix of numbers in the AI model, vs digits of pi?


I can ask one to produce copyrighted works. The other, not so much. That's a rather weak straw-man.


If you publish/present something for consumption, it ends up in their brain's neural network. It's not stealing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: