Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Is the choice of what to train upon not creative? I feel like it can be.


Possibly, but even if that were the case, it would protect NovelAI, not Stability.

The closest analogue I can think of would be copyrighting a Magic: The Gathering deck. Robert Hovden did that[0], and somehow convinced the Copyright Office to go along with it. As far as I can tell this never actually got court-tested, though. You can get a thin copyright on arrangements of other works you don't own, but a critical wrinkle in that is that an MTG deck is not merely "an arrangement of aesthetically pleasing card art". The cards are picked because of their gameplay value, specifically to min-max a particular win condition. They are not arrangements, but strategies.

Here's the thing: there is no copyright in game rules[1]. Those are ideas, which you have to patent[2]. And to the extent that an idea and an expression of that idea are inseparable, the idea part makes the whole uncopyrightable. This is known as the merger doctrine. So you can't copyright an MtG deck that would give you de-facto ownership over a particular game strategy.

So, applying that logic back to the training set, you'd only have ownership insamuch as your training set was selected for a particular artistic result, and not just "reducing the loss function" or "scoring higher on a double-blind image preference test".

As far as I'm aware, there are companies that do creatively select training set inputs; i.e. NovelAI. However, most of the "generalist" AI art generators, such as Stable Diffusion, Craiyon, or DALL-E, were trained on crawled data without much or any tweaking of the inputs[3]. A lot of them have overfit text prompts, because the people training them didn't even filter for duplicate images. You can also specifically fine-tune an existing model to achieve a particular result, which would be a creative process if you could demonstrate that you picked all the images yourself.

But all of that only applies to the training set list itself; the actual training is still noncreative. The creativity has to flow through to the trained model. There's one problem with that, though: if it turns out that AI training for art generators is not fair use, then your copyright over the model dissolves like cotton candy in water. This is because without a fair use argument, the model is just a derivative work of the training set images, and you do not own unlicensed derivative works[4].

[0] https://pluralistic.net/2021/08/14/angels-and-demons/#owning...

[1] Which is also why Cory Doctorow thinks the D&D OGL (either version) is a water sandwich that just takes away your fair use rights.

[2] WotC actually did patent specific parts of MTG, like turning cards to indicate that they've been used up that turn.

[3] I may have posted another comment in this thread claiming that training sets are kept hidden. I had a brain fart, they all pull from LAION and Common Crawl.

[4] This is also why people sell T-shirts with stolen fanart on it. The artists who drew the stolen art own nothing and cannot sue. The original creator of that art can sue, but more often than not they don't.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: