Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
It looks like OpenAI trained Sora on game content (techcrunch.com)
21 points by alex_young on Dec 13, 2024 | hide | past | favorite | 16 comments


The number of layers is irrelevant. It’s transformative. The author had to work hard just to get it to make non exact videos of similar videos.

Where it’s going to get interesting is in terms of service where companies start using AI to copy SaaS products by having an AI login and copy competitors features. This breaks not just copyright law but is a breach of contract.


Copying SaaS products has been happening for decades even without AI


It has but the speed is what will change and who has the capability to copy.


And I have to work hard to break DRM, produce accessible transcriptions of images-of-text / images without alt text, and ensure that there is a meaningful machine-readable outline of the document. Doesn't make it transformative.

If something is being done by a computer program, and that computer program is not customised (i.e., if you fed it something else, it would do the same "transformation" to a different work), and features of the original are clearly apparent in the derived version, I would be hard-pressed to call that transformative.


I agree with you that there are different interpretations of transformative. This one will play out in the courts. The definition that the copyright courts use is defined by court cases. In that regard I do believe what the models do is transformative, which exculpates them from copyright violations. Not just because someone had to work hard, I agree I didn’t say that very well.

I do think they have a much bigger problem with breach of contract claims for potentially violating a tos. It’s becoming clear that the way they acquired some of the data likely has issues. However, they are super proactive at signing agreements to solve these issues (eg deal with Reddit)


I don't think there's anything new here. It's the same unanswered question of can you train AI on copyrighted material.

The fact that it's video instead of photos or text doesn't seem like it should make any difference.

We need to wait for the legal system to decide.


More specifically, we need to wait for the legal systems to decide. There’s an increasing split between what the US courts decide and what much of the rest of the world want.


> We need to wait for the legal system to decide.

This sounds like "I have no opinion so the legal system will pick the side for me and everyone else"


> We need to wait for the legal system to decide.

I think it has.


I don't think so. There are loads of ongoing lawsuits.

https://www.techtarget.com/whatis/feature/AI-lawsuits-explai...

I don't think any significant ones have been decided yet, but please post a link if I'm wrong.


I was speaking more philosophically—I don’t believe our legal systems, much like their failure to stop Trump’s reelection in 2024, are designed or capable of meaningfully altering the trajectory of something like AI.


I think it's clearly wrong to use generative AI, or any tool, to produce what are basically copies of a work, and then profit off that as if you were the original creator.

I think it's just as clearly okay to use content that other people produce as part of learning to create your own stuff.

Generative AI can be used for copyright infringement - just like any creator technology could be.


> Generative AI can be used for copyright infringement - just like any creator technology could be.

And yet, all generative AI services are pretty confident that copyright infringement is not something you should worry about when using their outputs.

For a example, this FAQ from Cursor:

> Who owns the code generated in Cursor?

> You! Regardless of whether you use the free, pro or business version of Cursor, all generated code is yours and free to be used however you like, including commercially.

Note how it's specifically not saying "it's up to you to determine if the code infringes on any of the million codebases we trained our models on".

The difference with video and Sora is that it does this with properties of entities that actually have some considerable legal departments behind them.


Seems like a ongoing trend where wealthy AI companies ignore copyright and try to settle outside court using money they raised from the hype created using said copyrighted material.

Meanwhile the RIAA and the new generation of streaming companies is attacking non-wealthy consumers left and right.


Is AI generated content any different from a human reading a bunch of books or watching movies and being inspired by them to create something similar but not identical?


lol it's going to be much harder to launder the training data for videos than for text and images. not that it matters legally but like who's going to be that impressed by an app that so obviously produces facsimiles




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: