Hacker Newsnew | past | comments | ask | show | jobs | submit | entropi's commentslogin

So you are saying now that you can bypass a lot of solutions offered by a mix of small/large providers by using a single solution from a huge provider, this is the opposite of a centralization of power?

>"by using a single solution from a huge provider"

The parent didn't say that though and clearly didn't mean it.

Smaller SaaS providers have a problem right now. They can't keep up with the big players in terms of features, integrations and aggressive sales tactics. That's why concentration and centralisation is growing.

If a lot of specialised features can be replaced by general purpose AI tools, that could weaken the stranglehold that the biggest Saas players have, especially if those open weights models can be deployed by a large number of smaller service providers or even self hosted or operated locally.

That's the hypothesis I think. I'm not sure it will turn out that way though.

I'm not sure whether the current hyper-competitive situation where we have a lot of good enough open weights models from different sources will continue.

I'm not sure that AI models alone will ever be reliable enough to replace deterministic features.

I'm not sure whether AI doesn't create so many tricky security issues that once again only the biggest players can be trusted to manage them or provide sufficient legal liability protection.


with ai specialized hardware you can run the open source models locally too and without without the huge provider stealing your precious IP

ah, so what you are saying is this: now you can buy your own specialized hardware, which is realistically produced and sold by a single company on earth, compete with ~3 of the largest multinational corporations to do so (consider the ram prices lately, to get a sense of the effect of this competition), spend tens of thousands in the process, and run your 'own' model, which someone spends millions to train and makes it open for some reason (this is not a point about its existence, its about its reliability. I don't think its wise to assume the open models will be roughly in line with SOTA forever). This way, by spending roughly 1-2 orders of magnitude more, you can eliminate a handful of SaaS products that you use.

Sorry, I don't see this happening, at least not for the majority. Even if it does, it would still be arguably centralizing.


Roofs do, though.


Hmm. So OpenAI doesn't care about other people's terms, copyrights or any sort of IP; but they get to have their own terms.


Controlling other people's things and gaining from the control is kind of the point of being a thief.


Welcome to the internet! Have a look around

(https://www.youtube.com/watch?v=k1BneeJTDcU)

Another obligatory song named Jeff bezos by Bo Burnham (https://www.youtube.com/watch?v=lI5w2QwdYik)

C'mon Jeffrey you can do it! (I mean you can break half the internet with AWS us-east-1, but in all seriousness both songs are really nice lol)


I don't understand this outrage. People put things on the internet for all to see but now you're mad someone saw it and made use of it.

If you didn't want others to read your information you shouldn't have published it on the internet. That's all they're doing at the end of the day, reading it. They're not publishing it as their own, they just used publicly available data to train a model.

It's quite the same as if I read an article and then tell someone about it. If I'm allowed to learn from your article then why isn't openai?

Also the terms are just for liability. Nobody gives a shit what you use ChatGPT for, the only thing those terms do is prevent you from turning around and suing OpenAI after it blows up in your face.


And I feel like the difference between, say;

- Paying for a ticket/dvd/stream to see a Ghibli movie

- Training a model on their work without compensating them, then enabling everyone to copy their art style with ~zero effort, flooding the market and diluting the value of their work. And making money in the process.

should be rather obvious. My only hypothesis so far is that a lot of the people in here have a vested interest in not understanding the outrage, so they don't.


I don't have any vested interest. I'm just a guy. Don't work on AI, don't use AI much in my day job.

I just legitimately think the outrage is unreasonable. It is completely infeasible for AI companies to provide any meaningful amount of compensation to all the data sources they use.

Alternatively they could just not use any of the data, in which case we wouldn't have as good LLMs and other than that the world would be exactly the same. These data owners don't notice any difference. Using their data doesn't harm them in any way.

You seem to be arguing that enabling people to mimic Ghibli's art style somehow harms them, I don't see how it does at all. People have been able to mimic it already, what's the difference? More people can? Does that make a difference to ghibli? I mean can you point to some concrete negative effects that his phenomenon has had on studio ghibli?

I don't think you can. And I think that proves my point. Anything can be mimicked. People can play covers of songs, paint their own versions of famous paintings, copy Louis Vuitton bag designs, whatever they want. The effort it takes is irrelevant.

You don't even have to train AI on studio Ghibli's art to mimic it. You could just train it on other stuff and then the user could feed it studio ghibli art and tell it to mimic it. The specific training data itself is irrelevant, it's the volume of data that trains the models. Even if they specifically avoided training on studio Ghibli's art there would likely be basically no difference. It wouldn't be worth paying them for it.


You ever see warning labels on products? That's because putting them in terms and conditions isn't enough to avoid liability in a products liability case.


Well it certainly helps


Who told you that? ChatGPT?

No, not really. It does not help. It evidences that they knew the risks were present but did not take the actions to adequately warn the consumers of its products, which OpenAI knows do not read the entirety of the TOS.

Again, this is exactly why you see warning labels on products. Prohibiting certain uses in small text, hidden in the TOS, is not a warning.


Quite the over-simplified straw man you have there.


Quite the lack of substance you have there.


People publish stuff on the Internet for various reasons. Sometimes they want to share information, but if the information is shared, they want to be *attributed* as the authors.

> If you didn't want others to read your information you shouldn't have published it on the internet. That's all they're doing at the end of the day, reading it. They're not publishing it as their own, they just used publicly available data to train a model.

There is some nuance here that you fail to notice or you pretend you don't see it :D I can't copy-paste a computer program and resell it without a license. I can't say "Oh I've just read the bits, learned from it and based on this knowledge I created my own computer program that looks exactly the same except the author name is different in the »About...« section" - clearly, some reason has to be used to differentiate reading-learning-creating from simply copying...

What if instead of copy-pasting the digital code, you print it onto a film, pass the light through the film onto ants, make the light kill the ants exposed, and the rest of the ants eventually go away, and now use the dead ants as another film to somehow convert that back to digital data. You can now argue you didn't copy, you taught ants, the ants learned, and they created a new program. But you will fool no one. AI models don't actually learn, they are a different way the data is stored. I think when court decides if a use is fair and transformative enough, it investigates how much effort was put into this transformation: there was a lot of effort put into creating the AI, but once it was created, the effort put into any single work is nearly null, just the electricity, bandwidth, storage.


> There is some nuance here that you fail to notice or you pretend you don't see it :D

I could say the same for you:

> I can't copy-paste a computer program and resell it without a license. I can't say "Oh I've just read the bits, learned from it and based on this knowledge I created my own computer program that looks exactly the same except the author name is different in the »About...« section"

Nobody uses LLMs to copy others' code. Nobody wants a carbon-copy of someone else's software, if that's what they wanted they would have used those people's software. I mean maybe someone does but that's not the point of LLMs and it's not why people use them.

I use LLMs to write code for me some times. I am quite sure that nobody in history has ever written that code. It's not copied from anyone, it's written specifically to solve the given task. I'm sure it's similar to a lot of code out there, I mean it's not often we write truly novel stuff. But there's nothing wrong with that. Most websites are pretty similar. Most apps are pretty similar. Developers all over the world write the same-ish code every day.

And if you don't want anyone to copy your precious code then don't publish it. That's the most ironic thing about all this - you put your code on the internet for everyone to see and then you make a big deal about the possibility of an LLM copying it as a response to a prompt?

Bro if I wanted your code I could go to your public github repo and actually copy it, I don't need an LLM to do that for me. Don't publish it if you're so worried about being copied.


I feel the same way.

Yet people are using AI for therapy, listening to AI-written novels read by AI, gladly watching AI-made slop. There seems to be real actual demand for it. Feels like total insanity to me; but here we are and facts are facts.

I guess whether the emperor is wearing clothes or not never actually mattered.


I saw someone post a Gemini summary from a Google search as their interpretation of a legal question (not a lawyer) and, when they were called on it, scoffed that they hadn’t used AI. People don’t even know when they’re relying on AI at this point. Which isn’t great given the current state of things.


I am also paying for crunchyroll and trying to support the creators in various ways.

But still, I often find myself watching anime from fansub groups even though I have a legitimate, official way of watching them. Paying for a streaming service that is objectively, significantly worse than even the shittier pirate offerings does make me feel like a fool.


I have a couple of friends working at Google. They don't care about this stuff at all. They seem to be completely bought into the "every man for himself" neoliberal worldview. My sample size is obviously small, but judging by the actions of the company, my friends seem not to be the exception.


While your arguments are correct, I don't think they are relevant.

Like, for example, the government could give me a few billion bucks and everything you said would still be correct. I would also spend it, etc. etc.


  > a few billion bucks  ... I would also spend it,
You as a person? No. You might spend some of it but good luck spending even a billion. If you put that in some investment account and just get index funds you'll be making money faster than you can spend it, even if you don't do the typical billionaire shenanigans like getting loans that mature upon death.

You as a company? Sure, you can spend a few billion.

What does this have to do with the comment I responded to? Who knows[0]

[0] https://www.youtube.com/watch?v=BknZGQoCFt4


While it is completely absurd, I don't see why it would be non-enforceable. They can very well enforce it.

Here is one way to do it: they could take a page out of google's Web Environment Integrity proposal and make it illegal to serve any page within Germany unless the integrity is proven. Done. VPNs are problematic? Ban them. Seems very enforceable to me.

Why do you think it is un-enforceable?


Web Environment Integrity was so heavily criticized at the time that it made Google itself backtrack. The same Google that forged ahead with Manifest V3. There is no realistic way the German government could get websites to implement an even worse version of that.

The whole Web would simply become incompatible with Germany. So this would be trivial to bypass on a technical level, and unacceptable on a social level. Completely unenforceable indeed.


> Web Environment Integrity was so heavily criticized at the time that it made Google itself backtrack. The same Google that forged ahead with Manifest V3. There is no realistic way the German government could get websites to implement an even worse version of that.

I don't think this is a good comparison, though. Google cannot force people to use WEI -yet-. The government can.

>The whole Web would simply become incompatible with Germany.

I think the ad-supported web would just LOVE this idea and would become compatible with Germany ASAP.

> So this would be trivial to bypass on a technical level

I don't think so. Don't get me wrong, there will always be a way for the tech-savvy. But all the trivial ways can very well be blocked.

> unacceptable on a social level

In Germany, you cannot install security cameras in a building unless all the owners agree, on grounds of privacy. But the ISPs keep all of your traffic logs, law firms get these logs, and mass-send cease-and-desist letters using automated systems. This is also not particularly acceptable, but it happens everyday and looks like it is very enforceable.

Lets not be naive and think this is unenforceable on the grounds of being "socially unacceptable".


> But the ISPs keep all of your traffic logs,

only for some short time allowed by the law

> law firms get these logs,

Not it you have them for law enforcement, then it's illegal to give them to someone else.


Yes, but the same logic doesn't fly for basically any other topic, since German public is very sensitive about their privacy. We need to protect the right to privacy at all costs. Unless if its for copyright enforcement. If it is for copyright enforcement purposes, timeouts and pinky promises about not sharing my ID-associated private data with anyone other than for law enforcement purposes is all we need...


I think this situation is described best as being "above" the law.


Yes, and my understanding is that this is precisely the point.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: