My question still stands. How could anyone besides OpenAI be confident of there ...

Izkata · on June 9, 2024

Quoting GP for context:

> That happy discovery was never really a linear improvement path, though. We had an explosion of capability, but all along there have been active questions about how far the improvements would go with the current approach.

> I think the point that a lot of researchers are making is that that we're starting to see those limits (with LLMs, at least).

The kinds of limitations we're "starting to see" are largely the same as they were a year ago. People were talking about it on here back then, but now it's becoming more apparent to more people as they get used to LLMs.

For those who saw it back then, this does look like we're hitting a limit. For others, not so much.

mewpmewp2 · on June 10, 2024

I'm not sure I understand this?

How do active questions about a technology imply we are approaching a brick wall?

How could researchers without having access to the latest state of the art - by OpenAI or any other unknown companies be able to even test that we could be approaching a brick wall? It seems to me that it would take trillions to find out what the exact limit is.

It's possible that we will get diminishing returns, but I don't see how we can confidently claim or know it?

> The kinds of limitations we're "starting to see" are largely the same as they were a year ago. People were talking about it on here back then, but now it's becoming more apparent to more people as they get used to LLMs.

I don't follow. GPT-3.5 was borderline useless at reasoning. But it still seemed amazing and what I wouldn't have thought to be possible in any near future.

And then GPT-4 was a crazy advancement over that to me. And I've been using it daily since it was available, for various use-cases. Are you saying we are seeing the limitations of GPT-4 specifically? Because, sure, GPT-4 is far from AGI, but I don't see how this implies that further scaling, optimisation, training data improvements, techniques like multi modality and other potential strategies that I might not be aware of couldn't bring another explosive step?

Also the fact that GPT-4 reasoning skill hasn't been reproduced by anyone else so far seems to leave me thinking that everyone except OpenAI are clueless. Claude Opus is close, like I've said before, but not quite GPT-4 levels in specific reasoning tasks that I'm using the API for.

If you can't reproduce GPT-4, how could we trust the assessment that we have hit a limit?