Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

From the amount of data each successive generation used (which grew many orders of magnitude each time) to the decreasing, logarithmic performance, it's quite clear the steam is running out on shoving more data into it. If one plots the data to performance graph, its horribly logarithmic. In another perspective, the ability of LLMs to transfer learning actually decreases exponentially the larger they and the data sets get. This fits into the how humans have to specialise in topics because the mental models of one field is very difficult to transfer to another.


If that's the case, why we aren't seeing yet specialized LLMs for say only JavaScript, or translating from english to portuguese, etc?


We are likely going to get there. Similar to the steam/combustion engines (and other core technologies like computers, wireless transmission etc) there's first a massive rush to increase the power of it, at the cost of efficiency and effectiveness for more niche use cases. Then it is specialised to various use cases with large improvements in efficiency and effectiveness. My own prediction for where most gains will now come is

1) Creating new "harnesses" for models that connect to various systems, APIs, frameworks, etc. While this sounds "trivial", a lot of gains can come from this. Similar to how the voice version of ChatGPT was (apparently) amazing, all you really had to do was create an additional voice to text layer and another text to voice layer.

2) Increasing specialisation of models. I predict over time that end user AI companies (e.g those that just use models and not develop them), will use more and more specialised models. The current, almost monolithic, system where every service from text summary to homework help is plugged into the same model will slowly change.


We kind of have, that's what fine tuning is trying to achieve.

We haven't seen wholesale specialised models yet because creating foundation models is expensive and difficult and the current highest ROI is to make a general model.


> to the decreasing, logarithmic performance

In what measure, loss? Loss can't go below 0 plus the inherent entropy in the text (other than that with overfitting it could reach nearer to 0, but not fully if it is next token and there are multiple same prefixes).

With respect to hallucinations 4 got incredibly better over 3


In intelligence/performance. It's admittedly a fuzzy notion. Most benchmarks will probably show decreasing gains between generations. Similar to time/space complexity, trying to debate about what performance/intelligence is will get into a million definitions, caveats and technicalities. But a relative comparison between inputs and outputs is gives us useful information.

The inputs - data, compute and parameters - going into training these models have grown by many orders of magnitude between each gen. There's a lot of fuzziness about how much better each gen has gotten, but clearly 4 is not many orders of magnitude better than 3 by any reasonable definition. This mental model isn't useful to say how good each gen is, but it is quite useful to see the trend and make long term predictions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: