Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah it’s a real valid business strategy.

It’s also silly.

The crytek case is similar, but there’s a big difference. Crytek was betting that performance of hardware will increase. A lot of AI startups are betting that future LLMs will have an new different, fairly vague, capabilities (AGI, whatever that actually means)



I'd also add that what Crysis did was pretty typical at the time. It was an era when new computers were a bit dated in a few months, and then obsolete in a couple of years. Carmack/ID Software/Doom was a more typical example of this, as they did it repeatedly and regularly, frequently in collaboration with the hardware companies of the time. But there was near zero uncertainty. There was a clear path to the goal down to the point of exact expected specs.

With LLMs there's not not only no clear path to the goal, but there's every reason to think that such a path may not exist. In literally every domain neural networks have been utilized in you reach asymptotic level diminishing returns. Truly full self driving vehicles are just the latest example. They're just as far away now as they were years ago. If anything they now seem much further away because years ago many of us expected the exponential progress to continue, meaning that full self driving was just around the corner. We now have the wisdom to understand that, at the minimum, that's one rather elusive corner.


"In literally every domain neural networks have been utilized in you reach asymptotic level diminishing returns."

Is that true, though? I think of "grokking", where long training runs result in huge shifts in generalization, only with orders of magnitude more training after training error seemed to be asymptotically low.

This'd suggest both that there's not that asymptotic limit you refer to - something very important is happening much later - and that there are potentially some important paths to generalization on lower amounts of training data that we haven't yet figured out.


I think training error itself has diminishing returns as a metric for LLM usefulness.

A lower error, after a certain point, does not suggest better responses




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: