Hacker Newsnew | past | comments | ask | show | jobs | submit | drmindle12358's commentslogin

Curious to hear my fellow hacker's feedback on my new years day musing.


Interesting. I also wonder if this idea can be generalized to larger social groups, which collectively can enforce divergence from reality.


As an AI/ML practitioner in the Third World, curious to hear your thoughts on the proposed three-world division and why nobody seems interested in the huge market lifting the third world of AI/ML.


You meant to ask "Has r1 made SFT obsolete?" ?


Authors are from Northeaster University, Shenyang, China, not the Northeastern U in Boston. Don't understand why the two Chinese professors write an LLM book in english, definitely not from experiences, probably under pressure to publish.


prob. not prof; just phd students needs pubs to graduate


Do you have appetite for a poem? @sama made it on my list of Silicon Valley villains [1] long time ago:

"Villain staging the show / open, close / you can count on the con man to wow you / even though, the only trick he knows / is the “law” of scale / but let's just hope / The con man doesn't turn into evil / when the thing he has is real and powerful"

[1] https://www.drmindle.com/ai-is-not-dangerous/#villains-in-th...


Dude, it's not the LLM that does the reasoning. Rather it's the layers and layers of scaffolding around LLM that simulate reasoning.

The moment 'tooling' became a thing for LLM, it reminded me 'rules' for expert system which caused one of the AI winter. The number of 'tools' you need to solve real use cases will be untenable soon enough.


Well, I agree that the part that does the reasoning isn't an LLM in the naive form.

But that "scaffolding" seems to be an integral part of the neural net that has been built. It's not some Python for-loop that has been built on top of the neural network to brute force the search pattern.

If that part isn't part of the LLM, then o1 isn't really an LLM anymore, but a new kind of model. One that can do reasoning.

And if we chose to call it an LLM, well then now LLM's can also do reasoning intrinsically.


Reasoning, just like intelligence (of which it is part) isn't an all or nothing capability. o1 can now reason better than before (in a way that is more useful in some contexts than others), but it's not like a more basic LLM can't reason at all (i.e. generate an output that looks like reasoning - copy reasoning present in the training set), or that o1's reasoning is human level.

From the benchmarks it seems like o1-style reasoning-enhancement works best for mathematical or scientific domains where it's a self-consistent axiom-driven domain such that combining different sources for each step works. It might also be expected to help in strict rule-based logical domains such as puzzles and games (wouldn't be surprising to see it do well as a component of a Chollet ARC prize submission).


o1 has moved "reasoning" from training time to partly something happening at inference time.

I'm thinking of this difference as analogus to the difference between my (as a human) first intution (or memory) about a problem to what I can achieve by carefully thinking about it for a while, where I can gradually build much more powerful arguments, verify if they work and reject parts that don't work.

If you're familiar with chess terminology, it's moving from a model that can just "know" what the best move is to one that combines that with the ability to "calculate" future moves for all of the most promising moves, and several moves deep.

Consider Magnus Carlsen. If all he did was just did the first move that came to his mind, he could still beat 99% of humanity at chess. But to play 2700+ rated GM's, he needs to combine it with "calculations".

Not only that, but the skill of doing such calculations must also be trained, not only by being able to calculate with speed and accuracy, but also by knowing what parts of the search tree will be useful to analyze.

o1 is certainly optimized for STEM problems, but not necessarily only for using strict rule-based logic. In fact, even most hard STEM problems need more than the ability to perform deductive logic to solve, just like chess does. It requires strategical thinking and intuition about what solution paths are likely to be fruitful. (Especially if you go beyond problems that can be solved by software such as WolframAlpha).

I think the main reason STEM problems was used for training is not so much that they're solved using strict rule-based solving strategies, but rather because a large number of such problems exist that have a single correct answer.


Tech celebrities need to be held accountable by us, the people. Curious to hear HN's opinions of a rating system I proposed.


Bill Gates cheated on his wife like Bezos did.


My brother in-law is the VP engineering of an competitor to LinkSpace. No they didn't steal it. They don't even have access of Google!

Private aerospace startups face huge push backs from state-owned agencies for two apparent reasons: (1) most of the engineers of the startups left their job in those agencies for these startups. (2) The progresses these startups made expose how slow, bloated and inefficient the government-run agencies are. As a result of the hostility, the startups have little access to suppliers and testing facilities etc., not to mention the technological know-hows.


>> No they didn't steal it. They don't even have access of Google!

I didn't say they searched for it. Stealing isn't a Google search.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: