I don’t think any goalposts need to be redecorated. The “inner monologue” isn’t ...

katmannthree · on April 29, 2025

Devil's advocate, as with most LLM issues this applies to the meatbags that generated the source material as well. Quick example is asking someone to describe their favorite music and why they like it, and note the probable lack of reasoning on the `this is what I listened to as a teenager` axis.

ewoodrich · on April 29, 2025

Something as inherently subjective as personal preference doesn't seem like an ideal example to make that point. How could you expect to objectively evaluate something like "I enjoy songs in a minor scale" or "I hate country"?

katmannthree · on April 29, 2025

The point is to illustrate the disconnect between stated reasoning and proximate cause.

Consider your typical country music enjoyer. Their fondness of the art, as it were, is far more a function of cultural coding during their formative years than a deliberate personal choice to savor the melodic twangs of a corncob banjo. The same goes for people who like classic rock, rap, etc. The people who `hate' country are likewise far more likely to do so out of oppositional cultural contempt, same as people who hate rap or those in the not so distant past who couldn't stand rock & roll.

This of course fails to account for higher-agency individuals who have developed their musical tastes, but that's a relatively small subset of the population at large.

hombre_fatal · on April 29, 2025

Good point. When we try to explain why we're attracted to something or someone, what we do seems closer to modeling what we like to think about ourself. At the extreme, we're just story-telling about an estimation we like to think is true.

TimorousBestie · on April 29, 2025

I largely agree! Humans are notoriously bad at doing what we call reasoning.

I also agree with the cousin comment that (paraphrased) “reasoning is the wrong question, we should be asking about how it adapts to novelty.” But most cybernetic systems meet that bar.

empath75 · on April 29, 2025

I don't think the inner monologue is evidence of reasoning at all, but doing a task which can only be accomplished by reasoning is.

TimorousBestie · on April 29, 2025

Geoguessr is not a task that can only be accomplished by reasoning. Famously, it took a less than a day of compute time in 2011 to SLAM together a bunch of pictures of Rome (https://grail.cs.washington.edu/rome/).

jibal · on April 29, 2025

Such as? geoguessing certainly isn't that.

red75prime · on April 29, 2025

> it’s at best a post-hoc estimation of what a human inner monologue might be in this circumstance

Nope. It's not autoregressive training on examples of human inner monologue. It's reinforcement learning on the results of generated chains of thoughts.

jibal · on April 29, 2025

"It's reinforcement learning on the results of generated chains of thoughts."

No, that's not how LLMs work.

Philpax · on April 29, 2025

That is how reasoning models work: https://www.interconnects.ai/p/deepseek-r1-recipe-for-o1

red75prime · on April 29, 2025

Base models are trained using autoregressive learning. "Reasoning models" are base models (maybe with some modifications) that were additionally trained using reinforcement learning.