> Those questions aren’t monetizable. ... There’s an inability to monetize the s...

hightrix · 2025-12-09T23:50:31 1765324231

This type of response has been given by Alexa from an echo device in my house. I asked, “play x on y”, the response was something like “ok, but first check out this new…”. I immediately unplugged that device and all other Alexa enabled devices in the house. We have not used it since.

This is the monetization wall they have to figure out how to break through. The first inkling of advertising is immediate turn off and destroy, for me.

mapontosevenths · 2025-12-10T02:15:17 1765332917

Even worse than ads, mine keeps trying to jam "News" down my throat. I keep disabling the news feeds on all my devices and they kept re-enabling against my wishes. Every now and then I'll say something to Alexa and she'll just start informing me about how awful everything is, or the echo show in the kitchen will stop displaying the weather in favor of some horrific news story.

Me: "Alexa, is cheese safe for dogs?"

Alexa: "Today, prominent politician Nosferatu was accused by the opposition of baby-cannibal sex trafficking. Nosferatu says that these charges are baseless as global warming will certainly kill everyone in painful ways by next Tuesday at exactly 3pm. In further news, Amazon has added more advertisements to this device for only a small additional charge..."

If I wanted to feel like crap every time I go to the kitchen I'd put a scale in there. /s

jmye · 2025-12-10T00:32:48 1765326768

> LLMs still aren't good at that

I find this a really interesting observation. I feel like 3-4 trivial ways of doing it come to mind, which is sort of my signal that I’m way out of my depth (and that anything I’ve thought of is dumb or wrong for various reasons). Is there anything you’d recommend reading to better understand why this is true?

throwaway290 · 2025-12-10T01:20:40 1765329640

You are asking why someone don't want to ship a tool that obviously doesn't work? Surely it's always better/more profitable to ship a tool that at least seems to work

fn-mote · 2025-12-10T01:54:36 1765331676

GP means they aren't good at knowing when they are wrong and should spend more compute on the problem.

I would say the current generation of LLMs that "think harder" when you tell them their first response is wrong is a training grounds for knowing to think harder without being told, but I don't know the obstacles.

throwaway290 · 2025-12-10T02:00:10 1765332010

Are you suggesting that when you tell it "think harder" it does something like "pass a question to a bigger system"? I have doubts... It would be gated behind more expensive plan if so

jmye · 2025-12-10T04:07:45 1765339665

No? I’m interested in why LLMs are bad at knowing when they don’t know the answer, and why that’s a particularly difficult problem to solve.

xmcqdpt2 · 2025-12-10T06:26:52 1765348012

In part because model performance is benchmarked using tests that favor giving partly correct answers as opposed to refusing to answer. If you make a model that doesn't go for part marks, your model will do poorly on all the benchmarks and no one will be interested in it.

https://arxiv.org/abs/2509.04664

throwaway290 · 2025-12-10T05:30:24 1765344624

Because people make them and people make them for profit. incentives make the product what it is.

an LLM just needs to return something that is good enough for average person confidently to make money. if an LLM said "I don't know" more often it would make less money. because for the user this is means the thing they pay for failed at its job.