> Those questions aren’t monetizable. ... There’s an inability to monetize the simple voice commands most consumers actually want to make.
There lies the problem. Worse, someone may solve it in the wrong way:
I'll turn on the light in a minute, but first, a word from our sponsor...
Technically, this will eventually be solved by some hierarchical system. The main problem is developing systems with enough "I don't know" capability to decide when to pass a question to a bigger system. LLMs still aren't good at that, and the ones that are require substantial resources.
What the world needs is a good $5 LLM that knows when to ask for help.
This type of response has been given by Alexa from an echo device in my house. I asked, “play x on y”, the response was something like “ok, but first check out this new…”. I immediately unplugged that device and all other Alexa enabled devices in the house. We have not used it since.
This is the monetization wall they have to figure out how to break through. The first inkling of advertising is immediate turn off and destroy, for me.
Even worse than ads, mine keeps trying to jam "News" down my throat. I keep disabling the news feeds on all my devices and they kept re-enabling against my wishes. Every now and then I'll say something to Alexa and she'll just start informing me about how awful everything is, or the echo show in the kitchen will stop displaying the weather in favor of some horrific news story.
Me: "Alexa, is cheese safe for dogs?"
Alexa: "Today, prominent politician Nosferatu was accused by the opposition of baby-cannibal sex trafficking. Nosferatu says that these charges are baseless as global warming will certainly kill everyone in painful ways by next Tuesday at exactly 3pm. In further news, Amazon has added more advertisements to this device for only a small additional charge..."
If I wanted to feel like crap every time I go to the kitchen I'd put a scale in there. /s
I find this a really interesting observation. I feel like 3-4 trivial ways of doing it come to mind, which is sort of my signal that I’m way out of my depth (and that anything I’ve thought of is dumb or wrong for various reasons). Is there anything you’d recommend reading to better understand why this is true?
You are asking why someone don't want to ship a tool that obviously doesn't work? Surely it's always better/more profitable to ship a tool that at least seems to work
GP means they aren't good at knowing when they are wrong and should spend more compute on the problem.
I would say the current generation of LLMs that "think harder" when you tell them their first response is wrong is a training grounds for knowing to think harder without being told, but I don't know the obstacles.
Are you suggesting that when you tell it "think harder" it does something like "pass a question to a bigger system"? I have doubts... It would be gated behind more expensive plan if so
In part because model performance is benchmarked using tests that favor giving partly correct answers as opposed to refusing to answer. If you make a model that doesn't go for part marks, your model will do poorly on all the benchmarks and no one will be interested in it.
Because people make them and people make them for profit. incentives make the product what it is.
an LLM just needs to return something that is good enough for average person confidently to make money. if an LLM said "I don't know" more often it would make less money. because for the user this is means the thing they pay for failed at its job.
There lies the problem. Worse, someone may solve it in the wrong way:
I'll turn on the light in a minute, but first, a word from our sponsor...
Technically, this will eventually be solved by some hierarchical system. The main problem is developing systems with enough "I don't know" capability to decide when to pass a question to a bigger system. LLMs still aren't good at that, and the ones that are require substantial resources.
What the world needs is a good $5 LLM that knows when to ask for help.
Useful Douglas Adams reference: [1]
[1] http://technovelgy.com/ct/content.asp?Bnum=135