Not that I defend all of Valve’s business practices, but last I heard, you’re only prevented from sell games cheaper on other platforms if the thing you’re selling is Steam keys.
AFAIK, this restriction doesn’t apply if you’re distributing via a different market place.
If that’s true, that seems fair since you’re relying on Valve’s infrastructure to support the sale.
Personally, I'm not bothered very much by LLM confabulation, as long as it's the result of missing context. In most practical tasks, we either give context to the model, or tell it to find it itself using the internet. What I am concerned with is confabulation that contradicts available in-context information, but that doesn't seem to be what is measured here.
This must be easily benchmaxed because I have never gotten an "idk like" answer for the western frontier models. All my personal "real world" use cases will always resort to hallucinations.
The output of any LLM is always 100% hallucination by principle. On top of that, most benchmarks are at best an approximation of LLM quality. Your use case decides which one to use. That said, I haven't tested v4 yet but the old 3.2 is still a decent model. And concerning use cases, I had coding problems that Opus couldn't solve but a local 35B model did.
All the talk about frontier and SOTA is do dig deeper and deeper into the pockets of VCs and finally do an IPO.
People in China live under totalitarian rule, that much is true.
But how free is the average North American, where getting sick can bring you and your family financial ruin? Where the "free press" is controlled by corporations who are also the main source of campaign funding for politicians? Where their urban spaces are designed to require you to have a car and promote complete atomized individuals?
All these things are from the private sector and may be left behind if you like (do younger generations even listen to corporate news?)
The real issues are government surveillance and it increasingly getting involved in my personal matters, but it’s still more free than any other country I could go to. Look at countries in Europe like the UK without true freedom of press arresting people for mean tweets and giving them years in prison.
We can talk about all this stuff on an American form, but good luck talking about any of China's issues on a Chinese Forum. Lets not talk about how China regularly kills Catholic priests and bishops. Anyone who tries to glaze China is a propagandized fool.
..you forgot to mention that any technology in China, foreign or domestic, can and will be used for and to the benefit of the -military- party.. But like someone posted: "not perfect" fits the bill.
Check out the Sean Ryan Show with Palmer Luckey on China and military tech.
"With Opus 4.6, extended thinking was a toggle you managed: turn it on for hard stuff, off for quick stuff. If you left it on, every question paid the thinking tax whether it needed to or not. Now, with Opus 4.7, extended thinking becomes adaptive thinking. "
You want extended thinking? It's not adaptive thinking and opus will turn it on if it thinks it needs to. But it probably won't, according to user reports as tokens are expensive. Except opus 4.7 now uses 35% more and outputs more thinking tokens.
I am getting pretty good performance. Even on trivial questions it seems to go through the thinking process end. If they are using adaptive thinking, it seems to work much better than before. I will see how my experience goes with more usage.
Not when you want extended thinking - you select extended thinking and opus decides if you get it with apativenthinking.
"With Opus 4.6, extended thinking was a toggle you managed: turn it on for hard stuff, off for quick stuff. If you left it on, every question paid the thinking tax whether it needed to or not. Now, with Opus 4.7, extended thinking becomes adaptive thinking. "
I've gotten quite a bit of work done on claude.ai and the mobile app though. It's been good for code review. The GitHub connector is a bit clunky but it works.
"Opus 4.7 thinks more at higher effort levels, particularly on later turns in agentic settings. This improves its reliability on hard problems, but it does mean it produces more output tokens. "
That's a good point. AA's Cost Efficiency section says the opposite: you can hover to see the breakdown between input, reasoning and output tokens.
I'm not sure where that discrepancy comes from (is Anthropic using different benchmarks?).
There's a few different theories but all we have now are synthetic benchmarks, anecdotes and speculation.
(Benchmarks are misleading, I think our best bet now is for individuals to run real world tests, giving the same task to each model, and compare the quality, cost and time.)
The input cost inflation however is real, and dramatic.
I would have expected them to lower input costs proportionally, because otherwise you're getting less intelligence per dollar even with the smarter model. Think that would be the smartest thing for them to do, at least PR wise. And maybe a bit of free usage as an apology :)
https://www.bbc.co.uk/news/articles/cx2g1md0l23o
reply