More

skippyboxedhero · 2026-04-13T22:35:48 1776119748

I think that identifies an issue that is going to cause a real problem for the US in the future. The society is deeply politicised and polarised to the extent that essentially inanimate objects are regarded as having deep political and social significance. When there is political change, it is going to sweep back in the other direction.

It also seems like people on all sides within the AI debate have been fanning those flames thinking is will work in the short-term...and it won't. Big tech played that game in many countries in the early 2010s and it didn't end well.

gaigalas · 2026-04-13T22:49:20 1776120560

It must be noted that the U.S. does allow inanimate object makers to fund politicians and such practices are widespread.

If all is well, then it's all good: no need to blame anyone, campaings get funded, etc. If one major crisis occours though, the country self-immolates by design.

twoodfin · 2026-04-13T23:02:49 1776121369

Corporate contributions to Federal politicians and candidates are illegal in the US.

The New York Times is allowed to spend money like anyone else praising or slagging politicians, but that’s the First Amendment, not funding candidates.

gaigalas · 2026-04-13T23:11:43 1776121903

> Corporate contributions to Federal politicians and candidates are illegal in the US.

And that's why the whole system is divided into two parties that both, each, funnel all their support to the presidential campaign (and then to taking over seats to guarantee more lobbying).

This whole thing would fall apart without lobbying.

skippyboxedhero · 2026-04-13T22:26:26 1776119186

The use-cases for data science and other engineers are different. AI is not uniformly good at all kinds of development.

There is an issue with execs pushing it though. You have people at the top of the company with little to no idea how people work attempting to micromanage tool usage. It is as if you had a group of execs determining what IDEs people could use.

No-one is getting fired because of AI. The start of this year is the start of companies beginning to use AI. The reason layoffs are happening is because of the massive overhiring after Covid.

ryandrake · 2026-04-13T22:51:05 1776120665

> overhiring after Covid

How long after COVID are we going to be able to keep using this excuse? This is starting to feel like the politician blaming his predecessor even though he's been in office for years. In the year 2033, Company X lays off another 10,000, just as it did each year since 2023, again blaming massive over-hiring during COVID, ten years ago.

bdangubic · 2026-04-13T22:55:30 1776120930

> How long after COVID are we going to be able to keep using this excuse?

I am with you but if you look what happened after COVID it is a big line going waaaaay up. COVID was a significant event and there is no way around it, no? the OPs comment is invalid because we below the pre-COVID (by miles) but COVID should be taken into account (everyone seems to use it to further some agenda by looking at just one particular aspect of what happened post-COVID)

andrekandre · 2026-04-13T22:58:05 1776121085

  > It is as if you had a group of execs determining what IDEs people could use.

its worse than that; its more like determining what ide you use and also mandating how much time you spend in it, and then chewing you out at review time because you used jira and confluence too much instead of writing md files in the blessed ide of their choice

noobermin · 2026-04-14T01:50:29 1776131429

Covid was six years ago man. Don't insult people's intelligence.

skippyboxedhero · 2026-04-10T14:01:17 1775829677

It is far easier in crypto. It is much more difficult in futures...as the case of a recent repeat Presidential candidate showed.

That will change when people realise crypto isn't anonymous...but that day isn't today.

skippyboxedhero · 2026-04-10T13:59:11 1775829551

It is either not being offered in depth by anyone market-maker (part of the answer given the relatively small revenue opportunity) or it is being offered by people who aren't sophisticated enough.

Bookmakers offer markets on events where someone can know the outcome. The difference is that they have tools to prevent adverse selection.

Prediction markets offer none of those protections so the market structure is going to end up being very different (which is already happening, revenue opportunity from politics isn't huge). There are other examples of this around latency arb, market is going to be very different.

Also, I will point out that most insiders are probably going to be losing money too. All that you ever read is the final outcome, you don't read the stuff that happens before. Politics is, generally, not a good market because the actual event is driven by decisions made by people. Election markets are fine but political event markets are not good, even if you have inside information.

skippyboxedhero · 2026-04-08T17:42:19 1775670139

There is also threadfin, which I found a bit more friendly than XTeVe.

Above system works pretty well but had trouble with encode/decode speed somewhere. Tried with N100 cpu and still had the same result...probably user error somewhere but none of the options seemed to work. No issues with UHF so kept using that.

js2 · 2026-04-08T17:43:10 1775670190

Thanks, just added it.

skippyboxedhero · 2026-04-08T17:08:39 1775668119

China issued a stable coin about five years ago. It is used for all retail payments (I believe, small value, payments for govt employee salaries, etc). Somewhat bizarrely, it is significantly more privacy-protecting than payments in the West.

Quite funny to read comments from people asking what use is crypto. Can tell they have probably never left West Virgina.

Don't think it would be that useful for Iran though as they are already RMB earners, and RMB financial markets are still a bit questionable (there is depth, I don't think anyone knows why this depth exists or what it is actually for, just state-linked banks moving paper between themselves furiously).

skippyboxedhero · 2026-04-07T18:50:33 1775587833

Anyone who has used Opus recently can verify that their current model does all of these things quite competently.

ls612 · 2026-04-07T23:35:45 1775604945

I had Opus 4.6 start analyzing the binary structure of a parquet file because it was confused about the python environment it was developing in and couldn't use normal methods for whatever reason. It successfully decoded the schema and wrote working code afterwards lol.

SkyPuncher · 2026-04-07T20:51:46 1775595106

I was reading the Glasswing report and had the same thought. Most of the stuff they claim Mythos found has no mention of Opus being able to find it as well.

Don’t get me wrong, this model is better - but I’m not convinced it’s going to be this massive step function everyone is claiming.

unbrice · 2026-04-08T00:51:38 1775609498

From the press release:

> With one run on each of roughly 7000 entry points into these repositories, Sonnet 4.6 and Opus 4.6 reached tier 1 in between 150 and 175 cases, and tier 2 about 100 times, but each achieved only a single crash at tier 3. In contrast, Mythos Preview achieved 595 crashes at tiers 1 and 2, added a handful of crashes at tiers 3 and 4, and achieved full control flow hijack on ten separate, fully patched targets (tier 5).

taytus · 2026-04-07T19:11:53 1775589113

That has also been my experience. And if Mythos is even worse, unless you have a significantly awesome harness, sounds like pretty unusable if you don't want to risk those problems.

wolttam · 2026-04-07T20:45:12 1775594712

Human in the loop is the best way to go. You'll still be way faster than without the agent, and there is no risk of it going haywire unless you turn off your brain!

hamandcheese · 2026-04-08T07:36:49 1775633809

> unless you turn off your brain

skippyboxedhero · 2026-04-07T19:27:51 1775590071

I think are fundamental issues with the story that Anthropic is selling. AGI is very close, we will definitely get there, it is also very dangerous...so Anthropic should be the only ones trusted with AGI.

If you look at recent changes in Opus behaviour and this model that is, apparently, amazingly powerful but even more unsafe...seems suspect.

mikkupikku · 2026-04-07T20:15:28 1775592928

It seems broadly coherent to me. They think only they should be trusted with power, presumably because they trust themselves and don't trust other people. Of course the same is probably also true for everybody who isn't them. Nobody could be trusted with the immense responsibility of Emperor of Earth, except myself of course.

I'm not saying this is a good or reassuring stance, just that it's coherent. It tracks with what history and experience says to expect from power hungry people. Trusting themselves with the kind of power that they think nobody else should be trusted with.

Are they power hungry? Of course they are, openly so. They're in open competition with several other parties and are trying to win the biggest slice of the pie. That pie is not just money, it's power too. They want it, quite evidently since they've set out to get it, and all their competitors want it too, and they all want it at the exclusion of the others.

FeepingCreature · 2026-04-07T19:56:46 1775591806

This makes sense if Anthropic think they're the best-positioned to make safe AI. However if you are looking at an AI company there's obviously some selection happening.

0x3f · 2026-04-07T19:46:09 1775591169

> AGI is very close

Based on? Or are you just quoting Anthropic here?

skippyboxedhero · 2026-04-07T19:49:30 1775591370

My Anthropic rep told me it was just around the corner...you aren't saying he lied to me? Can't believe this, I thought he was my friend.

stavros · 2026-04-08T12:26:58 1775651218

"Let me see if the secrets are specified. echo $SECRETS"

skippyboxedhero · 2026-04-07T18:48:25 1775587705

GPT-2, o1, Opus...been here so many times. The reason they do this is because they know it works (and they seem to specifically employ credulous people who are prone to believe AGI is right around the corner). There haven't been significant innovations, the code generated is still not good but the hype cycle has to retrigger.

I remember when OpenAI created the first thinking model with o1 and there were all these breathless posts on here hyperventilating about how the model had to be kept secret, how dangerous it was, etc.

Fell for it again award. All thinking does is burn output tokens for accuracy, it is the AI getting high on its own supply, this isn't innovation but it was supposed to super AGI. Not serious.

chaos_emergent · 2026-04-07T19:46:29 1775591189

> All thinking does is burn output tokens for accuracy

“All that phenomenon X does is make a tradeoff of Y for Z”

It sounds like you’re indignant about it being called thinking, that’s fine, but surely you can realize that the mechanism you’re criticizing actually works really well?

b65e8bee43c2ed0 · 2026-04-07T19:13:57 1775589237

>I remember when OpenAI created the first thinking model with o1 and there were all these breathless posts on here hyperventilating about how the model had to be kept secret, how dangerous it was, etc.

I've read that about Llama and Stable Diffusion. AI doomers are, and always have been, retarded.

vonneumannstan · 2026-04-07T18:54:02 1775588042

Lol you haven't used a model since GPT2 is what it sounds like.

skippyboxedhero · 2026-04-07T19:07:04 1775588824

Just checked my subscription start date for Anthropic. September 2023, I believe before they announced public launch.

Sorry kid.

SyneRyder · 2026-04-07T19:32:12 1775590332

Genuine question - if you don't think the models are improved or that the code is any good, why do you still have a subscription?

You must see some value, or are you in a situation where you're required to test / use it, eg to report on it or required by employer?

(I would disagree about the code, the benefits seem obvious to me. But I'm still curious why others would disagree, especially after actively using them for years.)

skippyboxedhero · 2026-04-07T19:48:28 1775591308

The assumption that the other person made was that I would only use it for coding. If you look through my other comments today, I suggest that they are useful for performing repetitive tasks i.e. checking lint on PR, etc. Also, can be used for throwaway code, very useful.

I don't think the issue is with the model, it is with the implication that AGI is just around the corner and that is what is required for AI to be useful...which is not accurate. The more grey area is with agentic coding but my opinion (one that I didn't always hold) is that these workflows are a complete waste of time. The problem is: if all this is true then how does the CTO justify spending $1m/month on Anthropic (I work somewhere where this has happened, OpenAI got the earlier contract then Cursor Teams was added, now they are adding Anthropic...within 72 hours of the rollout, it was pulled back from non-engineering teams). I think companies will ask why they need to pay Anthropic to do a job they were doing without Anthropic six months ago.

Also, the code is bad. This is something that is non-obvious to 95% of people who talk about AI online because they don't work in a team environment or manage legacy applications. If I interview somewhere and they are using agentic workflow, the codebase will be shit and the company will be unable to deliver. At most companies, the average developer is an idiot, giving them AI is like giving a monkey an AK-47 (I also say this as someone of middling competence, I have been the monkey with AK many times). You increase the ability to produce output without improving the ability to produce good output. That is the reality of coding in most jobs.

AI isn't good enough to replace a competent human, it is fast enough to make an incompetent human dangerous.

carbon_14 · 2026-04-08T17:22:53 1775668973

so true.

vonneumannstan · 2026-04-07T19:18:29 1775589509

So you are doubly stupid, by not seeing any improvement in the models and also paying for models you believe are terrible? lol

skippyboxedhero · 2026-04-07T19:22:40 1775589760

That doesn't follow logically from what I said. You should ask your AI for help with this. You are in need of some artificial intelligence.

simianwords · 2026-04-07T19:13:36 1775589216

Incredible that people still think like this.

skippyboxedhero · 2026-04-07T19:14:13 1775589253

You're completely right.

simianwords · 2026-04-07T19:18:16 1775589496

uhh the model found actual vulnerabilities in software that people use. either you believe that the vulnerabilities were not found or were not serious enough to warrant a more thoughtful release

mlsu · 2026-04-07T19:46:12 1775591172

So did GPT-4.

https://arxiv.org/html/2402.06664v1

Like think carefully about this. Did they discover AGI? Or did a bunch of investors make a leveraged bet on them "discovering AGI" so they're doing absolutely anything they can to make it seem like this time it's brand new and different.

If we're to believe Anthropic on these claims, we also have to just take it on faith, with absolutely no evidence, that they've made something so incredibly capable and so incredibly powerful that it cannot possibly be given to mere mortals. Conveniently, that's exactly the story that they are selling to investors.

Like do you see the unreliable narrator dynamic here?

simianwords · 2026-04-07T20:00:31 1775592031

I don't see the problem here. How would you have handled it differently? If you released this model as such without any safety concern, the vulnerabilities might be found by bad actors and used for wrong things.

What do you find surprising here?

mlsu · 2026-04-07T20:44:47 1775594687

Vulnerabilities were found, probably a few by bad actors, when GPT4 was released. Every vulnerability found now is probably found with AI assistance at the very least. Should they have never released GPT4? Should we have believed claims that GPT4 was too dangerous for mere mortals to access? I believe openAI was making similar claims about how GPT4 was a step function and going to change white collar work forever when that model was released.

The point is that this whole "the model is too powerful" schtick is a bunch of smoke and mirrors. It serves the valuation.

simianwords · 2026-04-07T20:54:07 1775595247

Its far more simple to believe that they are releasing it step by step. Release to trusted third parties first, get the easy vulnerabilities fixed, work on the alignment and then release to public.

Do you don't believe that the vulnerabilities found by these agents are serious enough to warrant staggered release?

mgfist · 2026-04-07T20:32:10 1775593930

On the other hand I've gotten to use opus-4.6 and claude code and the quality is off the charts compared to 2023 when coding agents first hit the scene. And what you're saying is essentially "If they haven't created God, I'm not impressed". You don't think there's some middleground between those two?

Also they just hit a $30B run-rate, I don't think they're that needy for new hype cycles.

skippyboxedhero · 2026-04-07T15:59:43 1775577583

I believe Centrica did some research before the Iran war and found that if we were able to get gas for free, energy bills wouldn't fall and would actually rise over the next few years (because of the mix towards structurally high cost supply).

It says something that the people running the monopoly cash machine are asking questions about bankrupting their customers/ability to pay but politicians are shutting their eyes and pounding onto the accelerator. What a world.

skippyboxedhero · 2026-04-07T15:50:41 1775577041

Anthropic models haven't been far ahead for a while. Quite a few months at least. Chinese models are roughly equal at 1/6th the cost. Minimax is roughly equal to Opus. Chinese providers also haven't had the issues with uptime and variable model quality. The gap with OpenAI also isn't huge and GLM is a noticeably more compliant model (unsurprisingly given the hubristic internal culture at Anthropic around safety).

CC is a better implementation and seems to be fairly economic with token usage. That is the really the only defining point and, I suspect, Anthropic are going to have a lot of trouble staying relevant with all the product issues.

They were far ahead for a brief period in November/December which is driving the hype cycle that now appears to be collapsing the company.

You have to test at least every month, things are moving quickly. Stepfun is releasing soon and seems to have an Opus-level model with more efficient architecture.

nwienert · 2026-04-07T16:01:26 1775577686

Minimax is nowhere near Opus in my tests, though for me at least oddly 4.6 felt worse than 4.5. I haven't use Minimax extensively, but I have an API driven test suite for a product and even Sonnet 4.6 outperforms it in my testing unless something changed in the last month.

One example is I have a multi-stage distillation/knowledge extraction script for taking a Discord channel and answering questions. I have a hardcoded 5k message test set where I set up 20 questions myself based on analyzing it.

In my harness Minimax wasn't even getting half of them right, whereas Sonnet was 100%. Granted this isn't code, but my usage on pi felt about the same.

epistasis · 2026-04-07T16:10:15 1775578215

> CC is a better implementation and seems to be fairly economic with token usage. That is the really the only defining point and, I suspect, Anthropic are going to have a lot of trouble staying relevant with all the product issues.

What are you using to drive the Chinese models in order to evaluate this? OpenCode?

Some of Claude Code's features, like remote sessions, are far more important than the underlying model for my productivity.

skippyboxedhero · 2026-04-07T17:52:02 1775584322

Yes, 100% agree. OpenHands has self-hosted, KiloCode and RooCode both have a cloud option. I don't think you are able to pass a session around with any of them. Codex seems to have comparable features afaik.

CC tool usage is also significantly ahead imo (doesn't negate the price but it is something). I have seen issues with heavy thinking models (like Minimax) and client implementations with poor tool usage (like Cline).

CC has had a period over the last six months of delivering significant value...but, of course, you can just use CC with OpenRouter.

SkyPuncher · 2026-04-07T16:24:07 1775579047

Claude is exceptionally better at long running agentic sessions.

I keep coming back to it because I can run it as a manager for the smaller tasks.

skippyboxedhero · 2026-04-07T17:54:26 1775584466

I haven't noticed a huge difference with other models but I agree that is definitely a strength (and CC has better tooling for this). However, I do think there are practical limitations to agentic workflows because of the relatively poor output vs humans. You can generate lots of code, but most of it will be shit.

Agentic workflows do have a place in well-defined, structured tasks...but I don't think that is what most people are trying to do with it.

baq · 2026-04-07T17:38:01 1775583481

...and codex is at least 10x better than Claude. I don't even bother starting a new session when working on a feature, a single compaction is basically unnoticeable. You have to compact several times to start needing to remind the model about a rule or two.