More

jryio · 2026-04-28T03:41:20 1777347680

If anyone was wondering ... it's racist

Unsurprisingly the texts written up until that time were dominated by such individuals which is tragic for LLM training if you think about it.

The voiceless groups or fringe opinions which we take as normative today do not appear.

Does this encourage us to write in the present such that we influence the models in perpetuity?

ipaddr · 2026-04-28T04:52:57 1777351977

Voiceless groups do not appear in the training data? How could they, they are voiceless. You think the voiceless people are represented in todays training data? They cannot they are voiceless.

Nothing tragic about using data from a time period.

Common words used in 1900s are labeled racist now. I doubt anyone was wondering if they filtered those words for modern safe wordx.

idonotknowwhy · 2026-04-29T04:48:49 1777438129

>The voiceless groups or fringe opinions which we take as normative today do not appear.

Times are different. Anybody with an internet connection can "publish" their thoughts and perspective online. LLMs scrape all of this. Modern datasets like CommonCrawl capture a vastly wider spectrum of humanity than a printing press ever could. The pre-1930 model acts as a time capsule of "gatekept publishing", but modern LLMs are trained on the democratized web.

>Does this encourage us to write in the present such that we influence the models in perpetuity?

I noticed a bunch of LLM-powered Reddit accounts praising products/services in dead threads. Or one bot posting a setup question, then a few other bots responding with praise / questions about a specific product in response. I don't know why they're doing this but I'm beginning to suspect it's something like this (get this positive sentiment into the datasets for the next generation of LLMs).

SuddsMcDuff · 2026-04-28T07:07:55 1777360075

I'd be more worried if words from that era were fully aligned with present day notions of morality. Wouldn't that indicate a certain stagnation & lack of progress?

Let us hope, 100 years from now, there will be people who look back unkindly on us.

NoGravitas · 2026-04-28T16:53:14 1777395194

As Proudhon said, "I dream of a society where I would be guillotined as a reactionary."

dirasieb · 2026-04-28T19:10:33 1777403433

10 years ago people might had cared about your whining, not anymore (thank god)

b65e8bee43c2ed0 · 2026-04-28T12:08:50 1777378130

one day we'll have SOTA models trained like this one and there's nothing you can do about it :^)

jryio · 2026-04-27T14:05:42 1777298742

> OpenAI has contracted to purchase an incremental $250B of Azure services, and Microsoft will no longer have a right of first refusal to be OpenAI’s compute provider.

Azure is effectively OpenAI's personal compute cluster at this scale.

JumpCrisscross · 2026-04-27T14:07:25 1777298845

What fraction of Azure compute does OpenAI represent? (Does the $250bn commitment have a time period? Is it legally binding?)

runako · 2026-04-27T14:10:50 1777299050

Azure did $75B last quarter.

That article doesn't give a timeframe, but most of these use 10 years as a placeholder. I would also imagine it's not a requirement for them to spend it evenly over the 10 years, so could be back-loaded.

OpenAI is a large customer, but this is not making Azure their personal cluster.

einrealist · 2026-04-27T14:12:50 1777299170

I wonder how this figure was settled. Is it based on consumer pricing? Can't Microsoft and OpenAI just make a number up, aside from a minimum to cover operating costs? When is the number just a marketing ploy to make it seem huge, important and inevitable (and too big to fail)?

jryio · 2026-04-24T22:21:00 1777069260

I find it strange that you've anthropomorphized Claude but not ChatGPT seemingly based on one having a human name and the other not

jryio · 2026-04-23T23:19:12 1776986352

Exactly - cooperation is not incentivized properly

jryio · 2026-04-23T22:30:15 1776983415

Just another disposable piece of software maintained by a single person that does 80% of what other apps do but worse.

Max lifespan 2 years

rglover · 2026-04-23T22:42:47 1776984167

Please cut this out. You really don't want to live in a world where individuals are discouraged from trying to build things that are good.

If you want something to stick around: you have to use and pay for it.

lbreakjai · 2026-04-23T23:52:55 1776988375

You're right. We should absolutely only rely on "Ask sales for price" closed-source software from megacorps, that get worse on every release, and get sunset anyway when the funding runs out.

morelikeborelax · 2026-04-24T00:00:10 1776988810

I hAvE a FeW qUaLmS wItH tHiS aPp

https://news.ycombinator.com/item?id=9224

qsort · 2026-04-23T22:32:30 1776983550

But if they ever choose to decommission it, they have the chance to do the funniest thing:

https://scryfall.com/card/plst/INV-156/obliterate

johntopia · 2026-04-24T05:02:54 1777006974

unacceptable comment. hacker news is misunderstood as a toxic community because of fellas like you. have some dignity.

ikdiendoehdj · 2026-04-24T04:31:54 1777005114

Of all the things to judge this on, you chose the most ridiculous one. Why shouldn’t a project like this exist just because there are “bigger” alternatives out there?

If youre gonna shut this one down, at the very least do it for the right reasons such as the fact that this is a webwrapper—absolutely disgusting, either go native or don’t bother shoving your webpage into a browser-container and calling it what it is not (an app).

SpyCoder77 · 2026-04-23T22:52:11 1776984731

Some people...

BirAdam · 2026-04-24T00:48:37 1776991717

You do realize that would have once described GCC and Linux, right?

tredre3 · 2026-04-24T03:57:16 1777003036

Of Linux, yes. Of GCC, no. From the very beginning there was multiple authors and the project was a mishmash of several other projects.

jryio · 2026-04-23T18:32:02 1776969122

This feels like an unethical release of a model. They've opened a can of worms without investing in defense first.

Anthropic announced their capabilities in advanced, issued a private release, then put up $100M in credits to Fortune 500 companies and OSS projects to secure themselves.

OpenAI sees that, makes a model equally capable at exploiting vulnerabilities, then released it to the pubic with no equivalent program [1]

[1]: https://www.anthropic.com/glasswing

jryio · 2026-04-23T18:12:11 1776967931

Their 'Preparedness Framework'[1] is 20 pages and looks ChatGPT generated, I don't feel prepared reading it.

https://cdn.openai.com/pdf/18a02b5d-6b67-4cec-ab64-68cdfbdde...

jryio · 2026-04-23T17:53:42 1776966822

1. They changed the default in March from high to medium, however Claude Code still showed high (took 1 month 3 days to notice and remediate)

2. Old sessions had the thinking tokens stripped, resuming the session made Claude stupid (took 15 days to notice and remediate)

3. System prompt to make Claude less verbose reducing coding quality (4 days - better)

All this to say... the experience of suspecting a model is getting worse while Anthropic publicly gaslights their user-base: "we never degrade model performance" is frustrating.

Yes, models are complex and deploying them at scale given their usage uptick is hard. It's clear they are playing with too many independent variables simultaneously.

However you are obligated to communicate honestly to your users to match expectations. Am I being A/B tested? When was the date of the last system prompt change? I don't need to know what changed, just that it did, etc.

Doing this proactively would certainly match expectations for a fast-moving product like this.

fn-mote · 2026-04-23T18:15:29 1776968129

> 2. Old sessions had the thinking tokens stripped, resuming the session made Claude stupid (took 15 days to notice and remediate)

This one was egregious: after a one hour user pause, apparently they cleared the cache and then continued to apply “forgetting” for the rest of the session after the resume!

Seems like a very basic software engineering error that would be caught by normal unit testing.

Eridrus · 2026-04-23T18:05:35 1776967535

To be fair to Anthropic, they did not intentionally degrade performance.

To take the opposite side, this is the quality of software you get atm when your org is all in on vibe coding everything.

shrx · 2026-04-23T20:50:15 1776977415

Are you saying dropping cache after 1 hour is not intentionally degrading performance?

Eridrus · 2026-04-24T02:14:26 1776996866

Yes. Caching is a cost optimization not a response quality metric.

shrx · 2026-04-24T09:37:22 1777023442

But it still degrades performance.

Eridrus · 2026-04-24T13:18:12 1777036692

It's unfortunate that the word performance is overloaded and ML folks have a specific definition..that isn't what the rest of CS uses, but I understand Anthropic to mean response quality when they say this and not any other dimension you could measure performance on.

You can argue they're lying, but I think this is just folks misunderstanding what Anthropic is saying.

fydorm · 2026-04-25T04:24:31 1777091071

They didn't just drop cache. They elided thinking blocks even if you recache. That permanently degraded the model output for the rest of the session, even ignoring the bug, if you waited 60 minutes instead of 59.

sroussey · 2026-04-23T18:01:50 1776967310

None of these problems equate to degrading model performance. Completely different team. Degraded CC harness, sure.

qingcharles · 2026-04-23T18:09:43 1776967783

Sure, but it gives the impression of degraded model performance. Especially when the interface is still saying the model is operating on "high", the same as it did yesterday, yet it is in "medium" -- it just looks like the model got hobbled.

sroussey · 2026-04-23T18:19:18 1776968358

Oh, absolutely. Though changes in how the model is used is imminently more fixable than the model itself.

johnmaguire · 2026-04-23T18:40:53 1776969653

Yes, but for many users, CC is the product. Especially since I'm not allowed(?) to use my own harness with my sub.

Philpax · 2026-04-23T18:03:07 1776967387

> Anthropic publicly gaslights their user-base: "we never degrade model performance" is frustrating.

They're not gaslighting anyone here: they're very clear that the model itself, as in Opus 4.7, was not degraded in any way (i.e. if you take them at their word, they do not drop to lower quantisations of Claude during peak load).

However, the infrastructure around it - Claude Code, etc - is very much subject to change, and I agree that they should manage these changes better and ensure that they are well-communicated.

jryio · 2026-04-23T18:15:25 1776968125

Model performance at inference in a data center v.s. stripping thinking tokens are effectively the same.

Sure they didn't change the GPUs their running, or the quantization, but if valuable information is removed leading to models performing worse, performance was degraded.

In the same way uptime doesn't care about the incident cause... if you're down you're down no one cares that it was 'technically DNS'.

sroussey · 2026-04-23T18:21:41 1776968501

I thought these days thinking tokens sent my the model (as opposed to used internally) were just for the users benefit. When you send the convo back you have to strip the thinking stuff for next turn. Or is that just local models?

aszen · 2026-04-23T18:16:45 1776968205

Claude code is not infra, the model is the infra. They changed settings to make their models faster and probably cheaper to run too. Honestly with adaptive thinking it no longer matters what model it is if you can dynamically make it do less or more work.

jryio · 2026-04-22T18:27:28 1776882448

Notion did it first and arguably better[1]. Shared agents benefit from shared context.

The hardest part is ensuring that shared context is maintained and it converges on a representation of reality and the people in the company.

[1] https://www.notion.com/help/custom-agents

jeswin · 2026-04-22T20:21:28 1776889288

Notion, as any other thin-AI product out there, is now in Anthropic/OpenAI/Google's crosshairs. Unless one has a moat the size of SharePoint or Google Docs or OneDrive, it's just a feature away.

baxtr · 2026-04-22T21:26:06 1776893166

I really like Notion's UI. I wish they would focus only on that and let me access my Notion DB as .md files with Claude.

dimitri-vs · 2026-04-23T01:49:54 1776908994

Take a look at Outline! I use it almost exactly like a cloud based Obsidian vault. And they have been very responsive for MCP feature requests

bryanhogan · 2026-04-23T06:35:01 1776926101

I don't think they have added a Obsidian Bases / Notion Database like feature yet, right? Saw some discussion of adding a NocoDB integration, but also didn't see that happen yet.

nxobject · 2026-04-22T21:57:17 1776895037

I know this is probably out of scope, but I'd love it as well if Notion could slowly accrete the features of Airtable... at least expose some form of programmatic access to tables!

jorl17 · 2026-04-22T21:27:40 1776893260

Yes, please. Their MCP suuuuuuuucks

artdigital · 2026-04-23T03:08:33 1776913713

How does it suck? I use it almost daily and love their Notion MCP

jorl17 · 2026-04-23T18:54:54 1776970494

I was probably a bit harsh.

It works, but models seem to have these insane long traces to do the most basic things. I had to create a couple of skills so they know how to properly use the thing without breaking, so they don't always try to pass the wrong parameters to it.

It also doesn't let us change a couple of things (like icons). Or, if it does, not even Opus 4.6 can figure out how to do it.

theshrike79 · 2026-04-23T06:59:52 1776927592

Can't limit access easily. You can do per-workspace permissions and that's about it.

gavinray · 2026-04-22T18:36:54 1776883014

At promptql, our solution to this was a wiki. You get knowledge-graph/relations for free through page links.

New knowledge additions are proposed when agents decide it would be relevant to retain, humans confirm/deny or create wiki modifications themselves.

persedes · 2026-04-23T13:46:52 1776952012

it's funny how adding AI to notion actually made it a lot more usable. Most products force it on you, but here I feel like it's actually a massive benefit. It was hard finding content and using the filters felt clunky. (And the whole UI either in a browser or their app feels buggy + slow). But with their notion AI / MCP it's gotten super easy to get information in and out.

Jayakumark · 2026-04-22T18:51:19 1776883879

In demo videos, it shows Memory under Files, so i assume it holds learnings and shared context.

defjosiah · 2026-04-22T20:46:27 1776890787

Yeah, the memory is cool, just a file store that you can instruct the agent to use however you see fit.

jryio · 2026-04-20T01:26:32 1776648392

Software engineering is certainly not engineering. Even at the highest levels. Real engineering have infinitely more complex interactions in the physical world than symbolic institutions for machines.

whattheheckheck · 2026-04-20T04:38:47 1776659927

Thats right, no need to understand anything other than symbols on a machine. No people involved. No reality to model. No economics to think about. Nothing like real engineering. Thats for the big boys and girls