Eufrat's comments

Eufrat · 2026-01-19T10:08:44 1768817324

You’re forcing a binary choice here.

I think for a lot of minor things, having AI generate stuff is okay, but it’s rather astounding how verbose and sometimes bizarre the code is. It mostly works, but it can be hard to read. What I’m reading from a lot of people is that they’re enjoying coding again because they don’t have to deal with the stuff they don’t want to do, which...I mean, that’s just it isn’t it? Everyone wants to work on what they enjoy, but that’s not how most things work.

Another problem is that if you just let the AI do a lot of the foundational stuff and only focus on the stuff that you’re interested in, you sometimes just miss giant pieces of important context. I’ve tried reading AI driven code, sometimes it makes sense, sometimes it’s just unextensible nonsense that superficially works.

This isn’t tech that should replace anything and needs to be monitored judiciously. It can have value, but what I suspect is going to happen is we are going to have a field day with people fixing and dealing with ridiculous security holes for the next decade after this irrational exuberance goes away. It should be used in the same way that any other ML technique should be. Judiciously and in a specific use case.

Said another way, if these models are the future of general programming, where are the apps already? We’re years into this and where are they? We have no actual case studies, just a bunch of marketing copy and personal anecdotes. I went hunting for some business case studies a while ago and I found a Deloitte “case study” which was just pages of “AI may help” without any actual concrete cases. Where are the actual academic studies showing that this works?

People claiming AI makes them code faster reminds me that Apple years ago demonstrated in multiple human interaction studies that the mouse is faster, but test subjects all thought keyboard shortcuts were faster [1]. Sometimes objective data doesn’t matter, but it’s amusing that the whole pitch for agentic AI is that it is faster and evidence is murky for this at best.

[1] https://www.asktog.com/TOI/toi06KeyboardVMouse1.html

Eufrat · 2026-01-18T09:59:06 1768730346

There was a post about Erdős 728 being solved with Harmonic’s Aristotle a little over a week ago [1] and that seemed like a good example of using state-of-the-art AI tech to help increase velocity in this space.

I’m not sure what this proves. I dumped a question into ChatGPT 5.2 and it produced a correct response after almost an hour [2]?

Okay? Is it repeatable? Why did it come up with this solution? How did it come up with the connections in its reasoning? I get that it looks correct and Tao’s approval definitely lends credibility that it is a valid solution, but what exactly is it that we’ve established here? That the corpus that ChatGPT 5.2 was trained on is better tuned for pure math?

I’m just confused what one is supposed to take away from this.

[1] https://news.ycombinator.com/item?id=46560445

[2] https://chatgpt.com/share/696ac45b-70d8-8003-9ca4-320151e081...

Coeur · 2026-01-18T19:53:11 1768765991

Also #124 was proved using AI 49 days ago: https://news.ycombinator.com/item?id=46094037

vessenes · 2026-01-18T16:37:29 1768754249

Thanks for the curious question. This is one in a sequence of efforts to use LLMs to generate candidate proofs to open mathematical questions, which then are generally formalized into Lean, a formal proof system for pure mathematics.

Erdos was prolific and many of his open problems are numbered and have space to discuss them online, so it’s become fairly common to run through them with frontier models and see if a good proof can be come up with; there have been some notable successes here this year.

Tao seems to engage in sort of a two step approach with these proofs - first, are they correct? Lean formalization makes that unambiguous, but not all proofs are easily formulated into Lean, so he also just, you know, checks them. Second, literature search inside LLMs and out for prior results — this is to check where frontier models are at in the ‘novel proofs or just regurgitated proofs’ space.

To my knowledge, we’re currently at the point where we are seeing some novel proofs offered, but I don’t think we’ve seen any that have absolutely no priors in literature.

As you might guess this is itself sort of a Rorschach test for what AI could and will be.

In this case, it looked at first like this was a totally novel solution to something that hadn’t been solved before. On deeper search, Tao noted it’s almost trivial to prove with stuff Erdos knew, and also had been proved independently; this proof doesn’t use the prior proof mechanism though.

Eufrat · 2026-01-08T21:19:13 1767907153

People arguing this is a first step in true reform fail to see what Texas has been doing, trying to return to the days of “Pappy” O’Daniel.

I suppose naked grifting is just the law of the land at this point and we should just all be gaslighted into accepting it as reform.

Eufrat · 2026-01-08T20:49:20 1767905360

> When the tools are ready, I'd argue that they will probably be safer out of the box compared to a whole lot of users that just blindly copy-paste stuff from the internet, adding random dependencies without proper due diligence, etc. These tools might actually help users acting more secure.

This speculative statement is holding way too much of the argument that they are just “beta tools”.

Eufrat · 2026-01-08T09:06:10 1767863170

I don’t think this is a fair retort. This is not being marketed towards people who have any inkling about how any of this works. The linked press release is clearly trying to get the average person jazzed up about wiring their medical history and fitness data to ChatGPT.

ChatGPT is just suppose to “work” for the lay person and it just doesn’t quite often. OpenAI is already being sued by people for stochastic parroting that ended in tragedy. In one case they’ve tried to use the rather novel affirmative defense that they’re not not liable because using ChatGPT for self-harm was against the terms of service the victim agreed to when using the service.

weatherlite · 2026-01-08T11:44:25 1767872665

Doctors get sued all the time. It doesn't mean doctors are no good. I also don't think ChatGPT will pretend they are replacing doctors / committing to diagnosis with this tool. They will cover their ass legally.

Eufrat · 2026-01-08T08:49:55 1767862195

I find this really frustrating and confusing about all of the coding models. These models are all ostensibly similar in their underpinnings and their basic methods of operation, right?

So, why does it feel all so fragile and like a gacha game?

FergusArgyll · 2026-01-08T09:37:45 1767865065

OpenAI actually have different models in the cli (e.g. gpt-5.2-codex)

davidmurdoch · 2026-01-08T14:25:43 1767882343

Naming things is hard. So hard every AI company isn't even trying to come up with good names.

Eufrat · 2026-01-08T01:40:12 1767836412

I fear people will just get used to it just like other means of mass surveillance then wonder why they're being harassed with petty pretexts based on this data.

heavyset_go · 2026-01-08T01:50:08 1767837008

This is already the case. The largest supermarket chain in my relatively wealthy area has had multiple cameras per aisle hanging about ~3 feet above your head + monitors in each aisle that show some, but not all, camera views, for over a decade now.

Like ALPR cameras and now Flock cameras, no one cares and if you seem to care, people assume you're up to no good.

This is the same culture that obsessively watches their Ring cameras and posts videos of people innocently walking down the street on the Nextdoor app because seeing the wrong people existing outside scares them.

potato3732842 · 2026-01-08T02:29:05 1767839345

It's so weird to me that the stores in "nicer" areas seem to be on the forefront of this crap.

I suspect it may have more to do with how local law enforcement handles shoplifting and theft generally than actual customer demographics.

heavyset_go · 2026-01-08T04:19:18 1767845958

> I suspect it may have more to do with how local law enforcement handles shoplifting and theft generally than actual customer demographics.

They literally have nothing better to do so this, traffic enforcement and bothering kids who are trying to have a good time are the bulk of their duty, so I'd agree.

> It's so weird to me that the stores in "nicer" areas seem to be on the forefront of this crap.

I think a certain kind of person is comforted by surveillance. They perceive it, usually somewhat correctly from places of immense privilege, to be for their benefit and protection. They idea that it would be used against them, who are Good, and not against those people, who are Bad, is laughable to them if the concept even crosses their minds.

Maybe you're one of those people if the cameras bother you, is the sentiment.

potato3732842 · 2026-01-08T11:07:06 1767870426

What I was getting at is that these richer areas are pretty bimodal. They either support the shit out of the police or they think that enforcing petty theft laws are racist and both cases lead to more orwellian crap (the latter because the retailer has to basically serve up felony prosecutions on a silver platter if they want anything to happen).

crazygringo · 2026-01-08T02:03:19 1767837799

Harassed how?

vmilner · 2026-01-08T02:27:18 1767839238

https://www.theguardian.com/uk-news/2025/jun/06/shopper-face...

Eufrat · 2025-11-12T21:56:11 1762984571

Are there any more details on this investment? Do they have the hardware or even the power to ready support such growth?