Hacker Newsnew | past | comments | ask | show | jobs | submit | minimaltom's commentslogin

high surface energy?

mahm/father I yearn for the spicy surface


The archive-handling code was in lean-zip, it just seems the verifiers forgot to write proofs for it (still a bug).

Thats not the main finding of the article however. The main bug found was actually in the lean runtime, affecting all proofs using scalar arrays where the size of the array is not bounded.


Is it fair when it comes to formally-verified software to only consider bugs that violate a proof, and ignore everything else?

Formally-verified software is usually advertised "look ma no bugs!" Not "look ma no bugs*" *As long as this complicated chain of lemmas appropriately represents the correctness of everything we care about.

In boating theres often debate of right of way rules in certain situations, and some people are quick to point out that giant tanker ships should be giving way to tiny sailboats and get all worked up about it*. The best answer I've heard: they're dead right! that is to say as right as they are dead (if they didnt yield) lol. In the same vein, I think someone who assumed that a formally-verified software was perfect and got hacked or whatever is going to be a bit wiggly about the whole thing.

* = Technically the rules prioritize the tankers if they are "restricted in ability to maneuver" but everyone loves to argue about that.


Nobody with experience in the field advertises formally-verified software like that, and it is understood that the spec may as well be wrong. It is also understood that the non-verified parts may have bugs (surprise). There is no news here.

Unless "with experience in the field" == academia, disagree. In particular I remember the early discourse & hype around Wireguard, it was discussed as if perfection was an achieved outcome.

Looks like a normal distribution about the chaos mean to me. I appreciate its not everyones cup of tea, but I like this style of writing.

This 'fingerprint' function is super interesting, I imagine this is a signal they use to detect non-claude-code use of claude-code tokens: src/utils/fingerprint.ts#L40-L63


Strong agree on it all being a legal minefield / new grass.

> But was almost certainly part of its training data, which complicates things

On this point specifically, my read of the Anthropic lawsuit was one of the precedents was that if it trains on something but does not regurgitate it, its fair use? Might help the argument that it was clean-room but ¯\_(ツ)_/¯


Is your (1) description of clean room implementation, and (2) description of what was done, actually correct?

(1): my understanding was that a party _with access to copyrighted material_ made the functional spec, which was communicated to a party without access [1]. Under my understanding, theres no requirement for the authors of the functional spec to be 'clean'.

(2) Afaict, they limited the AI to access of just the functional spec and audited that it did not see the original source.

Edit: Not sure if sharing the 'test suite' matters, probably something for the courts in the unlikely event this ever gets there.

[1] Following the definition of clean room re implementation as it relates to US precedent, ie that described in the wikipedia page.


I feel exactly this.

One thing I will add: while AI is getting really good at _doing_ the software building bits, I haven't yet seen it well integrated into the decision-making and political structure of organizations. Right now, it seems best in the hands of a high-agency individual empowered and able to make changes or 'ship' something, with them acting as the bridge.

This of course, is not a technical challenge, but I would expect the change in structure of organizations to make this more efficient to be slower than the pace of improvement we've seen over the last few years.


Well, I, for one, am guilty of developing AI agents for public project evaluation for current governments. It's coming.


On (1): Might be useful to separate investment flows from the rest of US's economic activity.

AI investment is propping up capital flows, the GDP statistic, and responsible for most of the gains on SPX, but its still a small fraction of the economy.


> Does Waymo have the same object permanence and trajectory prediction (combined) to that of a human?

On this note specifically ive actually been impressed, ie when driving down Oak st in SF (fast road, tightly parked cars) I've often observed it slow if someone on a scooter on the sidewalk turns to look toward oncoming traffic (as if to start riding), or to slow passing parked box trucks (which block vision of potential pedestrians)


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: