Surely if we can recognize this, an AI worthy of the name would be able to recognize this at scale, and what can be recognized can be remediated…
Or perhaps this could serve as a kind of test: a technology that cannot be reliably used in tax evasion enforcement simply isn’t worthy of the name AI.
Or perhaps it reveals that we have structural problems, and certain concentrations of wealth with or without automation are a threat to the just and effective operation of society and should therefore be as vigorously opposed as crime or foreign attacks.
Piggybacking on this… do all nvidia cards have the same issue with Linux drivers, where the fan won’t ever go below 30%? I have a 3090 on Ubuntu 24 and hours of googling netted nothing that worked.
FWIW - the fans on my 4070ti super turn off during idle in pop, bazzite, and cachy (just like on windows). Definitely no being stuck at 30% that I’ve experienced
Retention is one good line between enforcement and tracking.
If we wanted to solve this problem while retaining the benefits, what we'd do is have stiff penalties for warrantless retention and required yearly independent audits of systems.
Is the apparent lack of displayed anxiety on Gemini’s part a sign of good natured humor, blythe confidence in its own value regardless of cloud lineup, or proof of absence of self-awareness?
Sometimes it strikes me that something like this might be one of the better litmus tests for AI — if it’s really good enough to start 10x-ing engineers (let alone replacing them) it should be more common for more projects like this should begin to accelerate to practical usability.
If not, maybe the productivity dividends are mostly shallow.
The problem is that many of these clean room reimplementations require contributors to not have seen any of the proprietary source. You can't guarantee that with ai because who knows which training data was used
> You can't guarantee that with ai because who knows which training data was used
There are no guarantees in life, but with macOS you can know it is rather unlikely any AI was trained on (recent) Apple proprietary source code – because very little of it has been leaked to the general public – and if it hasn't leaked to the general public, the odds are low any mainstream AI would have been trained on it. Now, significant portions of macOS have been open-sourced – but presumably it is okay for you to use that under its open source license – and if not, you can just compare the AI-generated code to that open source code to evaluate similarity.
It is different for Windows, because there have been numerous public leaks of Windows source code, splattered all over GitHub and other places, and so odds are high a mainstream AI has ingested that code during training (even if only by accident).
But, even for Windows – there are tools you can use to compare two code bases for evidence of copying – so you can compare the AI-generated reimplementation of Windows to the leaked Windows source code, and reject it if it looks too similar. (Is it legal to use the leaked Windows source code in that way? Ask a lawyer–is someone violating your copyright if they use your code to do due diligence to ensure they're not violating your copyright? Could be "fair use" in jurisdictions which have such a concept–although again, ask a lawyer to be sure. And see A.V. ex rel. Vanderhye v. iParadigms, L.L.C.,
562 F.3d 630 (4th Cir. 2009))
In fact, I'm pretty sure there are SaaS services you can subscribe to which will do this sort of thing for you, and hence they can run the legal risk of actually possessing leaked code for comparison purposes rather than you having to do it directly. But this is another expense which an open source project might not be able to sustain.
Even for Windows – the vast majority of the leaked Windows code is >20 years old now – so if you are implementing some brand new API, odds of accidentally reusing leaked Windows code is significantly reduced.
Other options: decompile the binary, and compare the decompiled source to the AI-generated source. Or compile the AI-generated source and compare it to the Windows binary (this works best if you can use the exact same compiler, version and options as Microsoft did, or as close to the same as is manageable.)
Are those OSes actually that strict about contributors? That’s got to be impossible to verify and I’ve only seen clean room stuff when a competitor is straight up copying another competitor and doesn’t want to get sued
ReactOS froze development to audit their code.[1] Circumstantial evidence was enough to call code not clean. WINE are strict as well. It is impossible to verify beyond all doubt of course.
I’ve been thinking a long time about using AI to do binary decompilation for this exact purpose. Needless to say we’re short of a fundamental leap forward from doing that
This was my thought here as well. Getting one piece of software to match another piece of software is something that agentic AI tools are really good at. Like, the one area where they are truly better than humans.
I expect that with the right testing framework setup and accessible to Claude Code or Codex, you could iterate your way to full system compatibility in a mostly automated way.
If anyone on the team is interested in doing this, I’d love to speak to them.
In an actual business environment, you are right that its not a 10x gain, more like 1.5-2x. Most of my job as an engineer is gathering and understanding requirements, testing, managing expectations, making sure everyone is on the same page etc...it seems only 10-20% is writing actual code. If I do get AI to write some code, I still need to do all of these other things.
I have used it for my solo startups much more effectively, no humans to get in the way. I've used AI to replace things like designers and the like that I didn't have to hire (nor did I have the funds for that).
I can build mini AI agents with my engineering skills for simple non-engineering tasks that might otherwise need a human specialist.
Who's paying $30 to run an AI agent to run a single experiment that has a 20% chance of success?
On large code-bases like this, where a lot of context gets pulled in, agents start to cost a lot very quickly, and open source projects like this are usually quite short on money.
I have the unpopular opinion that like I have witness in person the transition from Assembly into high level languages, eventually many tasks that we manually write programs for, will be done with programable agents of some sort.
In an AI driven OS, there will be less need for bare bones "classical" programming, other than the AI infrastructure.
Now is this possible today, not really as the misteps from Google, Apple and Microsoft are showing, however eventually we might be there with a different programming paradigm.
Having LLMs generate code is a transition step, just like we run to Compiler Explorer to validate how good the compiler optimizer happens to be.
Be sure to correct all the people who are using the term “cool” for things other than relative temperature, as it was originally defined.
See also the dictionary fallacy, and again descriptivism vs prescriptivism.
Additionally, even leaving alone the div/dynamic language issue, there really isn’t a point in usage history where DHTML came without JS — believe me, I was doing it when the term first came into usage. JS was required for nearly all dynamic behavior.
> See also the dictionary fallacy, and again descriptivism vs prescriptivism
DHTML is an acronym that expands to: Dynamic HyperText Markup Language.
There is no dictionary fallacy or descriptivism vs prescriptivism or defined meaning. It was simply an industry standard way to shorten all those words.
Changing one of the letters to stand for something else reassigns the pointer to something else entirely, or is the making of a joke, which I think the above may have been.
DHTML is literally just HTML that is dynamically modified by JavaScript. DHTML became a term when JavaScript became ubiquitous. It was not an extension.
Javascript was not ubiquitous when the term DHTML was last seriously used. And yes, CSS and javascript were extensions at the time, not very widely supported across all browsers.
We had table based layouts and then divs when CSS started to take off, mostly used by artists rather than companies at first.
Javascript had vanishingly limited uses at first, too. I don't remember exactly how long it took us to get XHR but before that we had "Comet frames", before iframe security was given much focus. Javascript couldn't do that for a while. It was also dodgy and considered bad practice for quite a while, too.
I don't remember when the term javascript was even really used in regular vernacular but DHTML was not so much referring to CSS as it was the myriad of weird mechanisms introduced to make pages dynamic. It was never "Div-based HTML" or whatever, the div craze came way later once CSS was Good Enough to eschew table layouts - after which, Dreamweaver died and photoshop's slice tool finally got removed, and we started inching toward where the web sits today.
I also do distinctly recall needing a doctype for DHTML for some browsers.
> Javascript was not ubiquitous when the term DHTML was last seriously used.
It wasn't as fast or as usable as it is today, but Javascript has been in every mainstream browser since before Microsoft started pushing "DHTML".
Interestingly, in my memory, it seemed like we had JS for a long time before DHTML, but it was only a couple years between Eich writing it and IE4, which was the start of the "DHTML" moniker. Looking back at the timeline, everything seems much more compressed than it felt at the time.
DHTML was just JavaScript that mutated the DOM. That’s literally all it ever was. There was also not a DHTML doctype. There was also not anything even called “an extension”. There were Java applets, ActiveX controls, and ActionScript -> JavaScript bridges, which the concept of DHTML (dynamic HTML) eventually fully replaced.
Divs weren’t a “craze”. They were popularized by the (brand new) XHTML spec, which did have its own doctype.
JS was ubiquitous when DHTML was pushed by Microsoft, because DHTML required JScript aka JS.
CSS was not there in the '90s because Netscape didn't implement, and MS did its own subset.
JS and CSS both suffered from wildly inconsistent support between Netscape and IE, but JS at root had interop enough to support hotmail, later OddPost, much more. CSS had no extension mechanism based on JS then, so it suffered browser-specific support that was IMHO worse than JS suffered. No way to polyfill.
> I don't remember when the term javascript was even really used in regular vernacular
2004 or 2005. Gmail and Google Maps were a "holy crap this is actually possible?" for a lot of people, both technical and non, and was when javascript switched from mostly-ignored* to embraced.
*Just minor enhancements, outside of technical people mostly only known to MySpace users who wanted to add a little flair to their page. XmlHttpRequest was almost entirely unknown even in technical spaces until gmail showcased interaction without page refreshes.
Every settlement reaches a level of density where they have to re-reckon with land use rules collectively — that’s one of the deep flaws in a primarily private approach to land however appealing (and definitely particularly popular in boomer heydays).
I think this is a really good question, and the answer might be that ideally you move up and down the ladder of abstraction, learning from concrete examples in some domains, then abstracting across them, then learning from applying the abstractions, then abstracting across abstractions, then cycling through the process.
Or perhaps this could serve as a kind of test: a technology that cannot be reliably used in tax evasion enforcement simply isn’t worthy of the name AI.
Or perhaps it reveals that we have structural problems, and certain concentrations of wealth with or without automation are a threat to the just and effective operation of society and should therefore be as vigorously opposed as crime or foreign attacks.
reply