Hacker Newsnew | past | comments | ask | show | jobs | submit | snailmailman's commentslogin

How is the second LLM not also vulnerable from prompt injection? In order to supervise the first, it must receive data (presumably output from the first LLM?). All generated output after the user input is in the context should be considered possibly compromised/prompt injected. Having a second LLM just adds more obfuscation, but prompt injection could be chained.

That's when you bust out the third LLM. Nobody expects the fourth LLM to be the REAL LLM in the chain.

Quis custodiet ipsos custodes?

I’ve seen it directly contradict the citation so many times that i disregard the text and just click the citation or scroll past every single time. Just today i caught it making up the date for an event, and the citation had accurate information when clicked through.

It’s super easy to catch on dates and numbers, but it gets other details wrong all the time too. But so many people won’t be double checking the results.


I thought most modern Bluetooth devices essentially randomize the Bluetooth MAC address periodically, specifically to prevent this sort of tracking? And random MAC addresses too on WiFi.

If someone has a half dozen BT devices on their person/in their car and they randomize MACs hourly (but not all at once) I bet you could still track people pretty accurately.

I had a problem recently where I ran a script with the wrong set of permissions, and accidentally screwed up the ownership of a random mix of files spread across my entire drive. This broke several pieces of software and made the system unusable.

I had enough information to reconstruct what files exactly got screwed up, and while I didn’t have a backup, I had a similar enough system I could pull “known good” file permissions from. I knew a simple script could find the problematic files and fix all of them.

I tried getting an AI to solve this. And it repeatedly gave me scripts that ignored all the details and intricacies of my issue and were functionally just "chown -R user:user /". (A command that will functionally nuke a drive, breaking ownership on every file)

The ai-provided scripts were reasonably complex and did a pretty decent job of obfuscating the disastrous outcomes the scripts would have inflicted on my drive.

After reading the man pages myself I wrote a simple enough script by hand and fixed the issue myself. AI wasted more time than it saved.


On my rokus, I am able to use my phone as a remote via the roku app. This includes typing on mobile via my phone's keyboard. Makes logging into things much easier.

AppleTV is like that too. It's nice being able to use the password manager on my phone rather than have try to enter some long complicated password a letter at a time.

While Texas is quite red. Renewables are surprisingly popular. Why should a farmer in the middle of nowhere have to rely on Texas’ power grid, when they can install a few solar panels and a battery. Especially when storms can take out power lines, or take out the entire grid.

I’m near a big city in Texas, and before any big storms here, generators frequently sell out at stores. Power outages are basically expected during any storms. Lots of people buying into solar (or backup generators/batteries) just for independence from the power grid. Especially after the huge winter storm a few years ago left people without power for days in the cold.


> Lots of people buying into solar (or backup generators/batteries) just for independence from the power grid.

Sounds like the woke mind virus has taken over /s


While fossil fuels are huge in Texas, solar and wind are too. Especially out in west Texas where there’s a lot of wide open space, wind turbines are surprisingly frequent. Texas produces the most wind power out of any state. And solar works just about anywhere in Texas. Lots of sun in the summers.

On mobile, It’s hijacking my scroll in such a way that I literally cannot move further down the page. And “reader mode” is only showing me the first paragraph or so.

I’ll have to try again later on desktop. The content looks interesting but it’s literally impossible to read. I cannot get past the section that introduces Ernst and Young.


On desktop it keeps adding forced pauses to scrolling, of varying sizes, and you need to scroll down a between 1 and 10 pages worth to begin scrolling again.

It might "work" just fine on mobile (or not) but you may have stopped trying before reaching the point of re-scrolling, because it's insane.


I eventually managed to get far enough into the article that I thought I saw the main stat - the stat that 26% of the citations were hallucinated. Then the scroll threw me back to the top again and I gave up entirely on reading from my phone.

Coming back later on desktop, I see that the percentage keeps climbing the further you manage to make it down the page. The real stat is 60% of the citations were hallucinated.


I recommend just clicking and dragging the actual scrollbar on desktop for this one. Wild

LLMs are so frequently inaccurate its crazy to think of it fully replacing search.

I've been trying to use LLMs for things and it makes mistakes all the time. Just this week i had multiple instances of various LLMs basically saying "just run the software with --flag-that-fixes-your-problem" or "edit the config and add solve-your-issue=true" hallucinating non-existant options. Even if i manually link the relevant documentation pages it will still just make basic mistakes. and if im having to read the documentation myself anyway to fix the AI's mistakes, why is the AI even in the loop.

its infecting search too, because blogspam/slop articles are managing to make their way into search results by just making up untrue information, claiming software can do things it cant, or has options that don't exist.


> LLMs are so frequently inaccurate its crazy to think of it fully replacing search.

It's baffling that people have become so devoted to them as a source of information given how inaccurate they are. I've learned not to trust anything they say, ever, especially when it comes to technical subjects.


Perhaps I've just internalized it -- I know that's unreliable and I just deal with it. LLMs are certainly capable of searching the web and finding the right answer directly so you still don't have to read the documentation.


This case is wild and seems to perfectly encapsulate all the problems people complain about with vibecoded projects.

The "rewrite it in rust" commit is +1M lines of code. Humans haven't looked at that in depth. In about a week, they saw the tests passed and pushed it to main. Now people have started to look through it and are pointing out glaring issues. And the solution is just going to be "feed it to another AI and ask it to fix it".

The entire codebase is slop now. Nobody knows what it does. It manages to pass some tests, but its largely a black box just on the basis of humans haven't read it yet. The code isn't guaranteed to be anything close to 1:1 with the old codebase. Its probably vaguely shaped like the old codebase, but new bugs could be there, old bugs could be there, nobody knows anything yet.

Its going to be interesting to see how recoverable this is. They are almost certainly going to just hand every file to an AI, say "look for soundness issues and fix them" and then what? If AI is making huge, sweeping changes to the code so frequently that humans can't keep up, is that really maintainable? The only solution appears to be "even more AI" while anybody that looks closely gets scared away by the too-large-to-comprehend-and-entirely-slop codebase.

This kind of thing has been happening with many smaller projects already, but now its a larger project and happening in a much more public way, with the intent to replace human-written, mostly-understood code with slop. I suspect the same thing, with the same problems, is happening inside all the largest companies, just not quite as obviously.


> only solution appears to be "even more AI"

That's the idea, to transform businesses to be wholly dependent on "AI" service to develop software. What better way than to re/write entire codebases until no human being understands it.

The Zig project know this, and its so-called "anti-AI" policy is actually pro-community and cultivating human understanding. It's not about the tool or technology, per se, it's about people, knowledge, and sustainability.

In contrast, the Bun project is demonstrating how they doesn't care about any of that, YOLO-ing its way to losing the trust of its users, contributors, and maintainers. Oh well, AI will maintain the project now, since no one else can.


The one thing I can't stand about the AI zealots is their anti‑intellectualism. Even before coding agents became a thing, there were so many comments here along the lines of, "doing things properly has a learning cost! I don't have time for that nonsense because, unlike you, I'm busy actually making stuff." Now, too many people openly mock the practice of reading, writing, or understanding code altogether.

It's sad to see what hacker culture has been reduced to: outright contempt for science and engineering.


One thinking is most people writing software who are not software engineers prefer using AI because they don't think software is valuable in itself, it's only a way to solve a problem. So there are two camps, the other being people who like to solve "software problems". But this latter has been solved by AI


That's exactly thing I'm trying to call out. AI coding has attracted a flood of people whose only goal is to make a quick buck out of shoddy work. They regard science and engineering as beneath them, and they're not shy about saying it, here and elsewhere.

Any serious professional in this field knows that software development is far from a solved problem. It wasn't before LLMs, and it isn't now. Responsible development takes discipline and respect for the hard-won lessons of past and present efforts.

But no, according to many here, being responsible makes you a "luddite." "Humans make mistakes too," that's what they'll say as they'll inevitably screw over people's lives with their reckless disregard for others. "It's not my issue to solve."

Seriously, haven't techbros already caused enough damage throughout society with "move fast and break things"? A lot of people are losing patience for this nonsense.


This is because AI is most appealing to average and below average developers and users because it makes them feel like they can finally do something.


This is more or less my take on it.

I am not against AI code, it can be perfectly fine.

The principle issue in my mind is the rate of change.

Once you rewrite a code base like this (in a week no less) the only way to work on it in the future is using AI tools because no single person has any knowledge about any specific piece of code base any more.

AI generated code that is run through a classic PR process would potentially be fine, but then you sorta lose the entire point of using AI.


That happened to my project as well. The main issue hasn’t beet that ai couldn’t solve the problem, but it became so slow and you need more and more verification layers and CI/CD that at one point you wish a simpler codebase back, with reasonable tests, with storylines in codes and so on.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: