Hacker Newsnew | past | comments | ask | show | jobs | submit | pipejosh's commentslogin

$2 million a year to run 244,000 searches that advanced 361 cases... That's about $5,500 per useful search.

Meanwhile every car that drove past one of those cameras got logged, timestamped, and stored. These things aren't not law enforcement, they're mass surveillance with a badge.


Also that $2 mil is the cost of the system, and doesn’t include all those man hours spent running unproductive queries.


>These things aren't not law enforcement, they're mass surveillance with a badge.

That sounds like ChatGPT.


I settled on a similar workflow but across two agents instead of one session.

One agent writes task specs. The other implements them. Handoff files bridge the gap. The spec IS the session artifact because it captures intent, scope, and constraints before any code gets written.

The plan.md approach people are describing here is basically what happens naturally when you force yourself to write intent before execution.


Tried self-hosting with Mattermost to get around Slacks 90 day free tier history but my team didn't care for it much. Ended up back on Slack's free tier. This may solve that issue for me, will check it out.


The maintenance burden is real but I think security is the bigger gap. People vibing out code with AI aren't thinking about input validation or dependency vulnerabilities. They build it, it works, they ship it. Then they're running unpatched code with no security review. That's where things get ugly.


Security is even a bigger issue than it looks at first glance. While security risk by omission was always a thing (AI or not), now we face a whole new level of risks, from prompt injection to creating malicious libraries to be used by coding agents: https://garymarcus.substack.com/p/llms-coding-agents-securit...

The most shallow security, however, seems easier. Now, you can get through an automated AI security audit every day for (basically) free. You don't have to hire specialists to run pen tests.

Which makes the whole thing even more challenging. Safe on the surface while vulnerable in the details creates the false sense of safety.

Yet, all these would be a concern only once a product is any successful. Once it is, hypothetically, the company behind should have money to fix the vulnerabilities (I know, "hypothetically"). The maintenance cost hits way earlier than that. It will kick in even for a pet personal project, which is isolated from the broader internet. So I treat it as an early filter, which will reduce the enthusiasm of wannabe founders.


The automated audit only covers static analysis. When the agent actually runs, hitting MCP servers, making HTTP calls, getting responses back, that's where the real problems show up. Prompt injection through tool responses, malicious libraries that exfiltrate env vars, SSRF from agents that blindly follow redirects. Code audits miss all of it because this is a runtime and network problem, not a code quality problem.

Built Pipelock for this actually. It's a network proxy that sits between the agent and everything it talks to. Still early but the gap is real. https://github.com/luckyPipewrench/pipelock


Yes. And the more autonomously we create code, the more of these (and not only these) vulnerabilities we'll be adding. Combine that with the AI-automation in attacks, and you have an all-out security mess.

It's like a Petri dish for inventing new angles of security attacks.

Oh, and let's not forget that coding agents are non-deterministic. The same prompt will yield a different result each time. Especially for more complex tasks. So it's probably enough to wait till the vibe-coded product "slips." Ultimately, as a black hat hacker, I don't need all products to be vulnerable. I can work with those few that are.


Agreed. The non-determinism makes traditional testing basically useless here. You can't write a test suite for "the agent decided to do something unexpected this time." Logging and runtime checks are the only way to catch the weird edge cases.


The part that worries me about agentic everything is the security model hasn't caught up. We're handing agents more and more access (shell, network, APIs, file systems) and the security story is still basically "the model probably won't do bad things." That's not how we secure anything else in computing. Principle of least privilege should apply to agents the same way it applies to services.


Circuit breakers for cost control is smart. The security equivalent is rate limiting and DLP on the egress side. If your agent suddenly starts making a bunch of requests to domains it's never hit before, or starts including high-entropy strings in URLs, something's wrong. Cost and security are two sides of the same observability problem.


Sandboxing the filesystem is one layer but egress scanning is where it gets interesting. An agent inside a sandbox can still exfiltrate secrets through any HTTP request it's allowed to make. The request looks totally legitimate from the sandbox's perspective. You need something actually inspecting the content of outbound traffic for credential patterns.


This is cool for testing the model side, but the real scary part is what happens after the injection succeeds. Even if your agent fails 3 out of 10 tests, that's a 30% chance it exfiltrates whatever secrets are in its environment. The defense can't just be "hope the model catches it." You need architectural controls on the egress side too.


Everyone's talking about how productive agents are but nobody's talking about what happens when one gets prompt injected. Your agent has shell access, your API keys in env vars, and unrestricted internet. That's one bad dependency readme away from leaking everything. The productivity gains are real but so is the attack surface.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: