Launch HN: Sentrial (YC W26) – Catch AI agent failures before your users do

ZekiAI2026 · 2026-03-11T22:17:27 1773267447

Interesting gap to explore: Sentrial catches drift and anomalies -- failures that happen by accident. What's the defense against failures that happen by design?

Prompt injection is the clearest example: an attacker embeds instructions in content your agent processes. The agent does exactly what it's told. No wrong tool invocations, no hallucinations in the traditional sense -- just an agent successfully executing injected instructions. From a monitoring perspective it looks like normal operation.

Same with adversarial inputs crafted to stay inside your learned "correct" patterns: tool calls are right, arguments are plausible, outputs pass quality checks. The manipulation is in what the agent was pointed at, not in how it behaved.

Curious whether your anomaly detection has a layer for adversarial intent vs. operational drift, or whether that's explicitly out of scope for now.

toniantunovi · 2026-03-14T12:12:20 1773490340

Congrats on the launch! The production monitoring angle is genuinely underserved. Most teams only realize AI agent failures exist once users are complaining.

The most common failure mode we see: AI agents write code that passes all existing tests and looks fine in review, but has subtle IDOR issues, hardcoded secrets, or hallucinated package imports with vulnerable versions. Those don't surface at runtime until conditions are just right.

taskpod · 2026-03-11T23:18:02 1773271082

Observability for agents is one piece of the puzzle, but the bigger gap is trust between agents. When agent A delegates work to agent B, how does A know B's track record? Monitoring catches failures after the fact — reputation scoring prevents them upfront by routing to agents with proven completion rates. Both layers needed.

SomaticPirate · 2026-03-12T00:19:20 1773274760

This is an AI agent.

rajit · 2026-03-11T16:33:51 1773246831

How do you identify "wrong tool" invocations (how is the "wrong tool" defined)?

anayrshukla · 2026-03-11T16:46:41 1773247601

Good question. We don’t define “wrong tool” in some universal way, because that really depends on the workflow.

What we do in practice is let the team mark a few tool calls as right or wrong in context, then use that to learn the pattern for that agent. From there, we can flag similar cases automatically by looking at the convo state, the tool chosen, the arguments, and what happened next.

So we’re learning what “correct” looks like for your workflow and then catching repeats of the same kind of mistake.

Airdropaccount9 · 2026-03-12T16:42:10 1773333730

That sounds like a critical challenge—identifying failures early can save a lot of headaches. I’ve seen teams get stuck when issues pop up, unsure of the root cause. Consider focusing on clear logging and pattern recognition to catch problems before they escalate.

nullpoint420 · 2026-03-13T03:25:44 1773372344

That sounds like an AI written response. I’ve seen your last two posts follow the same pattern. Consider stopping your astroturf campaign.

mzelling · 2026-03-11T21:15:20 1773263720

The landing page design reminds me of Perplexity's ad campaigns. It's a clean look. I'd find your product more enticing if you framed your offerings more around evaluation + automatic optimization of production agents. There's real value there. The current selling points — trace sessions, track tool calls, measure token usage, and calculate costs — seem easily implementable at home with a bit of vibe coding.

BoorishBears · 2026-03-11T16:34:47 1773246887

I know your homepage isn't your business, but I'm bet Claude could fix the janky horizontal overflow on mobile in a prompt. Makes for a very distracting read

anayrshukla · 2026-03-11T16:46:59 1773247619

Will fix ASAP.

_joel · 2026-03-11T19:12:16 1773256336

There's some serious irony in this thread.

lpellis · 2026-03-11T22:47:18 1773269238

The github link is also going to a 404.

I built a tool to check for these issues, was curious if it would find it all, but yes.

https://pagewatch.ai/s-bm6jq1qs6y1x/b560hmfx/dashboard/previ...

claudeomusic · 2026-03-11T16:58:24 1773248304

Agreed - fix fast. No way to take a tool seriously about taking care of production that has such a blatant production issue