More

mccoyb · 2026-02-25T11:36:54 1772019414

Written by a person who is infamously annoying open source maintainers with AI slop PRs (see the DWARF debacle in OCaml) … and missing much of pi’s philosophy

Pass for me.

mccoyb · 2026-02-25T00:58:42 1771981122

> I would say that the project actively expects you to be downloading them to fill any missing gaps you might have.

Where did you get this perspective from?

> I thought pi and its tools were supposed to be minimal and extensible. So why is a subagent extension bundling six agents I never asked for that I can’t disable or remove?

Why do you think a random subagents extension is under the same philosophy as pi?

Your blog post says little about pi proper, it's essentially concerned with issues you had with the ecosystem of extensions, often made by random people who either do or do not get the philosophy? Why would that be up to pi to enforce?

the_mitsuhiko · 2026-02-25T07:34:19 1772004859

Sharing extensions is very much the philosophy. Using them however is less so.

Pi ships with docs that include extensions and the agent looks there for inspiration if you ask it to build a custom extension.

Looking at what others publish is useful!

mccoyb · 2026-02-24T23:37:08 1771976228

You can use your Codex plan with it. OpenAI endorsed it several weeks ago, as far as I remember. That could change, however, but now seems safe.

ac29 · 2026-02-25T01:15:29 1771982129

You can use your Claude or Gemini plan with it too for now, though Anthropic and Google have made it clear this is a ToS violation.

mccoyb · 2026-02-24T23:26:56 1771975616

Pi has made all the right design choices. Shout out to Mario (and Armin the OG stan) — great taste shows itself.

semiinfinitely · 2026-02-24T23:45:58 1771976758

I do not understand why in the age of ai coding we would implement this in javascript

mccoyb · 2026-02-24T23:51:30 1771977090

It’s straightforward: JavaScript is a dynamic language, which allows code (for instance, code implementing an extension to the harness) to be executed and loaded while the harness is running.

This is quite nice — I do think there’s a version of pi’s design choices which could live in a static harness, but fully covering the same capabilities as pi without a dynamic language would be difficult. (You could imagine specifying a programmable UI, etc — various ways to extend the behavior of the system, and you’d like end up with an interpreter in the harness)

At least, you’d like to have a way to hot reload code (Elixir / Erlang could be interesting)

This is my intuition, at least.

jatari · 2026-02-25T00:30:36 1771979436

Code hotloading isn't a particularly difficult feature to implement in any language.

mccoyb · 2026-02-25T00:38:21 1771979901

Sure, but why implement a novel language with said feature if your concern is a harness ... not on implementing a brand new language with this feature?

jauntywundrkind · 2026-02-25T03:31:37 1771990297

Rust can't even dynamically link!

I'm super on board the rust train right now & super loving it. But no, code hot loading is not common.

Most code in the world is dead code. Most languages are for dead code. It's sad. Stop writing dead code (2022) was no where near the first, is decades and decades late in calling this out, but still a good one. https://jackrusher.com/strange-loop-2022/

jasonjmcghee · 2026-02-25T05:45:30 1771998330

Incredible talk and I agree with all the things and I've worked on this problem a bunch.

But Rust can dynamically link with dylib but I believe it's still unstable.

It can also dynamically load with libloading.

sergiomattei · 2026-02-25T00:55:31 1771980931

I built my own harness on Elixir/Erlang[0]. It's very nice, but I see why TypeScript is a popular choice.

No serialization/JSON-RPC layer between a TS CLI and Elixir server. TS TUI libraries utilities are really nice (I rewrote the Elixir-based CLI prototype as it was slowing me down). Easy to extend with custom tools without having to write them in Elixir, which can be intimidating.

But you're right that Erlang's computing vision lends itself super well to this problem space.

[1]: https://github.com/matteing/opal

KeplerBoy · 2026-02-25T08:14:52 1772007292

This confused me about openclaw for quite some time. The whole lobster/crustacean theme is just firmly associated with rust in my head. Guess it's just a claude/claw wordplay.

sean_pedersen · 2026-02-25T01:23:24 1771982604

There is a Rust port: https://github.com/Dicklesworthstone/pi_agent_rust

mr_mitm · 2026-02-25T13:39:18 1772026758

This looked interesting because I prefer rust over npm.

The first issue I had was to figure out the schema of the models.json, as someone who hadn't used the original pi before. Then I noticed the documented `/skill:` command doesn't exist. That's also hard to see because the slash menu is rendered off screen if the prompt is at the bottom of the terminal. And when I see it, the selected menu items always jumps back to the first line, but looks like he fixed that yesterday.

The tool output appears to mangle the transcript, and I can't even see the exact command it ran, only the output of the command. The README is overwhelmingly long and I don't understand what's important for me as a first time user and what isn't. Benchmarks and code internals aren't too terribly relevant to me at this point.

I looked at the original pi next and realized the config schema is subtly different (snake_case instead of camelCase). Since it was advertised as a port, I expected it to be a drop-in replacement, which is clearly not the case.

All in all it doesn't inspire confidence. Unfortunate.

Edit: The original pi also says that there is a `/skill` command, but then it is missing in the following table: https://github.com/badlogic/pi-mono/tree/main/packages/codin...

The `/skill` command also doesn't seem registered when I use pi. What is going on? How are people using this?

Edit2: Ah, they have to be placed in `~/.pi/agent/skills`, not `~/.pi/skills`, even though according to the docs, both should work: https://github.com/badlogic/pi-mono/tree/main/packages/codin...

This is exhausting.

saberience · 2026-02-25T09:40:30 1772012430

If you look at that code it’s possibly the worst rust code I’ve seen in my life. There are several files with 5000 to 10000 lines of code in a single file.

It looks 100% vibe coded by someone who’s a complete neophyte.

jauntywundrkind · 2026-02-25T03:34:18 1771990458

Fwiw @dicklesworthstone / jeff Emanuel is definitely my favorite dragon rider right now, doing the most with AI, to the most effect.

Their agent mail was great & very early in agent orchestration. Code agent search is amazing & will tell you what's happening in every harness. Their Franktui is a ridiculously good rust tui. They have project after project after project after project and they are all so good.

Didn't know they had a rust Pi. Nice.

saberience · 2026-02-25T09:49:43 1772012983

You should look at the code in that project. It’s terrible, I mean, really, really terrible.

It’s clear it was 100% written by Claude using sub-agents which explains the many classes with 5000 lines of rust in a single file.

It’s a huge buggy mess which doesn’t run on my Mac.

If you’re a rust engineer and want a good laugh, go take a look at the agent.rs, auth.rs, or any of the core components.

orangecoffee · 2026-02-25T13:02:45 1772024565

This matters less and less in the new world. that fact that a fully compatible 10x faster clone came up, and is continuously working and adapting/improving, tells you that this is hugely valuable. It has users and it's thriving.

Caring about taste in coding is past now. It's sad :( but also something to accept.

mr_mitm · 2026-02-25T13:40:26 1772026826

Unmaintainable messes of code are also hard to maintain for AI agents. This isn't solely about taste.

orangecoffee · 2026-02-25T14:43:10 1772030590

This projects huge commit list proves this wrong :(

mr_mitm · 2026-02-25T14:53:27 1772031207

The project also doesn't work. See my other comment.

Looks like a lot of nonsensical commits.

saberience · 2026-02-25T15:37:07 1772033827

Yeah, I tried to use this clone of pi for a while and its very, very broken.

First of all it wouldn't build, I have to mess around with git sub-modules to get it building.

Then trying to use it. First of all the scrolling behavior is broken. You cannot scroll properly when there are lots of tool outputs, the window freezes. I also ended up with lots of weird UI bugs when trying to use slash commands. Sometimes they stop the window scrolling, sometimes the slash commands don't even show at all.

The general text output is flaky, how it shows results of tools, the formatting, the colors, whether it auto-scrolls or gets stuck is all very weird and broken.

You can easily force it into a broken state by just running lots of tool calls, then the UI just freezes up.

But just try it and see for yourself...

thomasfromcdnjs · 2026-02-25T06:43:00 1772001780

I am building an entire GPT model framework from the ground up in Typescript + small amounts of c bindings for gpu stuff. https://github.com/thomasdavis/alpha2 (using claude)

Don't hate me aha and no, there is no reason other than I can

raincole · 2026-02-25T09:39:12 1772012352

Thank god it's written in JavaScript. I might have skipped it if it were zig or something.

solarkraft · 2026-02-25T09:55:40 1772013340

It’s one of the most productive languages and ecosystems (IMO top 1 over all).

Blackarea · 2026-02-24T23:50:35 1771977035

yes! I just don't understand that as well. Up until some time ago claud code's preferred install was a npm i, wasn't it? Please serious answers for why anyone would use a web language for a terminal app

fragmede · 2026-02-25T04:08:03 1771992483

Because it's what the person writing it's preferred language.

So it can share code with the web app.

Because writing it in javascript is easier than writing it in raw brute forced assembly.

moonlion_eth · 2026-02-25T02:28:48 1771986528

i wrote an agent in zig, it kinda sucks tho. the language is just words

andai · 2026-02-25T04:02:31 1771992151

See also: pz: pi coding-agent in Zig

https://news.ycombinator.com/item?id=47120784

mccoyb · 2026-02-24T20:50:30 1771966230

Here's my question:

if agents continue to get better with RL, what is future proof about this environment or UI?

I think we all know that managing 5-10 agents ... is not pretty. Are we really landing good PRs with 100% cognitive focus from 5-10 agents? Chances are, I'm making mistakes (and I assume other humans are too)? Why not 1 agent managing 5-10 agents for you? And so on?

Most of the development loop is in bash ... so as long as agents get better at using bash (amongst other things), what happens to this in 6 months?

I don't think this is operating at a higher-level of abstraction if agents themselves can coordinate agents across worktrees, etc.

onecommit · 2026-02-24T21:00:07 1771966807

Interesting thoughts - thank you! And directionally agree - given that agents are becoming ever better, they'll take more and more of the orchestration on themselves. Still, we believe that developers need an interface to interact with these agents; see their status and review / test their work. Emdash is our approach for building this interface of the future - the ADE :)

blumomo · 2026-02-24T21:35:36 1771968936

> Still, we believe that developers need an interface to interact with these agents;

CLIs like claude code equally improve over time. tmux helps running remote sessions like there were local.

Why should we invest long time into your „ADE“, really?

> see their status and review / test their work

Won’t that be addressed eventually by the CLIs themselves?

Maybe you’re betting on being purchased by one of the agentic coding providers given your tool has long term value on its own?

sothatsit · 2026-02-25T01:13:07 1771981987

People use UIs for git despite it working so well in the terminal... Many people I knew at uni doing computer science wouldn’t even know what tmux is. I would bet that the demand for these types of UIs is going to be a lot bigger than the demand for CLI tools like Claude Code. People already rave about cowork and the new codex UI. This falls into the same category.

mccoyb · 2026-02-24T20:36:00 1771965360

Skills feel analogous to behavioral programs. If you give an agent access to a programmable substrate (e.g. bash + CLI tools), you write these Markdown programs which are triggered and read when the agent thinks certain behaviors will be beneficial.

It's a great idea: really neat take on programmability, and can be reloaded while the agent is running without tweaking the harness, etc -- lots of benefits.

`pi` has a great skills implementation too.

I think skills might really shine if you take a minimal approach to the system prompt (like `pi`) -- a lot of the times, if I want to orchestrate the agent in some complex behavior, I want to start fresh, and having it walk through a bunch of skills ... possibly the smaller the system prompt, the more likely the agent is to follow the skills without issue.

evalstate · 2026-02-24T20:58:23 1771966703

Yes -- skills live in a special gap between "should have been a deterministic program" and "model already had the ability to figure this out". My personal experience leaves me in agreement that minimal system prompts are definitely the way to go.

mccoyb · 2026-02-19T23:51:50 1771545110

I wanted to write the same comment. These people are fucking hucksters. Don’t listen to their words, look at their software … says all you need to know.

mccoyb · 2026-02-19T15:34:54 1771515294

The opposite is true.

There is barely any magic in the harness, the magic is in the model.

Try it: write your own harness with (bash, read, write, edit) ... it's trivial to get a 99% version of (pick your favorite harness) -- minus the bells and whistles.

The "magic of the harness" comes from the fun auxiliary orchestration stuff - hard engineering for sure! - but seriously, the model is the key item.

ted537 · 2026-02-19T16:02:22 1771516942

Yeah I agree with this. The only tool that really matters is file patching -- which you can check something like the opencode patch implementation, its fairly straightforward.

butlike · 2026-02-19T15:41:48 1771515708

Not all percentages are weighted equally. That 1% is worth a lot more than the low-hanging 99% from your example

mccoyb · 2026-02-19T15:51:28 1771516288

Is it? Look at pi, for instance.

It turns out that "most of the bell and whistles" could amount to instructing models how to use tools like tmux

tvink · 2026-02-19T15:45:44 1771515944

This is the complete opposite of my experience.

mccoyb · 2026-02-19T15:52:46 1771516366

Does your experience include writing your own agent? Send a link

mccoyb · 2026-02-19T15:32:03 1771515123

"Training the specific harness" is marginal -- it's obvious if you've used anything else. pi with Claude is as good as (even better! given the obvious care to context management in pi) as Claude Code with Claude.

This whole game is a bizarre battle.

In the future, many companies will have slightly different secret RL sauces. I'd want to use Gemini for documentation, Claude for design, Codex for planning, yada yada ... there will be no generalist take-all model, I just don't believe RL scaling works like that.

I'm not convinced that a single company can own the best performing model in all categories, I'm not even sure the economics make it feasible.

Good for us, of course.

thepasch · 2026-02-19T15:58:03 1771516683

> pi with Claude is as good as (even better! given the obvious care to context management in pi) as Claude Code with Claude

And that’s out of the box. With how comically extensible pi is and how much control it gives you over every aspect of the pipeline, as soon as you start building extensions for your own, personal workflow, Claude Code legimitely feels like a trash app in comparison.

I don’t care what Anthropic does - I’ll keep using pi. If they think they need to ban me for that, then, oh well. I’ll just continue to keep using pi. Just no longer with Claude models.

throwaw12 · 2026-02-20T11:23:41 1771586621

As a Claude Code user looking for alternatives, I am very intrigued by this statement.

Can you please share good resources I can learn from to extend pi?

theshrike79 · 2026-02-20T21:02:47 1771621367

Pi has specific instructions to extend itself.

You can just tell it to create an extension to connect to any AI API provider and it'll most likely one or two-shot it for you.

IMO it's the most self-aware of all of the current harnesses.

mccoyb · 2026-02-19T03:58:45 1771473525

OpenAI has endorsed OAuth from 3rd party harnesses, and their limits are way higher. Use better tools (OpenCode, pi) with an arguably better model (xhigh reasoning) for longer …

wyre · 2026-02-19T04:09:05 1771474145

I am looking forward to switching to OpenAI once my claude max account is banned for using pi....