Written by a person who is infamously annoying open source maintainers with AI slop PRs (see the DWARF debacle in OCaml) … and missing much of pi’s philosophy
> I would say that the project actively expects you to be downloading them to fill any missing gaps you might have.
Where did you get this perspective from?
> I thought pi and its tools were supposed to be minimal and extensible. So why is a subagent extension bundling six agents I never asked for that I can’t disable or remove?
Why do you think a random subagents extension is under the same philosophy as pi?
Your blog post says little about pi proper, it's essentially concerned with issues you had with the ecosystem of extensions, often made by random people who either do or do not get the philosophy? Why would that be up to pi to enforce?
It’s straightforward: JavaScript is a dynamic language, which allows code (for instance, code implementing an extension to the harness) to be executed and loaded while the harness is running.
This is quite nice — I do think there’s a version of pi’s design choices which could live in a static harness, but fully covering the same capabilities as pi without a dynamic language would be difficult. (You could imagine specifying a programmable UI, etc — various ways to extend the behavior of the system, and you’d like end up with an interpreter in the harness)
At least, you’d like to have a way to hot reload code (Elixir / Erlang could be interesting)
Sure, but why implement a novel language with said feature if your concern is a harness ... not on implementing a brand new language with this feature?
I'm super on board the rust train right now & super loving it. But no, code hot loading is not common.
Most code in the world is dead code. Most languages are for dead code. It's sad. Stop writing dead code (2022) was no where near the first, is decades and decades late in calling this out, but still a good one. https://jackrusher.com/strange-loop-2022/
I built my own harness on Elixir/Erlang[0]. It's very nice, but I see why TypeScript is a popular choice.
No serialization/JSON-RPC layer between a TS CLI and Elixir server. TS TUI libraries utilities are really nice (I rewrote the Elixir-based CLI prototype as it was slowing me down). Easy to extend with custom tools without having to write them in Elixir, which can be intimidating.
But you're right that Erlang's computing vision lends itself super well to this problem space.
This confused me about openclaw for quite some time. The whole lobster/crustacean theme is just firmly associated with rust in my head. Guess it's just a claude/claw wordplay.
This looked interesting because I prefer rust over npm.
The first issue I had was to figure out the schema of the models.json, as someone who hadn't used the original pi before. Then I noticed the documented `/skill:` command doesn't exist. That's also hard to see because the slash menu is rendered off screen if the prompt is at the bottom of the terminal. And when I see it, the selected menu items always jumps back to the first line, but looks like he fixed that yesterday.
The tool output appears to mangle the transcript, and I can't even see the exact command it ran, only the output of the command. The README is overwhelmingly long and I don't understand what's important for me as a first time user and what isn't. Benchmarks and code internals aren't too terribly relevant to me at this point.
I looked at the original pi next and realized the config schema is subtly different (snake_case instead of camelCase). Since it was advertised as a port, I expected it to be a drop-in replacement, which is clearly not the case.
All in all it doesn't inspire confidence. Unfortunate.
If you look at that code it’s possibly the worst rust code I’ve seen in my life. There are several files with 5000 to 10000 lines of code in a single file.
It looks 100% vibe coded by someone who’s a complete neophyte.
Fwiw @dicklesworthstone / jeff Emanuel is definitely my favorite dragon rider right now, doing the most with AI, to the most effect.
Their agent mail was great & very early in agent orchestration. Code agent search is amazing & will tell you what's happening in every harness. Their Franktui is a ridiculously good rust tui. They have project after project after project after project and they are all so good.
This matters less and less in the new world. that fact that a fully compatible 10x faster clone came up, and is continuously working and adapting/improving, tells you that this is hugely valuable. It has users and it's thriving.
Caring about taste in coding is past now. It's sad :( but also something to accept.
Yeah, I tried to use this clone of pi for a while and its very, very broken.
First of all it wouldn't build, I have to mess around with git sub-modules to get it building.
Then trying to use it. First of all the scrolling behavior is broken. You cannot scroll properly when there are lots of tool outputs, the window freezes. I also ended up with lots of weird UI bugs when trying to use slash commands. Sometimes they stop the window scrolling, sometimes the slash commands don't even show at all.
The general text output is flaky, how it shows results of tools, the formatting, the colors, whether it auto-scrolls or gets stuck is all very weird and broken.
You can easily force it into a broken state by just running lots of tool calls, then the UI just freezes up.
I am building an entire GPT model framework from the ground up in Typescript + small amounts of c bindings for gpu stuff. https://github.com/thomasdavis/alpha2 (using claude)
Don't hate me aha and no, there is no reason other than I can
yes! I just don't understand that as well. Up until some time ago claud code's preferred install was a npm i, wasn't it? Please serious answers for why anyone would use a web language for a terminal app
if agents continue to get better with RL, what is future proof about this environment or UI?
I think we all know that managing 5-10 agents ... is not pretty. Are we really landing good PRs with 100% cognitive focus from 5-10 agents? Chances are, I'm making mistakes (and I assume other humans are too)? Why not 1 agent managing 5-10 agents for you? And so on?
Most of the development loop is in bash ... so as long as agents get better at using bash (amongst other things), what happens to this in 6 months?
I don't think this is operating at a higher-level of abstraction if agents themselves can coordinate agents across worktrees, etc.
Interesting thoughts - thank you! And directionally agree - given that agents are becoming ever better, they'll take more and more of the orchestration on themselves. Still, we believe that developers need an interface to interact with these agents; see their status and review / test their work. Emdash is our approach for building this interface of the future - the ADE :)
People use UIs for git despite it working so well in the terminal... Many people I knew at uni doing computer science wouldn’t even know what tmux is. I would bet that the demand for these types of UIs is going to be a lot bigger than the demand for CLI tools like Claude Code. People already rave about cowork and the new codex UI. This falls into the same category.
Skills feel analogous to behavioral programs. If you give an agent access to a programmable substrate (e.g. bash + CLI tools), you write these Markdown programs which are triggered and read when the agent thinks certain behaviors will be beneficial.
It's a great idea: really neat take on programmability, and can be reloaded while the agent is running without tweaking the harness, etc -- lots of benefits.
`pi` has a great skills implementation too.
I think skills might really shine if you take a minimal approach to the system prompt (like `pi`) -- a lot of the times, if I want to orchestrate the agent in some complex behavior, I want to start fresh, and having it walk through a bunch of skills ... possibly the smaller the system prompt, the more likely the agent is to follow the skills without issue.
Yes -- skills live in a special gap between "should have been a deterministic program" and "model already had the ability to figure this out". My personal experience leaves me in agreement that minimal system prompts are definitely the way to go.
I wanted to write the same comment. These people are fucking hucksters. Don’t listen to their words, look at their software … says all you need to know.
There is barely any magic in the harness, the magic is in the model.
Try it: write your own harness with (bash, read, write, edit) ... it's trivial to get a 99% version of (pick your favorite harness) -- minus the bells and whistles.
The "magic of the harness" comes from the fun auxiliary orchestration stuff - hard engineering for sure! - but seriously, the model is the key item.
Yeah I agree with this. The only tool that really matters is file patching -- which you can check something like the opencode patch implementation, its fairly straightforward.
"Training the specific harness" is marginal -- it's obvious if you've used anything else. pi with Claude is as good as (even better! given the obvious care to context management in pi) as Claude Code with Claude.
This whole game is a bizarre battle.
In the future, many companies will have slightly different secret RL sauces. I'd want to use Gemini for documentation, Claude for design, Codex for planning, yada yada ... there will be no generalist take-all model, I just don't believe RL scaling works like that.
I'm not convinced that a single company can own the best performing model in all categories, I'm not even sure the economics make it feasible.
> pi with Claude is as good as (even better! given the obvious care to context management in pi) as Claude Code with Claude
And that’s out of the box. With how comically extensible pi is and how much control it gives you over every aspect of the pipeline, as soon as you start building extensions for your own, personal workflow, Claude Code legimitely feels like a trash app in comparison.
I don’t care what Anthropic does - I’ll keep using pi. If they think they need to ban me for that, then, oh well. I’ll just continue to keep using pi. Just no longer with Claude models.
OpenAI has endorsed OAuth from 3rd party harnesses, and their limits are way higher. Use better tools (OpenCode, pi) with an arguably better model (xhigh reasoning) for longer …
Pass for me.
reply