Maybe a bit misleading. I have used in in two places.
One Is for local opencode coding and config of stuff the other is for agent-browser use and for both it did better (opus 4.6) for the thing I was testing atm. The problem with opus at the moment I tired it was overthinking and moving itself sometimes I the wrong direction (not that qwen does overthink sometimes). However sometimes less is more - maybe turning thinking down on opus would have helped me. Some people said that it is better to turn it of entirely when you start to impmenent code as it already knows what it needs to do it doesn't need more distraction.
Another example is my ghostty config I learned from queen that is has theme support - opus would always just make the theme in the main file
Set a budget. Fund an openrouter account with the max you can stomach spending on this test and give it a shot.
At least, that’s what I would do, if I had any interest in testing out gastown with my own money. If my employer wants to pay for the testing, that’s another question entirely.
I often have 10+ running in parallel. I’m attacking parallel problems that aren’t interdependent. Sometimes adding additional products can bring me up to 15+.
Gotta have really good test harnesses so they can largely fix themselves.
We have our doubts about this. Can you share your code or product?
Anecdotally, my mistakes and lack of understanding exponentiate the more I try to parallelize.
As I said in the neighboring comment, for vibe coding side projects and prototypes for work I just merge and iterate. It works out more than it doesn’t. For anything bigger at work I cannot share as I’m at Apple.
But you have to keep it in your head, and remember all stuff at the same time. How is it possible to track, and do reviews one after another? Or are these pretty long running agents?
I’m not sure what you mean by keep it in your head? I know all of the parts the agents are working on. It’ll often be a mix between bigger tasks (some large refactor, new feature, etc) and small tasks (little bug fixes).
For prototyping I just merge. I don’t bother to review the code. For anything more important than I am reviewing the code and going back and forth. Basically there’s a queue of stuff demanding my attention, and I just serially go through them.
What’s also been really helpful to me is /simplify and similar code review skills (I have my own). That alone takes an agent a while to parse through everything it’s done and self reviews. It catches quite a lot itself this way.
>I’m not sure what you mean by keep it in your head?
If the project I work on is large enough, it takes me some time to get everything I need to understand for review into the short term memory. If it's small enough, it's less of a problem for me.
Honestly, I dont know. I could be mistaken about the exact number of agents - but not wrong about fact of AI-driven workflows which is heavily automated, and goes on for hours.
He's one (small) step from distinguished engineer, with 20+ patents to his name, and is an embedded programmer (largely C/C++) with 30+ years of experience in the field; and I've known him for nearly as long, so I put a lot of credence to his words.
But we don't usually talk work; he's the guitarist in our band :) [I'm the bass] So we mainly chill over music + beer.
And lately, it's been less chill ¯\_(ツ)_/¯
feels euphemistic for the original “colloquial” usage I have for it.
> The killing of one in ten, chosen by lots, from a rebellious city or a mutinous army was a punishment sometimes used by the Romans. The word has been used (loosely and unetymologically, to the irritation of pedants) since 1660s for "destroy a large but indefinite number of." [0]
Yup. What amuses me is that people think that decimate is to massively degrade something. I assume they're thinking "reduce to 1/10th" rather than "reduce to 9/10th". The effect is markedly different
This is very interesting. This could allow custom harnesses to be used economically with Opus. Depending on the usage limits, this may be cheaper than their API.
reply