More

malyk · 2026-02-13T20:05:58 1771013158

Can you give an example to help us understand?

I look at my ticket tracker and I see basically 100% of it that can be done by AI. Some with assistance because business logic is more complex/not well factored than it should be, but most of the work that is done AI is perfectly capable of doing with a well defined prompt.

gordonhart · 2026-02-13T20:29:21 1771014561

Here's an example ticket that I'll probably work on next week:

    Live stream validation results as they come in

The body doesn't give much other than the high-level motivation from the person who filed the ticket. In order to implement this, you need to have a lot of context, some of which can be discovered by grepping through the code base and some of which can't:

- What is the validation system and how does it work today?

- What sort of UX do we want? What are the specific deficiencies in the current UX that we're trying to fix?

- What prior art exists on the backend and frontend, and how much of that can/should be reused?

- Are there any scaling or load considerations that need to be accounted for?

I'll probably implement this as 2-3 PRs in a chain touching different parts of the codebase. GPT via Codex will write 80% of the code, and I'll cover the last 20% of polish. Throughout the process I'll prompt it in the right direction when it runs up against questions it can't answer, and check its assumptions about the right way to push this out. I'll make sure that the tests cover what we need them to and that the resultant UX feels good. I'll own the responsibility for covering load considerations and be on the line if anything falls over.

Does it look like software engineering from 3 years ago? Absolutely not. But it's software engineering all the same even if I'm not writing most of the code anymore.

Rodeoclash · 2026-02-13T20:50:24 1771015824

This right here is my view on the future as well. Will the AI write the entire feature in one go? No. Will the AI be involved in writing a large proportion of the code that will be carefully studied and adjusted by a human before being used? Absolutely yes.

This cyborg process is exactly how we're using AI in our organisation as well. The human in the loop understands the full context of what the feature is and what we're trying to achieve.

codegangsta · 2026-02-13T21:39:10 1771018750

But planning like this is absolutely something AI can do. In fact, this is exactly the kind of thing we start with on our team when it comes to using AI agents. We have a ticket with just a simple title that somebody threw in there, and we asked the AI to spin up a bunch of research agents to understand and plan and ask itself those questions.

Funny enough, all the questions that you posed are things that come up right away that the agent asks itself, and then goes and tries to understand and validate an answer, sometimes with input from the user. But I think this planning mechanism is really critical to being able to have an AI generate an understanding, then have it be validated by a human before beginning implementation.

And by planning I don't necessarily mean plan mode in your agent harness of choice. We use a custom /plan skill in Claude Code that orchestrates all of this using multiple agents, validation loops, and specific prompts to weed out ambiguities by asking clarifying questions using the ask user question tool.

This results in taking really fuzzy requirements and making them clear, and we automate all of this through linear but you could use your ticket tracker of choice.

adriand · 2026-02-14T00:32:19 1771029139

Absolutely. Eventually the AI will just talk to the CEO / the board to get general direction, and everything will just fall out of that. The level of abstraction the agents can handle is on a steady upward trajectory.

sarchertech · 2026-02-14T03:44:53 1771040693

If AIs can do that, they won’t be talking to a CEO or the board of a software company. There won’t be a CEO or a board because software companies won’t exist. They’ll talk to the customers and build one off solutions for each of them.

There will be 3 “software” companies left. And shortly after that society will collapse because of AI can do that it can do any white collar job.

lexoj · 2026-02-14T21:56:53 1771106213

But the llm could still take ownership of that and ask clarifications, reach stakeholders while taking notes, and so on.

fragmede · 2026-02-13T20:53:30 1771016010

I mean, what is the validation system? Either it exists in code, and thus can be discovered if you point the AI at repo, or... what, it doesn't exist?

For the UX, have it explore your existing repos and copy prior art from there and industry standards to come up with something workable.

Web scale issues can be inferred by the rest of the codebase. If your terraform repo has one RDS server, vs a fleet of them, multi-region, then the AI, just as well as a human, can figure out if it needs Google Spanner level engineering or not. (probably not)

Bigger picture though, what's the process of a human logs an under specified ticket and someone else picks it up and has no clue what to do with it? They're gonna go ask the person who logged the bug for their thoughts and some details beyond "hurr Durr something something validation". If we're at the point where AI is able to make a public blog post shaming the open source developer for not accepting a patch, throwing questions back to you in JIRA about the details of the streaming validation system is well within its capabilities, given the right set of tools.

gordonhart · 2026-02-13T21:05:30 1771016730

Honestly curious, have you seen agents succeed at this sort of long-trajectory wide breadth task, or is it theoretical? Because I haven't seen them come close (and not for lack of trying)

codegangsta · 2026-02-13T21:42:20 1771018940

Yeah I absolutely see it every day. I think it’s useful to separate the research/planning phase from the building/validadation/review phase.

Ticket trackers are perfect for this. Just start with asking AI to take this unclear, ambiguous ticket and come up with a real plan for how to accomplish it. Review the plan, update your ticket system with the plan, have coworkers review it if you want.

Then when ready, kick off a session for that first phase, first PR, or the whole thing if you want.

fragmede · 2026-02-13T23:25:26 1771025126

Opus 4.6, with all of the random tweaks I've picked up off of here, and twitter, is in the middle of rewriting my golang cli program for programmers into a swiftui Mac app that people can use, and it's totally managing to do it. Claude swarm mode with beads is OP.

kolinko · 2026-02-13T22:59:26 1771023566

In my expedience, Claude Code with opus 4.5 is the first one to tackle such issues well.

lbrito · 2026-02-13T20:13:57 1771013637

Then why isn't it? Just offload it to the clankers and go enjoy a margarita at the beach or something.

Gud · 2026-02-13T20:53:38 1771016018

There are plenty of people who are enjoying margarita by the beach while you, the laborer, are working for them.

lbrito · 2026-02-13T20:55:13 1771016113

Preach. That's always been the case though, AI just makes it slightly worse.

contagiousflow · 2026-02-13T20:11:36 1771013496

Why do you have a backlog then? If a current AI can do 100% of it then just run it over the weekend and close everything

fishpham · 2026-02-13T20:12:29 1771013549

As always, the limit is human bandwidth. But that's basically what AI-forward companies are doing now. I would be curious which tasks OP commenter has that couldn't be done by an agent (assuming they're a SWE)

Analemma_ · 2026-02-13T20:14:29 1771013669

This sounds bogus to me: if AI really could close 100% of your backlog with just a couple more humans in the loop, you’d hire a bunch of temps/contractors to do that, then declare the product done and lay off everybody. How come that isn’t happening?

fishpham · 2026-02-13T21:31:36 1771018296

Because there's an unlimited amount of work to do. This is the same reason you are not fired once completing a feature :-) The point of hiring a FTE is to continue to create work that provides business value. For your analogy, FTEs often do that by hiring temp, and you can think of the agent as the new temp in this case - the human drives an infinite amount of them

sarchertech · 2026-02-14T03:50:51 1771041051

Why hasn’t any of the software I use started shipping features at a breakneck speed then? The only thing any of them have added is barely working AI features.

Why aren’t there 10x the number of games on steam? Why aren’t people releasing new integrated programming language/OS/dev environments?

Why does our backlog look exactly the same as when I left for posterity leave 4 months ago?

fishpham · 2026-02-14T04:47:20 1771044440

Questions posed in bad faith can only be answered by the author.

sarchertech · 2026-02-14T05:10:56 1771045856

Someone asked why the backlog doesn’t get finished. You answered that it does but the backlog just refills. So I asked where is the backlog evidence that the original backlog was completed.

I’m still waiting for the evidence. I still haven’t seen externally verifiable evidence that AI is a net productivity boost for the ability to ship software.

That doesn’t mean that it isn’t. It does mean that it isn’t big enough to be obvious.

I’m very closet watching every external metric I can find. Nothing yet. Just saw the steam metrics for January. Fewer titles than January last year.

catmanjan · 2026-02-14T09:00:48 1771059648

Sounds more like busy work rather than something that makes money

rockbruno · 2026-02-13T20:16:07 1771013767

I think the "well defined prompt" is precisely what the person you responded to is alluring to. They are saying they don't get worried because AI doesn't get the job done without someone behind it that knows exactly what to prompt.

dwa3592 · 2026-02-13T20:12:55 1771013575

>>I look at my ticket tracker and I see basically 100% of it that can be done by AI.

That's a sign that you have spurious problems under those tickets or you have a PM problem.

Also, a job is a not a task- if your company has jobs which is a single task then those jobs would definitely be gone.

malyk · 2026-01-15T15:46:02 1768491962

Not in the US. It’s kinda why paypal, venmo, zelle exist. Filling a gap. Our banking system is quite backwards.

malyk · 2026-01-07T21:56:27 1767822987

I think this is person dependant. A Kale salad makes almost no impact on my hunger, but a piece of bread makes me feel pretty full.

Just as an example of an opposite experience.

(american, vegetarian for 13 years, athletic, former meat eater, long carb centric diet that i'm trying to change)

com2kid · 2026-01-07T22:58:36 1767826716

This is very true, and something that people pushing keto (myself included) had to learn the hard way.

There are satiety indexes for different foods but they are not universal. I can eat almost unlimited carbs and never feel full. I'll eat multiple plates full of bread or a thousand calories in french fries and then move on to the main course.

6oz of lean meat and some salad and I'm good with 500 or so calories on my plate.

I honestly don't get how potatoes supposedly fill people up. I have made twice baked potatoes before and eaten an easy 2000 calories of them along side thanksgiving dinner.

In contrast right now I'm eating clean and doing a body recomp. Eating clean is super satiating, for me at least!

dpark · 2026-01-08T01:20:25 1767835225

> I have made twice baked potatoes before and eaten an easy 2000 calories of them along side thanksgiving dinner.

Try plain boiled potatoes. I bet you feel like stopping long before 2000 Calories. Tasty things are tasty and often easy to eat an unhealthy amount of.

macNchz · 2026-01-08T02:47:26 1767840446

This is the thing that makes any conversation about broad categories of food difficult—there’s just a huge range of ways to package those carbs, and people eat a ton of “hyper palatable” foods. A few hundred calories of Smartfood popcorn with a day’s worth of sodium and addicting flavors is quite different in my experience than, say, a few slices of chewy, crusty sourdough bread.

malyk · 2026-01-08T00:14:54 1767831294

Right. My wife doesn't feel full unless she has protein. I don't feel full unless I have a bunch of carbs. It makes life interesting.

mikestorrent · 2026-01-07T22:08:47 1767823727

Well, if you've ever cooked down a cabbage or spinach or whatever, you'll see it basically takes up no space whatsoever... so yeah, kale on its own will take a while to fill you up.

dfee · 2026-01-08T01:28:17 1767835697

Maybe true! I eat a bunch (like the formal term of 1 unit) of kale in my daily salad. That seems to be enough, alongside some Greek yogurt and blueberries to maintain me for a few hours.

Can’t help eating junk carbs when I see them, though.

mikestorrent · 2026-01-09T19:32:02 1767987122

I'm cursed with having a good wide palate - your salad sounds delightful - but nothing ever seems to make me feel full until it makes me feel Too Full and then I wish I hadn't overeaten. Normal plain satiation, where are you?

malyk · 2025-12-09T23:13:39 1765322019

Can't you just say "Hey siri, add a note"? I add reminders, send texts, etc. that way.

scratchyone · 2025-12-10T01:02:28 1765328548

Yeah, there's lots of places you can't speak out loud bc it's disruptive to others though. Personally I set a lot of Siri reminders, but it's weird and uncomfortable to talk loudly at your phone in public spaces so I can only use it at home or outdoors. If the ring can follow through on the promise of being able to whisper to it, that's fairly valuable imo

Daneel_ · 2025-12-10T10:46:54 1765363614

You can whisper to Siri on your phone or your watch too - that’s not a ring exclusive.

scratchyone · 2025-12-10T18:40:17 1765392017

Definitely, but Siri is terrible at understanding quiet speech and you have to hold your phone/watch right up to your mouth. A ring is definitely a much nicer form factor for that.

MarkusWandel · 2025-12-10T14:53:36 1765378416

There is the small matter of a 3-4dB price difference between this ring gadget and your "Hey Siri" capable watch.

malyk · 2025-11-07T15:58:56 1762531136

Isn't that a 35% reduction in police response?

malyk · 2025-10-04T22:25:03 1759616703

Use rails. it’s still great.

malyk · 2025-09-28T17:52:57 1759081977

Using an agentic workflow does not require you to delegate tge thinking. Agents are great at taking exactly what you want to do and executing. So spend an extra few minutes and lay out the architecture YOU want then let the ai do the work.

malyk · 2025-09-23T19:43:29 1758656609

I've had success here by simply telling Codex which components to use. I initially imported all the shadcn components into my project and then I just say things like "Create a card component that includes a scrollview component and in the scrollview add a table with a dropdown component in the third column"...and Codex just knows how to add the shadcn components. This is without internet access turned on by the way.

iagooar · 2025-09-23T21:04:54 1758661494

Telling which component to use works perfectly too, if you want a very specific look.

malyk · 2025-09-16T18:04:56 1758045896

Right. The idea here is to kick of 3-8 or something tasks. They finish as you finish writing the next prompt. Then you go and review/test/merge the code from the first task, then another task finishes and you review/test/merge that code.

The challenge is that you have to be working on multiple work streams at once because so far Codex isn't great at not doing work you are doing in another task even if you tell it something like "class X will have a function that returns y"...it will go write that function most times.

I've found it really good for integration work between frontend and backend features where you can iterate on both simultaneously if the code isn't in the same codebase.

Also, for Codex this works best in the web ui because it actually uses branches, opens prs, etc. I think (though could be wrong) that locally with the CLI or IDE extension you might have to manually great git worktrees, etc.

maximamas · 2025-09-16T18:13:08 1758046388

Yeah I try to keep it away from overlapping it's work as much as possible. Using plan mode in claude or just telling codex to build a plan, that is structured in a parallelized way for multiple agents usually helps delegate tasks to be handled at the same time. Typically: app code, infra, and data layer are the main three, but obviously depends on the project.

If I ever find my self just waiting, then it always gives me an opportunity to respond to messages, emails, or update tickets. Won't be long now until the agents are doing that as well...

malyk · 2025-03-27T16:29:20 1743092960

Because you'll be replaced by those engineers in N months/years when they can outperform you because they are wizards with the new tools.

It's like failing to adopt compiled code and sticking to punch cards. Or like refusing to use open source libraries and writing everything yourself. Or deciding that using the internet isn't useful.

Yes, developing as a craft is probably more fulfilling. But if you want it to be a career you have to adapt. Do the crafting on your own time. Employers won't pay you for it.

codr7 · 2025-03-27T17:36:07 1743096967

Let them replace me then, it's not a job I feel like doing anyway.

And when they have forgotten all about how to actually write software, the market is mine.