Codex works much better for long-running tasks that require a lot of planning an...

incoming1211 · 2025-10-20T22:32:33 1760999553

I disagree, Codex always gets stuck and wants to double check and clarify things, its like "dammit just execute the plan and don't tell me until its completely finished"

The output of codex is also not as great. Codex is great at the planning and investigation portion but sucks at execution and code quality.

ewoodrich · 2025-10-20T22:54:43 1761000883

I've been dealing with this on Codex a lot lately. It confidently wraps up a task, I go to check it's work... and it's not even close.

Then I do a double take and re-read the summary message and realize that it pulled a "and then draw the rest of the owl", seemingly arbitrarily picking and choosing what it felt like doing in that session and what it punted over to "next steps to actually get it running".

Claude is more prone to occasional "cheating" with mocked data or "tbd: make this an actual conditional instead of hardcoded If True" stuff when it gets overwhelmed which is annoying and bad. But it at least has strong task adherence for the user's prompt and doesn't make me write a lawyer-esque contract to avoid any loopholes Codex will use to avoid doing work.

aaronblohowiak · 2025-10-20T23:45:18 1761003918

Are you using something like spec-kit?

shmoogy · 2025-10-20T21:23:51 1760995431

Can / Does Codex actually check docker logs and other things for feedback while iterating on something that isnt working ? That is where the true magic of Claude comes for me. Often things cant be one shot, but being able to iteratively check logs, make an adjustment, rebuild the docker containers, send a curl, and confirm fixed is huge improvement.

intellectronica · 2025-10-20T22:08:44 1760998124

Yes, in this regard it's very similar. It works as an agent and does whatever you need it to do to complete the task. In comparison to Claude it tends to plan more and improvise less.