That's been my experience as well switching from Opus to Codex. Reasoning takes ...

solenoid0937 · 2026-03-05T22:37:52 1772750272

Weird, I have had the opposite experience. Codex is good at doing precisely what I tell it to do, Opus suggests well thought out plans even if it needs to push back to do it.

slopinthebag · 2026-03-06T01:00:38 1772758838

This is just the stochastic nature of LLM's at play. I think all of the SOTA models are roughly equivalent, but without enough samples people end up reading into it too much.

oorza · 2026-03-06T06:37:49 1772779069

There's a certain amount of variance in the way that people utilize these agents. Put five people in a room and ask them to compose the same prompt and you have five distinct prompts. Couple this with the fact that models respond better/worse to certain prompts depending on the stylistic composition of the prompt itself. And since people tend to write in the same style, you'd get people who have more luck with one model over another, where one model happens to align more readily with their prompt style.

To wit, I have noticed that I tend to prefer Codex's output for planning and review, but Opus for implementation; this is inverted from others at work.

ruszki · 2026-03-06T09:43:56 1772790236

> Couple this with the fact that models respond better/worse to certain prompts depending on the stylistic composition of the prompt itself.

Do we really know this, or is it just gut feeling? Did somebody really proved this statistically with a great certainty?

meowface · 2026-03-06T17:50:13 1772819413

I used to feel like you do, but I don't agree. I would just say it is not consistent. For a given codebase and given goal, sometimes Claude will be the more sensible, creative, thoughtful planner and sometimes Codex will be, sometimes Claude will make a serious oversight that Codex catches and sometimes the opposite. But the trend for me and seemingly a lot of people is that Claude is a more "human-like/human-smart" planner than Codex (in a positive way) but is more likely to make mistakes or forget details when implementing major codebase changes.

throwaway911282 · 2026-03-05T22:09:44 1772748584

codex has been really good so far and the fast mode is cherry on top! and the very generous limits is another cherry on top

slopinthebag · 2026-03-06T01:02:25 1772758945

It's well worth the $20 to not deal with any limits and have it handle all the boilerplate repetitive BS us programmers seem forced to deal with. I think 80% of the benefit comes from spending that $20 (20%? :P) and just having it do the lame shit that we probably shouldn't have to do but somehow need to.