So let me get this straight, OpenAi previously had an issue with LOTS of different models snd versions being available. Then they solved this by introducing GPT-5 which was more like a router that put all these models under the hood so you only had to prompt to GPT-5, and it would route to the best suitable model. This worked great I assume and made the ui for the user comprehensible. But now, they are starting to introduce more of different models again?
We got:
- GPT-5.1
- GPT-5.2 Thinking
- GPT-5.3 (codex)
- GPT-5.3 Instant
- GPT-5.4 Thinking
- GPT-5.4 Pro
Who’s to blame for this ridiculous path they are taking? I’m so glad I am not a Chat user, because this adds so much unnecessary cognitive load.
The good news here is the support for 1M context window, finally it has caught up to Gemini.
The real problem that OpenAI had was that their model naming was completely incomprehensible. 4.5, o3, 4o, 4.1 which is newer than 4.5. It was a complete clusterfuck. The blowback on that issue seems to have led them to misidentify the issue, but nobody was really asking for a single router model. Having a number of sequentially numbered and clearly labelled models is not actually a problem.
I just don't understand how this happens. Either there's literally no product management at a cross-product level or there is and they had a meeting where this plan was discussed and someone approved it.
I'm not sure which would be more shocking, especially considering it's a decade old multi-billion dollar company paying top salaries.
> Who’s to blame for this ridiculous path they are taking?
Variability, different pressures and fast progress. What's your concrete idea for how to solve this, without the power of hindsight?
For example, with the codex model: Say you realize at some point in the past that this could be a thing, a model specifically post-trained for coding, which makes coding better, but not other things. What are they supposed to do? Not release it, to satisfy a cleaner naming scheme?
And if then, at a later point, they realize they don't need that distinction anymore, that the technique that went into the separate coding model somehow are obsolete. What option do you have other than dropping the name again?
As someone else pointed out, the previous problems were around very silly naming pattern. This sems about as descriptive as you can get, given what you have.
Who’s to blame for this ridiculous path they are taking? I’m so glad I am not a Chat user, because this adds so much unnecessary cognitive load.
Most people have it on auto select I'm assuming so this is a non issue. They keep older models active likely because some people prefer certain models until they try the new one or they can't completely switch all the compute to the new models at an instance.
Well, they have older ones of course. But the current options actual users see is "Auto" or "Instant (5.3)" or "Thinking (5.4)". Not that complicated really.
> Then they solved this by introducing GPT-5 which was more like a router that put all these models under the hood so you only had to prompt to GPT-5, and it would route to the best suitable model.
Was this ever explicitly confirmed by OpenAI? I've only ever seen it in the form of a rumor.
Ask the router "What model are you". It will yap on and on about being a GPT-5.3 model (Non-thinking models of OpenAI are insufferable yappers that don't know when to shut up).
Ask it now "What model are you. Think carefully". It concisely replies "GPT-5.4 Thinking".
> GPT‑5 is a unified system with a smart, efficient model that answers most questions, a deeper reasoning model (GPT‑5 thinking) for harder problems, and a real‑time router that quickly decides which to use based on conversation type, complexity, tool needs, and your explicit intent (for example, if you say “think hard about this” in the prompt)
We got:
- GPT-5.1
- GPT-5.2 Thinking
- GPT-5.3 (codex)
- GPT-5.3 Instant
- GPT-5.4 Thinking
- GPT-5.4 Pro
Who’s to blame for this ridiculous path they are taking? I’m so glad I am not a Chat user, because this adds so much unnecessary cognitive load.
The good news here is the support for 1M context window, finally it has caught up to Gemini.