Why I'm Sticking With GPT-5.5 Over Claude
A personal comparison of GPT-5.5, Claude, Codex, and Claude Code based on coding workflow, context handling, readability, limits, and developer experience.
- Focus
- Coding workflow
- Main tool
- Codex app
- Comparison
- GPT-5.5 vs Claude
- Status
- Personal take
I personally love the Codex desktop app on macOS. I have been using it for both my personal and professional work. While Claude Code excels at the CLI experience, Claude’s desktop app, in comparison, is not that good. This is one of many reasons I have not switched to Claude Fable 5 and am sticking with GPT-5.5.
Codex Changed the Equation
Before the launch of the Codex app, I loved Claude Code’s CLI experience, but the Codex app is far more addictive. The built-in browser tool, code review, side chat, worktrees, plugins, automations, and the overall UI/UX of the app feel like what a modern agentic development environment should be. Not to mention, Anthropic has slowly made the CLI experience worse by hiding thinking or other useful information.
Context Is Not Just Window Size
Beyond the app itself, I prefer GPT-5.5 over Claude models for a few model-level reasons. While GPT-5.5 has a 1M context window, it is configured to use only ~256K in the Codex app. Claude models offer the full 1M context window. Despite that, I found Codex’s compaction so powerful that I almost never felt the 256K context window was too small.
Somehow, Claude models, despite the 1M context window, feel cluttered beyond a specific point. I have been using Opus since the release of 4.5 and up to 4.8. I still have to explore Fable 5 more, which felt somewhat better based on initial interactions, but I am not sure if it is good enough to switch from GPT-5.5.
Readability, Vibes, and Trust
Then there is readability, something that I cannot directly explain. I find GPT-5.5’s responses more readable and easier to understand than Claude’s. GPT-5.5 usually does not produce a wall of information (or at least in the Codex app, it does not feel like that), while Claude’s output feels weirdly unreadable and hard to follow.
Company culture and vibes matter too. Anthropic has been doing a lot of bad things lately. The nerfing of Mythos to Fable is just disappointing. Typical of western culture: ‘We do not trust people to do the right thing except the ones we trust’. Compared to Claude’s extensive refusals, GPT-5.5 is extremely helpful and cooperative. Unless explicitly asked to do something harmful, GPT-5.5 is more than happy to help with anything.
Then there is the feel! I feel like Claude models are very agreeable yet not very consistent. If you ask Claude to do something many times, you may get different responses. Also, if you say something wrong, Claude may actually go with it. I like the steerability of Claude until I can steer it in the wrong direction. The inconsistencies of Claude seem to be related to its short thinking time, but I could be wrong. I heard they probably fixed some of these issues in later versions of Opus (and probably Fable).
GPT-5.5 has a more consistent feel, and I attribute it to its longer thinking time, but I could be wrong. While GPT-5.5 is also steerable, sometimes it pushes back when you try to steer it in the wrong direction. The wrong direction here means what the developers think is wrong, not objective reality. On some issues, GPT-5.5 tries to be safe rather than take sides, IYKWIM. But in general, especially related to coding and development work, GPT-5.5 just feels superior.
Limits and Practical Use
I used to have both a Claude and ChatGPT subscription ($20 each) so I could keep using both models. However, the limits on Claude’s side have been so restrictive that I could not use the Opus model to start with, and Sonnet would last something like an hour. GPT-5.5 is an Opus-level model (not Sonnet), and it would last something like 2-3 hours of continuous use before I hit the 5-hour limit. So I canceled my Claude subscription and got two ChatGPT Plus subscriptions.
Claude has very strict limits. The moment the limits are gone, it stops. You can’t even use simple Claude chat (not Claude Code, as the limits are shared). If you assign a long task to Codex while your 5-hour limit is 1% remaining, it will keep doing the task until it finishes the work. And the good part is, beyond 0% of the 5-hour limit, it will not even consume your weekly limit. Not to mention, you can still use ChatGPT even if you exhaust your Codex limits.
My Current Verdict
These assessments are based on the current state of the models and my personal experience. OpenAI could go in the same direction as Anthropic and nerf the model to be less helpful and more refusal-y. I know they are probably cooking GPT-5.6 or GPT-6, and they were the first ones to hype this ‘dangerous’ AI narrative. Let us wait and see how things evolve.