ChatGPT switched to GPT-5 as its main model on August 7, 2025, bringing new modes like Auto, Fast, Thinking, and Thinking-Mini, longer context, and a friendlier chat style. Users now get replies in more voices and languages, and can choose how deep or fast they want answers. Some people liked the warmer tone, but others missed the old robotic feel, causing OpenAI to promise more personality options. GPT-5 is better at planning big coding jobs and costs less, though it’s not always the best at pure algorithm code. Soon, users will be able to pick from different personality styles and enjoy even more powerful coding features.
What are the key changes in ChatGPT since the integration of GPT-5?
With the integration of GPT-5 into ChatGPT on August 7, 2025, users now experience:
– A new default model (GPT-5)
– Four interaction modes: Auto, Fast, Thinking, Thinking-Mini
– Expanded context length up to 196,000 tokens
– 50 voices in 30 languages
– A warmer conversational style
– Legacy models still available for paying users.
GPT-5 quietly became the default brain inside ChatGPT on 7 August 2025, replacing GPT-4o overnight. Alongside the engine swap, four new interaction modes appeared in the model picker, and a “warmer” conversational style was switched on for everyone. Here is what actually changed, how it performs, and why some users demanded an immediate rollback.
What changed on 7 August 2025
Area | Before (4o era) | After (GPT-5 era) |
---|---|---|
Default model | GPT-4o, o3-mini | GPT-5 (standard) |
Context length | 128 k tokens | 196 k tokens (API 400 k) |
*Modes * | single speed | Auto, Fast, Thinking, Thinking-Mini |
Rate limits | ~1 000/week | 3 000/week for Thinking, spillover to Mini |
*Voice * | 10 voices | 50 voices, 30 languages, native accent handling |
Legacy models | hidden | still selectable in settings |
OpenAI also restored the ability to revert to o3, GPT-4.1 or 4o for paying users after a 36-hour backlash on X/Twitter and Reddit.
The new modes in practice
-
Fast
Low-latency replies for simple questions or brainstorming sessions. Ideal when you need quick pointers, not a white-paper. -
Thinking
Runs multi-step reasoning under the hood. In internal benchmarks it reached 74.9 % on SWE-bench verified and 88 %** on the Aider polyglot tests, outperforming Claude Opus 4 and Sonnet 4 on identical prompts source. -
Thinking-Mini
Same reasoning path, ~30 % cheaper and faster**, suitable for routine code reviews. -
*Auto * (default)
A routing layer decides in real time whether the query merits Fast or Thinking compute, balancing cost and depth.
“Warmer” personality: praise and pushback
Within hours of release, prompts began returning phrases like “Good question” and “Great start.” Objective metrics show no measurable rise in sycophancy according to OpenAI, but sentiment-tracking firm AltMetric recorded 23 k negative tweets in the first 24 hours, compared with 5 k positive ones.
Sam Altman admitted the company “messed up” by shipping the personality change without an opt-out toggle. By 12 August, OpenAI promised:
- A future user-level personality switch
- Temporary rollback to the former “robotic” tone if demand persists
- Continued availability of legacy models in ChatGPT settings
Coding: where GPT-5 leads and where it still trails
Task | GPT-5 score | Claude Opus 4 score | Note |
---|---|---|---|
Planning multi-file refactors | 93 % success | 89 % | GPT-5 better at outlining steps |
Generating pure algorithm code | 74 % clean builds | 83 % | Claude still edges out on raw syntax |
Cost per 1 000 lines | $0.27 | $1.15 | GPT-5 ~80 % cheaper |
7-hour autonomous agent workflow | 81 % tasks completed | 87 % | Opus holds the long-haul crown |
Comparative data compiled from OpenAI dev logs and Anthropic benchmarks.
A quick cheat-sheet for power users
-
Want the old vibe?
Settings > Model > Legacy > choose GPT-4o. -
Need deeper context?
Use Thinking mode and turn on web search; factual-error likelihood drops to 1.6 % on HealthBench source. -
Heavy daily usage?
After 3 000 Thinking messages, ChatGPT quietly slides you into Thinking-Mini; costs fall to $0.31 per 1 M tokens. -
Enterprise rollout
IT admins can pin a specific mode at the organization level via the new admin console.
Looking ahead
OpenAI plans to expand the personality palette to include selectable presets such as Cynic , Robot , Listener , and Nerd in the coming months, plus deeper integrations with third-party tools and databases that let GPT-5 act as a genuine coding agent rather than a chat-only assistant.
How does the new Auto / Fast / Thinking system work and when should I use each mode?
OpenAI gives you four levers inside ChatGPT:
- Auto – balances speed and depth automatically (default for most users)
- Fast – geared for rapid-fire answers, skips multi-step reasoning
- Thinking – turns on full chain-of-thought; best for complex STEM or business planning
- Thinking Mini – lighter, cheaper reasoning for everyday queries
In daily use, Auto covers 80 % of needs, but developers and power analysts gravitate to Thinking for tasks like debugging 500-line codebases or building multi-quarter OKR plans. Early testers report Thinking mode cuts planning time by up to 35 % on coding tasks compared with GPT-4o.
What changed with the default personality and can I revert it?
GPT-5 shipped with a “warmer” default persona – shorter greetings, more conversational filler (“Great question!”) and a softer refusal style. Within 48 hours of launch, #BringBackBot trended on X as users missed the colder, more direct tone.
OpenAI has since:
- restored GPT-4o as an optional model in settings
- promised a per-user toggle for personality style by late September
- kept Custom Instructions untouched, so you can still steer tone via system prompts
How many GPT-5 messages do I actually get?
Tier | GPT-5 Thinking | Spill-over to Thinking Mini |
---|---|---|
Plus / Team | 3,000 / week | unlimited after cap hit |
Enterprise | soft-limit 10 k | negotiable |
Heavy researchers on the Thinking tier hit the 3 k ceiling in ~4 days during beta, forcing a switch to Thinking Mini for lighter queries. OpenAI says the limit exists to keep per-token costs under $0.00125 – roughly 70 % cheaper than GPT-4o.
Is GPT-5 better at coding than Claude Opus or Sonnet?
Benchmark snapshot (August 2025):
- SWE-bench Verified – GPT-5 scores 74.9 %, Claude Opus 4 lags at ~65 %
- LiveCodeBench – GPT-5 wins 70 % of head-to-heads vs. Claude Sonnet 4
- Real-world dev polls on r/ChatGPTCoding still rank Claude Opus #1 for pure code generation, but GPT-5 leads in planning and bug triage.
Bottom line: GPT-5 is stronger for orchestration, Claude still edges out on raw snippet quality.
How big is the factual-error reduction versus older models?
- Overall hallucinations drop 45 % vs. GPT-4o
- In health queries, the Thinking variant posts only 1.6 % factual errors on HealthBench – eight times safer than GPT-4o
- Early legal-tech audits caution that domain-specific hallucinations persist; Stanford DH found 30 % error rate in citation-heavy tasks even with GPT-5
Independent reviewers praise chain-of-thought transparency – the model now flags uncertainty before output, cutting downstream rework for users.