The latest
OpenAI had planned a broad launch for GPT-5.6 but pulled back after the US government — briefed on the model’s capabilities before release — asked for a narrower rollout. Access is currently limited to selected partners through the API and Codex, with wider availability planned for ChatGPT and the broader API in the coming weeks.
Details
- GPT-5.6 is a three-model family: Sol is the flagship, Terra is the balanced everyday model, and Luna is the faster, lower-cost option. Pricing runs from $1 per million input tokens for Luna to $5 for Sol.
- OpenAI says Sol scored 91.9% on TerminalBench 2.1 in Ultra mode, ahead of Claude Mythos 5 at 88%. Those figures come from OpenAI’s own preview materials and have not been independently verified.
- Sol introduces Ultra mode, which distributes complex tasks across parallel subagents — a shift that could affect extended coding sessions, security analysis, and scientific workflows.
- OpenAI says Sol does not cross its “Cyber Critical” threshold under the Preparedness Framework. In testing, the model identified vulnerabilities and exploit primitives but did not autonomously produce functional full-chain exploits.
- Safeguards include model-level refusals, real-time misuse classifiers for cybersecurity and biology, and account-level review. OpenAI used more than 700,000 A100-equivalent GPU hours for automated red-teaming against universal jailbreaks.
- CEO Sam Altman said this type of limited rollout may be warranted when models reach major new capability levels, but added it is not the process OpenAI considers optimal long-term.
- Anthropic previously took down Fable 5 and Mythos 5 following a US export control directive. In both cases, government action is now shaping who accesses frontier models and when.
What to watch
The capability story is secondary to the access question: when does GPT-5.6 move from trusted partners to general availability, and what criteria govern that transition? OpenAI says it is working with the administration on a repeatable framework for future releases. Until that framework is public and tested, every major frontier model launch carries the same uncertainty — government review is now part of the release cycle.