


GPT‑5.3‑Codex‑Spark is a research preview from OpenAI, designed as a smaller, ultra-fast version of GPT‑5.3‑Codex. It’s the first model built specifically for real-time coding within Codex, delivering over 1000 tokens per second on Cerebras’ Wafer Scale Engine 3 hardware. With a 128k context window and text-only support, Codex-Spark prioritizes speed without sacrificing capability, making it ideal for interactive development where latency is as critical as intelligence.
Codex-Spark achieves more than 1000 tokens per second, thanks to its optimized design and Cerebras’ purpose-built AI accelerator. This speed enables real-time collaboration, allowing you to interrupt or redirect the model as it works.
With a 128k context, Codex-Spark can handle substantial codebases or long conversations in a single session. This makes it suitable for complex projects where maintaining context across many files is essential.
By default, Codex-Spark makes minimal, targeted edits and doesn’t automatically run tests unless you ask. This keeps interactions fast and focused, letting you control the depth of analysis or verification.
Underlying optimizations—like a persistent WebSocket connection and streamlined inference stack—reduce client/server roundtrip overhead by 80%, per-token overhead by 30%, and time-to-first-token by 50%. These benefits apply to all models, not just Codex-Spark.
"Codex-Spark is our first model designed specifically for working with Codex in real-time—making targeted edits, reshaping logic, or refining interfaces and seeing results immediately."
This focus on real-time interaction sets Codex-Spark apart from larger frontier models that excel at long-running, autonomous tasks. It bridges the gap between speed and intelligence, enabling developers to collaborate with the model as if it were a pair programmer—interrupting, redirecting, and iterating without delay. The partnership with Cerebras further amplifies this edge by providing a latency-first serving tier that feels near-instant.
You’re a developer who values rapid iteration and real-time feedback in coding workflows. If you often find yourself waiting for model responses during interactive sessions, Codex-Spark offers a compelling alternative. It’s also worth exploring if you’re a ChatGPT Pro user eager to experiment with early-stage technology and provide feedback that shapes future releases. For those who prefer lightweight, targeted edits over full-scale autonomous tasks, this model delivers speed without overwhelming complexity.
Other tools you might consider
Loading comments…
Maker
meowbyte
Visit Website
openai.com/index/introducing-gpt-5-3-codex-spark/
Project Info
Product Keywords
Achievement