GPT‑5.3‑Codex‑Spark

What is GPT‑5.3‑Codex‑Spark?

GPT‑5.3‑Codex‑Spark is a research preview from OpenAI, designed as a smaller, ultra-fast version of GPT‑5.3‑Codex. It’s the first model built specifically for real-time coding within Codex, delivering over 1000 tokens per second on Cerebras’ Wafer Scale Engine 3 hardware. With a 128k context window and text-only support, Codex-Spark prioritizes speed without sacrificing capability, making it ideal for interactive development where latency is as critical as intelligence.

Who it's for

Real-time coders who need near-instant responses to iterate rapidly on code without waiting for long generation times.
ChatGPT Pro subscribers who want early access to cutting-edge models for experimenting with high-speed, interactive workflows.
Developers focused on targeted edits who prefer minimal, precise changes over full-scale autonomous tasks and want to control when tests run.

Key features

Ultra-fast inference

Codex-Spark achieves more than 1000 tokens per second, thanks to its optimized design and Cerebras’ purpose-built AI accelerator. This speed enables real-time collaboration, allowing you to interrupt or redirect the model as it works.

128k context window

With a 128k context, Codex-Spark can handle substantial codebases or long conversations in a single session. This makes it suitable for complex projects where maintaining context across many files is essential.

Lightweight default behavior

By default, Codex-Spark makes minimal, targeted edits and doesn’t automatically run tests unless you ask. This keeps interactions fast and focused, letting you control the depth of analysis or verification.

End-to-end latency improvements

Underlying optimizations—like a persistent WebSocket connection and streamlined inference stack—reduce client/server roundtrip overhead by 80%, per-token overhead by 30%, and time-to-first-token by 50%. These benefits apply to all models, not just Codex-Spark.

What stands out

"Codex-Spark is our first model designed specifically for working with Codex in real-time—making targeted edits, reshaping logic, or refining interfaces and seeing results immediately."

This focus on real-time interaction sets Codex-Spark apart from larger frontier models that excel at long-running, autonomous tasks. It bridges the gap between speed and intelligence, enabling developers to collaborate with the model as if it were a pair programmer—interrupting, redirecting, and iterating without delay. The partnership with Cerebras further amplifies this edge by providing a latency-first serving tier that feels near-instant.

Worth checking out if…

You’re a developer who values rapid iteration and real-time feedback in coding workflows. If you often find yourself waiting for model responses during interactive sessions, Codex-Spark offers a compelling alternative. It’s also worth exploring if you’re a ChatGPT Pro user eager to experiment with early-stage technology and provide feedback that shapes future releases. For those who prefer lightweight, targeted edits over full-scale autonomous tasks, this model delivers speed without overwhelming complexity.

What is GPT‑5.3‑Codex‑Spark?

Who it's for

Real-time coders who need near-instant responses to iterate rapidly on code without waiting for long generation times.
ChatGPT Pro subscribers who want early access to cutting-edge models for experimenting with high-speed, interactive workflows.
Developers focused on targeted edits who prefer minimal, precise changes over full-scale autonomous tasks and want to control when tests run.

Key features

Ultra-fast inference

128k context window

Lightweight default behavior

End-to-end latency improvements

What stands out

"Codex-Spark is our first model designed specifically for working with Codex in real-time—making targeted edits, reshaping logic, or refining interfaces and seeing results immediately."

GPT‑5.3‑Codex‑Spark

About GPT‑5.3‑Codex‑Spark

What is GPT‑5.3‑Codex‑Spark?

Who it's for

Key features

Ultra-fast inference

128k context window

Lightweight default behavior

End-to-end latency improvements

What stands out

Worth checking out if…

Related products

Mistral 3

Okara

Blueberry

Axel

Comments

About GPT‑5.3‑Codex‑Spark

What is GPT‑5.3‑Codex‑Spark?

Who it's for

Key features

Ultra-fast inference

128k context window

Lightweight default behavior

End-to-end latency improvements

What stands out

Worth checking out if…

Related products

Mistral 3

Okara

Blueberry

Axel