OpenAI Leverages Cerebras Wafer-Scale Chip to Boost Codex Speed
Partnership and New Hardware
OpenAI announced a collaboration with Cerebras that brings its Codex‑Spark coding model to the Wafer Scale Engine 3. This processor, described as the size of a dinner plate, represents Cerebras’ core hardware offering and is the first product emerging from the partnership announced earlier this year.
Performance Benchmarks
Codex‑Spark delivers about 1,000 tokens per second, a speed that OpenAI calls modest by Cerebras standards. The company has measured 2,100 tokens per second on Llama 3.1 70B and reported 3,000 tokens per second on its own open‑weight gpt‑oss‑120B model, suggesting the lower figure reflects the larger or more complex nature of Codex‑Spark.
Why Speed Matters
AI‑driven coding assistants have experienced a breakout year, with tools such as OpenAI’s Codex and Anthropic’s Claude Code becoming increasingly useful for rapid prototyping, interface design and boilerplate generation. Faster inference translates directly into quicker developer iteration, turning a 1,000‑token‑per‑second experience into what developers describe as a “rip saw” versus a slower, more laborious process.
Competitive Landscape
The coding‑assistant market is crowded. OpenAI, Anthropic, Google and other firms are racing to ship more capable agents, and latency has become a key differentiator. OpenAI recently rolled out GPT‑5.3‑Codex after an internal “code red” memo highlighted competitive pressure from Google, following the earlier release of GPT‑5.2 in December.
Reducing Dependence on Nvidia
OpenAI has been systematically diversifying its hardware suppliers. The company signed a multi‑year deal with AMD in October 2025, entered a $38 billion cloud‑computing agreement with Amazon in November, and is designing its own custom AI chip for eventual fabrication by TSMC. A planned $100 billion infrastructure deal with Nvidia has stalled, though Nvidia later committed a $20 billion investment. Reuters reported that OpenAI grew unsatisfied with the speed of some Nvidia chips for inference tasks, a shortfall Codex‑Spark aims to address.
Implications for Developers
For developers spending hours inside a code editor waiting for AI suggestions, the speed gains offered by Codex‑Spark could meaningfully reduce friction. While the performance numbers are still modest compared with Cerebras’ top benchmarks, the partnership signals OpenAI’s commitment to delivering faster, more responsive coding tools as part of a broader hardware diversification strategy.
Used: News Factory APP - news discovery and automation - ChatGPT for Business