Back Feb 13, 2026

OpenAI Leverages Cerebras Wafer-Scale Chip to Boost Codex Speed

Partnership and New Hardware

OpenAI announced a collaboration with Cerebras that brings its Codex‑Spark coding model to the Wafer Scale Engine 3. This processor, described as the size of a dinner plate, represents Cerebras’ core hardware offering and is the first product emerging from the partnership announced earlier this year.

Performance Benchmarks

Codex‑Spark delivers about 1,000 tokens per second, a speed that OpenAI calls modest by Cerebras standards. The company has measured 2,100 tokens per second on Llama 3.1 70B and reported 3,000 tokens per second on its own open‑weight gpt‑oss‑120B model, suggesting the lower figure reflects the larger or more complex nature of Codex‑Spark.

Why Speed Matters

AI‑driven coding assistants have experienced a breakout year, with tools such as OpenAI’s Codex and Anthropic’s Claude Code becoming increasingly useful for rapid prototyping, interface design and boilerplate generation. Faster inference translates directly into quicker developer iteration, turning a 1,000‑token‑per‑second experience into what developers describe as a “rip saw” versus a slower, more laborious process.

Competitive Landscape

The coding‑assistant market is crowded. OpenAI, Anthropic, Google and other firms are racing to ship more capable agents, and latency has become a key differentiator. OpenAI recently rolled out GPT‑5.3‑Codex after an internal “code red” memo highlighted competitive pressure from Google, following the earlier release of GPT‑5.2 in December.

Reducing Dependence on Nvidia

OpenAI has been systematically diversifying its hardware suppliers. The company signed a multi‑year deal with AMD in October 2025, entered a $38 billion cloud‑computing agreement with Amazon in November, and is designing its own custom AI chip for eventual fabrication by TSMC. A planned $100 billion infrastructure deal with Nvidia has stalled, though Nvidia later committed a $20 billion investment. Reuters reported that OpenAI grew unsatisfied with the speed of some Nvidia chips for inference tasks, a shortfall Codex‑Spark aims to address.

Implications for Developers

For developers spending hours inside a code editor waiting for AI suggestions, the speed gains offered by Codex‑Spark could meaningfully reduce friction. While the performance numbers are still modest compared with Cerebras’ top benchmarks, the partnership signals OpenAI’s commitment to delivering faster, more responsive coding tools as part of a broader hardware diversification strategy.

Used: News Factory APP - news discovery and automation - ChatGPT for Business

Source: Ars Technica2

Also available in:

Español OpenAI aprovecha el chip de escala de oblea de Cerebras para aumentar la velocidad de Codex Português OpenAI Utiliza Chip de Escala de Placa da Cerebras para Aumentar a Velocidade do Codex