Amazon Web Services has struck a partnership with Cerebras Systems to make the startup's AI inference chips available through AWS, according to announcements from both companies and reporting by the Wall Street Journal and Reuters.
Cerebras, which is known for its wafer-scale processors that are significantly larger than conventional GPU dies, has positioned its hardware as a speed advantage for AI inference workloads. The collaboration is aimed at setting what both companies described as a new standard for inference speed and performance in the cloud.
The deal extends Cerebras' reach considerably. The Santa Clara-based startup had previously built a customer base through direct contracts with enterprises and national laboratories, but cloud availability on AWS gives it access to a far broader pool of developers and corporate users without requiring dedicated on-premise deployments.
For Amazon, the arrangement adds specialist inference capacity at a moment when demand for fast, cost-efficient AI inference is intensifying across the industry. AWS already offers its own custom silicon, including the Trainium and Inferentia families, as well as access to Nvidia GPUs. Adding Cerebras provides another option for latency-sensitive applications.
Cerebras had filed for an initial public offering in late 2024, though the listing had not proceeded as of early 2026. The AWS partnership represents a meaningful commercial development ahead of any potential return to public markets.


