Cerebras

by Cerebras Systems paid AI API Platforms

World-class LLM inference speeds via wafer-scale AI chips, ideal for agentic workloads.

From $0.10/MTok

About

Cerebras delivers industry-leading inference throughput using its wafer-scale CS-3 AI chips, achieving speeds far exceeding GPU-based systems. Supports LLaMA 3.1 70B and 405B models, making it ideal for latency-sensitive agentic pipelines.

Features

Wafer-scale chip for extreme speed
LLaMA 3.1 70B & 405B support
Ideal for agentic low-latency workloads
Simple REST API
Competitive token pricing
Streaming responses

Specifications

Context Window 128K tokens
Tool Use
Vision
Streaming
Open Source
Self-Host
Starting Price $0.10/MTok input

Community Feedback

How would you rate Cerebras?

Quick Info

Category AI API Platforms
Pricing paid
From $0.10/MTok

Get Started