Replicate
Run thousands of open-source AI models via a simple API with serverless GPU billing.
Pay per prediction Free tier available
About
Replicate provides a cloud platform to run and deploy open-source AI models including LLaMA, Stable Diffusion, Whisper, and more. Pay per prediction with serverless GPUs, and deploy custom models using Cog.
Features
1000+ public models in catalog
Stable Diffusion, LLaMA, Whisper & more
Custom model deployment with Cog
Serverless GPU infrastructure
Language-agnostic REST API
Webhooks & streaming
Specifications
| Context Window | Model dependent |
| Tool Use | |
| Vision | |
| Streaming | |
| Open Source | |
| Self-Host | |
| Starting Price | Pay per prediction |