Fireworks AI

by Fireworks AI freemium AI API Platforms

High-speed inference for open-source LLMs with function calling and structured outputs.

From $0.20/MTok Free tier available

About

Fireworks AI delivers production-grade inference for open-source models like LLaMA and Mixtral with sub-200ms latency. Supports function calling, structured JSON output, vision models, and serverless or dedicated GPU deployments.

Features

Sub-200ms inference latency
LLaMA 3, Mixtral, Gemma, Qwen models
Function calling & structured output
OpenAI-compatible API
Serverless & dedicated GPU options
Vision & multimodal models

Specifications

Context Window 128K tokens
Tool Use
Vision
Streaming
Open Source
Self-Host
Starting Price $0.20/MTok input

Community Feedback

How would you rate Fireworks AI?

Quick Info

Category AI API Platforms
Pricing freemium
Vendor Fireworks AI
From $0.20/MTok

Free tier available

Start Free