Fireworks AI

by Fireworks AI freemium AI API Platforms

High-speed inference for open-source LLMs with function calling and structured outputs.

From $0.20/MTok Free tier available

About

Fireworks AI delivers production-grade inference for open-source models like LLaMA and Mixtral with sub-200ms latency. Supports function calling, structured JSON output, vision models, and serverless or dedicated GPU deployments.

Features

Sub-200ms inference latency

LLaMA 3, Mixtral, Gemma, Qwen models

Function calling & structured output

OpenAI-compatible API

Serverless & dedicated GPU options

Vision & multimodal models

Specifications

Context Window	128K tokens
Tool Use
Vision
Streaming
Open Source
Self-Host
Starting Price	$0.20/MTok input

Community Feedback

Quick Info

Category AI API Platforms

Pricing freemium

Vendor Fireworks AI

Website fireworks.ai

From $0.20/MTok

Free tier available

Start Free

Similar Products