Open Calculator →

LLM API Cost Comparison:
Claude vs GPT-4o vs Gemini 2025

Side-by-side pricing for the leading models. Find the cheapest API for your workload — then calculate your actual Claude Code costs from session logs.

Calculate My Claude Costs →

LLM API Pricing Table (per million tokens, 2025)

Model Input $/M Output $/M Cache Read $/M Best for
Claude Haiku 3.5 Cheapest
Anthropic
$0.80 $4.00 $0.08 Routing, classification, lightweight agents
Gemini 1.5 Flash
Google
$0.075 $0.30 $0.019 Very high volume, multimodal, long context
GPT-4o mini
OpenAI
$0.15 $0.60 $0.075 Simple tasks, OpenAI ecosystem
Claude Sonnet 4.5 Recommended
Anthropic
$3.00 $15.00 $0.30 Code generation, complex reasoning, agentic
GPT-4o
OpenAI
$2.50 $10.00 $1.25 Complex tasks, OpenAI ecosystem
Gemini 1.5 Pro
Google
$1.25 $5.00 $0.31 Long-context, multimodal
Claude Opus 4 Most capable
Anthropic
$15.00 $75.00 $1.50 Hardest tasks, research, frontier reasoning

Best for low-cost production

Claude Haiku 3.5 + Gemini Flash cover most routine tasks cheaply. Haiku's cache reads at $0.08/M are the best in class for repeated-context workloads.

Best quality-per-dollar

Claude Sonnet 4.5 leads coding benchmarks at $3/M input — comparable to GPT-4o ($2.50/M) but with stronger agentic and tool-use performance.

Caching advantage

Claude's cache reads are 4× cheaper than GPT-4o's ($0.30 vs $1.25/M for Sonnet-tier). For long system prompts or document QA, this gap is significant.

Frequently Asked Questions

Is Claude API cheaper than GPT-4o?

Claude Haiku ($0.80/M input) is 3× cheaper than GPT-4o ($2.50/M) for simple tasks. Claude Sonnet ($3/M) is slightly more expensive than GPT-4o on input, but cheaper on cache reads ($0.30 vs $1.25/M). For applications with heavy prompt caching, Claude comes out cheaper at both tiers.

Which LLM API is cheapest for production in 2025?

Gemini 1.5 Flash has the lowest list prices at $0.075/M input and $0.30/M output. For Claude-specific workloads (Claude Code, Anthropic SDK apps), Haiku 3.5 is the cheapest option and benefits from deep prompt caching support. A routing architecture — Flash/Haiku for simple calls, Sonnet for hard ones — is the most cost-effective overall pattern.

Does Claude prompt caching beat GPT-4o's cached input pricing?

Yes. Claude Sonnet cache reads cost $0.30/M; GPT-4o cache reads cost $1.25/M — a 4× difference. Claude Haiku cache reads are just $0.08/M. For any workload that repeats a large system prompt, tool definition set, or long document, Claude's caching is substantially cheaper.

How do I compare actual costs for my Claude Code workload?

Claude Code writes every API call to ~/.claude/projects/<name>/*.jsonl. Drop that file into the Claude Code Cost Calculator — it applies per-model pricing and shows a full breakdown by model, tool, and hour. Fully client-side; nothing leaves your browser.