Gemini 2.0 Flash Pricing 2026
vs Claude Haiku Cost
Full Gemini 2.0 Flash and Flash-Lite pricing — tokens, context caching, 1M context window, and a direct comparison with Claude Haiku 4.5 to help you choose the cheapest model for your use case.
Flash Input
Flash Output
Flash-Lite Input
Context Window
Gemini Model Pricing Table
| Model | Input ($/M) | Output ($/M) | Image input | Context | Free tier |
|---|---|---|---|---|---|
| Gemini 2.0 Flash | $0.10 | $0.40 | Yes | 1M | Yes (15 RPM) |
| Gemini 2.0 Flash-Lite | $0.025 | $0.10 | Yes | 1M | Yes (30 RPM) |
| Gemini 2.5 Pro | $1.25 (<200K) | $10.00 | Yes | 1M | Limited |
Gemini 2.0 Flash vs Claude Haiku 4.5
| Feature | Gemini 2.0 Flash | Gemini Flash-Lite | Claude Haiku 4.5 |
|---|---|---|---|
| Input price/M (uncached) | $0.10 | $0.025 | $0.80 |
| Output price/M | $0.40 | $0.10 | $4.00 |
| Cache read/M | ~$0.05 (50% off) | ~$0.0125 | $0.08 (90% off) |
| Context window | 1M tokens | 1M tokens | 200K tokens |
| Image / vision | Yes | Yes | Yes |
| Tool use quality | Good | Basic | Excellent |
| Free tier | Yes (15 RPM) | Yes (30 RPM) | No |
| Data residency | Google (US/EU) | Google (US/EU) | Anthropic (US) |
| Prompt caching depth | 50% off (implicit) | 50% off (implicit) | 90% off (explicit) |
Monthly Cost Examples
| Workload (per month) | Gemini 2.0 Flash | Gemini Flash-Lite | Claude Haiku 4.5 | Haiku + 80% cache |
|---|---|---|---|---|
| 10M in / 2M out | $1.80 | $0.45 | $16 | $3.40 |
| 100M in / 20M out | $18 | $4.50 | $160 | $34 |
| 1B in / 100M out | $140 | $35 | $1,200 | $256 |
When to Use Gemini 2.0 Flash vs Claude Haiku
| Use case | Best choice | Why |
|---|---|---|
| Ultra-high-volume text classification | Gemini Flash-Lite | Lowest uncached price at $0.025/M in |
| Very long document processing (>200K tokens) | Gemini 2.0 Flash | 1M context vs Claude's 200K limit |
| Development / prototyping (no budget) | Gemini 2.0 Flash | Free tier at 15 RPM |
| Agentic tool-use pipelines | Claude Haiku 4.5 | Superior function calling reliability |
| Claude Code cost optimization | Claude Haiku 4.5 | 90% cache discount, native Claude Code integration |
| Google Cloud / Vertex AI ecosystem | Gemini 2.0 Flash | Native GCP integration, unified billing |
Frequently Asked Questions
How much does Gemini 2.0 Flash cost per token?
Gemini 2.0 Flash costs $0.10 per million input tokens and $0.40 per million output tokens via the Google AI API. Gemini 2.0 Flash-Lite is even cheaper at $0.025/M input and $0.10/M output. Both models have a 1M token context window and support image input at the same rate as text tokens.
Is Gemini 2.0 Flash cheaper than Claude Haiku 4.5?
At sticker price, yes — Gemini Flash ($0.10/M) is 8× cheaper than Haiku ($0.80/M). However, Claude's 90% prompt caching drops Haiku's effective input cost to $0.08/M on cache hits. For pipelines with high cache hit rates (agent loops with fixed system prompts), Haiku can match or beat Gemini Flash on cost while offering superior tool use reliability.
Does Gemini 2.0 Flash have a free tier?
Yes. The Google AI API (not Vertex AI) includes a free tier for Gemini 2.0 Flash at 15 requests per minute and 1M tokens per minute. Flash-Lite has a 30 RPM free tier. This free tier is available for development and low-volume production use. The Claude API has no comparable sustained free tier — only initial trial credits upon signup.
What is the difference between Gemini 2.0 Flash and Flash-Lite?
Gemini 2.0 Flash is the full-capability model ($0.10/M input): better reasoning, more reliable instruction following, stronger tool use, and higher quality on complex tasks. Flash-Lite ($0.025/M input) is a smaller, faster model optimized for simple tasks where quality is less critical. For classification, routing, and summarization of straightforward content, Flash-Lite is adequate. For multi-step reasoning, structured output, or complex instructions, use Flash or a stronger model.
How does Gemini context caching compare to Claude prompt caching?
Gemini uses implicit context caching — Google automatically reuses matching context prefixes. Cache reads cost roughly 50% of input price. Claude uses explicit prompt caching with cache_control markers — cache reads cost 10% of input price (90% discount). For most use cases, Claude's 90% cache discount is more aggressive. Gemini's implicit system is easier to implement (no cache markers needed) but less cost-effective at high cache hit rates.
How do I calculate Gemini 2.0 Flash monthly costs?
Monthly cost = (input_tokens × $0.0000001) + (output_tokens × $0.0000004). At 100M input + 20M output tokens/month: (100M × $0.10/M) + (20M × $0.40/M) = $10 + $8 = $18/month. Compare this with your Claude workload using our free cost calculator — paste your Claude Code session logs to get exact token counts, then apply Gemini's pricing to see the side-by-side comparison.