← Home

LLM Pricing

Compare input/output cost, context window, and benchmarks across leading models.

Updated 2026-06-15T02:37:06.743Z

ProviderModelInput / 1MOutput / 1MContext
DeepSeek
DeepSeek-V4 Flash
DeepSeek V4 Flash replaces deepseek-chat. 1M context. Pricing per 1M tokens.
$0.140$0.2801.0M
DeepSeek
DeepSeek-V4 Pro
DeepSeek V4 Pro replaces deepseek-reasoner. 1M context. Pricing per 1M tokens.
$0.435$0.8701.0M
OpenAI
gpt-4o
Flagship multimodal model. Pricing per 1M tokens.
$2.50$10.00128K
OpenAI
gpt-4o-mini
Fast, affordable small model for everyday tasks.
$0.150$0.600128K
OpenAI
o3
Reasoning model. Higher latency, best for complex STEM tasks.
$10.00$40.00200K
Anthropic
Claude 4 Sonnet
Latest Sonnet with extended thinking. Pricing per 1M tokens.
$3.00$15.00200K
Anthropic
Claude 4 Opus
Most capable Claude model for complex agentic workflows.
$15.00$75.00200K
Anthropic
Claude 4 Haiku
Fast, cost-effective model for high-volume tasks.
$0.625$2.50200K
DeepSeek
DeepSeek-V3
Strong open-weight model at very low cost.
$0.140$0.28064K
DeepSeek
DeepSeek-R1
Reasoning model. Output is long due to chain-of-thought.
$0.550$2.1964K
Google
Gemini 2.5 Pro
1M token context window. Strong coding and reasoning.
$1.25$10.001.0M
xAI
Grok 3
xAI flagship with real-time X data access.
$3.00$15.00131K
OpenAI
GPT-4.1
Long-context coding model with 1M token context window.
$2.00$8.001.0M
Anthropic
Claude 3.7 Sonnet
Prior-generation Claude Sonnet with extended thinking mode.
$3.00$15.00200K
Google
Gemini 2.5 Flash
Fast, cost-efficient Gemini with 1M context window.
$0.150$0.6001.0M
xAI
Grok 3 Mini
Fast, affordable Grok model for everyday tasks.
$0.300$0.500131K
Meta
Llama 4 Maverick
Open-weight multimodal model via API partners.
$0.200$0.600256K
OpenRouter
Qwen 3 235B A22B
Mixture-of-experts model available through unified API.
$0.800$1.60128K
Together AI
Llama 3.3 70B
Open-weight model hosted on serverless inference platform.
$0.880$0.880131K
AnthropicDeepSeekGoogleMetaOpenAIOpenRouterTogether AIxAI

Source: provider pricing pages. Prices are per 1M tokens unless noted. Benchmarks are from public leaderboards.