AI Token Pricing Calculator â€” Compare GPT-4o, Claude, Gemini Costs

Compare token pricing across GPT-4o, GPT-4o-mini, Claude 3.5 Sonnet/Haiku, and Gemini 1.5 Pro/Flash. Estimate per-request, monthly, and annual costs with DAU scaling.

✓ Formula verified: January 2026

⚡

AI Token Pricing

Results update instantly as you type

Enter Values

AI Model*Current pricing as of May 2026 — 12 models across 6 providers

This field is required.

Paste your prompt to auto-count tokensRough estimate (~4 chars per token). Adjust Prompt Tokens below for exact counts.

Prompt Tokens per Request*Input tokens per request (system prompt + message). Auto-filled from text above.

This field is required.

Completion Tokens per Request*Output tokens per request (the model response)

This field is required.

Use Prompt CachingPrompt caching reduces input costs by ~75-90% for repeated context blocks

Daily Active Users (optional)Number of unique users making requests each day

Requests per User per Day (optional)Average number of requests each user makes per day

Cost per Request

$0.0165

↑ Gain

ModelGPT-5

ProviderOpenAI

Prices Last Updated2026-05-02

Input Price (per 1M tokens)$15.00

Output Price (per 1M tokens)

$60.00

Prompt Tokens

500

Completion Tokens

150

Input Cost per Request

$0.00750000

Output Cost per Request

$0.00900000

Total Monthly Tokens

19,500

Estimated Annual Cost

$297.00K

Cost per User per Month

$2.4750

Requests per Day

50,000

All Model Comparisons

[{"id":"gpt-4o-mini","label":"GPT-4o-mini","provider":"OpenAI","inputPrice":0.15,"outputPrice":0.6,"cachedInput":0.0375,"costPerRequest":0.000165},{"id":"gemini-3-flash","label":"Gemini 3.1 Flash","provider":"Google","inputPrice":0.15,"outputPrice":0.6,"cachedInput":0.01875,"costPerRequest":0.000165},{"id":"claude-haiku-4","label":"Claude Haiku 4.5","provider":"Anthropic","inputPrice":0.25,"outputPrice":1.25,"cachedInput":0.025,"costPerRequest":0.00031249999999999995},{"id":"llama-4","label":"Llama 4","provider":"Meta","inputPrice":0.5,"outputPrice":1.5,"costPerRequest":0.000475},{"id":"deepseek-v3","label":"DeepSeek V3","provider":"DeepSeek","inputPrice":0.5,"outputPrice":2,"costPerRequest":0.0005499999999999999},{"id":"gemini-3-pro","label":"Gemini 3.1 Pro","provider":"Google","inputPrice":1.5,"outputPrice":6,"cachedInput":0.1875,"costPerRequest":0.00165},{"id":"gpt-5-mini","label":"GPT-5-mini","provider":"OpenAI","inputPrice":2,"outputPrice":8,"cachedInput":0.3,"costPerRequest":0.0021999999999999997},{"id":"gpt-4o","label":"GPT-4o","provider":"OpenAI","inputPrice":2.5,"outputPrice":10,"cachedInput":0.625,"costPerRequest":0.00275},{"id":"claude-sonnet-4","label":"Claude Sonnet 4.6","provider":"Anthropic","inputPrice":3,"outputPrice":15,"cachedInput":0.3,"costPerRequest":0.00375},{"id":"grok-3","label":"Grok 3","provider":"xAI","inputPrice":3,"outputPrice":15,"costPerRequest":0.00375},{"id":"gpt-5","label":"GPT-5","provider":"OpenAI","inputPrice":15,"outputPrice":60,"cachedInput":2.5,"costPerRequest":0.0165},{"id":"claude-opus-4","label":"Claude Opus 4.7","provider":"Anthropic","inputPrice":15,"outputPrice":75,"cachedInput":1.5,"costPerRequest":0.01875}]

http://127.0.0.1:54963/engineering/ai-token-pricing

AI Token Cost BreakdownPrices updated: 2026-05-02

Cost Breakdown per Request

Input 45%

Output 55%

● Input: $0.00750000● Output: $0.00900000

Per Request

$0.0165

Monthly

$24.75K

Annual

$297.00K

Cost per User

$2.4750

All Models — Cost for Same Request

GPT-4o-miniOpenAI

$0.000165

Gemini 3.1 FlashGoogle

$0.000165

Claude Haiku 4.5Anthropic

$0.000312

Llama 4Meta

$0.000475

DeepSeek V3DeepSeek

$0.000550

Gemini 3.1 ProGoogle

$0.001650

GPT-5-miniOpenAI

$0.002200

GPT-4oOpenAI

$0.002750

Claude Sonnet 4.6Anthropic

$0.003750

Grok 3xAI

$0.003750

GPT-5OpenAI

$0.0165

Claude Opus 4.7Anthropic

$0.0187

The Formula

Cost = (PromptTokens / 1M × InputPrice) + (CompletionTokens / 1M × OutputPrice)

AI model pricing is based on token usage, split between input (prompt) tokens and output (completion) tokens. Output tokens typically cost 3-6x more than input tokens because generating tokens requires sequential compute. Prompt caching can reduce input costs by up to 90% for repeated context blocks.

Variable Definitions

Prompt Tokens

Input Tokens

Tokens in the user prompt / system message. Includes system instructions, conversation history, and the user's current query.

Completion Tokens

Output Tokens

Tokens generated by the model in its response. Depends on task complexity and model's response style.

Input Price

Price per 1M Input Tokens

Cost per million tokens for the prompt portion. Ranges from $0.075 (Gemini Flash) to $15.00 (GPT-5, Opus).

Output Price

Price per 1M Output Tokens

Cost per million tokens for the generated response. Usually 3-6x input price.

Cached Input

Cached Input Price

Discounted rate for repeated prompt context blocks. Available on OpenAI, Anthropic, and Google. Typically 10-25% of standard input price.

How to Use This Calculator

1
Select your AI model from 12 options across 6 providers.
2
Paste your actual prompt text to auto-count tokens, or enter token counts manually.
3
Toggle prompt caching to see cached vs standard pricing.
4
Enter daily active users and requests per user to project monthly and annual costs.
5
Compare costs across all models to optimize your AI spend.
6
Prices are updated regularly — last updated date is shown in results.

Common Applications

Estimating production AI costs by comparing per-request pricing across 12+ models from OpenAI, Anthropic, Google, and others
Choosing the most cost-effective model for each task — from budget-friendly (GPT-4o-mini, Gemini Flash) to premium (GPT-5, Claude Opus)
Budgeting monthly AI infrastructure costs with usage projections based on daily active users and requests per user
Optimizing prompt design and implementing caching strategies to reduce token consumption and lower operational costs

Understanding the Concept

AI token pricing varies dramatically across models and providers. GPT-5 and Claude Opus 4.7 are the most capable (and most expensive) at $15/$60 and $15/$75 per million tokens respectively. For high-volume production workloads, GPT-4o-mini ($0.15/$0.60) or Gemini 3.1 Flash ($0.15/$0.60) offer the best value. Prompt caching is now a standard feature across all major providers — cached input tokens cost 75-90% less than standard input. For production deployments, enabling caching and choosing the right model tier for each task are the two most impactful cost optimization strategies. A customer support chatbot handling 10K conversations/day might cost $50/day on GPT-5 but only $1.65/day on GPT-4o-mini — a 30x difference. Use the model comparison table below to find the optimal price/performance balance for your use case.

Frequently Asked Questions

Sources & References

OpenAI — API Pricing Anthropic — API Pricing Google AI — Gemini API Pricing

Related Calculators

Reviews

No reviews yet. Be the first to share your experience with AI Token Pricing Calculator â€” Compare GPT-4o, Claude, Gemini Costs.

Write a Review