1. Comparative Pricing Scales
Gemini 1.5 Pro is priced at $1.25 per million input tokens and $5.00 per million output tokens for contexts under 128k. This makes it exactly 50% cheaper than GPT-4o ($2.50 / $10.00). However, if your prompt exceeds 128k tokens, Gemini's rates double to $2.50 / MTok input and $10.00 / MTok output, matching OpenAI's rates.
2. Caching Implementation and Break-Even Points
Google's prompt caching requires a minimum token size of 32k to trigger and offers a 50% discount on cache-hits. OpenAI's caching has no minimum size requirement, automatically applying to repeating prompts of any size. For small queries, GPT-4o caching is easier to trigger.
3. Cost Summary: Flash vs. Mini
In the budget tier, Gemini 1.5 Flash ($0.075 / MTok input, $0.30 / MTok output) is half the price of GPT-4o-mini ($0.15 / MTok input, $0.60 / MTok output), making Gemini 1.5 Flash the cheapest managed proprietary model in 2026.