AI Goldmine

Cheapest LLM APIs: Blended Cost per Million Tokens Comparison

API token pricing has declined significantly. Today, developers can process millions of tokens for pennies. This guide compares the cheapest LLM APIs across proprietary and serverless open-weights hosting options.

Interactive LLM Cost Calculator

Want to calculate your exact parameters and operational expenses? Run the calculations locally inside your browser.

Launch LLM Cost Calculator

1. Proprietary Budget Models (Mini vs. Flash)

Gemini 1.5 Flash costs $0.075 / MTok input and $0.30 / MTok output. GPT-4o-mini costs $0.15 / MTok input and $0.60 / MTok output. Both models support prompt caching, lowering costs further.

2. DeepSeek V3: The High-Intelligence Budget API

DeepSeek V3 costs $0.14 per million input tokens ($0.014 cached) and $0.28 per million output tokens, offering flagship intelligence at budget model rates.

3. Serverless Open Weights Hosting (Llama 8B)

Hosting providers (Together AI, DeepInfra) charge ~$0.05 to $0.10 per million tokens for Llama 3 8B, representing the cheapest endpoints for routine tasks.

Frequently Asked Questions

What is the absolute cheapest LLM API?

For budget models, Gemini 1.5 Flash. For flagship intelligence, DeepSeek V3 represents the cheapest capable option.

Are cheap APIs reliable?

Yes. Google, OpenAI, and DeepSeek back their endpoints with high SLAs, making them suitable for production workloads.