# The Price Sheet

**Last verified: July 2026.**

We keep the chapters in this part at order-of-magnitude, so this sheet is the one place the volatile numbers live. When it disagrees with a provider's pricing page, the pricing page wins.

## Per-token rates

Every major provider bills in dollars per million tokens, with separate rates for input and output, and output runs about five to six times the input rate. The tiers below cover OpenAI, Anthropic, and Google at current list prices, quoted as ranges because the exact figures move often.

- **Flagship tier** (Claude Opus 4.8, GPT-5.5, Gemini 3.1 Pro): roughly $2 to $5 per million input tokens and $12 to $30 per million output tokens.
- **Mid tier** (Claude Sonnet 4.6, GPT-5.4, Gemini 3.5 Flash): roughly $1.50 to $3 input and $9 to $15 output.
- **Small tier** (Claude Haiku 4.5, GPT-5.4-nano, Gemini Flash-Lite): roughly $0.20 to $1 input and $1 to $5 output.

Long context can carry a surcharge, and the pattern varies: Google roughly doubles Pro rates past 200K tokens, while Anthropic's current million-token models bill at flat rates. Check the fine print when your prompts run long.

## Prompt caching

Caching is the biggest single discount in the price list. When a request reuses a prefix the provider has seen recently, those input tokens bill at a fraction of the standard rate.

- **Reads:** cached input runs at roughly a tenth of standard input on current Anthropic, OpenAI, and Gemini models, though some older models still sit closer to half price.
- **Writes:** Anthropic charges to write the cache, about 1.25x standard input for a five-minute cache and about 2x for a one-hour cache. OpenAI and Google cache automatically on their default tiers with no separate write charge.

The practical move stays the same across providers: put the stable system prompt and documents up front, and pay full price only for what changes.

## Batch tiers

Work that can wait ships at about half price. OpenAI, Anthropic, and Google all run batch tiers at roughly a 50 percent discount against standard rates, with results promised within 24 hours and usually back much sooner. The batch discount stacks with caching, so high-volume offline work can land at a tenth of list price or better.

## The price trend

Prices have fallen steeply year over year and the trend has held. List prices dropped by well over half between early 2025 and early 2026, and the cost of matching a fixed capability level has fallen closer to ten times per year, because each new small model does what last year's flagship did. Capability per dollar keeps rising, so a design that barely pencils out today usually pencils out comfortably within six months.

## How to read this sheet

Rates rot in months. Before you trust any number here, check the provider's pricing page and check this sheet's date. If the date is more than a quarter old, treat every figure as a shape rather than a quote, and reverify before it goes into a forecast or a launch decision.