Pricing: charge in a currency your costs track · The Builder's Stack

Not everything is realtimeThe Inference Budget

Your AI writing assistant launches at a flat monthly seat price, the model every software product you have ever bought uses, so nobody in the room questions it. After launch the usage dashboard tells a lopsided story: most subscribers draft a few documents a month, while a handful of agency accounts run the tool all day, pushing entire content libraries through it. The week the provider invoice lands, you do the division and find that each heavy account costs several times what it pays you, while the light users whose subscriptions carry the margin are also the likeliest to churn. The customers who love the product most, the ones writing your testimonials and filling your case studies, lose you money on every renewal. You have built a product whose best customers are its worst business.

The mismatch: flat prices, metered costs

Seat pricing survived for decades because one more user cost the software business almost nothing to serve. Whether an account lived in the product all day or opened it twice a month, the marginal cost sat near zero, so a flat monthly price was safe against any usage pattern, and the pre-AI playbook never had a reason to question it.

An AI feature breaks that assumption, because the meter runs per task. Every draft, every rewrite, every summarized thread spends tokens, and The bill of materials: cost the task, not the call showed you how to turn that spend into a per-task figure. Once that figure is real money, cost varies with usage while the price stays flat, which means the heaviest users eat the margin and the lightest users pay for them. The mismatch is harmless while the per-task cost is trivial, and it is dangerous the moment it is not.

Price in a currency your costs track: when the meter runs per task and the price is flat, your best users quietly become your worst business.

There are two ways out, and both work: charge in a unit that moves with the meter, or keep the flat price and cap what it includes. This is not a hypothetical failure; a widely reported analysis of an AI coding assistant's early flat subscription found that the average subscriber cost more to serve than they paid, and the heaviest subscribers cost a multiple of that.

The pricing currencies: seats, usage, and outcomes

Nearly every AI price on the market is quoted in one of three currencies, and each trades margin safety against ease of selling.

Seats charge per user per month. A seat is simple to explain, easy for procurement to budget, and quoted in the unit buyers already trust, which is why it sells best. Its weakness is exactly the mismatch above, so a seat price only stays safe with a cap or a written fair-use line that bounds how much metered work one seat can consume.
Usage charges per unit of work, whether the unit is a credit, a task, or a metered action like a generated report. It tracks your cost meter almost perfectly, so the margin holds at any volume, but it is harder to sell because the buyer cannot predict the bill, and it teaches users to ration the product at exactly the moment you want them building a habit.
Outcomes charge per result: a resolved support conversation, a completed screening, a booked meeting. It is the cleanest story a salesperson can tell, since the customer pays only when the product delivers, and it is the hardest to operate, because someone has to decide what counts as resolved, and that attribution (deciding which results the product genuinely earned) invites disputes and gaming. Support platforms that charge a fixed fee per conversation their agent resolves without a human are the visible pioneers here.

In practice hybrids dominate: a seat with an included allowance and metered overage above it, the same structure phone plans converged on for the same reason. The seat keeps the sale simple, the allowance keeps light and median users inside a predictable price, and the overage keeps the heavy tail from eating the margin.

The margin sheet: test the price against your heaviest users

Whatever currency you pick, test it with a margin sheet before it ships. The sheet is one row of arithmetic run twice: revenue per user per month minus cost per user per month, computed once at your median user and once at your 95th-percentile user (the user heavier than 19 out of every 20). The median row tells you whether the business works on paper, and the 95th-percentile row tells you whether it survives contact with real customers, because AI usage is heavy-tailed: a small fraction of accounts reliably drives a large share of the token spend, the same skew the opening scene's dashboard showed.

A price that only clears at the median user is a subsidy program for your heaviest users, funded by everyone else's margin.

Free tiers get the same two rows plus abuse limits. A free user's revenue is zero, so the sheet reads as pure cost, which you may accept as marketing spend, but only up to a ceiling you chose in advance: a daily task cap, rate limits, and terms that let you cut off automated accounts. A free tier without limits is an open API endpoint with a nicer login page.

Decide in advance where falling costs go

The last pricing decision is about a future you can already see coming: the cost side of your sheet falls. Provider rates for equivalent capability have dropped year over year for as long as these products have existed, sometimes by an order of magnitude within a couple of years, which is why every hard number in this part lives in the dated Price Sheet rather than in prose. When your per-task cost drops by half, the saving lands somewhere, and if you have not decided where, the scramble decides for you.

The saving can go to three places, and each is a legitimate strategy when you choose it rather than default into it:

Price cuts defend share, in a market where a competitor will pass the saving on if you do not.
Margin funds the business, the right call when your price already undercuts the alternative your product replaces.
A richer feature holds the price and upgrades what a task buys: a stronger model behind the same button, another verification pass, longer context.

Write the choice into the pricing plan now, so the next provider price drop executes a decision you already made instead of starting an argument under deadline.

Try it now

This drill spends no tokens; it is one margin sheet built for one real feature.

Get your unit. Pick one AI feature, yours if you have one in flight, otherwise a public one you use, and estimate its per-task cost with the method from The bill of materials: cost the task, not the call and the current rates on the Price Sheet. Order of magnitude is enough.

Pick your two users. Write a plausible monthly task volume for the median user and one for the 95th-percentile user. If you have a usage dashboard, read both off it; if not, estimate from how the feature gets used, such as a support tool's typical agent against its busiest queue, or a writing tool's occasional drafter against an agency seat.

Run the flat-seat row. Pick a seat price you could imagine on the product's pricing page, then compute margin at both volumes: the price minus tasks times per-task cost.

Run the seat-plus-allowance row. Keep the same seat price, set an included allowance near the median volume with a metered overage above it, and compute both margins again.

Write the verdict. One sentence stating which pricing survives the heavy user, and what cap or overage line the flat version would need in order to survive too. Keep the sheet, because it goes into the budget you sign in Write your Inference Budget and ship a feature that pays for itself.

Chapter Summary

Seat prices are flat and AI costs are metered, so heavy users consume the margin and light users pay for them.
Charge in a currency that tracks the meter, or keep the flat price and cap what it includes.
Seats sell easiest and need caps or fair-use lines; usage tracks cost but makes buyers nervous and users ration; outcomes tell the cleanest story and carry the hardest attribution.
Most AI pricing lands on a hybrid in practice: a seat with an included allowance and metered overage above it.
Run the margin sheet at the median user and at the 95th-percentile user, because a price that only works at the median subsidizes your heaviest users.
Free tiers get the same math plus hard abuse limits; a free tier without limits is an open endpoint.
Write caps and fair-use lines into the plan on day one, since adding them after launch reads as a takeaway aimed at your best advocates.
Provider costs fall year over year, so decide in advance whether the saving goes to price cuts, margin, or a richer feature.
This closes the last of the budget's decisions, and Write your Inference Budget and ship a feature that pays for itself gathers the decisions on cost, latency, routing, caching, batch, and pricing into one signed document.

Sources

Casado, M., & Bornstein, M. (2020). The New Business of AI (and How It's Different From Traditional Software). Andreessen Horowitz.
Dotan, T., & Seetharaman, D. (2023). Big Tech Struggles to Turn AI Hype Into Profits. The Wall Street Journal.
Poyar, K. (2024-2025). Growth Unhinged, essays on seat, usage, and outcome-based pricing for AI products.
Intercom Fin per-resolution pricing documentation (last verified July 2026).
Anthropic and OpenAI pricing pages (last verified July 2026).
The Price Sheet, this part's dated companion for current rates and discounts.

Marks this chapter complete on your course map. Reaching the end does this for you.