Line of Sight

Line of Sight

The Token Waterfall: A Working Financial Model for AI Pricing

This is a paid post. The model referenced throughout is available for download below.

Kyle Kelly's avatar
Kyle Kelly
Mar 12, 2026
∙ Paid

Hi, I’m Kyle Kelly and welcome to Line of Sight.

I've spent nearly two decades pricing new lines of business and products that could not afford to be wrong. At Aurora, a few basis points of model error killed deals.

AI has the same problem, at ten times the clock speed.

By the end of this read, you will have a working financial model that maps your full token cost stack, isolates hidden consumption, and tells you whether your pricing survives a provider rate shock before the market does.


The free deep dive told you what to model. This one hands you the model.

If you have not read “The New Math of AI,” start there. It covers the four paths, the token iceberg, and why the fastest-growing AI startups are running at 25% gross margins while the healthier cohort operates closer to 60%. This post assumes you understand the problem. It is here to solve it.

What follows is a walk-through of the Token Waterfall Model: a six-sheet working financial model built specifically for AI operators who need to price a product before launch, not discover their cost structure after. Every number in this article maps directly to a cell in the model. Open both. Work through them together.


What This Model Does That Generic SaaS Models Do Not

Every financial model template on the market was built for SaaS. Revenue minus COGS, COGS being hosting and support, gross margin being 80%. That math does not apply here.

The Token Waterfall Model: six sheets, all inputs unlocked. Available to paid subscribers via the link below.

The Token Waterfall Model is built around four realities that are specific to AI products:

Token costs are not fixed. They vary by user behavior, task complexity, model version, and provider pricing decisions. A model that assumes a flat cost-per-user will be wrong by 2x to 10x depending on your product.

Hidden token consumption is the primary cost driver. Internal reasoning chains, agent loops, retry logic, and tool calls often represent 50-90% of total token spend. Almost no standard model accounts for this.

Provider rate changes are a strategic risk, not a line item. Pricing tier restructuring by major AI providers has in some documented cases resulted in materially higher infrastructure costs for enterprise clients within a single contract period. Your model needs a shock test built in before you need it.

Power users destroy average-based pricing. The math that works for your median user can be catastrophic when applied to your top 5%. You need to know which segments are profitable and which are subsidized before you set a single price.

The Token Waterfall Model addresses all four. Here is how each sheet works.

Sheet 1: The Token Flow Calculator Where every pricing model should start. Almost none do.

Before you can price anything, you need to know what a single transaction actually costs in tokens. Not what the user prompt costs. Not what the response costs. Everything.

The Token Flow Calculator maps your product’s full token consumption across two layers.

User's avatar

Continue reading this post for free, courtesy of Kyle Kelly.

Or purchase a paid subscription.
© 2026 Kyle Kelly · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture