I Read a 28-Page Deloitte Report So You Don't Have To. Here's What It Actually Says.

Published: January 27, 2026 - 10 min read

I have decided to break this down in the spirit of my ongoing Battle Against Chauffeur Knowledge.

A few days ago, Richard Turrin shared a Deloitte document on LinkedIn called "The Pivot to Tokenomics: Navigating AI's New Spend Dynamics." I plan to repost it with my own commentary on Friday, the 30th of January because the findings were important. If you are reading this, you probably came here from that repost. Cloud bills are rising nearly 20% year over year, driven by AI workloads. Only 28% of global finance leaders currently see clear, measurable value from their AI investments. Only 45% of leaders expect to see ROI from basic AI within 3 years -- and for more advanced AI, 60% expect it will take even longer.

Those are big numbers. Important numbers. Numbers that affect how companies make decisions, how budgets get approved, and ultimately, how the AI tools you use every day get priced.

But here's the thing.

If you clicked on that post, saw the document, and felt overwhelmed... I get it.

I know how hard it is to admit that something is unclear because you want to seem smart. There are so many changes happening in this AI field and it can be difficult to admit that some or most of it is not really clear. You see words like "tokenomics" and "inference costs" and "Jevons' Paradox" and your brain does that thing where it pretends to understand while secretly hoping nobody asks follow-up questions.

I've been there. I am there sometimes.

So I want to help demystify AI economics for you. I want to take all the big confusing words and speak to you in English. Plain English. The kind of English where if your grandmother asked "what does that mean?" you could actually explain it.

If you are reading this, that PDF document was probably a lot. Do not fret. By the time you are done reading this series of blog posts, you are going to understand more than most developers do.

But more importantly, you are going to feel powerful.

Yes, I mean it. Not the "I can pretend I understand big words at a dinner party" kind of powerful. The "I actually understand what's happening and can make informed decisions about the technology I use every day" kind of powerful.

Trust me. Read carefully. Enjoy the process. And do not be afraid to text me on LinkedIn if you have any questions.

Let's go.

Why I'm Writing This

I use AI every single day. Multiple times a day. I build AI agents. I architect AI workflows. I literally have a blog series teaching people how to become Claude Gods.

And when I first read that Deloitte document? I understood maybe 70% of it.

Maybe.

The rest was a fog of acronyms and assumptions about prior knowledge I didn't have. TCO. ROI. NCPs. FinOps. Jevons' Paradox. Quantization. Inference optimization.

So I did what I always do when I encounter chauffeur knowledge in myself: I sat down and actually learned it. I broke down every concept. I found analogies that made sense. I connected the dots between technical jargon and real-world implications.

And now I'm going to share all of that with you.

This is a 12-part series. Each post will tackle one or two major concepts from that Deloitte report, explained in a way that assumes you know nothing going in but leaves you feeling like an expert coming out.

Who This Series Is For

Before we dive in, let me be clear about who I'm writing this for. If you see yourself in any of these descriptions, you're in the right place.

The Curious User

You use ChatGPT or Claude every day. Maybe multiple times a day. You've hit usage limits and wondered why. You've noticed that sometimes the AI seems to "forget" things you told it earlier in the conversation. You've seen subscription prices go up and wondered if you're getting your money's worth.

You're not a developer. You're not trying to become one. You just want to understand how this thing you use constantly actually works and what you're actually paying for.

This series will explain: What tokens are, why your conversations cost more over time, and how to get more value from your subscription.

The Freelancer or Solopreneur

You've built AI into your workflow. Maybe you use it for writing, research, customer service, or creative work. AI isn't just a tool for you; it's a business expense. You're cost-conscious because every dollar matters when you're running your own show.

You want to understand when you're wasting money and when you're being efficient. You want to make smart decisions about which AI tools to use and how to use them.

This series will explain: How AI pricing actually works, optimization strategies that can save you money, and how to evaluate different AI options for your business.

The Tech-Forward Manager

You manage a team that uses AI tools. Your leadership keeps asking about AI costs and ROI, and honestly, you're not entirely sure how to answer them. You've seen the budget line items growing and you need to justify the spend.

You're not afraid of technology, but you also don't have time to become a deep technical expert. You need enough understanding to have intelligent conversations with leadership, make informed decisions for your team, and not get blindsided by costs.

This series will explain: Why AI costs grow even as prices drop, how to talk about AI ROI with leadership, and governance frameworks that keep spending under control.

The CFO or Finance Lead

You've noticed AI showing up in budgets across multiple departments. The numbers are growing and you're not sure if that's good (more productivity!) or concerning (more spend!). You need to understand the economics of AI without becoming a technologist.

You care about TCO, ROI, and long-term strategic implications. You need to make recommendations about AI investments and you want to be informed, not just reliant on what the tech team tells you.

This series will explain: Total cost of ownership for AI, the ROI visibility problem, build-vs-buy economics, and financial governance frameworks for AI.

What You're Going to Learn: The 12-Part Series

Here's the roadmap. Each post builds on the previous ones, but you can also jump to specific topics that matter most to you.

Part 1: This Post (You're Here!)

The Setup. Why this matters, who it's for, and what we'll cover. Plus, a gentle introduction to what AI actually is (no really, what it actually is).

Part 2: Tokens and Inference

The Currency. What tokens are, how they work, and why they're the fundamental unit of both AI computation and cost. This is the most important concept in the entire series.

Part 3: Context Windows

The Memory Problem. Why AI doesn't actually remember your conversation (seriously, it doesn't), why your 10th message costs more than your 1st, and why uploading a PDF and asking 10 questions means you just paid for that PDF 10 times.

Part 4: Why AI Costs What It Costs

The Supply Chain. From your message to their bill. GPUs, electricity, data centers, and the hidden chain behind every token you use.

Part 5: Three Ways to Buy AI

The Options. SaaS subscriptions, API access, or building your own. The pros, cons, and economics of each approach.

Part 6: TCO, ROI, and the Money Nobody Talks About

The Hidden Costs. What AI really costs when you add everything up (spoiler: it's more than the subscription fee).

Part 7: Jevons' Paradox

The Budget Trap. Why your AI bill goes UP even as token prices go DOWN. An 1865 economist predicted your budget problem.

Part 8: The 3-Year Study

The Data. Deloitte's actual numbers. Year 1, Year 2, Year 3. When self-hosted AI becomes 2x cheaper than API access.

Part 9: Latency, Throughput, and RAG

The Performance. Why your chatbot is slow, what throughput means for scale, and how RAG lets you give AI your information without sending everything every time.

Part 10: Model Optimization Techniques

The Efficiency. Quantization, pruning, knowledge distillation, and fine-tuning. What these actually mean and when they apply.

Part 11: FinOps for AI

The Governance. Treating AI spend like a serious business expense. Visibility, governance, optimization, and accountability frameworks.

Part 12: Everything Connected

The Synthesis. Bringing it all together. A complete cheat sheet, glossary, and persona-specific action items.

Let's Start with the Basics: What is AI, Really?

Okay, I know this seems too basic. But stick with me because I'm about to say something that might surprise you.

When people say "AI" (Artificial Intelligence), they're usually talking about a very specific type of software. The kind that can read your questions and write responses that make sense. The kind you're probably using when you chat with Claude or ChatGPT.

The specific type of AI we're discussing in this entire series is called an LLM.

LLM stands for Large Language Model.

Let's break that down:

Large = trained on enormous amounts of text (books, websites, articles, basically a significant chunk of the internet)
Language = it works with human language (English, French, Spanish, code, whatever)
Model = it's a mathematical system that predicts what words should come next

Here's the thing that most people don't realize:

An LLM doesn't "think" the way humans do.

It predicts the next word based on patterns it learned during training. That's it. That's the whole magic trick.

When you ask Claude: "What is the capital of France?"

Claude doesn't "know" the answer in the way you know your own name. It predicts that based on all the text it was trained on, the most likely next words after your question are: "The capital of France is Paris."

It's pattern matching at an unbelievably sophisticated scale. But it's still pattern matching.

Why Does This Matter?

Because understanding what AI actually is helps you understand what AI actually costs.

Every time you send a message to Claude or ChatGPT, that message gets processed. The AI reads your words, does a massive amount of mathematical prediction, and generates a response. That processing takes computational resources. Those resources cost money.

The unit we measure that processing in? Tokens.

And tokens are going to be the most important concept in this entire series.

But that's for Part 2.

Popular LLMs You've Probably Heard Of

Just so we're all on the same page, here are the main players:

Name	Created By	You Might Use It Via
Claude	Anthropic	Claude.ai, Claude app
GPT-4	OpenAI	ChatGPT
Gemini	Google	Google's AI products
Llama	Meta	Various apps (it's open-source)

Each of these is an LLM. Each works by predicting the next word. And each charges for usage in a way that relates to tokens.

The differences between them matter for specific use cases, but the economics work the same way for all of them. Learn how one works, and you understand the fundamentals of all of them.

What This Series Will NOT Be

Let me set expectations clearly.

This series will NOT be:

A deep technical dive into neural networks and transformer architectures
A guide on how to build your own AI model
A comparison of which AI is "best" (that depends on your use case)
Advice specific to any one company or industry

This series WILL be:

A plain-English explanation of AI economics
A translation of technical jargon into concepts anyone can understand
A practical guide to making informed decisions about AI costs
A foundation that lets you have intelligent conversations about AI spending

Your Homework (Yes, Really)

Before Part 2, I want you to do one thing.

Think about the last time you used ChatGPT or Claude. Think about:

How long was your conversation?
Did you upload any files?
Did you ask multiple related questions?
Did you notice the AI ever "forget" something you told it earlier?

Hold those observations in your mind. Because in Part 2, I'm going to explain exactly what was happening behind the scenes during that conversation. And it's going to change how you think about AI forever.

Coming Up Next

Part 2: Tokens and Inference

I'm going to explain what tokens are, how they work, and why they're the fundamental unit of both AI computation and cost. This single concept is the foundation of everything else in AI economics.

If you understand tokens, you understand 50% of everything in that Deloitte report.

If you understand tokens and context windows (Part 3), you understand 80%.

We're going to get you there.

A Note on This Series

I've written about tokens before. If you want to get a head start, you can check out:

This new series builds on that foundation but goes much deeper. We're not just talking about individual usage anymore. We're talking about the economics of AI at every level, from your personal subscription to enterprise infrastructure decisions.

And we're doing it in plain English.

Because you don't need to be a software developer to understand the economics of something you use daily.

You just need someone to explain it without the jargon.

That's what I'm here for.

As always, thanks for reading!

I Read a 28-Page Deloitte Report So You Don't Have To. Here's What It Actually Says.