I Read a 30-Page Deloitte Report So You Don't Have To. Here's What It Actually Says.
Published: January 27, 2026 - 10 min read
I have decided to break this down in the spirit of my ongoing Battle Against Chauffeur Knowledge.
A few days ago, Richard Turrin shared a Deloitte document on LinkedIn called "The Pivot to Tokenomics: Navigating AI's New Spend Dynamics." I plan to repost it with my own commentary on Friday, the 30th of January because the findings were important. If you are reading this, you probably came here from that repost. AI tokens will drive enterprise IT budgets up 20%+ by 2027. Only 28% of leaders currently see clear, measurable value from their AI investments. 45% of leaders expect it will take 3+ years to see ROI from basic AI.
Those are big numbers. Important numbers. Numbers that affect how companies make decisions, how budgets get approved, and ultimately, how the AI tools you use every day get priced.
But here's the thing.
If you clicked on that post, saw the document, and felt overwhelmed... I get it.
I know how hard it is to admit that something is unclear because you want to seem smart. There are so many changes happening in this AI field and it can be difficult to admit that some or most of it is not really clear. You see words like "tokenomics" and "inference costs" and "Jevons' Paradox" and your brain does that thing where it pretends to understand while secretly hoping nobody asks follow-up questions.
I've been there. I am there sometimes.
So I want to help demystify AI economics for you. I want to take all the big confusing words and speak to you in English. Plain English. The kind of English where if your grandmother asked "what does that mean?" you could actually explain it.
If you are reading this, that PDF document was probably a lot. Do not fret. By the time you are done reading this series of blog posts, you are going to understand more than most developers do.
But more importantly, you are going to feel powerful.
Yes, I mean it. Not the "I can pretend I understand big words at a dinner party" kind of powerful. The "I actually understand what's happening and can make informed decisions about the technology I use every day" kind of powerful.
Trust me. Read carefully. Enjoy the process. And do not be afraid to text me on LinkedIn if you have any questions.
Let's go.
Why I'm Writing This
I use AI every single day. Multiple times a day. I build AI agents. I architect AI workflows. I literally have a blog series teaching people how to become Claude Gods.
And when I first read that Deloitte document? I understood maybe 70% of it.
Maybe.
The rest was a fog of acronyms and assumptions about prior knowledge I didn't have. TCO. ROI. NCPs. FinOps. Jevons' Paradox. Quantization. Inference optimization.
So I did what I always do when I encounter chauffeur knowledge in myself: I sat down and actually learned it. I broke down every concept. I found analogies that made sense. I connected the dots between technical jargon and real-world implications.
And now I'm going to share all of that with you.
This is a 12-part series. Each post will tackle one or two major concepts from that Deloitte report, explained in a way that assumes you know nothing going in but leaves you feeling like an expert coming out.
Who This Series Is For
Before we dive in, let me be clear about who I'm writing this for. If you see yourself in any of these descriptions, you're in the right place.
The Curious User
You use ChatGPT or Claude every day. Maybe multiple times a day. You've hit usage limits and wondered why. You've noticed that sometimes the AI seems to "forget" things you told it earlier in the conversation. You've seen subscription prices go up and wondered if you're getting your money's worth.
You're not a developer. You're not trying to become one. You just want to understand how this thing you use constantly actually works and what you're actually paying for.
This series will explain: What tokens are, why your conversations cost more over time, and how to get more value from your subscription.
The Freelancer or Solopreneur
You've built AI into your workflow. Maybe you use it for writing, research, customer service, or creative work. AI isn't just a tool for you; it's a business expense. You're cost-conscious because every dollar matters when you're running your own show.
You want to understand when you're wasting money and when you're being efficient. You want to make smart decisions about which AI tools to use and how to use them.
This series will explain: How AI pricing actually works, optimization strategies that can save you money, and how to evaluate different AI options for your business.
The Tech-Forward Manager
You manage a team that uses AI tools. Your leadership keeps asking about AI costs and ROI, and honestly, you're not entirely sure how to answer them. You've seen the budget line items growing and you need to justify the spend.
You're not afraid of technology, but you also don't have time to become a deep technical expert. You need enough understanding to have intelligent conversations with leadership, make informed decisions for your team, and not get blindsided by costs.
This series will explain: Why AI costs grow even as prices drop, how to talk about AI ROI with leadership, and governance frameworks that keep spending under control.
The CFO or Finance Lead
You've noticed AI showing up in budgets across multiple departments. The numbers are growing and you're not sure if that's good (more productivity!) or concerning (more spend!). You need to understand the economics of AI without becoming a technologist.
You care about TCO, ROI, and long-term strategic implications. You need to make recommendations about AI investments and you want to be informed, not just reliant on what the tech team tells you.
This series will explain: Total cost of ownership for AI, the ROI visibility problem, build-vs-buy economics, and financial governance frameworks for AI.
What You're Going to Learn: The 12-Part Series
Here's the roadmap. Each post builds on the previous ones, but you can also jump to specific topics that matter most to you.
Part 1: This Post (You're Here!)
The Setup. Why this matters, who it's for, and what we'll cover. Plus, a gentle introduction to what AI actually is (no really, what it actually is).
Part 2: Tokens and Inference
The Currency. What tokens are, how they work, and why they're the fundamental unit of both AI computation and cost. This is the most important concept in the entire series.
Part 3: Context Windows
The Memory Problem. Why AI doesn't actually remember your conversation (seriously, it doesn't), why your 10th message costs more than your 1st, and why uploading a PDF and asking 10 questions means you just paid for that PDF 10 times.
Part 4: Why AI Costs What It Costs
The Supply Chain. From your message to their bill. GPUs, electricity, data centers, and the hidden chain behind every token you use.
Part 5: Three Ways to Buy AI
The Options. SaaS subscriptions, API access, or building your own. The pros, cons, and economics of each approach.
Part 6: TCO, ROI, and the Money Nobody Talks About
The Hidden Costs. What AI really costs when you add everything up (spoiler: it's more than the subscription fee).
Part 7: Jevons' Paradox
The Budget Trap. Why your AI bill goes UP even as token prices go DOWN. An 1865 economist predicted your budget problem.
Part 8: The 3-Year Study
The Data. Deloitte's actual numbers. Year 1, Year 2, Year 3. When self-hosted AI becomes 2x cheaper than API access.
Part 9: Latency, Throughput, and RAG
The Performance. Why your chatbot is slow, what throughput means for scale, and how RAG lets you give AI your information without sending everything every time.
Part 10: Model Optimization Techniques
The Efficiency. Quantization, pruning, knowledge distillation, and fine-tuning. What these actually mean and when they apply.
Part 11: FinOps for AI
The Governance. Treating AI spend like a serious business expense. Visibility, governance, optimization, and accountability frameworks.
Part 12: Everything Connected
The Synthesis. Bringing it all together. A complete cheat sheet, glossary, and persona-specific action items.
Let's Start with the Basics: What is AI, Really?
Okay, I know this seems too basic. But stick with me because I'm about to say something that might surprise you.
When people say "AI" (Artificial Intelligence), they're usually talking about a very specific type of software. The kind that can read your questions and write responses that make sense. The kind you're probably using when you chat with Claude or ChatGPT.
The specific type of AI we're discussing in this entire series is called an LLM.
LLM stands for Large Language Model.
Let's break that down:
- Large = trained on enormous amounts of text (books, websites, articles, basically a significant chunk of the internet)
- Language = it works with human language (English, French, Spanish, code, whatever)
- Model = it's a mathematical system that predicts what words should come next
Here's the thing that most people don't realize:
An LLM doesn't "think" the way humans do.
It predicts the next word based on patterns it learned during training. That's it. That's the whole magic trick.
When you ask Claude: "What is the capital of France?"
Claude doesn't "know" the answer in the way you know your own name. It predicts that based on all the text it was trained on, the most likely next words after your question are: "The capital of France is Paris."
It's pattern matching at an unbelievably sophisticated scale. But it's still pattern matching.
Why Does This Matter?
Because understanding what AI actually is helps you understand what AI actually costs.
Every time you send a message to Claude or ChatGPT, that message gets processed. The AI reads your words, does a massive amount of mathematical prediction, and generates a response. That processing takes computational resources. Those resources cost money.
The unit we measure that processing in? Tokens.
And tokens are going to be the most important concept in this entire series.
But that's for Part 2.
Popular LLMs You've Probably Heard Of
Just so we're all on the same page, here are the main players:
| Name | Created By | You Might Use It Via |
|---|---|---|
| Claude | Anthropic | Claude.ai, Claude app |
| GPT-4 | OpenAI | ChatGPT |
| Gemini | Google's AI products | |
| Llama | Meta | Various apps (it's open-source) |
Each of these is an LLM. Each works by predicting the next word. And each charges for usage in a way that relates to tokens.
The differences between them matter for specific use cases, but the economics work the same way for all of them. Learn how one works, and you understand the fundamentals of all of them.
What This Series Will NOT Be
Let me set expectations clearly.
This series will NOT be:
- A deep technical dive into neural networks and transformer architectures
- A guide on how to build your own AI model
- A comparison of which AI is "best" (that depends on your use case)
- Advice specific to any one company or industry
This series WILL be:
- A plain-English explanation of AI economics
- A translation of technical jargon into concepts anyone can understand
- A practical guide to making informed decisions about AI costs
- A foundation that lets you have intelligent conversations about AI spending
Your Homework (Yes, Really)
Before Part 2, I want you to do one thing.
Think about the last time you used ChatGPT or Claude. Think about:
- How long was your conversation?
- Did you upload any files?
- Did you ask multiple related questions?
- Did you notice the AI ever "forget" something you told it earlier?
Hold those observations in your mind. Because in Part 2, I'm going to explain exactly what was happening behind the scenes during that conversation. And it's going to change how you think about AI forever.
Coming Up Next
I'm going to explain what tokens are, how they work, and why they're the fundamental unit of both AI computation and cost. This single concept is the foundation of everything else in AI economics.
If you understand tokens, you understand 50% of everything in that Deloitte report.
If you understand tokens and context windows (Part 3), you understand 80%.
We're going to get you there.
A Note on This Series
I've written about tokens before. If you want to get a head start, you can check out:
- Tokens, Context Windows, and Why LLM Instance Cloning Works
- Why You're Overpaying for Claude Max
- 6 Token Drains Destroying Your Quota
- 5 Ways People Waste Tokens
This new series builds on that foundation but goes much deeper. We're not just talking about individual usage anymore. We're talking about the economics of AI at every level, from your personal subscription to enterprise infrastructure decisions.
And we're doing it in plain English.
Because you don't need to be a software developer to understand the economics of something you use daily.
You just need someone to explain it without the jargon.
That's what I'm here for.
As always, thanks for reading!