I’ve been writing here, on and off, since 2007. Some of my favorite old posts were the ones where I tried to call something before it was obvious — like the Silicon Valley Bubble piece back in 2011. I want to get back into that habit, because I think we’re living through the biggest boom of my career, and the story most people are telling about it leaves out the part that keeps me up at night.
Everyone agrees AI is the most important thing to happen to computing in a generation. I mostly agree. What I can’t reconcile is the arithmetic. There are two walls in front of this industry, and I’ve come to believe they’re really the same wall.
Wall one: the money#
Start with the spend. In 2024, Sequoia’s David Cahn laid out what he called AI’s $600B question: given how much is being poured into AI data centers, the industry needs on the order of $600 billion in annual revenue just to justify the capex — and the gap between that number and actual end-user revenue was enormous. Two years later, with chip spend higher still, the gap is wider, not narrower.
Then look at the flagship. Drawing on audited financials, Ed Zitron reported OpenAI’s 2025 numbers:
Roughly $13 billion in revenue against $34 billion in costs.
Revenue is growing fast — but costs are growing faster, because every more-capable model demands more compute to train and serve, and that compute doesn’t get cheaper as demand climbs.
The standard rebuttal is “growth fixes this — give it time.” Maybe. But growth only fixes it if the product keeps getting dramatically better and cheaper to run. Which brings me to the second wall.
Wall two: the math#
This is the part that worries me more, because it isn’t a financing problem — it’s an architecture problem.
Today’s models are, at their core, next-token predictors. Astonishingly good ones — I genuinely don’t fully understand how something trained to guess the next word can write working Go for me. But “I don’t understand how it works” is not the same as “it has no ceiling,” and there’s a growing pile of evidence the ceiling is real.
Yann LeCun has argued for years that autoregressive next-token prediction is the wrong foundation: every token is a sample with some error probability, so over a long chain the errors compound exponentially. A 1% per-step error over 100 steps (0.99¹⁰⁰ ≈ 0.37) leaves you with only about a 37% chance of a clean run. He’s now put real money behind that conviction — leaving Meta to raise around $1B for “world model” architectures that don’t predict the next word at all.
The labs aren’t blind to this, and their answer deserves a fair hearing: inference-time compute — the o1/“Strawberry” lineage of “reasoning” models that think, plan, and search internally before committing to an answer. In effect they bolt scaffolding around the next-token predictor so a small early mistake doesn’t snowball. It clearly helps. But notice what Apple’s research group actually tested in The Illusion of Thinking: those very reasoning models. On controlled puzzles, they improve with complexity up to a point and then collapse to near-zero accuracy — even with plenty of token budget left. (In fairness, the paper drew sharp rebuttals arguing the failures were partly experimental artifacts; the debate isn’t settled.) If the scaffolding still falls over past a certain complexity, then inference-time compute may be papering over the architecture rather than replacing it — buying real gains, at steadily rising compute cost, without removing the ceiling.
And honestly, my own experience back in the trenches matches the theory. Relearning Go with an AI assistant beside me, the help is real and fast for the first 80% — and then it walks confidently off a cliff exactly when the chain of reasoning gets long. The last time it bit me, the assistant called a method that didn’t exist on the type; when I pasted the compiler error back, it “fixed” things by inventing a second nonexistent method, each correction drifting further from the real API instead of closer to it. That’s compounding error you can watch happen in real time — exactly what you’d expect when every step is sampled from a distribution the previous mistake has already skewed.
Why the two walls are one wall#
Here’s the connection. The bull case quietly assumes both things stay true: capability keeps scaling, and cost per unit of capability keeps falling — so the industry grows into that $600B. But if next-token prediction is near a ceiling, you don’t get the capability jumps that justify the spend, and you keep burning more compute to claw out smaller gains. The money wall and the math wall reinforce each other.
But can’t you just sell shovels?#
The reflexive way to play a gold rush is the line everyone repeats: don’t mine for gold — sell picks and shovels. In this rush that’s Nvidia, and it has worked spectacularly. It’s the obvious hedge, and it isn’t wrong.
But look one layer down, because this is where the cliché stops being safe. The railway and dot-com booms were picks-and-shovels stories too, and the shovel sellers did great — right up until the miners ran out of money. Shovel demand was never independent of the gold; it was a leveraged bet on the gold. Nvidia’s revenue is, ultimately, downstream of labs spending far more than they earn on the conviction that the gold is real. If the miners never strike it — if the capability jumps don’t arrive and the losses don’t turn — the first thing cut is next year’s shovel order.
So “sell shovels” doesn’t sidestep the question. It just restates it. The shovels are only a safe bet if the gold is real — and whether the gold is real is exactly what walls one and two are asking.
The honest counterargument#
The smartest pushback isn’t “the models will just get better.” It’s Carlota Perez’s framework, which Paul Kedrosky applies to AI capex: bubbles still build. The railway mania and the dot-com fiber glut both wiped out investors — and both left behind infrastructure (rails, dark fiber) that powered the next era. By that logic, today’s data centers, chips, and power build-out are the railroads of our age, and it barely matters if most of the companies financing them go bust.
I find this genuinely persuasive. But notice what it concedes: it agrees the bear case is probably right about the companies. It just argues the infrastructure outlives them.
Where I land#
So here’s my bet, for what it’s worth from someone who’s wrong about as often as anyone: the compute gets built and mostly stays. Most of the pure-play “AGI is imminent” companies do not survive in their current form. And the durable winners are the ones who treat today’s models as a useful-but-bounded tool rather than a religion. In practice I think that means three groups: the B2B software companies embedding AI into a narrow, verifiable workflow with a human in the loop; the teams betting on smaller, specialized models instead of one all-knowing god-model; and whoever is quietly working on the next architecture instead of only scaling this one.
That last part is the classic innovator’s dilemma: the incumbents are structurally committed to scaling the thing that’s working right now, which is exactly why they tend to miss the thing that comes next.
I came back to this blog partly to force myself to think these things through in public instead of in the shower. So tell me where I’m wrong: is the gap between the spend and the revenue just a financing detail that growth erases — or the tell of a bubble that, like every bubble before it, builds something lasting and bankrupts most of the people who built it?