Economics Always Wins

Let me start where most papers end, because I don't want you wondering what this is about. Here is the conclusion: every serious enterprise will run private AI. Not rent it. Run it.

They will take open-weight and domain-specific models, deploy them inside their own infrastructure, fine-tune them on their own data, and route work across them with a logic they design and control. The frontier labs, the OpenAIs and Anthropics of the world, will remain extraordinary. They will remain useful. But they won't be the place where the enterprise lives. They will be the place the enterprise visits.

This isn't a prediction about which model is smartest. It's a prediction about economics. And economics always wins.

Novelty is wonderful for a small business owner having a delightful conversation with a chatbot. But novelty has never once survived contact with unit economics at scale. Not in manufacturing. Not in production. Not in the data center. The history of every industrial technology is the same story. A breakthrough arrives expensive and magical, and then the market grinds it into a commodity and competes on what you build on top of it. AI will be no different. It can't be.

So that is the thesis. Now let me earn it.

01 · The cost curve

Economics always wins

Look at what it has cost to build these models, generation over generation, and the curve tells the whole story.

Take GPT-2, OpenAI's 2019 model. It is the first one most people would recognize as a real language model. It cost roughly $43,000 to train.¹ GPT-3, a year later, ran between several hundred thousand dollars and a few million.¹ GPT-4, in 2023, is where it stopped being a science project and became an industrial undertaking: roughly 25,000 high-end GPUs running for three months, on the order of 2 × 10²⁵ floating-point operations, tens of gigawatt-hours of electricity, and a training bill estimated near $80–100 million.² Each leap consumed roughly four to five times the compute of the one before it.

Figure 1 · Cost of a single frontier training run

From the price of a used car to the price of an aircraft carrier

Note the logarithmic scale. Each line up the axis is a 10× jump. We have gone from roughly $43,000 to a projected $10 billion in under a decade. * 2027–28 are projections; the billion-dollar run by 2027 is forecast by Epoch AI, and the $10B-by-2028 figure is Anthropic CEO Dario Amodei's. Sources 1–4.

That curve has not bent. It has accelerated. The frontier training runs of 2026 sit in the $200–500 million range for a top-tier model. Industry researchers project the first billion-dollar training run by 2027,³ and Anthropic's own CEO has said frontier models could cost $10 billion to train by 2028.⁴ We have gone from the price of a used car to the price of an aircraft carrier in roughly a decade.

This brute-force path of bigger clusters, more gigawatts, and more capital is the one the U.S. labs have chosen. It produces astonishing models. But it produces them at a cost structure that someone, eventually, has to pay for.

Here is what the coverage misses. There are two cost curves, not one, and they move in opposite directions. The ceiling keeps rising: the most capable model keeps getting more expensive to build. But the floor keeps collapsing. The cost to reach any given level of capability is falling off a cliff. A GPT-4-class model cost roughly $79 million to train in 2023; by 2026 the same capability can be had for $5–10 million.⁵ What was frontier-expensive two years ago is now a well-funded startup's line item.

Figure 2 · The divergence

Two curves, opposite directions

The two curves begin almost together and scissor apart. The capability an enterprise actually needs rides the green line down. * Projected. Log scale. Sources 3–5.

That second curve is the one enterprises should care about. It means the capability you actually need for your business doesn't stay locked behind the frontier. It comes to you, fast, and it gets cheaper every quarter.

02 · The proof point

The China precedent

If you want proof that cost, not novelty, is the gravitational force in this market, look at China.

Cut off from Nvidia's best silicon by U.S. export controls, Chinese labs were forced to build on cheaper, deliberately throttled chips. They couldn't out-spend the Americans, so they had to out-engineer the cost curve. The most cited example is DeepSeek, whose reasoning model R1 was reported in a peer-reviewed Nature paper to have completed its final training run on 512 export-market H800 chips in about 80 hours for roughly $294,000.⁶ Its V3 base model came in around $5.6 million in compute.⁷

Now, I'll be the honest broker. Those numbers deserve asterisks: they reflect the marginal cost of a single run, not the stockpiled hardware, the prior research, or the failed experiments behind it.⁷ There are legitimate questions about how the models were trained and what they learned from. China is, to put it gently, playing a different game on intellectual property, and a real path to domestic leading-edge fabrication is still years away.

Strip away the controversy and the signal is undeniable. A frontier-class model was produced for a fraction of the Western price, and given away with open weights.

The market reaction told you everything. A single release wiped hundreds of billions of dollars off chip valuations in a day. The constraint didn't kill Chinese AI. It made it cheaper. And cheap, open, and good-enough is exactly the combination that erodes a premium business model from underneath.

03 · The threshold

What "good enough" now means

The premium model business rests on one assumption: that the gap between the best closed model and the best open one is wide enough, and durable enough, to justify the price difference. That assumption is failing.

By Epoch AI's capability index, open-weight models now trail the closed frontier by roughly three to four months.⁸ Not three to four years. Months. Llama, Qwen, Mistral, DeepSeek, and Nvidia's Nemotron family match or beat what were frontier-closed benchmarks a year ago. On broad knowledge tests like MMLU, both open and closed models have pushed past 91% and the gap is effectively gone.⁹

You don't have to take that on faith. In June 2026, a Chinese lab called Z.ai released GLM-5.2, an open-weights model you can download and run yourself under an MIT license. On the public coding and agentic benchmarks it matched or beat GPT-5.5 and came within a few points of Claude Opus 4.8, the strongest closed model on the board, at a reported fraction of the cost to run.¹⁶ The scores are vendor-reported and the hosted API sits in China, so you'd test it on your own workloads before trusting it. But the weights are yours to run on your own hardware, and that's the point. Frontier-class coding, open, self-hostable, cheap. A year ago that gap was a chasm. This week it's a few points.

Figure 3 · The closing gap

How far behind is open weight? Months, not years.

Directional, based on Epoch AI's Capability Index. The lag has compressed from years to months, and has at times closed entirely. Source 8.

For the enterprise, this is the whole ballgame. You do not run your business on the bleeding edge. You run it on the reliable edge. A model that is three months behind the absolute frontier, that you can host yourself, fine-tune on your data, and run for the cost of compute, beats a marginally smarter model you rent by the token, can't control, and have to feed your crown jewels to. Unless and until someone builds a true, measurable artificial superintelligence (still undefined, still unmeasured), "three months behind and free to run" is the better deal for almost every workload that matters.

04 · Following the money

The economics of renting

Now follow the money on the other side of the ledger, because this is where the premium model loses.

You might say renting is fine. We've rented software for twenty-five years. SaaS proved you don't need to own the system. You subscribe, the vendor runs it, everyone wins. So why would AI be any different?

Because the cost base isn't remotely the same. SaaS was R&D a vendor could spread across thousands of customers at healthy margins. AI is a different order of magnitude. In 2026, four companies, Microsoft, Amazon, Google, and Meta, will spend close to $700 billion in capital expenditure, the overwhelming majority of it on AI.¹⁴ Two years earlier those same four spent a little over $200 billion. For comparison, the entire global software industry spends on the order of $200 billion a year on R&D, combined.¹⁵ So in a single year, four firms are pouring more than three times the whole software industry's research budget into building this one capability. And that $700 billion is only their capex. It doesn't count OpenAI's and Anthropic's own losses, the $500 billion Stargate program, xAI, or China.

Someone has to earn that back. It won't be the vendors. They're losing money today. It will be you, the enterprise customer, in the price of every token you rent. That's the difference. SaaS you could rent forever, because the underlying economics worked. This you would be underwriting. And once you're underwriting a cost this staggering, owning beats renting.

The frontier labs aren't profitable. They are not close. In the first quarter of 2026 alone, OpenAI reported around $5.7 billion in revenue against a net loss exceeding $21 billion,¹⁰ and by some readings loses well over a dollar for every dollar of revenue it takes in. It projects no profitability until the end of the decade. Anthropic, the more disciplined of the two, ran roughly 40% gross margins in 2025 and doesn't expect to break even until 2028.¹¹ Together these two companies are on track to spend on the order of $65 billion in a single year on compute, training, and operations.¹²

You can't lose tens of billions of dollars a year forever. The capital markets will eventually demand the thing every business must eventually deliver: a profit. And when that day comes, the cost gets passed to the customer. There is no other place for it to go. The $20 flat-rate plans and the artificially low token prices were always loss leaders. A land grab, paid for by investors, to drive adoption.

The early bills are already arriving. Enterprises that budgeted around 2024 token prices are discovering that agentic workflows at 2026 adoption levels consume multiples of what they planned. One major company reportedly burned through its entire annual AI budget in four months. A major bank circulated an internal note titled, more or less, "AI bills are out of control."¹³ Even the optimistic forecasts assume inference gets dramatically cheaper. But cheaper inference helps whoever runs the model, and increasingly that will be the enterprise itself.

No enterprise CFO will tolerate an unbounded, metered cost line that scales with usage, controlled by a vendor, with a built-in mandate to raise prices. They will do what they have always done with runaway variable costs: bring them in-house and turn them into a fixed, predictable, controllable line item.

05 · The last moat

The data argument

There is a second reason the enterprise pulls AI in-house, and it may matter more than the money.

Your data is one of the last real moats you have. Not your software. That is bought. Not your processes alone. Those get copied. But the proprietary graph of your customers, your transactions, your operations, built up over decades, that is yours, and it is hard to replicate.

When you route that data through a public model, you risk hemorrhaging exactly that advantage. The worst outcome in this entire industry is capability convergence: a world where every company in your sector queries the same public model, trained on the same public data, and gets back the same undifferentiated answers. In that world, there are no leaders. Everyone is average, expensively. Keeping your data inside your own walls, feeding it only to models you control, is how you refuse to converge.

06 · The architecture

The private model stack

So what does running private AI actually look like? It's not exotic. The pieces already exist, and most enterprises already own them. You take an AI-aware container platform, fill it with open-weight models, fine-tune them on your data, and route every request, human or agent, to the model that produces the best output at the lowest cost for that task.

And this is the part people miss when they hear the word private. It isn't all or nothing. The router still sends a job to a frontier model when the job truly calls for it, a hard, novel reasoning problem, or a capability you don't yet have in-house. You don't stop using public models. The default just flips. Private becomes the floor you run on. The frontier becomes the exception you reach for on purpose, and pay up for, when the value is clearly there.

Figure 4 · Anatomy of the new enterprise engine

The moat

Secret sauce, yours alone

Proprietary data · custom fine-tuning · adjusted weights · feedback loops from your own people

Control

Intelligent prompt router

Routes each task in real time to the cheapest model that clears the quality bar; bursts to a public API only when the value justifies it

Commodity

Open-weight + domain models

A general workhorse plus smaller vertical models, free to run beyond compute cost

Commodity

AI-aware container platform

Kubernetes, or a managed layer like OpenShift AI, on your own cloud or metal

This is the architecture. But the architecture is the commodity. The architecture is not the advantage. What you do inside it is.

07 · The differentiator

The secret sauce

Here is the part I want to be unambiguous about, because it is the heart of the whole argument.

If everyone can download the same open-weight model, then the model is not your advantage. It can't be. It's a commodity, like the CPU, like the database, like electricity. The advantage is the secret recipe you build around it.

Think of it as the thirteen herbs and spices. The chicken is just chicken; anyone can buy chicken. What you can't buy is the blend. In an enterprise AI stack, the blend is the mix of your proprietary data, your custom fine-tuning, your adjusted weights, your routing logic, and the accumulated feedback loops from your own people doing your own work. That combination exists nowhere else in the world and cannot be replicated by a competitor downloading the same base model. It's defensible because it's yours.

And notice what the frontier labs structurally cannot do here. They are building one brilliant, generic model to serve everyone. By design, they can't bake your thirteen herbs and spices into a product they sell to your competitor next door. Generic capability is their business. Specific advantage is yours. The more the base models commoditize, the more the differentiation migrates to the layer only you can build.

08 · The precedent

We have run this play before

If this pattern feels familiar, it should. We lived through it once already, and it was called ERP.

When enterprises deployed systems like SAP, the core platform was, in the end, a commodity. Everyone could buy the same boxes. The differentiation, the real secret sauce, lived in the customization: the ABAP code, the bespoke workflows, the data models, the process logic, the hard-won feedback from the people who ran the business every day. That is why serious companies spent not thousands, not hundreds of thousands, but in many cases millions of dollars customizing their ERP. It is also why two companies running the identical SAP installation could end up with wildly different capabilities. The platform was table stakes. The customization was the moat.

Enterprise AI is the same play, one layer up. The open-weight model is the commodity SAP. The fine-tuning, the proprietary data, the routing, the workflow integration, the feedback loops. That is the new ABAP.

09 · The conclusion

The new enterprise engine

For thirty years, ERP was the beating heart of the enterprise. It was the system of record, the place where the business's truth lived, the engine everything else ran against. That era is ending.

1995–2025

ERP

The system of record

Stores what happened
Differentiated by ABAP & customization
System of truth
Runs the back office

2026 →

The Enterprise Engine

The system of intelligence

Decides, acts & improves
Differentiated by data, weights & routing
System of advantage
Runs the whole business, privately

The new beating heart of the enterprise won't be a system of record. It will be a system of intelligence. A privately run, continuously fine-tuned stack of models, fed by proprietary data, orchestrated by intelligent routing, executing work across the business. It does not merely store what happened. It decides, acts, and improves. And critically, it runs privatized, inside your walls, on your data, under your control, for all the reasons of cost, data sovereignty, and differentiation laid out above.

ERP told you what your business did. The new enterprise engine will help run what your business does next.

What I'm telling you

Economics always wins

So let me tell you what I told you.

The frontier labs are a marvel, and they are losing tens of billions of dollars a year to stay that way. That bill is coming due, and it will be paid by the customer. Which means the price of renting intelligence is under pressure to rise, not fall, exactly as enterprise consumption explodes. At the same time, open-weight models are arriving three to four months behind the frontier and free to run. China has already proven that good-enough intelligence can be built and given away for a rounding error of the Western price. And no enterprise wants to hemorrhage its most valuable asset, its data, into a public model that flattens it into the competition.

Add those forces together and the conclusion isn't a maybe. It's an inevitability. Enterprises will run private AI. They will build their advantage in the secret recipe: the data, the fine-tuning, the weights, the routing, the feedback loops. Exactly as they once built it in ABAP. And that private, intelligent, continuously improving stack will become the new engine of the business, displacing ERP as its beating heart.

This was never going to be decided by which model writes the prettiest slide. It was always going to be decided by unit economics at scale.

Because economics always wins.

Economics always wins

The China precedent

What "good enough" now means

The economics of renting

The data argument

The private model stack

The secret sauce

We have run this play before

The new enterprise engine

Economics always wins

Sources & Notes