30+ sources. Zero spin.
Cross-referenced, unbiased news. Both sides of every story.
Model Routing Emerges as Corporate America's Answer to Runaway AI Bills — and It Threatens OpenAI and Anthropic's Revenue Models

Since our earlier coverage confirmed that Uber burned its entire 2026 AI coding budget by April and Cisco overran its own targets with a projected $900 million annual tab for 90,000 employees, the corporate AI spending crisis has moved from shock to strategy.
The new strategy is called model routing — a key development in enterprise AI that could reshape how companies manage AI costs.
What Model Routing Actually Is
The concept is straightforward. Instead of firing every query — from 'summarize this email' to 'redesign our entire database architecture' — through GPT-5.1 or Claude Opus 4.5, you match the task to the cheapest model capable of handling it.
Scott Wu, CEO of Cognition (makers of the coding agent Devin), told CNBC that for routine boilerplate work, companies can achieve five to ten times better cost efficiency by routing to cheaper alternatives. Arvind Jain, CEO of enterprise AI company Glean, estimates that roughly 95% of enterprise AI usage today still runs on the most expensive frontier models — even for tasks a $5-per-million-token model could handle just fine.
Wu made the point bluntly: ask any AI model, expensive or cheap, to name the third U.S. president. Every single one says Thomas Jefferson. You're paying GPT-5.1 rates for a Wikipedia lookup.
The Numbers That Should Make Every CFO Angry
Cisco Chief Product Officer Jeetu Patel laid out the math to CNBC: at roughly $200 in token usage per employee per week, that's $10,000 per person per year. For a 90,000-employee company, that's $900 million annually. Patel confirmed Cisco came in well over budget and has had to reallocate resources, with 30,000 engineers now building products largely written with AI assistance.
Priceline's situation is worse. Senior Director of IT Finance Chris Reed told TechCrunch that a routine Cursor contract renewal came back 4-5x more expensive than expected. Reed's description of the dynamic was characteristically blunt: 'It's like the crack-cocaine epidemic. They let you try it to get you hooked on it, and now you're kind of beholden to it.' Priceline has responded by placing token limits on certain employee groups.
One unnamed company reportedly ended up with a $500 million Claude bill after simply forgetting to set usage limits for employees, according to TechCrunch.
What's Actually Driving the Cost Surge
Per-token prices have actually fallen. This isn't a pricing gouge story. The crisis is driven by consumption volume, not unit cost.
New model releases in November — Anthropic's Claude Opus 4.5, OpenAI's GPT-5.1, and Google's Gemini 3 Pro — brought major improvements to agentic tools. Agentic AI systems don't just answer one question; they chain together dozens or hundreds of queries to complete a task autonomously. Each step burns tokens. The better the agents got, the more companies deployed them. The more they deployed them, the more tokens got consumed — without anyone watching the meter.
Alexander Embiricos, OpenAI's head of enterprise, told TechCrunch at an event in New York City this week that the corporate conversation has completely flipped. Six months ago, he said, every client wanted to know what the technology could do. Now they want to know: 'What visibility do you have? What auditability do you have? What token controls do you have?'
A New Standards Body Enters the Picture
The Linux Foundation announced plans this week for the Tokenomics Foundation, a new standards body explicitly designed to bring the same cost discipline to AI tokens that the FinOps movement brought to cloud computing.
J.R. Storment, executive director of the FinOps Foundation (itself a Linux Foundation project), told TechCrunch the inflection point hit hard: 'In April and May, I started hearing from companies: Oh my god, we are 3x over our entire 2026 token budget and it's only April. We started hearing existential crises, and the whole conversation shifted from tokenmaxxing and go fast to we need guardrails, how do we control this?'
This is the cloud cost management story playing out again — except faster and with higher stakes.
Revenue Pressure on OpenAI and Anthropic
Both companies' valuations are built on an assumption: enterprises will keep routing maximum volume through maximum-cost frontier models indefinitely. Model routing challenges that assumption directly.
If 95% of enterprise usage is currently on frontier models but only 5-10% of tasks actually require frontier models, the addressable market for GPT-5.1 and Claude Opus 4.5 at premium prices could shrink significantly as companies optimize spending.
Microsoft already revoked its developers' Claude Code licenses months after rolling them out, according to TechCrunch. That's a significant signal — the largest software company on earth decided the ROI math doesn't work.
What This Means for Regular People
If you work at any company deploying AI tools, your access is about to get tiered — and probably more restricted. The era of 'use the best model for everything' is over at most large enterprises. Expect usage caps, routing policies, and a lot more questions from finance about what exactly your team is doing with AI.
For taxpayers: the federal government is also deploying AI at scale. Equivalent scrutiny is not being applied to federal token consumption. Nobody is asking what federal agencies are paying per query or whether they've set any usage limits whatsoever. Someone should start asking.