Coffee's for Closers

There’s a line an enterprise executive gave The Deep View this week at the Databricks Data + AI Summit, and I haven’t been able to put it down. “Last year the executives were letting every flower bloom. Now they’re coming in with a lawnmower.” He was describing the morning the inference invoice landed on his CFO’s desk. Another exec in the same room: “The coding agents have run our token bills out of control.” These are not dramatic people. They run finance and operations at companies you’ve heard of. And they are, by their own account, panicking.

I want to sit with that, because it’s the whole issue.

For two years the story the industry told itself was a capability story. Whose model is smartest. Who topped which benchmark this week. Which lab shipped the major version bump. We played that game too — guilty, repeatedly. And the entire time, underneath the leaderboard chatter, a second clock was running: the meter. Every one of those clever agent demos was burning tokens, and somebody was going to get the bill. This week the bill arrived, and it arrived at the one desk in the building where nobody grades on a benchmark. The CFO doesn’t care that your model is state of the art. The CFO cares that the line item tripled and nobody can explain why.

Ali Ghodsi, who runs Databricks, said the quiet part into a microphone at his own keynote. The current spending, he said, is “completely unsustainable.” And the number one question he now gets from customers is not “which model is best.” It’s “how do we curb the cost but still invest in AI.” When the CEO of a company whose entire business is selling you more compute stands on his own stage and tells you the spending can’t continue, that’s not a vendor managing expectations. That’s the market turning a corner in public.

We Wrote This Movie Eight Days Ago

I’m going to do the unseemly thing and point at our own receipt, because it matters for what comes next.

Eight days ago, June 11, we ran a column called Buy Wins, Not Players. The argument: the nerds would spend the summer fighting over the last ten percent of model quality, while the allocators who learned to meter and route quietly took the season. We borrowed it from Billy Beane and the 2002 A’s — buy wins, not players, because the market overpays for the thing that looks impressive and underpays for the thing that actually scores. We said the public benchmark had died as a buying tool and that your defense was to keep your own box score, because nobody credible was keeping it for you.

I’m not running a victory lap. Victory laps are cheap and they curdle fast. I’m pointing at the receipt because the most expensive lesson in this business is the one you learn a quarter late, and this week the people who waved every flower through got the invoice. When a $130-billion-dollar company’s CEO independently lands on the exact thing you wrote a week earlier, that’s not us being clever. That’s the market arriving. And the useful question is never “were we right” — it’s “okay, we were early, so what do you do on Monday.”

Here’s what you do. But first, look at who’s getting rich while everyone else panics, because that tells you where to stand.

The Toll-Collector Gets Paid Either Way

Databricks just crossed $6.9 billion in revenue, growing eighty percent in a year. Its lead over Snowflake — the other big data shop — was $490 million back in March. Today it’s $1.6 billion. The gap more than tripled in a single quarter. Tomasz Tunguz, the VC, has a name for companies positioned like this: the “first derivative of inference.” They don’t sell you the model. They sit in the path the tokens travel and clip a fee off every query that goes by. Databricks’ AI products alone now run at $1.7 billion annualized, a quarter of the whole company, growing faster than anything else it does.

Read that against the panic and the shape of the thing snaps into focus. The same token bill that’s wrecking the buyer is the invoice making the toll-collector rich. It’s the oldest pattern in any boom. In 1849 the men who dug for gold mostly went home broke. The men who got rich sold them shovels, sold them denim, sold them eggs at a dollar apiece — Sam Brannan bought every pan and pick in San Francisco and resold them at a markup before he ever touched a river. Levi Strauss never panned an ounce. The picks-and-shovels crowd doesn’t care whether you strike gold. They get paid on the digging. AI in 2026 has its own picks-and-shovels layer, and Databricks just printed the proof.

So you have two choices, and only two. Be the toll-collector, or get disciplined about the toll. Most of you reading this aren’t going to become the toll road. That leaves discipline.

Discipline Looks Like a Routing Map

Stop firing every query at a frontier model in the cloud. That’s the whole game, and almost nobody is playing it on purpose.

Using a top-tier frontier model to answer a customer FAQ is cutting a daisy with a chainsaw. It works. It is also insane on a unit-cost basis, and at scale it’s the thing blowing up the bill. The fix is unglamorous and it is sitting right there: route the eighty percent of dull, high-volume, low-stakes work — the ticket tagging, the doc summarizing, the FAQ answering, the first-draft boilerplate — to a cheap open model. Run it on your own hardware if the volume justifies the buildout. Then save the frontier, with its premium per-token price, for the twenty percent of work where the extra intelligence actually changes the outcome and earns its keep.

Find your superstars. Bench everyone else. And here’s the part that makes the discipline easy this year where it was hard last year: the bench got absurdly cheap. The open models collapsing the price floor are good now — genuinely good, often Chinese, a fraction of the cost, and improving every quarter. The gap between “the best model in the world” and “a free model that’s plenty good enough for tagging tickets” has never mattered less for the workloads that make up most of your volume.

You don’t have to take my word for any of this anymore, which is the real news in it. Microsoft made the OpenAI model optional in Copilot this week. Optional. The company that bet the firm on OpenAI, that wired GPT into every product it ships, quietly turned the default into a choice and started testing a fine-tuned version of DeepSeek for the cheap lane. The largest software company on earth just shipped the routing map as a product feature. When your biggest distributor stops treating your model as the default, you have your answer about whether the model was ever the moat. It wasn’t. The routing was.

One Warning About Where You Swing It

Now the turn, because there’s a way to take this thesis and hurt yourself with it.

The ruthlessness that’s correct for your token budget is a trap when you point it at your people. Gartner came out this week with a prediction worth taping to your monitor: by 2027, half of the companies that replaced their customer-service teams with AI are expected to hire the humans back. Not because the AI failed in some dramatic way. Because the executives confused two things that live on different lines of the P&L — cutting payroll and adding value. They ran the headcount down, watched the cost line drop, called it strategy, and then discovered that “cheaper” and “better” were never the same word. The customers noticed. The quality dropped. The rehiring starts.

So hold both ideas at once, which is the entire discipline. Route your tokens like Blake at the chalkboard — coffee’s for closers, the frontier is for the workloads that close, everything else rides the cheap bench, no sentiment. Then walk into the conversation about your team and throw that exact instinct out the window, because the company that fires its way to a quarter of margin is the company rehiring at a premium in eighteen months. The agents make your best people ten times more productive. That’s an argument for finding your superstars and protecting them, not for clearing the floor.

The model isn’t the moat. We’ve been saying it for a month, and this week Satya Nadella’s product team said it for us by making OpenAI optional, and Ali Ghodsi said it for us by calling the spend unsustainable, and the Databricks-Snowflake gap said it loudest of all by tripling in ninety days while everyone argued about benchmarks. The moat was never the smartest model. It’s knowing which of your problems deserve the expensive answer.

Put the coffee down. Figure out which of your workloads are actually closing. Route accordingly — and don’t fire the salesmen who do.

The full Signal/Noise, with the routing map and the eight questions that tell you which workloads earn a frontier token, is the running argument we’ve been building since June 11. This is where it goes next.

Sources:

Coffee’s for Closers

We Wrote This Movie Eight Days Ago

The Toll-Collector Gets Paid Either Way

Discipline Looks Like a Routing Map

One Warning About Where You Swing It

Outsider
Labs.

Coffee’s for Closers

We Wrote This Movie Eight Days Ago

The Toll-Collector Gets Paid Either Way

Discipline Looks Like a Routing Map

One Warning About Where You Swing It

More like this

Show Me Where to Put the Fulcrum

The Right to Remain Silent

The Good, the Bad, and the Ugly

All Signal.No Noise.

OutsiderLabs.

All Signal.
No Noise.

Outsider
Labs.