Signal/Noise
Signal/Noise
2025-12-05
While AI companies race to build bigger models and grab headlines with trillion-dollar valuations, the real action is happening in the unglamorous business of making AI actually work reliably at scale. The gap between AI demos and production reality is creating a hidden infrastructure play that will determine which companies survive the inevitable consolidation.
The Great AI Reality Check: When Silicon Valley Dreams Meet Production Nightmares
Beneath the venture capital euphoria and billion-dollar AI startups lies an uncomfortable truth: most AI systems are brittle, unreliable, and nowhere near production-ready. Anthropic’s internal research reveals that even their own engineers can only “fully delegate” 0-20% of their work to Claude, despite claiming massive productivity gains. Meanwhile, coding agents—supposedly the poster child for AI automation—are failing spectacularly when faced with real-world complexity. They break when context windows overflow, fumble basic refactoring, and lack the operational awareness to handle production environments. This isn’t a temporary growing pain; it’s a fundamental architecture problem. The AI industry has optimized for demo-ability over deployability, creating systems that wow in controlled settings but crumble under real-world pressure. The companies that recognize this gap and build boring, reliable infrastructure will capture disproportionate value as the market matures. Look for businesses focused on data quality, model reliability, and operational monitoring—the plumbing that makes AI actually work.
The Data Gold Rush: How Training Data Became the New Oil (And Why It’s Getting Dirty)
The AI training data market has exploded from virtually nothing to a multi-billion dollar industry, with companies like Micro1 crossing $100M ARR in eight months by connecting domain experts with AI labs hungry for high-quality human feedback. But this gold rush is creating its own problems. Academic researchers are warning of a “slop problem”—low-quality, AI-generated content polluting training datasets and degrading model performance. Meanwhile, the race for specialized human trainers has created a new gig economy where Harvard professors earn $100/hour grading AI outputs. This isn’t sustainable. As models become more capable, the bar for useful human feedback rises exponentially. Companies are already struggling to find experts who can meaningfully improve frontier models. The winning strategy isn’t just accumulating more data—it’s building systems that can identify and filter high-quality training signals while maintaining data integrity at scale. The firms that solve this curation problem will control the chokepoint between raw human expertise and AI capability.
The Platform Wars Are Over Before They Started
While OpenAI panics about ChatGPT’s “code red” competitive situation and races to build AI agents, the real platform battle is being won by the infrastructure layer. Nvidia’s position remains unassailable not because of GPU performance, but because they control the entire stack from silicon to software. Their CUDA ecosystem creates switching costs that make even trillion-dollar competitors think twice about alternatives. Meanwhile, Google’s Gemini 3 launch signals a different strategy: embedding AI so deeply into existing workflows that users never have to choose a “primary” AI assistant. This isn’t about building the best chatbot; it’s about becoming invisible infrastructure. Meta’s poaching of Apple’s top designers reveals another angle—the winners will be companies that make AI feel like a natural extension of existing tools rather than a separate application. The consumer AI platform war was decided before it began: the platforms that already own distribution (Google, Apple, Microsoft) will win by making AI a feature, not a product.
Questions
- If AI coding agents can’t handle production complexity, what does this mean for the $7 trillion infrastructure buildout everyone is betting on?
- When training data quality becomes the limiting factor, do we end up with a few AI monopolies controlling the best datasets?
- Is the current AI bubble actually two bubbles—one for capabilities that will deflate, and another for infrastructure that will grow?
Past Briefings
AI’s Blind Geniuses
Everyone's measuring AI adoption. Nobody's measuring AI results. If Jensen Huang and Alfred Lin can't agree on a scorecard, that tells you more about the state of AI than any benchmark can. THE NUMBER: 0.37% or 100% — the gap between the best score any AI achieved on ARC-AGI-3 (Gemini 3.1 Pro's 0.37%) and Jensen Huang's claim that we've already reached AGI. Even among the most credible voices in AI, nobody can agree on whether we're at the starting line or the finish line. That uncertainty isn't a bug. It's the operating environment. And it's exactly why the question of...
Mar 25, 2026OpenAI Killed Sora 30 Minutes After a Disney Meeting. The Kill List Is the Strategy Now.
$15M/day to run, $2.1M lifetime revenue. The pivot to Codex puts them behind Claude Code — in a market China is about to commoditize from below. THE NUMBER: $15 million / $2.1 million — the daily operating cost of Sora vs. its lifetime revenue. When a product costs 2,600x more to run per day than it has ever earned, killing it isn't a choice. It's arithmetic. The question is what that arithmetic tells you about everything else OpenAI is doing. OpenAI killed Sora this week. Not quietly — 30 minutes after a working session with Disney, whose $1 billion investment...
Mar 24, 2026I’m a Mac. I’m a PC. And Only One of Us Is Getting Enterprise Contracts
THE NUMBER: 1,000 — the number of publishable-grade hypotheses an AI model can generate in an afternoon. Terence Tao, the greatest living mathematician, says the bottleneck is no longer ideas. It's knowing which ones are true. Two engineers hacked an inflight entertainment system this week to launch a video game at 35,000 feet. The airline gave them free flights for life. The hacker community on X thought it was the coolest thing they'd seen all month. Every CISO reading this just felt their blood pressure spike. That's the divide. Not between capabilities. Between cultures. Remember those "I'm a Mac, I'm...