Models as Chips: The Infrastructure Layer of AI (And Why You Shouldn't Build There)
Foundation models are commoditizing faster than CPUs did. Learn why competing at the model layer is a losing strategy and where to focus instead.
Models as Chips: The Infrastructure Layer of AI (And Why You Shouldn't Build There)
In 1997, Intel's Pentium II processor cost $636. Today, a chip with 10,000 times the performance costs roughly the same. That's not a technology story—it's an economics story. And it's happening again, right now, with AI models.
When Satya Nadella said "any advantage in model quality disappears fast, prices collapse," he wasn't making a prediction. He was describing what's already happened. GPT-4's launch price has dropped 95% in 18 months. Claude's cost-per-token is a fraction of what it was at launch. Gemini is practically giving away inference for high-volume customers.
This isn't a race to the bottom—it's a race to commodity. And if you're building at the model layer, you're about to learn a very expensive lesson.
The Commoditization Curve
The Pattern We've Seen Before
Every computing platform follows the same arc:
-
Innovation Phase: A breakthrough creates massive capability gaps. Early movers charge premium prices. (Think: Intel's 386, NVIDIA's CUDA, OpenAI's GPT-3)
-
Competition Phase: Others catch up. The gap narrows. Prices start falling. (AMD's Ryzen, TPUs, Claude and Gemini matching GPT-4)
-
Commodity Phase: Performance differences become marginal. Competition shifts to price. Margins collapse. (x86 chips today, cloud compute, and soon—foundation models)
Foundation models are already in the competition phase. Most developers can't tell the difference between GPT-4, Claude 3.5, and Gemini Ultra for 90% of tasks. The remaining 10% won't justify a 10x price premium for long.
The Numbers Tell the Story
Consider the pricing trajectory for GPT-4-class models:
| Date | Model | Input Cost (per 1M tokens) | Output Cost (per 1M tokens) |
|---|---|---|---|
| Mar 2023 | GPT-4 | $30.00 | $60.00 |
| Nov 2023 | GPT-4 Turbo | $10.00 | $30.00 |
| May 2024 | GPT-4o | $5.00 | $15.00 |
| Jan 2025 | GPT-4o | $2.50 | $10.00 |
That's an 85% price reduction in under two years. And these are list prices—volume customers are getting 50-80% discounts on top of that.
The pattern holds across providers. Anthropic's Claude pricing has followed a similar trajectory. Google is aggressively undercutting both. Amazon's Bedrock is turning models into a price war. Open-source alternatives like Llama and Mistral are pushing the floor toward zero.
Why This Is Inevitable
The commoditization of models isn't a bug—it's a feature of how this technology works.
Training costs are front-loaded. Training GPT-4 cost an estimated $100 million. Training GPT-5 might cost $1 billion. But once trained, the marginal cost of inference is just compute. As competition increases, inference pricing races toward marginal cost.
Improvements are incremental. GPT-3 to GPT-4 was a leap. GPT-4 to GPT-5 will be an improvement. Each generation delivers diminishing returns while costing exponentially more to develop. The innovation premium shrinks.
Knowledge diffuses. Techniques that were novel in 2023—RLHF, chain-of-thought, constitutional AI—are now table stakes. Every new lab starts with what took OpenAI years to discover. The gap between leaders and followers keeps shrinking.
Hardware gets cheaper. NVIDIA's H100 margins won't last forever. AMD, Intel, and custom silicon (TPUs, Trainium) are all pushing inference costs down. When inference is cheap, model pricing power evaporates.
The Trap of Model-Layer Thinking
"We'll Fine-Tune Our Way to Differentiation"
This is the most common trap. Teams spend months fine-tuning foundation models on proprietary data, believing this creates competitive advantage.
The math doesn't work.
Fine-tuning costs include:
- Data preparation: 100-1000 hours of expert time
- Training compute: $10K-$100K per iteration
- Evaluation and iteration: 3-6 month cycles
- Maintenance: Re-training with each base model update
Meanwhile, the base model you fine-tuned is improving faster than your fine-tune. Every 6 months, a new version releases that's better at your task than your specialized version was. You're on a treadmill, spending resources to stay in place.
"We'll Build a Smaller, Specialized Model"
Another common trap: building domain-specific models that are smaller and cheaper to run.
This worked in 2022. It doesn't work in 2025.
The economics have flipped. When GPT-4 inference cost $60 per million output tokens, building a smaller model that cost $1 per million made sense. When GPT-4-class inference costs $2-5 per million, the 80% cost savings on a model that's 50% worse isn't worth the engineering investment.
Worse, general-purpose models are getting better at specialized tasks faster than specialized models can improve. Medical AI trained on PubMed is being outperformed by GPT-4 with good prompting. Legal analysis models are losing to Claude with system prompts. The specialists are losing to well-orchestrated generalists.
"We'll Train From Scratch on Proprietary Data"
This is the most expensive trap. Teams with unique data assets—enterprise data, specialized corpora, proprietary datasets—believe they should train custom models.
Consider the economics:
- Training a GPT-4-class model: $50-100 million
- Training a specialized model from scratch: $5-10 million (minimum)
- Fine-tuning an existing model: $50K-500K
- Using skills with retrieval augmentation: $5K-50K
For the same investment as training one custom model, you could build and iterate on 100-1000 skill implementations. The iteration speed difference is 100x. The learning rate difference is even larger.
Where Model Layer Companies Will Survive
This isn't to say the model layer is worthless—just that it's not where most teams should compete. A few types of organizations can thrive here:
The Hyperscalers
OpenAI, Anthropic, Google, Meta, and maybe a few others can compete at the model layer because they have:
- Billions in capital
- Access to unique training data (the internet, user interactions, proprietary corpora)
- Distribution advantages (integrations with Windows, Android, AWS, etc.)
- Talent concentration (the best researchers want to work on frontier models)
If you're not in this category, you're competing against organizations that can outspend you 1000:1.
The Open-Source Plays
Meta's Llama, Mistral, and similar open-source models compete on different economics. They monetize through:
- Enterprise support and customization
- Cloud infrastructure (running models on their clouds)
- Strategic positioning (commoditizing competitors' margins)
These are viable businesses, but they're infrastructure businesses with infrastructure margins.
The Hardware-Integrated
Companies building AI accelerators (NVIDIA, AMD, custom silicon vendors) have natural adjacency to the model layer. They can offer optimized models for their hardware, creating switching costs. But this is a hardware business, not an AI business.
The Data Monopolists
Organizations with truly unique data—Bloomberg for financial, Elsevier for scientific, Epic for medical—can potentially create defensible model-layer businesses. But even here, the value increasingly comes from the data access layer (skills and retrieval) rather than the model layer.
The Right Mental Model
Instead of thinking about models as products, think about them as utilities.
You don't build your competitive advantage on having a better electrical grid connection than your competitors. You plug into the grid and build value on top.
Similarly, you don't build AI competitive advantage on having a better foundation model. You plug into the best available models and build value at higher layers.
The Utilities Analogy
Consider how successful software companies treat infrastructure:
| Infrastructure | How Leaders Treat It | Wrong Approach |
|---|---|---|
| Cloud compute | Use AWS/GCP/Azure, don't build data centers | Build private data centers "for control" |
| Databases | Use Postgres/MongoDB, optimize queries | Build custom database engines |
| Authentication | Use Auth0/Clerk, focus on features | Build custom auth "for security" |
| AI Models | Use GPT-4/Claude, build skills | Train custom models "for differentiation" |
The pattern is consistent: successful companies treat commoditized infrastructure as utilities and focus engineering on differentiated value creation.
What to Do Instead
If competing at the model layer is a losing strategy, where should you focus?
Invest in the Agent Layer
Agents—the orchestration systems that manage context, tools, and user experience—are the new operating systems. Understanding how to build effective agents, or how to build for agent platforms, is the highest-leverage skill in AI engineering today.
Learn how agents:
- Manage context windows effectively
- Orchestrate tool calls and workflows
- Handle errors and edge cases gracefully
- Integrate with existing systems
This knowledge doesn't depreciate when models improve. It becomes more valuable.
Build at the Skill Layer
Skills—modular capabilities that plug into agents—are where defensible value is created. A skill that automates expense report processing doesn't care whether it's powered by GPT-4 or Claude or Gemini. It cares about:
- Understanding the domain (expense policies, approval workflows, fraud patterns)
- Integrating with systems (expense management software, accounting systems)
- Delivering reliable results (correctly categorized expenses, flagged anomalies)
This is where your domain expertise matters. This is where iteration creates compounding advantages. This is where you should build.
Embrace Model Portability
Design your systems to be model-agnostic. If you've built on GPT-4 today, you should be able to switch to Claude tomorrow or a better model next year without rewriting your application.
Practical implications:
- Use abstraction layers (LangChain, LiteLLM, or custom adapters)
- Don't rely on model-specific features unless necessary
- Test across multiple models regularly
- Build evaluation frameworks that work across providers
Model portability isn't just risk management—it's leverage. When you can switch models easily, you can always use the best price/performance option.
Focus on Data Advantages
If you're going to invest in any model-adjacent capability, invest in data.
Not data for training (that's the model layer trap) but data for retrieval and context. Build:
- High-quality knowledge bases in your domain
- Feedback loops that capture what works and what doesn't
- User interaction data that informs product decisions
- Benchmark datasets that let you evaluate model fitness
This data makes your skills more effective on any model. It's a durable advantage that compounds over time.
The Timeline
How long until models are fully commoditized? History suggests 3-5 years for complete commoditization, but functional commoditization—where model choice doesn't matter for most applications—is already here.
2025: Functional Equivalence
For 90% of business applications, GPT-4, Claude 3.5, and Gemini Ultra are interchangeable. Price and reliability, not capability, drive choice.
2026: Open-Source Parity
Open-source models reach GPT-4-class performance. Self-hosting becomes viable for enterprises. Cloud pricing continues collapsing.
2027: The Utility Phase
Foundation models are priced like cloud compute—pennies per request, commodity margins. Differentiation is impossible at this layer.
2028 and Beyond
The model layer becomes invisible infrastructure, like TCP/IP or DNS. Nobody talks about their model choice because it doesn't matter.
Conclusion
The message is simple: don't compete at the model layer unless you're one of five companies in the world with the resources to do so.
Models are chips. They're essential infrastructure, but infrastructure is not where software value is captured. Intel doesn't capture the value of every Windows application. NVIDIA doesn't capture the value of every CUDA-accelerated workload. OpenAI won't capture the value of every GPT-4-powered skill.
The value flows to higher layers—to agents that orchestrate, and to skills that solve real problems.
If you're a developer, an entrepreneur, or a business leader trying to figure out where to invest in AI, this is your answer: not here. Look up the stack. Look at agents. Look at skills. Look at where models are consumed, not where they're produced.
That's where the opportunity is. That's where you should build.
Next in this series: Agents as Operating Systems: The Orchestration Layer Developers Need to Master