Models as Chips: The Infrastructure Layer of AI (And Why You Shouldn't Build There)

In 1997, Intel's Pentium II processor cost $636. Today, a chip with 10,000 times the performance costs roughly the same. That's not a technology story—it's an economics story. And it's happening again, right now, with AI models.

When Satya Nadella said "any advantage in model quality disappears fast, prices collapse," he wasn't making a prediction. He was describing what's already happened. GPT-4's launch price has dropped 95% in 18 months. Claude's cost-per-token is a fraction of what it was at launch. Gemini is practically giving away inference for high-volume customers.

This isn't a race to the bottom—it's a race to commodity. And if you're building at the model layer, you're about to learn a very expensive lesson.

The Commoditization Curve

The Pattern We've Seen Before

Every computing platform follows the same arc:

Innovation Phase: A breakthrough creates massive capability gaps. Early movers charge premium prices. (Think: Intel's 386, NVIDIA's CUDA, OpenAI's GPT-3)
Competition Phase: Others catch up. The gap narrows. Prices start falling. (AMD's Ryzen, TPUs, Claude and Gemini matching GPT-4)
Commodity Phase: Performance differences become marginal. Competition shifts to price. Margins collapse. (x86 chips today, cloud compute, and soon—foundation models)

Foundation models are already in the competition phase. Most developers can't tell the difference between GPT-4, Claude 3.5, and Gemini Ultra for 90% of tasks. The remaining 10% won't justify a 10x price premium for long.

The Numbers Tell the Story

Consider the pricing trajectory for GPT-4-class models:

Date	Model	Input Cost (per 1M tokens)	Output Cost (per 1M tokens)
Mar 2023	GPT-4	$30.00	$60.00
Nov 2023	GPT-4 Turbo	$10.00	$30.00
May 2024	GPT-4o	$5.00	$15.00
Jan 2025	GPT-4o	$2.50	$10.00

That's an 85% price reduction in under two years. And these are list prices—volume customers are getting 50-80% discounts on top of that.

The pattern holds across providers. Anthropic's Claude pricing has followed a similar trajectory. Google is aggressively undercutting both. Amazon's Bedrock is turning models into a price war. Open-source alternatives like Llama and Mistral are pushing the floor toward zero.

Why This Is Inevitable

The commoditization of models isn't a bug—it's a feature of how this technology works.

Training costs are front-loaded. Training GPT-4 cost an estimated $100 million. Training GPT-5 might cost $1 billion. But once trained, the marginal cost of inference is just compute. As competition increases, inference pricing races toward marginal cost.

Improvements are incremental. GPT-3 to GPT-4 was a leap. GPT-4 to GPT-5 will be an improvement. Each generation delivers diminishing returns while costing exponentially more to develop. The innovation premium shrinks.

Knowledge diffuses. Techniques that were novel in 2023—RLHF, chain-of-thought, constitutional AI—are now table stakes. Every new lab starts with what took OpenAI years to discover. The gap between leaders and followers keeps shrinking.

Hardware gets cheaper. NVIDIA's H100 margins won't last forever. AMD, Intel, and custom silicon (TPUs, Trainium) are all pushing inference costs down. When inference is cheap, model pricing power evaporates.

The Trap of Model-Layer Thinking

"We'll Fine-Tune Our Way to Differentiation"

This is the most common trap. Teams spend months fine-tuning foundation models on proprietary data, believing this creates competitive advantage.

The math doesn't work.

Fine-tuning costs include:

Data preparation: 100-1000 hours of expert time
Training compute: $10K-$100K per iteration
Evaluation and iteration: 3-6 month cycles
Maintenance: Re-training with each base model update

Meanwhile, the base model you fine-tuned is improving faster than your fine-tune. Every 6 months, a new version releases that's better at your task than your specialized version was. You're on a treadmill, spending resources to stay in place.

"We'll Build a Smaller, Specialized Model"

Another common trap: building domain-specific models that are smaller and cheaper to run.

This worked in 2022. It doesn't work in 2025.

The economics have flipped. When GPT-4 inference cost $60 per million output tokens, building a smaller model that cost $1 per million made sense. When GPT-4-class inference costs $2-5 per million, the 80% cost savings on a model that's 50% worse isn't worth the engineering investment.

Worse, general-purpose models are getting better at specialized tasks faster than specialized models can improve. Medical AI trained on PubMed is being outperformed by GPT-4 with good prompting. Legal analysis models are losing to Claude with system prompts. The specialists are losing to well-orchestrated generalists.

"We'll Train From Scratch on Proprietary Data"

This is the most expensive trap. Teams with unique data assets—enterprise data, specialized corpora, proprietary datasets—believe they should train custom models.

Consider the economics:

Training a GPT-4-class model: $50-100 million
Training a specialized model from scratch: $5-10 million (minimum)
Fine-tuning an existing model: $50K-500K
Using skills with retrieval augmentation: $5K-50K

For the same investment as training one custom model, you could build and iterate on 100-1000 skill implementations. The iteration speed difference is 100x. The learning rate difference is even larger.

Where Model Layer Companies Will Survive

This isn't to say the model layer is worthless—just that it's not where most teams should compete. A few types of organizations can thrive here:

The Hyperscalers

OpenAI, Anthropic, Google, Meta, and maybe a few others can compete at the model layer because they have:

Billions in capital
Access to unique training data (the internet, user interactions, proprietary corpora)
Distribution advantages (integrations with Windows, Android, AWS, etc.)
Talent concentration (the best researchers want to work on frontier models)

If you're not in this category, you're competing against organizations that can outspend you 1000:1.

The Open-Source Plays

Meta's Llama, Mistral, and similar open-source models compete on different economics. They monetize through:

Enterprise support and customization
Cloud infrastructure (running models on their clouds)
Strategic positioning (commoditizing competitors' margins)

These are viable businesses, but they're infrastructure businesses with infrastructure margins.

The Hardware-Integrated

Companies building AI accelerators (NVIDIA, AMD, custom silicon vendors) have natural adjacency to the model layer. They can offer optimized models for their hardware, creating switching costs. But this is a hardware business, not an AI business.

The Data Monopolists

Organizations with truly unique data—Bloomberg for financial, Elsevier for scientific, Epic for medical—can potentially create defensible model-layer businesses. But even here, the value increasingly comes from the data access layer (skills and retrieval) rather than the model layer.

The Right Mental Model

Instead of thinking about models as products, think about them as utilities.

You don't build your competitive advantage on having a better electrical grid connection than your competitors. You plug into the grid and build value on top.

Similarly, you don't build AI competitive advantage on having a better foundation model. You plug into the best available models and build value at higher layers.

The Utilities Analogy

Consider how successful software companies treat infrastructure:

Infrastructure	How Leaders Treat It	Wrong Approach
Cloud compute	Use AWS/GCP/Azure, don't build data centers	Build private data centers "for control"
Databases	Use Postgres/MongoDB, optimize queries	Build custom database engines
Authentication	Use Auth0/Clerk, focus on features	Build custom auth "for security"
AI Models	Use GPT-4/Claude, build skills	Train custom models "for differentiation"

The pattern is consistent: successful companies treat commoditized infrastructure as utilities and focus engineering on differentiated value creation.

What to Do Instead

If competing at the model layer is a losing strategy, where should you focus?

Invest in the Agent Layer

Agents—the orchestration systems that manage context, tools, and user experience—are the new operating systems. Understanding how to build effective agents, or how to build for agent platforms, is the highest-leverage skill in AI engineering today.

Learn how agents:

Manage context windows effectively
Orchestrate tool calls and workflows
Handle errors and edge cases gracefully
Integrate with existing systems

This knowledge doesn't depreciate when models improve. It becomes more valuable.

Build at the Skill Layer

Skills—modular capabilities that plug into agents—are where defensible value is created. A skill that automates expense report processing doesn't care whether it's powered by GPT-4 or Claude or Gemini. It cares about:

Understanding the domain (expense policies, approval workflows, fraud patterns)
Integrating with systems (expense management software, accounting systems)
Delivering reliable results (correctly categorized expenses, flagged anomalies)

This is where your domain expertise matters. This is where iteration creates compounding advantages. This is where you should build.

Embrace Model Portability

Design your systems to be model-agnostic. If you've built on GPT-4 today, you should be able to switch to Claude tomorrow or a better model next year without rewriting your application.

Practical implications:

Use abstraction layers (LangChain, LiteLLM, or custom adapters)
Don't rely on model-specific features unless necessary
Test across multiple models regularly
Build evaluation frameworks that work across providers

Model portability isn't just risk management—it's leverage. When you can switch models easily, you can always use the best price/performance option.

Focus on Data Advantages

If you're going to invest in any model-adjacent capability, invest in data.

Not data for training (that's the model layer trap) but data for retrieval and context. Build:

High-quality knowledge bases in your domain
Feedback loops that capture what works and what doesn't
User interaction data that informs product decisions
Benchmark datasets that let you evaluate model fitness

This data makes your skills more effective on any model. It's a durable advantage that compounds over time.

The Timeline

How long until models are fully commoditized? History suggests 3-5 years for complete commoditization, but functional commoditization—where model choice doesn't matter for most applications—is already here.

2025: Functional Equivalence

For 90% of business applications, GPT-4, Claude 3.5, and Gemini Ultra are interchangeable. Price and reliability, not capability, drive choice.

2026: Open-Source Parity

Open-source models reach GPT-4-class performance. Self-hosting becomes viable for enterprises. Cloud pricing continues collapsing.

2027: The Utility Phase

Foundation models are priced like cloud compute—pennies per request, commodity margins. Differentiation is impossible at this layer.

2028 and Beyond

The model layer becomes invisible infrastructure, like TCP/IP or DNS. Nobody talks about their model choice because it doesn't matter.

Conclusion

The message is simple: don't compete at the model layer unless you're one of five companies in the world with the resources to do so.

Models are chips. They're essential infrastructure, but infrastructure is not where software value is captured. Intel doesn't capture the value of every Windows application. NVIDIA doesn't capture the value of every CUDA-accelerated workload. OpenAI won't capture the value of every GPT-4-powered skill.

The value flows to higher layers—to agents that orchestrate, and to skills that solve real problems.

If you're a developer, an entrepreneur, or a business leader trying to figure out where to invest in AI, this is your answer: not here. Look up the stack. Look at agents. Look at skills. Look at where models are consumed, not where they're produced.

That's where the opportunity is. That's where you should build.

Next in this series: Agents as Operating Systems: The Orchestration Layer Developers Need to Master

Models as Chips: The Infrastructure Layer of AI (And Why You Shouldn't Build There)

This isn't a race to the bottom—it's a race to commodity. And if you're building at the model layer, you're about to learn a very expensive lesson.

The Commoditization Curve

The Pattern We've Seen Before

Every computing platform follows the same arc:

Innovation Phase: A breakthrough creates massive capability gaps. Early movers charge premium prices. (Think: Intel's 386, NVIDIA's CUDA, OpenAI's GPT-3)
Competition Phase: Others catch up. The gap narrows. Prices start falling. (AMD's Ryzen, TPUs, Claude and Gemini matching GPT-4)
Commodity Phase: Performance differences become marginal. Competition shifts to price. Margins collapse. (x86 chips today, cloud compute, and soon—foundation models)

The Numbers Tell the Story

Consider the pricing trajectory for GPT-4-class models:

Date	Model	Input Cost (per 1M tokens)	Output Cost (per 1M tokens)
Mar 2023	GPT-4	$30.00	$60.00
Nov 2023	GPT-4 Turbo	$10.00	$30.00
May 2024	GPT-4o	$5.00	$15.00
Jan 2025	GPT-4o	$2.50	$10.00

That's an 85% price reduction in under two years. And these are list prices—volume customers are getting 50-80% discounts on top of that.

Why This Is Inevitable

The commoditization of models isn't a bug—it's a feature of how this technology works.

The Trap of Model-Layer Thinking

"We'll Fine-Tune Our Way to Differentiation"

This is the most common trap. Teams spend months fine-tuning foundation models on proprietary data, believing this creates competitive advantage.

The math doesn't work.

Fine-tuning costs include:

Data preparation: 100-1000 hours of expert time
Training compute: $10K-$100K per iteration
Evaluation and iteration: 3-6 month cycles
Maintenance: Re-training with each base model update

"We'll Build a Smaller, Specialized Model"

Another common trap: building domain-specific models that are smaller and cheaper to run.

This worked in 2022. It doesn't work in 2025.

"We'll Train From Scratch on Proprietary Data"

This is the most expensive trap. Teams with unique data assets—enterprise data, specialized corpora, proprietary datasets—believe they should train custom models.

Consider the economics:

Training a GPT-4-class model: $50-100 million
Training a specialized model from scratch: $5-10 million (minimum)
Fine-tuning an existing model: $50K-500K
Using skills with retrieval augmentation: $5K-50K

Where Model Layer Companies Will Survive

This isn't to say the model layer is worthless—just that it's not where most teams should compete. A few types of organizations can thrive here:

The Hyperscalers

OpenAI, Anthropic, Google, Meta, and maybe a few others can compete at the model layer because they have:

Billions in capital
Access to unique training data (the internet, user interactions, proprietary corpora)
Distribution advantages (integrations with Windows, Android, AWS, etc.)
Talent concentration (the best researchers want to work on frontier models)

If you're not in this category, you're competing against organizations that can outspend you 1000:1.

The Open-Source Plays

Meta's Llama, Mistral, and similar open-source models compete on different economics. They monetize through:

Enterprise support and customization
Cloud infrastructure (running models on their clouds)
Strategic positioning (commoditizing competitors' margins)

These are viable businesses, but they're infrastructure businesses with infrastructure margins.

The Hardware-Integrated

The Data Monopolists

The Right Mental Model

Instead of thinking about models as products, think about them as utilities.

You don't build your competitive advantage on having a better electrical grid connection than your competitors. You plug into the grid and build value on top.

Similarly, you don't build AI competitive advantage on having a better foundation model. You plug into the best available models and build value at higher layers.

The Utilities Analogy

Consider how successful software companies treat infrastructure:

Infrastructure	How Leaders Treat It	Wrong Approach
Cloud compute	Use AWS/GCP/Azure, don't build data centers	Build private data centers "for control"
Databases	Use Postgres/MongoDB, optimize queries	Build custom database engines
Authentication	Use Auth0/Clerk, focus on features	Build custom auth "for security"
AI Models	Use GPT-4/Claude, build skills	Train custom models "for differentiation"

The pattern is consistent: successful companies treat commoditized infrastructure as utilities and focus engineering on differentiated value creation.

What to Do Instead

If competing at the model layer is a losing strategy, where should you focus?

Invest in the Agent Layer

Learn how agents:

Manage context windows effectively
Orchestrate tool calls and workflows
Handle errors and edge cases gracefully
Integrate with existing systems

This knowledge doesn't depreciate when models improve. It becomes more valuable.

Build at the Skill Layer

Understanding the domain (expense policies, approval workflows, fraud patterns)
Integrating with systems (expense management software, accounting systems)
Delivering reliable results (correctly categorized expenses, flagged anomalies)

This is where your domain expertise matters. This is where iteration creates compounding advantages. This is where you should build.

Embrace Model Portability

Design your systems to be model-agnostic. If you've built on GPT-4 today, you should be able to switch to Claude tomorrow or a better model next year without rewriting your application.

Practical implications:

Use abstraction layers (LangChain, LiteLLM, or custom adapters)
Don't rely on model-specific features unless necessary
Test across multiple models regularly
Build evaluation frameworks that work across providers

Model portability isn't just risk management—it's leverage. When you can switch models easily, you can always use the best price/performance option.

Focus on Data Advantages

If you're going to invest in any model-adjacent capability, invest in data.

Not data for training (that's the model layer trap) but data for retrieval and context. Build:

High-quality knowledge bases in your domain
Feedback loops that capture what works and what doesn't
User interaction data that informs product decisions
Benchmark datasets that let you evaluate model fitness

This data makes your skills more effective on any model. It's a durable advantage that compounds over time.

The Timeline

2025: Functional Equivalence

For 90% of business applications, GPT-4, Claude 3.5, and Gemini Ultra are interchangeable. Price and reliability, not capability, drive choice.

2026: Open-Source Parity

Open-source models reach GPT-4-class performance. Self-hosting becomes viable for enterprises. Cloud pricing continues collapsing.

2027: The Utility Phase

Foundation models are priced like cloud compute—pennies per request, commodity margins. Differentiation is impossible at this layer.

2028 and Beyond

The model layer becomes invisible infrastructure, like TCP/IP or DNS. Nobody talks about their model choice because it doesn't matter.

Conclusion

The message is simple: don't compete at the model layer unless you're one of five companies in the world with the resources to do so.

The value flows to higher layers—to agents that orchestrate, and to skills that solve real problems.

That's where the opportunity is. That's where you should build.

Next in this series: Agents as Operating Systems: The Orchestration Layer Developers Need to Master