Everything is Model[s]
Andrej Karpathy's "Software 2.0" thesis was prescient. He saw that neural networks would replace traditional code in many domains. But there's a subtle adjustment needed to his vision: the future isn't one model doing everything. It's models all the way down.
The Nerd-Snipe Trap
Here's a classic nerd-snipe: see GPT-4, finetune some local model on SQL queries, and watch it outperform GPT 4. Engineers love this. They write blog posts. They share benchmarks. They feel clever.
Six months later, GPT-5 crushes their finetuned model at their subset of SQL without any specialized training. The cycle repeats.
This pattern reveals something important, but not what most people think. The lesson isn't that finetuning is useless. It's that we're asking the wrong question entirely.
The Real Nerd-Snipe
The subtler trap is believing that one superintelligent model will eventually actually do everything. Its true they will be capable of it, false that they should be doing everything. That Claude 5 or GPT-6 will handle every task from writing poetry to applying code edits to calculating tips.
This misunderstands both economics and intelligence itself.
You don't use Claude to add 5+2. Not because it can't—it obviously can—but because it's wasteful. A calculator is faster, cheaper, and more reliable for arithmetic - an ideal tool for an intelligent system to use. This isn't a bug in our current AI systems. It's a how intelligence will self organize.
Intelligence Organizes
Look at any human organization. The CEO doesn't write code. The senior engineer doesn't handle payroll. The accountant doesn't design the product roadmap.
This isn't because they can't. A [good] CEO could probably figure out the codebase. The senior engineer could learn QuickBooks. But it would be a catastrophic waste of their specialized intelligence.
Human intelligence naturally forms hierarchies of task assignment that optimize for cost and capability. Artificially intelligent systems will look slightly different - if you can have 10m Ilya Sutskevers do your tasks, you would do it. Artificially intelligent systems will be able to do everything, but as frontier models saturate tasks to 99%+ accuracy, the specialized models that are inference optimized will take over that task in favor of the frontier model - so the frontier model can allocate tokens to reasoning.
The Model Hierarchy
As frontier models grow to 5 trillion parameters and beyond, they'll leave behind a wake of tasks that aren't worth their computational cost as they move upmarket. Just as you wouldn't ask the CEO to format a spreadsheet, you won't ask a frontier model to apply straightforward code edits.
This is where specialized models like Fast Apply come in. They're not competing with Claude or Gemini. They're handling the tasks that frontier models shouldn't waste cycles on.
Consider the hierarchy:
- Frontier models: Novel problem solving, complex reasoning, creative work
- Specialized models: Domain-specific tasks with clear patterns
- Tiny models: Simple transformations, formatting, basic operations
Each layer handles what it's optimized for. The system as a whole is more capable than any single model could be.
Why Fast Apply Will Last
Fast Apply does one thing: it takes original code and an update snippet, then merges them at 4500+ tokens per second. It's not trying to understand the meaning of life or write poetry. It's applying edits.
Could GPT-5 do this? Of course. But why would you use a 5 trillion parameter model for a task that a specialized model handles perfectly? We've become so used to software tasks being swallowed by the generalist model that we forget hierarchies of intelligence arise naturally.
The economics are clear:
- Frontier model: $20 per million tokens, 100 tokens/second
- Fast Apply: $1.20 per million tokens, 4500 tokens/second
For applying code edits, you're paying 70x more for 45x slower performance. The frontier model gains you nothing for this specific task.
The Cambrian Explosion of Models
We're about to see an explosion of specialized models. Not because frontier models are failing, but because they're succeeding. As they push into genuinely novel capabilities, they create economic space for specialized models to handle routine tasks.
This mirrors the evolution of human expertise. As our collective intelligence grew, we didn't all become generalists. We specialized more deeply. The existence of brilliant surgeons didn't eliminate the need for nurses. It created it.
Software 2.0, Revised
Karpathy was right that neural networks would eat software. But it's not one neural network eating everything. It's an ecosystem of models, with frontier models at the top of the market and inference optimized models at the bottom.
The future tech stack won't be traditional code at the bottom and one superintelligent model at the top. It'll be models all the way down:
- Frontier models for reasoning, architecture and design
- Inference optimized models for implementation
Everything is models. Plural.
How to not get nerd sniped
- Finetuning a model for a niche task with 99% accuracy that outperforms on inference - useful
- Beating GPT-4o 78% vs 75% on MySQL queries? Not useful. Prepare for the next frontier model to beat you.
The Efficiency Imperative
This isn't just about cost. It's about speed and reliability. A model optimized for applying code edits can use specialized architectures, training data, and inference optimizations that a general-purpose model can't.
Fast Apply achieves 4500 tokens per second not by being smarter than GPT-4, but by being dumber in exactly the right way. It knows how to merge code. That's it. This constraint is its strength.
Building for the Model Hierarchy
If you're building AI systems today, think in hierarchies. Don't try to solve everything with one model. Instead:
- Use frontier models for genuinely novel tasks
- Build or use specialized models only for tasks where a smaller model would also get 98% performance
The winners in the AI age won't be those with access to the biggest models. They'll be those who understand when to use which model.
The Permanent Middle Layer
Fast Apply and models like it aren't transitional technologies waiting to be replaced by AGI. They're the permanent middle layer of our AI stack. Just as we still use neural net spam filters despite having LLMs, we'll still use specialized models despite having superintelligence.
The future isn't one model doing everything. It's a huge model at the top, and a lot of specialized models for use as tools at their disposal.
That's the real Software 2.0: not neural networks replacing code, but how artificial intelligence will organize itself. Everything is models.
At Morph, we're building for this multi-model future as frontier models move upmarket. Fast Apply is just the beginning—a specialized model that does one thing exceptionally well, leaving frontier models free to tackle the problems only they can solve.