The Bitter lesson and worse is Better
There are two ideas, decades apart, that I think about all the time. Two great ‘papers’
The Bitter Lesson (Rich Sutton)
Worse is Better / The New Jersey Approach (Richard Gabriel)
At first glance, they seem to come from different worlds—machine learning vs. programming language design. But together, they form a powerful lens for understanding why modern AI is evolving the way it is.
And more importantly: why the systems that win are rarely the ones that are most “correct.”
The Bitter Lesson: Compute Beats Cleverness
The Bitter Lesson argues something uncomfortable but historically consistent:
Over the long run, general methods that leverage computation outperform methods that rely on human-designed structure.
In other words:
Hand-crafted features lose to learned representations
Domain-specific cleverness loses to scale
Carefully engineered systems lose to brute-force learning with enough data
In the article :
Chess → search + compute beat human heuristics
Vision → deep learning beat feature engineering
NLP → LLMs beat symbolic systems
The takeaway..
Don’t overdesign. Scale wins.
Worse is Better: Simplicity Beats Perfection
Gabriel’s “Worse is Better” (aka the New Jersey approach) makes a different—but equally pragmatic—argument. Growing up outside of NYC and living in New Jersey, I appreciate the subtext here as well:
Systems that are simpler, more usable, and easier to adopt will beat more elegant, “correct” systems.
This is why:
C beat Lisp
Unix philosophy beat more theoretically pure OS designs
Simple APIs win over “perfect” abstractions
The tradeoff:
You sacrifice completeness and elegance
You gain adoption, iteration speed, and survival
The system thats easier to use wins, not the one that is most beautiful.
In startup world this is the way as well IMHO, make it work then make it pretty.
The Juxtaposition: Two Paths to the Same Outcome
At first, these ideas feel orthogonal:
Bitter Lesson
Scale and compute win
Avoid human bias in design
Favor generality
Worse is Better
Simplicity and usability win
Accept imperfection in design
Favor practicality
They converge on the same meta-principle:
Don’t over-engineer early. Build systems that can grow.
The Modern AI Pattern: Foundation Models + Adaptation
Today’s AI stack is where these two ideas collide and reinforce each other.
Old mindset (pre-LLM era)
Build custom models per problem
Design features carefully
Optimize architecture upfront
Lock into frameworks early
New mindset
Wait for foundation models
Build thin layers on top
Adapt via prompting, RAG, fine-tuning
Iterate rapidly instead of perfecting upfront
This is both:
The Bitter Lesson → Let large-scale pretraining do the heavy lifting
Worse is Better → Use the simplest interface (prompting, APIs) to unlock value
An old startup guy I know Carlos Cashman used to preach essentially
don’t worry about your startup codebase, you’re gonna rewrite it anyway if you’re successful
At first it sounded absurd, but I’ve lived it a few times now.
Why “Good Enough + Adaptable” Wins
The winning systems today share three traits:
1. They are easy to use
Prompting an LLM > building a custom NLP pipeline
API calls > training from scratch
2. They learn continuously
Instead of encoding rules:
You refine prompts
You improve retrieval
You fine-tune when needed
3. They avoid premature rigidity
Hard-coded systems:
Break when assumptions change
Are expensive to modify
LLM-based systems:
Are inherently flexible
Can adapt to new tasks with minimal changes
Flexibility => Correctness
The New Jersey approach tolerates imperfection.
The Bitter Lesson rewards generality.
Together, they suggest:
The best system is not the most correct—it’s the one that can improve itself fastest over time.
This is a subtle but profound shift:
From designing the right system
To designing a system that can become right
oVer-cooking the design
In today’s AI landscape, over-engineering early can hurt you:
You lock into assumptions that foundation models will obsolete
You spend time solving problems compute will solve anyway
You reduce your ability to pivot as models improve
Meanwhile, teams that:
Use simple abstractions
Ride model improvements
Iterate quickly
end up leapfrogging more “thoughtful” systems.
A Practical Heuristic for Builders
When designing AI systems today, ask:
“Am I solving this with structure that scale will replace?”
And:
“Am I making this harder than it needs to be to adopt and iterate?”
If yes, you’re probably fighting both lessons.
Final Thought
The Bitter Lesson tells us:
Don’t outsmart compute.
Worse is Better tells us:
Don’t out-design usability.
Together, they tell us:
Build systems that are simple enough to use today, and flexible enough to get better tomorrow.
That’s not just a philosophy—it’s the blueprint behind modern AI.