The Bitter lesson and worse is Better

There are two ideas, decades apart, that I think about all the time. Two great ‘papers’

At first glance, they seem to come from different worlds—machine learning vs. programming language design. But together, they form a powerful lens for understanding why modern AI is evolving the way it is.

And more importantly: why the systems that win are rarely the ones that are most “correct.”

The Bitter Lesson: Compute Beats Cleverness

The Bitter Lesson argues something uncomfortable but historically consistent:

Over the long run, general methods that leverage computation outperform methods that rely on human-designed structure.

In other words:

  • Hand-crafted features lose to learned representations

  • Domain-specific cleverness loses to scale

  • Carefully engineered systems lose to brute-force learning with enough data

In the article :

  • Chess → search + compute beat human heuristics

  • Vision → deep learning beat feature engineering

  • NLP → LLMs beat symbolic systems

The takeaway..
Don’t overdesign. Scale wins.

Worse is Better: Simplicity Beats Perfection

Gabriel’s “Worse is Better” (aka the New Jersey approach) makes a different—but equally pragmatic—argument. Growing up outside of NYC and living in New Jersey, I appreciate the subtext here as well:

Systems that are simpler, more usable, and easier to adopt will beat more elegant, “correct” systems.

This is why:

  • C beat Lisp

  • Unix philosophy beat more theoretically pure OS designs

  • Simple APIs win over “perfect” abstractions

The tradeoff:

  • You sacrifice completeness and elegance

  • You gain adoption, iteration speed, and survival

The system thats easier to use wins, not the one that is most beautiful.

In startup world this is the way as well IMHO, make it work then make it pretty.

The Juxtaposition: Two Paths to the Same Outcome

At first, these ideas feel orthogonal:

Bitter Lesson

  • Scale and compute win

  • Avoid human bias in design

  • Favor generality

Worse is Better

  • Simplicity and usability win

  • Accept imperfection in design

  • Favor practicality

They converge on the same meta-principle:

Don’t over-engineer early. Build systems that can grow.

The Modern AI Pattern: Foundation Models + Adaptation

Today’s AI stack is where these two ideas collide and reinforce each other.

Old mindset (pre-LLM era)

  • Build custom models per problem

  • Design features carefully

  • Optimize architecture upfront

  • Lock into frameworks early

New mindset

  • Wait for foundation models

  • Build thin layers on top

  • Adapt via prompting, RAG, fine-tuning

  • Iterate rapidly instead of perfecting upfront

This is both:

  • The Bitter Lesson → Let large-scale pretraining do the heavy lifting

  • Worse is Better → Use the simplest interface (prompting, APIs) to unlock value

An old startup guy I know Carlos Cashman used to preach essentially

  • don’t worry about your startup codebase, you’re gonna rewrite it anyway if you’re successful

At first it sounded absurd, but I’ve lived it a few times now.

Why “Good Enough + Adaptable” Wins

The winning systems today share three traits:

1. They are easy to use

Prompting an LLM > building a custom NLP pipeline
API calls > training from scratch

2. They learn continuously

Instead of encoding rules:

  • You refine prompts

  • You improve retrieval

  • You fine-tune when needed

3. They avoid premature rigidity

Hard-coded systems:

  • Break when assumptions change

  • Are expensive to modify

LLM-based systems:

  • Are inherently flexible

  • Can adapt to new tasks with minimal changes

Flexibility => Correctness

The New Jersey approach tolerates imperfection.
The Bitter Lesson rewards generality.

Together, they suggest:

The best system is not the most correct—it’s the one that can improve itself fastest over time.

This is a subtle but profound shift:

  • From designing the right system

  • To designing a system that can become right

oVer-cooking the design

In today’s AI landscape, over-engineering early can hurt you:

  • You lock into assumptions that foundation models will obsolete

  • You spend time solving problems compute will solve anyway

  • You reduce your ability to pivot as models improve

Meanwhile, teams that:

  • Use simple abstractions

  • Ride model improvements

  • Iterate quickly

end up leapfrogging more “thoughtful” systems.

A Practical Heuristic for Builders

When designing AI systems today, ask:

“Am I solving this with structure that scale will replace?”

And:

“Am I making this harder than it needs to be to adopt and iterate?”

If yes, you’re probably fighting both lessons.

Final Thought

The Bitter Lesson tells us:

Don’t outsmart compute.

Worse is Better tells us:

Don’t out-design usability.

Together, they tell us:

Build systems that are simple enough to use today, and flexible enough to get better tomorrow.

That’s not just a philosophy—it’s the blueprint behind modern AI.

Derek

Startup CTO, Software Hacker

Previous
Previous

Jevon’s paradox - Does it Apply?