The Alien Tool
The Alien Tool
Andrej Karpathy, one of the people who built these AI systems, recently wrote:
“I’ve never felt this much behind as a programmer. The profession is being dramatically refactored… Clearly some powerful alien tool was handed around except it comes with no manual and everyone has to figure out how to hold it and operate it, while the resulting magnitude 9 earthquake is rocking the profession.”
The most interesting word here is “alien.”
Not “new.” Not “powerful.” Alien.
An alien tool is one you can use without understanding. You press buttons. Things happen. You’re not sure why.
The Strange Finding
Here’s something strange.
Researchers gave developers real coding tasks. Half used AI assistants. Half didn’t. Then they measured two things: how fast people actually worked, and how fast people thought they worked.
The results:
BELIEVED ██████████████████████░░░░░░ +20% faster │ │ ◄── You are here │ (39% wrong) │ ACTUAL ░░░░░░░░░░░░░░░░░░████████████ -19% slower ◄─────────────────────────────► SLOWER FASTER
Developers using AI believed they were 20% faster.
They were actually 19% slower.
This isn’t one study. It’s replicated. A second study with 698 participants found the same pattern: performance improved by 3 points, but self-assessment overestimated by 4 points.
The gap between what we think we’re doing and what we’re actually doing has doubled compared to working without AI.
The Paradox
Here’s the weirder part.
You’d expect people who understand AI better to assess themselves more accurately. The opposite is true.
Higher AI literacy correlates with less accurate self-assessment. The people who know these tools best are most vulnerable to the illusion.
This isn’t a story about AI being bad. It’s a story about human cognition being strange.
Why This Happens
Cognitive scientists have names for what’s going on.
Cognitive offloading is when you delegate mental work to a tool. When the calculator does the math, you stop experiencing the calculation. The answer appears. You didn’t feel yourself produce it.
The same thing happens with AI coding. The code appears. You didn’t feel yourself write it. So you can’t accurately judge how hard it was.
The fluency illusion is simpler. When something feels easy, you assume you’ve mastered it. AI makes code generation effortless. That effortlessness creates false confidence.
Here’s the uncomfortable part: researchers have studied automation bias for decades. Pilots, doctors, factory workers. The finding is consistent:
“Automation bias occurs in both naive and expert participants, cannot be prevented by training or instructions.”
In one study, pilots shut off engines when false automated alerts told them to—even though those same pilots said in interviews they would never follow such an alert blindly.
We trust the machine over ourselves. We can’t help it.
This Has Happened Before
We’ve been here before. Sort of.
WHAT TOOLS ABSTRACT
2020s ╔═════════════════════════════════╗
AI ║ JUDGMENT ║ ◄── Different
║ (architecture, tradeoffs, ║
║ decisions you don't see) ║
╚═════════════════════════════════╝
▲
│
2010s ┌─────────────────────────────────┐
S.O. │ LOOKUP │
│ (answers on demand) │
└─────────────────────────────────┘
▲
│
2000s ┌─────────────────────────────────┐
IDEs │ SYNTAX │
│ (autocomplete, linting) │
└─────────────────────────────────┘
▲
│
1980s ┌─────────────────────────────────┐
Calc │ ARITHMETIC │
│ (mental math) │
└─────────────────────────────────┘
───────────────────────────────────────────
Each step: we traded understanding for speed.
The last step: we traded judgment.
Calculators. A study found 15% of university students couldn’t calculate 15% of 21. They’d passed math courses. They had calculators. They’d stopped doing arithmetic in their heads.
Stack Overflow. By 2018, researchers identified “Stack Overflow Driven Development”—programmers copying answers without understanding them. The code worked. The comprehension didn’t.
IDEs. Thirty years ago, programmers had to know their tools deeply because nothing else would help them. Now autocomplete finishes our thoughts before we have them.
Each tool traded understanding for speed. We accepted the trade.
What’s Different Now
Here’s what’s different.
Calculators abstracted arithmetic. IDEs abstracted syntax. Stack Overflow abstracted lookup.
AI abstracts judgment.
When you copy from Stack Overflow, you choose which answer to copy. When AI generates code, it already made fifty decisions you never saw.
Why this architecture? Why this data structure? Why this error handling approach?
The code exists. You approved it. You might not be able to explain it.
Greg Ceccarelli put it precisely:
“At AI speed, unresolved ambiguity becomes an immediate bug. Unstated assumption becomes immediate divergence.”
Three developers using AI independently will produce three incompatible systems. Not because they’re bad developers. Because each AI made different silent decisions.
The Split
The research shows a split.
Junior developers using AI: 35-39% faster. Senior developers using AI: 8-16% faster. Sometimes slower.
Why?
Junior developers have less context to verify. They accept AI output more readily. The work feels faster because there’s less resistance.
Senior developers have years of accumulated knowledge. They know what can go wrong. When AI suggests something, they check it against everything they’ve learned.
That checking takes time. Often more time than the AI saved.
The tools reward not-knowing. They penalize expertise.
The Reframe
Here’s the reframe.
AI didn’t break anything. AI revealed something.
Our sense of understanding was always fragile. We maintained it through slow work, repeated exposure, hallway conversations, code reviews. There was time to absorb. Time to question. Time to build mental models.
AI removed the time.
Now we see what was always true: our understanding was implicit, distributed across people, and easily lost. We just had enough slack in the system to not notice.
The alien tool didn’t create the problem. It made the problem visible.
What Now
I don’t have a tidy solution. But I have observations.
The perception gap is real. You probably can’t feel it. That’s the point.
The tools that make you fastest on easy tasks make you slowest on hard ones. Context-heavy work—architecture, integration, debugging—is where AI help becomes AI hindrance.
The pattern has repeated across decades of tools. What’s different is the scope. We’re abstracting judgment itself.
Some teams are experimenting with making decisions explicit before implementation. Writing down why before asking AI to write what. It’s early. The evidence is thin.
But here’s what I find interesting.
Karpathy called it an alien tool. Tools shape the people who use them. After decades of calculators, we lost mental arithmetic. After a decade of Stack Overflow, we lost some ability to derive from first principles.
What will we lose after a decade of AI coding?
What will we gain?
We’re running the experiment right now. The results aren’t in.
Roll up your sleeves.