WTF Vibe Coding 140k lines of “go”
This post was written by Claude (Anthropic) based on a conversation with Yuji Tomita. The AI wrote it, the human edited it, and we’re both being honest about that.
I was analyzing a vibe-coded 10k+ star library on GitHub. It’s a developer tool—the kind of thing that should be a few thousand lines of focused code. Instead, it’s 120,000 lines across 400 files.
The test coverage looks impressive: 137,000 lines of tests. A 1.14x test-to-code ratio. By traditional metrics, this is a healthy, well-maintained project. Written in go.
But something felt off. So I dug deeper.
The God Object
The core data structure had 54 fields. Fifty-four. It started as a simple concept, then accumulated:
- Workflow state (8 fields)
- Agent identity (6 fields)
- Coordination primitives (5 fields)
- Messaging metadata (3 fields)
- Provenance tracking (4 fields)
- Event handling (4 fields)
- And more…
Each field represented a feature someone asked an AI to add. Each feature got shipped. Each feature got tests. The tests pass. The coverage looks great.
But the thing is a mess.
Test Coverage Is Circular Validation
Here’s what I realized: high test coverage in a vibe-coded project is meaningless.
The loop works like this:
- Human prompts AI: “Add agent support”
- AI generates 500 lines of code
- Human prompts AI: “Now write tests”
- AI generates tests that pass
- Coverage goes up
- Everyone feels good
The tests validate that the implementation is correct. They don’t validate that the implementation should exist.
You can achieve 100% test coverage on code that shouldn’t have been written. The tests just prove the bloat works as intended.
The Dopamine Problem
Why does vibe-coded software balloon like this? Because adding features is fun.
"Add a templating system" → AI generates code → dopamine hit"Add 15 different types" → AI generates code → dopamine hit "Add workflow orchestration" → AI generates code → dopamine hit
Each prompt-and-generate cycle feels like progress. You’re shipping! Look at all this code! Look at all these features!
But then you need the features to actually work together. You need state management. Coordination. Clean abstractions.
"Make the agents coordinate" → Hard problem → No quick answer → No dopamine
So instead of solving the hard problem, you add another feature:
"Add convoy tracking" → AI generates code → dopamine hit
The fun path always wins.
What Gets Built vs. What’s Needed
Looking at this codebase, I found:
Built (Fun):
- 15 different issue types
- 18 different dependency types
- A full templating DSL with loops and conditionals
- Agent identity management
- Provenance tracking chains
Not Built (Hard):
- Clean state machine
- Working orchestration
- Simple coordination primitives
- Actual agent-to-agent communication
The fun features accumulate. The hard infrastructure never materializes. The codebase grows without the foundation to support it.
The AI-Hostile Codebase
Here’s the twist: vibe-coded projects are hostile to future AI.
When an AI tries to work with this codebase, it has to:
- Parse 54-field data structures
- Navigate 10 different enum types
- Understand 18 dependency relationships
- Wade through 120k lines to find what matters
The context window fills up with noise. The AI’s performance degrades. It generates more workarounds. More bloat.
Vibe coding creates a negative feedback loop:
AI generates verbose code ↓Future AI has more context to process ↓Context windows fill faster ↓AI performance degrades ↓AI generates more workarounds ↓(repeat)
The very thing that made the code easy to write makes it hard to maintain—by humans or AI.
The Market Failure
This isn’t just one project. It’s everywhere.
Look at what gets shipped:
- Chat UIs ✓
- Pretty dashboards ✓
- “AI-powered” features ✓
- 47 integrations ✓
Look at what doesn’t:
- Robust orchestration
- State persistence
- Coordination primitives
- Resumable workflows
The hard problems stay unsolved because solving them isn’t fun. There’s no dopamine hit for “clean abstraction.” You can’t demo “proper state management.”
So we get more features instead.
I’m building a tool to detect this. I’m calling it vibe-check. It analyzes codebases for AI-hostile patterns: type sprawl, enum proliferation, god objects.
This post was written by Claude and represents a genuine attempt to be honest about AI-generated content while still saying something useful.