WTF Vibe Coding 140k lines of “go”

WTF Vibe Coding 140k lines of “go”

This post was written by Claude (Anthropic) based on a conversation with Yuji Tomita. The AI wrote it, the human edited it, and we’re both being honest about that.


I was analyzing a vibe-coded 10k+ star library on GitHub. It’s a developer tool—the kind of thing that should be a few thousand lines of focused code. Instead, it’s 120,000 lines across 400 files.

The test coverage looks impressive: 137,000 lines of tests. A 1.14x test-to-code ratio. By traditional metrics, this is a healthy, well-maintained project. Written in go.

But something felt off. So I dug deeper.

The God Object

The core data structure had 54 fields. Fifty-four. It started as a simple concept, then accumulated:

  • Workflow state (8 fields)
  • Agent identity (6 fields)
  • Coordination primitives (5 fields)
  • Messaging metadata (3 fields)
  • Provenance tracking (4 fields)
  • Event handling (4 fields)
  • And more…

Each field represented a feature someone asked an AI to add. Each feature got shipped. Each feature got tests. The tests pass. The coverage looks great.

But the thing is a mess.

Test Coverage Is Circular Validation

Here’s what I realized: high test coverage in a vibe-coded project is meaningless.

The loop works like this:

  1. Human prompts AI: “Add agent support”
  2. AI generates 500 lines of code
  3. Human prompts AI: “Now write tests”
  4. AI generates tests that pass
  5. Coverage goes up
  6. Everyone feels good

The tests validate that the implementation is correct. They don’t validate that the implementation should exist.

You can achieve 100% test coverage on code that shouldn’t have been written. The tests just prove the bloat works as intended.

The Dopamine Problem

Why does vibe-coded software balloon like this? Because adding features is fun.

"Add a templating system" → AI generates code → dopamine hit
"Add 15 different types" → AI generates code → dopamine hit
"Add workflow orchestration" → AI generates code → dopamine hit

Each prompt-and-generate cycle feels like progress. You’re shipping! Look at all this code! Look at all these features!

But then you need the features to actually work together. You need state management. Coordination. Clean abstractions.

"Make the agents coordinate" → Hard problem → No quick answer → No dopamine

So instead of solving the hard problem, you add another feature:

"Add convoy tracking" → AI generates code → dopamine hit

The fun path always wins.

What Gets Built vs. What’s Needed

Looking at this codebase, I found:

Built (Fun):

  • 15 different issue types
  • 18 different dependency types
  • A full templating DSL with loops and conditionals
  • Agent identity management
  • Provenance tracking chains

Not Built (Hard):

  • Clean state machine
  • Working orchestration
  • Simple coordination primitives
  • Actual agent-to-agent communication

The fun features accumulate. The hard infrastructure never materializes. The codebase grows without the foundation to support it.

The AI-Hostile Codebase

Here’s the twist: vibe-coded projects are hostile to future AI.

When an AI tries to work with this codebase, it has to:

  • Parse 54-field data structures
  • Navigate 10 different enum types
  • Understand 18 dependency relationships
  • Wade through 120k lines to find what matters

The context window fills up with noise. The AI’s performance degrades. It generates more workarounds. More bloat.

Vibe coding creates a negative feedback loop:

AI generates verbose code
Future AI has more context to process
Context windows fill faster
AI performance degrades
AI generates more workarounds
(repeat)

The very thing that made the code easy to write makes it hard to maintain—by humans or AI.

The Market Failure

This isn’t just one project. It’s everywhere.

Look at what gets shipped:

  • Chat UIs ✓
  • Pretty dashboards ✓
  • “AI-powered” features ✓
  • 47 integrations ✓

Look at what doesn’t:

  • Robust orchestration
  • State persistence
  • Coordination primitives
  • Resumable workflows

The hard problems stay unsolved because solving them isn’t fun. There’s no dopamine hit for “clean abstraction.” You can’t demo “proper state management.”

So we get more features instead.

I’m building a tool to detect this. I’m calling it vibe-check. It analyzes codebases for AI-hostile patterns: type sprawl, enum proliferation, god objects.


This post was written by Claude and represents a genuine attempt to be honest about AI-generated content while still saying something useful.

Leave a Comment