Agent Team Philosophy

Two Principles. One Quality System.

The team is built on two compounding ideas — each powerful alone, decisive together.

Philosophy 01

Quality emerges from adversarial loops

A critic agent actively tries to break what a builder just produced. The builder fixes only what failed. The critic tries again. This loop runs unattended until nothing breaks — mimicking days of real-team code review, compressed into minutes, with no human intervention required.

critic → builder → critic unattended no human needed

Philosophy 02

Specialization beats generalism

Twelve agents each own exactly one part of the problem. A storage agent that only sees schema files reasons better about schema than a session that has seen everything. Specialization removes context pollution, role confusion, and the quality cliff that degrades single-session output past ~200 lines.

12 specialists clean context role clarity
Philosophy 01 · The Adversarial Loop

How It Works

The Critic–Builder Feedback Cycle

The critic is not a reviewer — it actively constructs failure scenarios, adversarial inputs, and edge cases the builder didn't consider. When it finds something, it returns a specific ISSUES list. The builder fixes only those items. Then the critic runs again from scratch.

Repeats up to N iterations · No human intervention required

Philosophy 01 · Why It's Different

What Makes It Different

The adversarial loop isn't human review and it isn't automated testing — it's a third thing that combines properties of both.

Unlike human review
No calendar dependency. No reviewer fatigue. No social pressure to approve work from a colleague. The critic has no relationship with the builder — its only job is to find failures, and it runs the moment the builder finishes.
Unlike automated tests
Tests verify what you thought to test. The critic actively constructs scenarios you didn't think of — adversarial inputs, race conditions, implicit assumptions, edge cases outside the spec. It looks for what you missed, not what you covered.
Philosophy 01 · Evidence

What Breaks Without It

Skip critic
Subtle logic errors, edge case gaps, and security issues reach production review. The critic actively tries to construct failure scenarios — it is harder to satisfy than a reviewer because its job is adversarial, not approving. Without it, the builder's blind spots become the reviewer's blind spots too.
Philosophy 02 · The Problem

The Single-Agent Trap

What Goes Wrong When One Agent Does Everything

These are not hypothetical failure modes — they are the structural problems that drove the team's design.

Context Pollution

An agent that has seen schema, frontend, tests, and business requirements all in one session reasons poorly about any of them. Each new piece of context crowds out earlier reasoning. The signal-to-noise ratio drops with every tool call.

Role Confusion

A general-purpose agent asked to both design and implement makes compromises — cutting corners on design to ship faster, or over-engineering implementation to prove capability. Specialization removes this tension completely.

Quality Cliff

Single-session quality degrades predictably as tasks grow. The first 200 lines are good. By 500 lines, the agent is fighting its own earlier decisions. By 1000 lines, it contradicts the architecture it designed 20 minutes ago.

Philosophy 02 · The Solution

12 Agents, Each With One Job

Every agent has a single domain, a single model tier, and a single place in the sequence. No agent makes decisions outside its scope.

orchestrator
Routes and sequences, never writes code
architect
Design decisions, pre-implementation
ideator
Lateral thinking, output to human only
critic
Adversarial review, tries to break the code
🌐 Playwright
frontend
UI, React, TypeScript, Tailwind
🌐 Playwright
backend
API, DB queries, Auth, Supabase
storage
All storage, sole RLS owner
researcher
Web research, docs, library investigation
tester
Tests — write and verify
🌐 Playwright
reviewer
Code review, read only, structured output
🌐 Playwright
explorer
Codebase navigation, read only, cheap
author
Docs and changelog, last step only
opus — expensive / rare
sonnet — workhorse
haiku — cheap / constant
Philosophy 02 · How It Runs

Sequential. Foreground. One at a Time.

Parallel agents sound faster. They're not better. Sequential execution keeps context clean, makes dependencies explicit, and means each agent's output is available to feed directly into the next agent's brief.

Focused context beats broad context

Each agent starts with a clean context window containing only what its role requires. This is not about model capability — it's about signal quality. A storage agent that only sees schema files reasons better about schema than any agent that has seen everything.

Delegate early, not late

The cost of fixing a wrong design after implementation is far higher than the token cost of running architect before builder. When in doubt, add the pre-implementation step. An architect brief is cheap. A builder rewrite is not.

The constraint is clarity, not speed

AI generates code faster than humans can review it. The bottleneck is always intent quality — a vague brief produces vague code regardless of which agent runs it. Sequential handoffs force each brief to be explicit and complete.

Philosophy 02 · Evidence

What Breaks Without It

Skip explorer
Builders create files or patterns inconsistent with the codebase — wrong directory, wrong naming convention, wrong abstraction level. Explorer's "existing patterns to follow" section is what keeps builders consistent on multi-file tasks. Without it, every builder starts from assumptions, not facts.
Skip architect
Builders solve the wrong problem elegantly. Correct implementation of a wrong design is the most expensive kind of rework — the code works, passes tests, clears review, and still needs to be thrown away. Architect prevents this by producing a written design decision before any code is written.

Cost & Quality

When the Overhead Is Worth It

Subagents multiply token usage by 4–7× versus a single session. The multiplier is justified when focused context produces better output than one bloated session. It is not justified for simple, single-file tasks.

Lite Mode

explorer → builder

  • Everyday tasks
  • Single-domain changes
  • Simple bug fixes
  • Quick sessions that complete cleanly in one context window

Full Pipeline /build

explorer → architect → builders → critic loop → reviewer → author

  • Features touching multiple layers
  • Where quality matters more than speed
  • When a single session would degrade mid-way
  • High-stakes changes where the adversarial loop earns its token cost

See It In Practice

Each pipeline has a dedicated page explaining the agents involved, the sequence, and the design decisions behind it.

Get the Setup

The full agent team — all 12 agents, slash commands, and skills — is open source and ready to deploy to any Claude Code project.

View on GitHub