reasonable-agents — harness-enforced adversarial review for Claude Code

The hope is the bug

A review instruction sitting in a sibling file does nothing if the turn ends before anyone runs it. And an agent that just produced a wall of confident work is the worst possible judge of whether that work holds up. Packs that ship a reviewer as prose and hope the agent reads it are betting on exactly the wrong reader.

reasonable-agents is a Claude Code plugin that moves the review out of the agent's good intentions and into the harness. Two hooks do it.

Two hooks

  *-expert subagent runs
        ↓
  H4 · pairing
     paired *-reviewer hasn't run this session
     → inject a blocking directive to run it
        ↓
  reviewer runs
        ↓
  H6 · output validation
     output vs per-persona schema: required tables + a verdict keyword
     miss → re-invoke directive naming the specific missing element
     (capped by a per-session circuit breaker)
        ↓
  review on the record          (both hooks fail open)

The harness drives the review. It still can't make the model obey — that's the next section.

H4 — pairing

After an *-expert subagent runs, if its paired *-reviewer exists and hasn't run this session, H4 injects a directive to run it before the turn proceeds. The dedup matches the structured subagent_type field, not a name mentioned in prose — so a reviewer merely talked about doesn't count as run.

H6 — output validation

After a reviewer runs, H6 checks its output against a per-persona schema: the required tables plus a verdict keyword. Miss one and it injects a re-invoke directive that names the specific thing missing, capped by a per-session circuit breaker so the loop can't run away.

Both hooks fail open. A missing schema, empty output, a parse error, a missing dependency: exit cleanly, stay out of the way. Enforcement defaults on — a switch that fails to propagate can't silently turn it off.

The skeptic caught its own bugs, three times

Not a thought experiment. The review earned its keep on the plugin's own construction — three times, on bugs the code itself never revealed:

Bare-string dedup. The first dedup matched a reviewer's name as a quoted string anywhere in the transcript, so a reviewer merely mentioned in prose counted as "already run" and silently suppressed the directive. Fixed to match the structured subagent_type field.
Kill-switch cascade. An early kill-switch used a positional default that could read as "off" on an empty value — enforcement disabled by a propagation glitch. Fixed to normalize and explicit-set-match, default-enforce.
A vacuous schema. Keying a reviewer's compatibility table on the word "Breaking" would have passed vacuously, because that word shows up in a neighboring table's cells. The schema keys on "Change" now, with a test proving the empty case still blocks.

None got caught by writing the code. All three got caught by running the skeptic against the plan. That is the product, run on itself.

Honest by design

The one line the whole project is built on:

Does the hook force the review?

No — and the package says so. Compliance with an injected directive is model-mediated, not mechanical. The hook makes the review fire; it can't make the model obey. So reasonable-agents publishes a measured compliance rate instead of calling the review 'forced' — the live pairing demo ran 10 times, the expert engaged in 9, and the model ran the reviewer in all 9.

That honesty is the point, not a disclaimer bolted on. On a tool whose whole pitch is trustworthy review, saying exactly what it can and can't guarantee is the credibility.

The starter pack

An empty harness can't be fairly evaluated, so the plugin ships six born-clean expert + reviewer + validator triples — authored fresh from public domain knowledge, not scrubbed from anyone's config:

software-architecture-review
agent-and-prompt-design
api-contract-design
security-threat-modeling
test-strategy-design
boss-fight-design

Each was dogfooded through the plugin's own chain before shipping: a clean headless session runs the expert, H4 fires, the reviewer runs, and H6 validates its natural output against its own schema. It passed first try, every time. What that proves is process — briefing-grounded, adversarially paired, schema-checked — not that the expert is correct.

Install

Requires Claude Code 2.1.109 or newer.

/plugin marketplace add ReasonEquals/reasonable-agents
/plugin install reasonable-agents@reasonable-agents
/plugin list

Turn it off any time with the REASONABLE_AGENTS_DISABLE environment variable. You can also add it from a local clone — the full quickstart, fixture demo, and per-hook kill-switches are in the README.

Status & scope

Community marketplace: submitted

This release is the enforcement chain (v0.1), the starter pack, and the receipts. For now, install from the repo above; the community-marketplace listing is pending.

You can author your own expert + reviewer + validator triad today — that's what the enforcement chain is for, and it's how all six starter triples were built. What's a later phase is automated minting (generating a triad from a researched briefing on demand), plus the eval scaffolds — both gated on whether this release earns real use.

What "validated" means, and what it doesn't. Validated here is process-validated — briefing-grounded, adversarially paired, schema-checked. It is not a domain-accuracy benchmark and not a hard gate. Reviewer lookup is user-scope only (~/.claude/agents/), dedup is best-effort across sessions, and the hooks fail open by design — they reduce missed reviews, they don't guarantee them. All of it is written down in the repo's CLAIMS.md and SECURITY.md.