Academy home

METHODOLOGY

Harness Engineering.

You wrote the rules, but the agent ignores them. You ran the same prompt, but got different results.
Harness Engineering is the methodology that fixes both — at the structural level.

Core Formula

Agent = Model + Harness

Models are the ceiling of capability. The harness is what makes capability consistent.
An agent without a harness can be brilliant once and useless the next minute. With a harness, even a modest model delivers reliably.

Why Harness Engineering exists.

Without a harness

  • · Rules go in the prompt. The agent reads them once, then drifts.
  • · The CEO ends up doing search, writing code, running analyses — the management role collapses into the worker role.
  • · Same prompt, different answer every session.
  • · Each new agent makes the system more chaotic, not more capable.
  • · Switch models and you start from zero.

With a harness

  • · Execution steps have hard checkpoints. Skipping is a logged violation.
  • · The CEO orchestrates. Specialists execute. Roles stay separate.
  • · Structured workflows produce repeatable outcomes.
  • · Teams scale because every agent passes the same audit standard.
  • · The harness travels. Capability accumulates across model upgrades.

The five principles of Harness Engineering.

1

Two control loops, never one.

Feedforward (Guides) tells the agent what to do before it acts. Feedback (Sensors) verifies what got done after. Guides alone drift over time. Sensors alone catch problems too late. Together they keep the system on rails.

Guide: "every reply must output a dispatch table first" → Sensor: "did a dispatch record actually get logged?" → violation entry on mismatch
2

The execution chain doesn't bend.

Every task moves through a fixed sequence: receive goal → tier → propose → dispatch → execute → QA → deliver. Skipping a step is structural failure. The harness defines checkpoints; checkpoints don't accept "this one didn't need it."

① tier → ② dispatch (mandatory) → ③ execute → ④ QA (mandatory) → ⑤ deliver
3

Roles separate, routing decides.

Management roles never execute. Execution roles never manage. Routing rules send each task to the most capable specialist, not the most convenient one.

code → harness_engineer / ai_systems_dev  |  strategy → strategist / decision_advisor
4

Five decision tiers.

L1 auto-execute → L2 expert review → L3 Lysander call → L4 strategic-alignment notify the operator → L5 value judgment. Specialists go before Lysander; the operator is involved only at L4 (notify) and L5 (call). The exact proportion handled below L3 fluctuates with task mix and is not strictly tracked.

L5 triggers: external contracts / budgets > $1M / value judgments with no objective optimum / operator-flagged items
5

Capability is auditable.

Every agent has an explicit capability description (B-grade or above: must reference a methodology). Auto-scoring runs at 90 to pass. Below the line, the agent's capability gets upgraded — not the score.

A-grade: "E2E testing framework using pytest plus Playwright" · C-grade (failing): "testing" or "quality management"

Synapse is Harness Engineering, made operational.

Synapse isn't a rules document. It's the five principles built into a working multi-agent team:

CLAUDE.md = harness configuration (Guides + Sensors + Constraints)
organization.yaml = roles and routing rules
hr_base.py = capability-audit engine (90-or-better passing line)
decision_rules.yaml = the five-tier decision framework
The 14 Skills = execution-chain checkpoints, made triggerable
evolution_engine = self-improvement loop

Read it as a methodology, then clone the repo to see exactly how each principle compiles into a file.