P1 methodology Multi-AgentArchitectureSynapseAI Engineering

From a 44-Agent Monolith to a Modular Agent Marketplace

A forced architectural evolution

Around this time last year, my AI CEO Lysander and I built a forty-four-agent team. Yes, forty-four AI agents — each with a defined role, job description, and KPI — distributed across eight departments: Butler, RD, OBS, Content Ops, Harness Ops, Growth, Janus, Stock. For the first three months it ran beautifully. The intelligence daily auto-generated at 8 AM. At 10 AM the intelligence-action agent turned frontier news into executable tasks. The PMO produced weekly retros. I half-believed this was the end state.

Then in month four, a request killed that belief: I wanted to share the system with a friend.

It would not move.

The cost of a monolith

The problem wasn’t technical “won’t run.” The problem was cognitive: it wouldn’t be usable. My friend didn’t need Content Ops — he wasn’t producing content. He didn’t need Growth — he wasn’t running growth. But the forty-four-agent system was tightly coupled: HR cards cross-referenced each other, the role routing in organization.yaml hardcoded department boundaries, active_tasks.yaml assumed every department was online. He had two choices: take the whole forty-four (and have ninety percent of agents idle, eating context), or build from scratch.

The more embarrassing realization was about my own usage. When I just wanted a quick code review, the entire decision system loaded metadata for forty-four roles. Execution chain step ① triage, step ② dispatch, step ③ QA — the full machinery, every time. The overhead on an S-tier task was bigger than the task itself.

What I had built wasn’t a multi-agent team. It was an AI bureaucracy. The problem with bureaucracies isn’t that they are big. It’s that they don’t decompose.

Three decisions that drove the rebuild

The architectural evolution still in progress is built on three decisions.

First: from “organization” to “module.” I had organized agents the way humans organize companies — by department — because that was intuitive to me, the human user. Departments are an artifact of human collaboration. Their boundaries come from communication overhead and authority partitioning. Neither of those constraints exists between AI agents. The replacement axis is capability module: intelligence pipeline, content factory, QA gate, HR management, decision audit. Each module is a self-contained subset of agents plus configuration plus triggers. Installable. Removable. Self-contained.

Second: from “global config” to “module contracts.” Every agent used to read organization.yaml. Modify one role, the entire team flinches. Now each module exposes a contract.yaml declaring what it consumes, what it produces, what other modules it depends on. Modules communicate through contracts, not through a global namespace. This is the microservices playbook, but in the agent context it matters even more — because an agent’s “state” is the prompt context, and context pollution is significantly more expensive than service-call pollution.

Third: from “complete team” to “compose on demand.” When a new user adopts Synapse, they no longer get the forty-four-agent bundle by default. It works like installing npm packages. Just want the intelligence daily? Install the intel-pipeline module — three agents. Need content production too? Add content-factory — five agents. Each module has independent versioning, independent upgrade paths. Lysander the CEO becomes a scheduling kernel that routes tasks across whichever modules are installed.

The non-obvious lessons

The most counter-intuitive thing I learned during the rebuild: module boundaries should not follow business function. They should follow failure domains.

Take intelligence-daily and intelligence-action. From a business view they’re one pipeline; the obvious thing is to ship them as one module. But their failure modes are completely different. Daily failure is a source problem — network, token, scraping. Action failure is a decision problem — evaluation criteria, execution capability, edge cases. Once we split them into separate modules, we could swap evaluation logic without touching the collection layer. Debuggability went up a level. Failure domains, not feature domains.

The other recurring problem was configuration drift. CLAUDE.md used to hold every rule in one document — CEO boundary, execution chain, HR scoring, credentials, upgrade protocol. The 350-line hard ceiling we imposed last month wasn’t taste, it was forced: above that length, the model started silently ignoring rules later in the file. We split it into a Core (the main file) plus .claude/harness/*.md reference modules loaded on demand, with # [ADDED: YYYY-MM-DD] timestamps and a 180-day decay review. The governance ruleset itself became a module.

Reusable principles

If you’re building your own AI agent workflow, four principles to take with you.

One: don’t design AI teams using human org charts. Departments, hierarchies, reporting lines — these are constraints from human collaboration. Agents don’t have any of them. Organize by capability module or failure domain, not by job title.

Two: pay the cost of decomposability up front. A monolith is faster to ship at demo stage. The moment you want to distribute it, let someone else reuse it, or even switch yourself between contexts, the cost compounds exponentially. Contract-based design is something you regret skipping, never something you regret doing.

Three: prompts are code, and they need governance. Rule files rot. Rules contradict each other. Long contexts cause models to silently drop instructions. Line budgets, timestamps, periodic audits — all the engineering practices you apply to code apply equally to prompts.

Four: state pollution in agents is worse than state pollution in services. Because the state is natural-language context — no type checks, no boundaries. Module isolation is the cheapest defense against pollution.

If you’re building an AI engineering team, the Synapse framework — the full evolution from forty-four-agent monolith toward a modular marketplace, including the configuration templates and governance rules — is open. It is still iterating. Patterns travel even when implementations don’t.