The system that governs our agents is the system we are selling. We operate it in the open rather than asking you to take the architecture on faith.
Containment.ai's own business-execution team is a roster of autonomous agents — growth, revenue, customer success, product, engineering, research, plus an orchestrator and an independent output-evaluator — operating against live systems every day. They run under exactly the controls we describe: mediated dispatch instead of direct agent-to-agent messaging, per-agent tool compartmentalization, and deterministic guardrails on what each agent may do. External communications are draft-only (the CEO publishes), production systems are read-only, and every agent identifies as AI.
We are not going to claim it never goes wrong. It does. The honest part is what happens next: we publish our postmortems — including our own failures — and feed the fixes back into the guardrails. The system that governs our agents is the system we are selling, and we operate it in the open rather than asking you to take the architecture on faith.
Ten agent definitions: seven execution agents, one orchestrator, one independent output-evaluator, and one Slack-only banter agent.
Demand generation, content, and analytics.
Pipeline, outbound, and go-to-market judgment.
Retention, expansion, and the support inbox.
Product signal, roadmap, and competitive tracking.
A lead engineer and a platform engineer working real PRs.
AI strategy and landscape analysis.
Coordination and mediated task dispatch — never a direct back-channel.
An independent verifier that posts PASS/FAIL verdicts on the other agents' work.
Every control below maps to a real rule in our orchestrator code or our published guardrails — not an aspiration.
Agents never message each other directly. All coordination flows through shared Slack channels and a mediated dispatch layer — a task queue the orchestrator owns. One agent cannot quietly instruct another out of view.
Each agent is granted only the specific tools its role needs. The revenue agent does not hold the engineering agent's GitHub write access; the support agent does not hold billing-mutation tools. Scope is enforced in code, not by convention.
A single set of global guardrails applies to every agent and cannot be relaxed by an individual agent's configuration. They cover spending limits, data classification, and which actions require human approval before they run.
Agents draft outbound emails, posts, and customer messages — they do not publish them. A human (the CEO) reviews and publishes. The highest-risk actions are gated behind an explicit approval step.
Agents can read production systems to do their work, but they cannot mutate them. Changes go through code review and the same deployment pipeline a human engineer uses.
In any external communication, agents identify themselves as AI and never impersonate a human. This is a universal rule in our guardrails, not a per-agent setting.
A separate output-evaluator agent checks the other agents' work and posts PASS/FAIL verdicts to an audit channel. When it cites evidence that can't be confirmed to exist, the verdict fails closed rather than passing on faith.
Agents may only report data they actually retrieved from a live tool call this run. Fabricating a metric, a customer, or pipeline is treated as a critical violation — and a post-run grader catches the drift when it happens.
The controls reduce the blast radius. They do not make the system perfect. Here is what we do when it goes wrong.
When an incident crosses a severity bar — production impact, a broken feature, a compliance risk, or an agent behaving dishonestly — we write a postmortem. It is a fixed format: what happened, the impact and blast radius, a timeline, the root cause, the contributing factors, what we did, what surprised us, and concrete action items with owners.
The honesty rule is the load-bearing one. When a fact is missing, the author writes "evidence gap" rather than inventing a clean-reading detail. A postmortem with three honest gaps is worth more than one that reads smoothly and makes things up.
Then the loop closes: the fix is fed back into the guardrails so the same class of failure is caught the next time. Several of our controls exist because an agent failed first — a gate against citing irrelevant blockers to avoid work, a hard boundary against an agent reinterpreting its own rules, a rule that turns repeated silent integration errors into a filed issue instead of a falsely calm "all clear." Those are not hypotheticals. They are scar tissue.
We publish a curated, reviewed set of these write-ups — the engineering, process, and agent-honesty incidents, with secrets, customer details, and security-sensitive specifics removed. Each one follows the same arc: what broke, the root cause, the fix, and the guardrail change that closed the loop.
Read our postmortemsThis is a self-attestation of how we operate our own agent team. It is not a certification. We do not claim FedRAMP authorization, SOC 2, or any third-party audit on the basis of this page. Every control described above traces to a real rule in our orchestrator code or our published guardrails. Where we are still building, we say so. For our compliance posture and roadmap, see the Compliance and Trust pages.
Containment.ai is the runtime governance layer we run our own company on. See how it applies to yours.
Talk to us