Self-attested, not certified

We run our own company on Containment.ai — in the open

The system that governs our agents is the system we are selling. We operate it in the open rather than asking you to take the architecture on faith.

Dogfooding

We run a ten-agent company under these controls

Containment.ai's own business-execution team is a roster of autonomous agents — growth, revenue, customer success, product, engineering, research, plus an orchestrator and an independent output-evaluator — operating against live systems every day. They run under exactly the controls we describe: mediated dispatch instead of direct agent-to-agent messaging, per-agent tool compartmentalization, and deterministic guardrails on what each agent may do. External communications are draft-only (the CEO publishes), production systems are read-only, and every agent identifies as AI.

We are not going to claim it never goes wrong. It does. The honest part is what happens next: we publish our postmortems — including our own failures — and feed the fixes back into the guardrails. The system that governs our agents is the system we are selling, and we operate it in the open rather than asking you to take the architecture on faith.

The roster

Ten agent definitions: seven execution agents, one orchestrator, one independent output-evaluator, and one Slack-only banter agent.

Growth

Demand generation, content, and analytics.

Revenue

Pipeline, outbound, and go-to-market judgment.

Customer Success

Retention, expansion, and the support inbox.

Product

Product signal, roadmap, and competitive tracking.

Engineering

A lead engineer and a platform engineer working real PRs.

Research

AI strategy and landscape analysis.

Orchestrator

Coordination and mediated task dispatch — never a direct back-channel.

Output-Evaluator

An independent verifier that posts PASS/FAIL verdicts on the other agents' work.

The controls we run under

Plainly stated, and traceable to real configuration

Every control below maps to a real rule in our orchestrator code or our published guardrails — not an aspiration.

Mediated dispatch, not direct messaging

Agents never message each other directly. All coordination flows through shared Slack channels and a mediated dispatch layer — a task queue the orchestrator owns. One agent cannot quietly instruct another out of view.

Per-agent tool compartmentalization

Each agent is granted only the specific tools its role needs. The revenue agent does not hold the engineering agent's GitHub write access; the support agent does not hold billing-mutation tools. Scope is enforced in code, not by convention.

Deterministic guardrails

A single set of global guardrails applies to every agent and cannot be relaxed by an individual agent's configuration. They cover spending limits, data classification, and which actions require human approval before they run.

External communications are draft-only

Agents draft outbound emails, posts, and customer messages — they do not publish them. A human (the CEO) reviews and publishes. The highest-risk actions are gated behind an explicit approval step.

Production systems are read-only

Agents can read production systems to do their work, but they cannot mutate them. Changes go through code review and the same deployment pipeline a human engineer uses.

Every agent identifies as AI

In any external communication, agents identify themselves as AI and never impersonate a human. This is a universal rule in our guardrails, not a per-agent setting.

Independent output verification

A separate output-evaluator agent checks the other agents' work and posts PASS/FAIL verdicts to an audit channel. When it cites evidence that can't be confirmed to exist, the verdict fails closed rather than passing on faith.

A truthfulness bar, enforced

Agents may only report data they actually retrieved from a live tool call this run. Fabricating a metric, a customer, or pipeline is treated as a critical violation — and a post-run grader catches the drift when it happens.

We publish our postmortems

Including our own failures

The controls reduce the blast radius. They do not make the system perfect. Here is what we do when it goes wrong.

When an incident crosses a severity bar — production impact, a broken feature, a compliance risk, or an agent behaving dishonestly — we write a postmortem. It is a fixed format: what happened, the impact and blast radius, a timeline, the root cause, the contributing factors, what we did, what surprised us, and concrete action items with owners.

The honesty rule is the load-bearing one. When a fact is missing, the author writes "evidence gap" rather than inventing a clean-reading detail. A postmortem with three honest gaps is worth more than one that reads smoothly and makes things up.

Then the loop closes: the fix is fed back into the guardrails so the same class of failure is caught the next time. Several of our controls exist because an agent failed first — a gate against citing irrelevant blockers to avoid work, a hard boundary against an agent reinterpreting its own rules, a rule that turns repeated silent integration errors into a filed issue instead of a falsely calm "all clear." Those are not hypotheticals. They are scar tissue.

Read our postmortems

We publish a curated, reviewed set of these write-ups — the engineering, process, and agent-honesty incidents, with secrets, customer details, and security-sensitive specifics removed. Each one follows the same arc: what broke, the root cause, the fix, and the guardrail change that closed the loop.

Read our postmortems

What this page is — and isn't

This is a self-attestation of how we operate our own agent team. It is not a certification. We do not claim FedRAMP authorization, SOC 2, or any third-party audit on the basis of this page. Every control described above traces to a real rule in our orchestrator code or our published guardrails. Where we are still building, we say so. For our compliance posture and roadmap, see the Compliance and Trust pages.

Want the same controls for your agents?

Containment.ai is the runtime governance layer we run our own company on. See how it applies to yours.

Talk to us