On January 12, 2026, the U.S. Secretary of War signed a memo directing the Chief Digital and Artificial Intelligence Office (CDAO) to enable seven "pace-setting projects" across warfighting, intelligence, and enterprise — including one called Agent Network, an explicit push for AI-enabled battle management and decision support. The memo's posture is plain: speed wins, and the risks of not moving fast outweigh the risks of imperfect alignment.
Four months later, on May 1, 2026, CISA, NSA, and the cyber arms of Australia, Canada, New Zealand, and the United Kingdom published the first coordinated multinational guidance on the Careful Adoption of Agentic AI Services — written for the same sectors the DoD memo just energized.
The two documents are not in tension. They are sequential. One says deploy agents now. The other says here are the five ways those agents fail in a mission environment. If you're a DoD prime, aerospace OEM, defense-AI vendor, or anyone whose roadmap now includes "agentic" anything, both documents are operative — and the gap between agent-side controls and the LLM data boundary is the one you're going to have to staff yourself.
What CISA and the Five Eyes actually published
The May 1 release is the third installment in an evolving Five Eyes series, following Guidelines for Secure AI System Development (2023) and Deploying AI Systems Securely (2024). The new guide focuses on agentic systems — components that "fundamentally rely on an AI model, such as an LLM, to interpret and reason about the state of the world and can autonomously make decisions and take actions."
CISA's announcement is direct about who it's for. "Critical infrastructure and defense sectors are increasingly deploying agentic AI systems to support mission-critical systems," the agency wrote in the press release. "However, these systems can introduce additional cybersecurity risks, such as an expanded attack surface, privilege creep, behavioral misalignment, and obscure event records."
The guidance organizes those risks into five categories: behavioral risks (agents acting unpredictably under prompt injection or data poisoning), structural risks (cascading failures across interconnected systems), accountability risks (opacity that prevents tracing decisions), plus privilege escalation and design/configuration failures. The Crowell & Moring client alert is blunt about why it matters operationally: it "may well inform DoD and broader U.S. government AI cybersecurity procurement requirements currently in the works."
CISA's actionable recommendations are tight: avoid broad or unrestricted access to sensitive data or critical systems; begin with low-risk, non-sensitive use cases; account for agentic AI security in your security model and risk posture. NSA, FBI, the Australian Cyber Security Centre, and the UK NCSC are co-authors. This is procurement-relevant language for any vendor selling into a CDAO pace-setting project.
What the agent-side controls actually cover
Read the five risk categories with one filter — can identity controls and lifecycle governance solve this on their own? — and the answer is partial. Identity and privilege abuse maps cleanly onto enterprise IAM, agent-identity platforms, and the CISA recommendation to avoid unrestricted access. Design and configuration failures map onto the kind of lifecycle governance that vets agent definitions before publication.
The other three categories — behavioral misalignment, structural brittleness, and accountability gaps — don't yield to the same controls. They're failure modes that surface at runtime, when the agent is interpreting content from a fetched document, a tool result, a memory store, or a chat with a human, and that content changes what the agent does next. CISA's own list of risks names "expanded attack surface" and "obscure event records" before it names anything else.
The complementary-layer architecture the industry is converging toward — the CSA AI Agent Resource Management (AARM) Protocol-Gateway pattern, the OWASP Top 10 for Agentic Applications 2026, and Entra Agent ID-class identity platforms — is improving fast on the agent side of that boundary. Containment.AI is aligned with, and working toward Core conformance with, the CSA AARM Protocol-Gateway pattern as it stabilizes; an AGT adapter is in design. None of those layers fully cover one risk surface: what data the agent is allowed to put in front of the model at the moment of the call.
The data-boundary layer the guidance doesn't enforce
CISA's behavioral-risk category — agents acting "unpredictably" under prompt injection or data poisoning — is a data-boundary problem. So is the accountability risk of "obscure event records." When a maintenance-planning agent at an aerospace OEM ingests a vendor advisory containing an injected instruction, the model is doing exactly what it was trained to do with the input it received. The input was untrusted. Identity controls don't see the input.
The same is true on the way out. When a defense contractor's analyst pastes a controlled unclassified information (CUI) excerpt into ChatGPT to "summarize this," the question is not "did the model decide correctly?" It is "is this payload allowed to cross this boundary, for this user, under this regulation?" That question is not answerable inside the model. It is answerable only at the data-boundary where content crosses the LLM call.
That is the layer Containment.AI focuses on. The proxy and the browser extension inspect what is about to leave the organization at the prompt and response boundaries, evaluate it against the customer's configured policy (CUI handling, FedRAMP boundary scope, NIST AI RMF profile), and allow, redact, or block the data movement in real time — with an audit trail that survives a contracting officer's question six months later. That doesn't replace identity governance or the AARM Protocol-Gateway pattern. Those layers govern what an agent does next. The data-boundary layer governs what crosses the LLM at all. The CISA guidance is explicit that both classes of control are required; it doesn't tell you which products fill which slot.
What to do this week if you're a defense-sector buyer
Three concrete steps, scoped to the CISA recommendations and the DoD AI Strategy posture:
- Map your existing controls to the five CISA risk categories. Most defense-sector stacks already have meaningful coverage on privilege escalation and design/configuration failures, especially in identity-first platforms. Coverage on behavioral, structural, and accountability risks is typically thinner and harder to demonstrate to a procuring officer.
- Inventory your agent-data boundaries before the next pace-setting-project demand signal. For every agent in pilot or production, list where untrusted content enters (vendor docs, OT telemetry, RAG, customer chat) and where data leaves (API, CRM, email, model call). The Defense Innovation Unit's MYSTIC DEPOT solicitation for a government-specific AI evaluation harness is going to ask similar questions in writing.
- Decide where runtime data-boundary enforcement lives in your stack. Identity controls govern the agent. Lifecycle controls govern the agent definition. Something needs to govern what crosses the LLM call in real time. "The model will refuse" is not a defensible answer in front of NSA's AI Security Center or a CDAO program officer.
The CISA guidance is currently advisory. The Crowell alert notes that proposed GSA Schedule AI contract clauses would already require contractors to preserve incident-relevant logs for at least 90 calendar days, and the trajectory from voluntary commitment to procurement-grade requirement has been short for every prior cybersecurity framework. Defense-sector buyers who solve the data-boundary problem before that conversion happens will not be the ones explaining to a contracting officer why their agent leaked CUI through a third-party model call.