Let's say your compliance team built a solid AIS Program. You documented your underwriting models, mapped your third-party vendors, established testing protocols for your pricing algorithms. You checked the boxes the NAIC Model Bulletin laid out when it was adopted in December 2023.
Here's the problem: that's not the only AI your company is running.
Your claims adjusters are using ChatGPT to draft denial letters. Your underwriters are pasting policy language into Claude for summary. Your actuaries are querying Gemini for regulatory precedents. Your customer service team is using Copilot to process correspondence. The NAIC's own website acknowledges this reality: "Tools like CoPilot, ChatGPT, Claude, and Gemini can answer questions, summarize documents, and assist with writing tasks."
None of that is in your AIS Program. And starting this year, examiners are going to ask.
The Examination Infrastructure Just Got Serious
The NAIC AI Systems Evaluation Tool has been in development since 2025. As of March 2026, it's being piloted by 12 participating states — and it's designed to give examiners a standardized, repeatable framework for assessing insurer AI governance during market conduct examinations.
The tool is expected to be adopted at the 2026 Fall National Meeting. That means by Q1 2027, any state that adopts it has a structured mechanism to ask very specific questions about your AI governance program — not just your documented models, but your entire AI footprint.
This isn't speculative. The NAIC Big Data and Artificial Intelligence (H) Working Group has a meeting scheduled for June 1, 2026 specifically to hear an update on the AI Systems Evaluation Tool Pilot. Examiners are getting trained. The infrastructure is being built. The only question is whether your governance program is ready for what they'll ask.
What the AIS Program Requirement Actually Says
The NAIC Model Bulletin, adopted December 4, 2023, requires insurers to create and implement "a written AIS Program, commensurate with an assessment of the risk." The bulletin explicitly sets forth "state insurance regulators' expectations on how insurers should govern the use of such technologies by or on behalf of the insurer to make or support such decisions."
Note that phrase: by or on behalf of the insurer. It doesn't say "by your actuarial models." It doesn't say "by your underwriting algorithms." It says by or on behalf of the insurer — which is a broader category than most compliance programs have addressed.
The bulletin also advises insurers of "documentation that a state Department of Insurance may request during an investigation or examination." That documentation obligation extends to every AI system influencing decisions. When an employee uses an LLM to draft a claims communication, and that draft influences the outcome, you have an AI-assisted consumer decision — whether or not it's in your model inventory.
The Shadow AI Problem in Insurance
The NAIC's own research shows the scale of AI adoption across the industry. According to the NAIC's AI survey data:
- Out of 193 auto insurers responding, 88% reported they use, plan to use, or plan to explore AI/ML models in their operations.
- Out of 93 health insurers responding, 92% said they currently use, plan to use, or plan to explore using AI or ML models in their operations.
Those numbers reflect documented, inventoried AI use. The undocumented use — the informal ChatGPT workflows, the Copilot integrations creeping into every Microsoft 365 deployment, the browser-based Claude sessions employees use without IT approval — isn't in those statistics. It's not in your AIS Program either.
Insurers built their compliance programs around the AI they knew about. The NAIC's examination framework is now going to probe the AI they didn't.
What the Evaluation Tool Will Actually Ask
Based on publicly available information about the AI Systems Evaluation Tool pilot, examiners will gather information about:
- The extent and use of AI by the insurance company in their operations
- Governance and risk mitigation practices — not just documented model governance, but actual oversight mechanisms
- Potentially high-risk AI models — a category that regulators are actively working to define more precisely
- Types of data used as inputs into AI systems
When an examiner asks about the "extent and use of AI in operations," they're not limiting the question to your actuarial model stack. The honest answer for most large insurers includes a sprawling ecosystem of employee-adopted LLM tools — most of which have no policy controls, no audit trail, and no connection to the AIS Program.
That's the gap.
Why Monitoring Alone Doesn't Close It
Some insurance compliance teams respond to this problem with logging and monitoring: capture what employees type into AI tools, review it periodically, flag violations manually. This approach has three problems.
First, it's retrospective. By the time you've reviewed the log, the claims denial has been sent, the customer communication has gone out, the policy interpretation has been acted on. You've documented the problem; you haven't prevented the harm.
Second, it doesn't produce the documentation examiners actually want. The NAIC framework asks for evidence of governance and risk mitigation practices — controls, not logs. A log of violations is evidence that violations occurred. A policy enforcement layer that blocked the violation is evidence of governance.
Third, it doesn't scale. If 88% of your auto operations staff are using AI tools, the review queue becomes unmanageable within weeks.
The AIS Program requirement exists because the NAIC recognized that decisions impacting consumers must be governed — not just observed. The same logic applies to employee LLM usage at the point of use, not after the fact.
What an Examination-Ready Governance Program Looks Like
An AIS Program that can withstand the NAIC AI Systems Evaluation Tool pilot questions needs to account for the full scope of AI usage. Practically, that means:
1. Policy scope that covers shadow AI. Your written AIS Program should explicitly address employee use of general-purpose LLM tools (ChatGPT, Claude, Copilot, Gemini) in addition to formal actuarial and underwriting systems. The NAIC's examination tool will probe the actual extent of AI use in operations — your documented policies need to match the real footprint.
2. Real-time enforcement at the point of use. Controls that activate when employees interact with AI tools — before a PII-laden claims document gets pasted into ChatGPT, before proprietary policyholder data gets sent to a third-party model API. Monitoring after the fact doesn't satisfy the governance expectation; control at the point of use does.
3. Audit trail documentation for consumer-affecting decisions. The Model Bulletin specifically advises insurers of documentation that regulators may request. That means you need a record of what AI was used, when, by whom, and what controls were in place — not just for your underwriting models, but for every AI-assisted decision workflow.
4. Third-party oversight that extends to employee tools. The NAIC also has a Third-Party Data and Models (H) Working Group actively developing a regulatory framework around third-party AI use. Browser-based LLM access routes your employees' work through OpenAI's, Anthropic's, and Google's infrastructure. That's third-party AI use, and regulators are increasingly treating it as such.
The Window Is Narrow
The AI Systems Evaluation Tool pilot runs through September 2026 and is expected to be adopted at the 2026 Fall National Meeting. For insurers operating in the 12 pilot states today, structured AI governance examinations are already possible. For everyone else, the adoption timeline suggests 2027 as the earliest broad deployment — but that window closes faster than most governance initiatives can be built from scratch.
The carriers that come through initial examinations cleanly will be the ones that treated the Model Bulletin seriously from the start — including the parts that apply to employee AI use, not just documented model inventories.
If your AIS Program doesn't have a section on employee LLM usage, it has a gap. The NAIC's examination framework is designed to find exactly that gap. The time to close it is before the examiner schedules the call.
Containment.AI enforces AI governance policies in real time — at the browser layer, before employees send sensitive data to external AI tools. Learn how insurance carriers are building AIS Program-compliant governance programs at app.containment.ai.