Twenty-four states and the District of Columbia have adopted the NAIC Model Bulletin on the Use of Artificial Intelligence Systems by Insurers since it cleared the NAIC executive committee in December 2023. Four additional states have enacted insurance-specific AI regulation or guidance that references or parallels its standards. Together they account for the overwhelming majority of domestic commercial and personal lines premium volume. What none of these jurisdictions has issued, until the NAIC Big Data and Artificial Intelligence (H) Working Group took a concrete step in spring 2026, is a standardized template specifying exactly what carriers must file to prove their AI programs satisfy the bulletin's requirements. That is the gap the working group is now closing, and the July 22, 2026 public meeting of the BDAI Working Group is where the compliance report structure takes its next formal step.
The Model Bulletin's Intentional Ambiguity and Its Practical Limits
The December 2023 Model Bulletin tells carriers to maintain a written AI program covering five functional domains: AI governance and accountability, risk management and internal controls, third-party vendor oversight, consumer transparency and notice, and responsiveness to regulatory inquiries. Within each domain it describes the type of information a regulator may request during an investigation or examination. It stops there. The bulletin specifies no format, no required data fields, no threshold for what constitutes a sufficiently documented governance program, and no definition of how a carrier demonstrates compliance as opposed to merely asserting it.
That ambiguity is a standard feature of principles-based regulation, not an oversight. Principles-based frameworks give carriers flexibility to build programs proportionate to their size and risk profile, avoid locking in documentation standards that technology will outpace within a few years, and allow regulators to adapt their expectations as AI practice evolves. For emerging technology regulation, that flexibility has real value. But it also creates an enforcement gap: a carrier can point to a written AI policy, a quarterly governance committee meeting, and a vendor management clause in its procurement contracts and claim bulletin compliance without producing a single data point on how any specific AI system performs, what data trained it, or whether its outputs have been tested for disparate impact across protected classes.
From reviewing how the 12 pilot states have approached AI examinations so far, the documentation gap is not hypothetical. State examiners arriving at a carrier for an AI governance review have found that "we have a policy" is the consistent answer to questions that the Model Bulletin's framework implies should produce specific evidence: bias testing results, model version histories, consumer complaint resolution records tied to AI-driven decisions, and written explainability protocols for adverse actions. A compliance report form with defined fields forces carriers to generate that evidence rather than gestioning toward it.
The Compliance Report Structure the NAIC Staff Circulated at the Spring 2026 Meeting
At the NAIC Spring 2026 National Meeting in Denver, NAIC staff presented a draft model compliance report structure to the BDAI Working Group. The document represents the most operationally specific output of any NAIC AI governance working group to date, because it translates the bulletin's five domain descriptions into nine disclosure components that a carrier's legal, actuarial, and compliance functions would complete and attest to.
The nine components are: an executive summary establishing the scope and geographic reach of the carrier's AI program; a board and senior management attestation confirming oversight responsibility; a models and data sources inventory covering both internal training data and external purchased data, with separate disclosure of selection bias controls and design constraints for each; a risk assessment framework section describing the carrier's methodology for classifying each AI system by risk tier; a model cards section providing structured technical documentation for each in-scope AI system; a corporate governance structure narrative showing the reporting chain from model developers to board-level oversight; a model drift and validation methodology section covering monitoring frequency, performance thresholds, and retraining triggers; a protected class inference and bias testing section reporting methodology and results by system; and a consumer complaint process disclosure documenting how AI-influenced adverse decisions are flagged, explained, and appealed.
The executive summary and board attestation components introduce an accountability layer the bulletin itself leaves implicit. Under the draft structure, a senior officer signs the compliance report, creating a named accountability chain from the carrier's AI program to a specific executive who vouches for its completeness and accuracy. That is a meaningful structural change from a written policy that sits in a compliance manual. Board attestation of AI governance documentation is already standard in public company proxy disclosures for carriers with exchange-listed securities, as AIG's March 2026 proxy demonstrated with its Global AI Policy, AI Advisory Council, and four-tier governance hierarchy. The compliance report structure would make equivalent attestation a regulatory filing requirement rather than an investor relations choice.
The Four-Exhibit Evaluation Tool and Its Relationship to the Compliance Report
The 12-state AI Systems Evaluation Tool pilot running from March through September 2026 is the regulatory counterpart to the compliance report, and the two instruments serve distinct but complementary functions. The compliance report is what carriers would file proactively, produced by the carrier's own compliance function to demonstrate an adequate AI program. The evaluation tool is what state examiners use during market conduct reviews and financial examinations to assess whether carriers can actually produce the evidence behind those representations. Understanding how the four exhibits map onto the compliance report's nine components clarifies what the full documentation package requires.
Exhibit A of the evaluation tool requires carriers to quantify their AI footprint: how many systems, across which business functions, affecting what volume of decisions. This maps to the models and data sources inventory in the compliance report. The scope question is harder than it sounds. Vendors embed machine learning components in underwriting workbenches, claims triage platforms, fraud detection systems, and customer service tools, and carriers often do not centrally track which of those embed AI under the NAIC definition. The NAIC's tool explicitly covers "vendor-embedded models, automated decision components inside larger platforms, and machine learning features that nobody in the organization categorizes as AI." A carrier using three vendor platforms and two internally developed models may in practice have twelve systems in scope once embedded components are counted.
Exhibit B addresses governance risk assessment, with a flexible format allowing carriers to respond via narrative or checklist. It maps to the compliance report's corporate governance structure, risk assessment framework, and board attestation components. The key distinction the NAIC pilot states have identified in early reviews is between governance infrastructure and governance theater. A policy document describing an AI risk committee, and the committee's quarterly meeting minutes, are not evidence that the committee functionally assesses AI system risk. Evidence of governance infrastructure requires the outputs the committee produces: meeting records that reference specific system performance metrics, escalation records for systems that breached risk thresholds, documented remediation steps for identified gaps. Exhibit B is designed to surface whether carriers have generated those outputs, not merely established the structures that should generate them.
Exhibit C is the most operationally demanding component of the evaluation tool and the one where carrier documentation gaps are most pronounced. It applies to high-risk AI systems: those used in underwriting, pricing, claims handling, fraud detection, and any other function where AI outputs materially affect consumer access to coverage or cost of coverage. For each high-risk system, carriers must produce documentation on model design and architecture, training data composition and vintage, validation procedures and performance metrics, bias testing methodology and results by protected class, and sample case files demonstrating how AI outputs contributed to specific underwriting or claims decisions. Third-party vendor models are fully within Exhibit C scope, with the carrier responsible for producing documentation it must obtain from the vendor. A carrier that purchased a GLM-based pricing model in 2021 and a gradient-boosted underwriting score in 2023 needs Exhibit C documentation for both, including the vendor's bias testing results, regardless of whether the vendor is contractually obligated to produce them.
Exhibit D addresses data details: sources, quality controls, representativeness testing, and discrimination risk assessment at the data layer. It maps to the models and data sources component of the compliance report, specifically the fields requiring disclosure of selection bias in internal datasets and design constraints in external data purchases. The NAIC has identified aerial imagery, social media data, and purchasing behavior data as categories warranting explicit discrimination risk analysis at the data layer, separate from the bias testing of model outputs required under Exhibit C.
Market Conduct Exams and the Scope for Follow-Up Examination
The 12 pilot states are deploying the evaluation tool across market conduct examinations, financial examinations, and financial analysis contexts, rather than creating a new standalone AI examination track. That multi-track deployment is significant because it means AI governance documentation can surface in exams that carriers already undergo for other reasons. A property and casualty market conduct exam triggered by complaint volume in personal auto could now incorporate Exhibit C requests for the carrier's pricing and underwriting AI systems. A life insurance financial examination focused on reserve adequacy and RBC ratios could include Exhibit B and D requests for the carrier's underwriting and claims triage models.
Whether the tool functions as an information-gathering instrument or a compliance enforcement framework depends on how each state's examination process handles the findings. The pilot states have not committed to a uniform outcome framework: some have indicated that examination findings from the tool will inform risk profiling for future examination scheduling, while others have reserved the right to treat material documentation deficiencies as examination findings requiring corrective action. That ambiguity is deliberate while the pilot runs. The NAIC will collect pilot data through September 2026, update the tool in October, expose it for public comment, and bring it to a vote at the Fall National Meeting in November 2026. The version adopted in November will include clearer guidance on how findings translate to examination outcomes, based on what the 12 pilot states learn over the next three months.
The pilot also tests whether the tool can be applied consistently across the insurance market without creating disproportionate burden on smaller carriers. The "principle of proportionality" built into the tool's design directs examiners to prioritize high-risk AI systems and scale documentation expectations to the carrier's size and AI deployment complexity. A regional personal lines carrier using a vendor's credit-based insurance score and a rules-based claims routing system faces a very different documentation burden than a national multiline carrier with internally developed ML models across pricing, underwriting, fraud, and claims. The pilot will produce data on whether proportionality in practice prevents documentation requirements from effectively applying only to the largest carriers while leaving AI governance at smaller insurers unexamined.
What Leading Reinsurers Already Ask in AI Governance Questionnaires
Reviewing the draft compliance report categories circulating ahead of the July 22 working group meeting alongside the AI governance questionnaires that leading reinsurers have built into cedant due diligence processes over the past 18 months produces a striking parallel. The fields the NAIC compliance report would require carriers to complete in the model cards section, the bias testing section, and the governance attestation section closely mirror what Munich Re, Swiss Re, and Gen Re now request when reviewing cedant AI programs for treaty placement and pricing purposes.
Reinsurers have commercial incentives the NAIC does not: they bear a share of the adverse development that flows from underwriting model errors, and they need to price that risk. The governance questionnaires they have developed ask cedants to identify their AI systems by business function, document the training data vintage and refreshment cadence for each system, produce bias testing results segmented by protected class and geography, and describe the human review protocols that override or qualify model outputs in the underwriting and claims workflow. A carrier whose reinsurance treaty requires annual AI governance certification is already building much of the documentation infrastructure the NAIC compliance report would require. The compliance report form standardizes that infrastructure across the market, rather than leaving it as a bespoke arrangement between each carrier and its reinsurance panel.
The parallel matters because it signals where documentation norms are converging independently of the regulatory calendar. Carriers that have negotiated AI governance disclosure into their reinsurance treaties have a head start on the compliance report framework. Carriers that have not should expect both the regulatory and the commercial pressure to arrive on roughly the same timeline.
Carrier Documentation Gaps: Where the Compliance Report Finds Friction
The gap between what the draft compliance report would require and what most carriers can currently produce has three distinct dimensions, each requiring a different operational response.
The first is the model inventory gap. Exhibit A and the compliance report's models and data sources section both require a comprehensive accounting of every AI system in scope. Most carriers do not maintain a centralized registry of their AI and machine learning systems that captures vendor-embedded components alongside internally developed models. Without a registry, the carrier cannot answer Exhibit A accurately, cannot establish which systems are high-risk under Exhibit C, and cannot attest in the executive summary that the compliance report covers the full scope of the carrier's AI program. Building the inventory requires engagement from actuarial, IT, underwriting, claims, and legal functions simultaneously, because AI systems in each functional area are often managed by different teams with no central ownership. This is a six-to-twelve-month build for most mid-sized carriers starting from scratch.
The second gap is vendor contract coverage. Many carrier-vendor contracts executed before 2024 predate the NAIC evaluation tool's documentation requirements and include no provisions for the carrier to access model documentation, bias testing results, training data composition, or performance data segmented by demographic group. When Exhibit C asks about a vendor's pricing model, the carrier needs the vendor's documentation. When the vendor contract does not require the vendor to provide it, the carrier has no compliance path other than contract renegotiation. The early pilot examinations have surfaced this issue consistently: carriers cite "vendor" as the reason they cannot produce Exhibit C documentation, and the pilot states have made clear that vendor relationships are the carrier's responsibility to manage, not a basis for examination deference.
The third gap is evidence generation versus policy maintenance. The compliance report's board attestation, model drift section, governance structure narrative, and consumer complaint disclosure all require continuous evidence generation rather than static policy documentation. A carrier can draft an AI risk management policy in a day. Producing the evidence that the policy is operationally active, meaning that the governance committee reviewed a specific model's drift metrics in March and documented a remediation decision in April, requires months of organizational behavior change before the documentation exists. Carriers that begin building continuous evidence generation practices now, before the compliance report form is finalized in 2027, will file with documentation that reflects genuine governance activity. Carriers that wait until the form is issued and then produce the documentation retroactively will have a harder time presenting it as credible governance infrastructure rather than examination preparation.
Timeline and the Path to 2027 Rollout
The NAIC's calendar from here is concrete. The 12-state pilot runs through September 2026. NAIC staff will update the evaluation tool based on pilot feedback in September and October, then re-expose the revised tool for public comment. The BDAI Working Group's July 22 public meeting, the one where the actuarial panel on AI governance trends and the compliance report structure update are both on the agenda, is the last major working group session before the fall national meeting cycle begins. The revised tool goes to a vote at the NAIC Fall National Meeting in November 2026, with formal adoption expected to land the tool in the hands of all state insurance departments for use in 2027 market conduct and financial examinations.
The compliance report structure follows a parallel but slightly lagged track. The draft circulated at the Spring 2026 meeting is a working document, not an exposure draft. It will be refined through the summer based on working group input and pilot findings, exposed for public comment in the fall alongside the revised evaluation tool, and likely finalized in the first half of 2027 for implementation across the 24-plus adopting states. The first carriers required to file a structured AI compliance report under a standardized template are almost certainly looking at a 2027 requirement in the pilot states, with broader rollout into 2028 as additional states adopt both the Model Bulletin and the compliance report structure.
That timeline provides a roughly 18-month window between now and widespread compliance report filing requirements. Carriers should not treat this as free time. Model inventories take 6 to 12 months to build accurately. Vendor contract renegotiations, particularly for multi-year treaty arrangements with embedded AI components, run on multi-quarter timelines. Bias testing infrastructure, especially for carriers using gradient-boosted tree models or neural networks where disparate impact testing requires output-based demographic analysis rather than variable-level review, requires model pipeline changes that actuarial, IT, and governance teams must coordinate. None of these workstreams compresses well under filing pressure.
Why the Compliance Report Matters More Than the Model Law Debate
Most trade press coverage of NAIC AI regulatory developments has focused on whether the working group will escalate from a principles-based Model Bulletin to an enforceable Model Law, a step that would require state legislative adoption and impose statutory penalties for noncompliance. That question is genuinely open: the 33 comment letters on the NAIC's 2025 Model Law request for information revealed deep industry disagreement on scope, vendor liability, and thresholds for what constitutes a covered AI system. A Model Law, if it advances, is several years from widespread state adoption.
The compliance report structure is the nearer-term and more operationally significant development. It does not require a Model Law. It does not require new state legislation. It operates within the existing examination authority that state insurance departments already have under adopted versions of the Model Bulletin and existing market conduct statute. A state insurance department that has adopted the Model Bulletin can, during a market conduct examination, request that a carrier complete the compliance report form as evidence of its AI program's adequacy. Once the NAIC standardizes the form, states will use it whether or not a Model Law passes, because the form gives examiners a structured way to conduct AI governance reviews that commissioners and market conduct analysts can deploy without building their own frameworks from scratch.
The compliance report also functions as a compliance mapping tool that carriers can use internally before any regulator requests it. Walking through the nine components, including the model inventory, the governance attestation, the bias testing section, and the vendor documentation fields, surfaces the gaps in a carrier's current AI program more systematically than an internal audit can. Carriers that complete a dry run of the compliance report against the draft structure circulated at the Spring 2026 meeting will identify their documentation gaps 12 to 18 months before those gaps become an examination finding. That is the actuarially sound approach to AI governance compliance: model the exposure before the loss event, not after.
Further Reading
- NAIC Weighs Jump From AI Bulletin to Enforceable Model Law
- NAIC Four-Tier AI Risk Taxonomy Redefines Insurer Compliance
- NAIC Pilot Tests AI Model Scrutiny in Rate Filings Across 12 States
- NAIC Proposes Third-Party AI Vendor Registry for Insurers
- State AI Laws Now Set Bias Audit Rules for Insurer Models
Sources
- NAIC Big Data and Artificial Intelligence (H) Working Group Committee Page
- NAIC AI Systems Evaluation Tool Pilot Project Summary (PDF)
- Mayer Brown: NAIC Spring 2026 National Meeting Highlights, Innovation and Technology Committee (April 2026)
- Fenwick: NAIC Expands AI Systems Evaluation Tool Pilot Program to 12 States (2026)
- Foley & Lardner: What To Do If You Receive an NAIC AI Systems Evaluation Tool Pilot Request
- Swept AI: NAIC AI Evaluation Tool 12-State Pilot Is Live (2026)
- Quarles: Nearly Half of States Have Now Adopted NAIC Model Bulletin on Insurers' Use of AI
- NAIC: Implementation of Model Bulletin State Map