NAIC Four-Tier AI Risk Taxonomy: What the New Compliance Framework Means for Insurers

From reviewing every BDAI Working Group exposure document since the Model Bulletin’s 2023 adoption, the risk taxonomy presented at the Spring 2026 National Meeting in San Diego marks the first time NAIC staff have proposed concrete tier definitions with operational compliance expectations. Previous guidance, from the 2020 AI Principles through the December 2023 Model Bulletin, relied on principles-based language. The four-tier taxonomy introduced at the March 24 session of the Big Data and Artificial Intelligence (H) Working Group represents a shift toward prescriptive, risk-proportionate oversight.

This matters for several reasons. The Model Bulletin has been adopted by 24 states and the District of Columbia, with four states layering additional insurance-specific AI regulations on top. A 12-state evaluation tool pilot launched March 2, 2026, and runs through September. The BDAI Working Group has opened a 45-day comment period on a request for information regarding a potential NAIC Model Law on the Use of Artificial Intelligence in the Insurance Industry. Taken together, these developments suggest the taxonomy will not remain conceptual for long.

No other actuarial outlet has yet mapped each risk tier to specific insurance AI use cases, modeled what the compliance report structure looks like in practice for a mid-size carrier, or analyzed how the NAIC taxonomy compares to the EU AI Act’s tiered approach. That is what this article does.

The Four Risk Tiers: What NAIC Staff Proposed

The NAIC’s Senior Behavioral Data Scientist and Actuary presented a risk taxonomy with four distinct levels during the BDAI Working Group’s discussion on operationalizing the Model Bulletin. The taxonomy is designed to help regulators prioritize oversight by concentrating examination resources on high-risk AI applications while applying lighter scrutiny to low-risk, back-office systems.

Risk Tier	NAIC Definition	Insurance Use Cases	Regulatory Treatment
Unacceptable	AI systems using subliminal manipulation or general social scoring	Systems exploiting behavioral biases to increase premium acceptance; social-credit-style scoring that conditions coverage on non-insurance behavior	Prohibited outright
High	Potential for significant harm if failure or misuse occurs	Automated underwriting triage with binding authority; claims denial algorithms; pricing models that set final rates without human review; fraud detection systems that trigger coverage rescission	Full compliance report, AI model card, bias testing, continuous monitoring, human-in-the-loop requirements
Medium	Requires transparency and user disclosure of AI interaction	Customer-facing chatbots; sentiment analysis on claims calls; emotion recognition in recorded interactions; AI-assisted (not AI-decided) claims triage	Transparency disclosure, periodic review, consumer complaint monitoring
Low	Minimal restrictions; deployable without enhanced oversight	Spam filters; internal document search; scheduling and workflow automation; data extraction from policy forms	Inventory tracking only

The taxonomy draws a clear line between AI systems that make or materially influence coverage, pricing, and claims decisions (high-risk) and those that support internal operations without directly affecting policyholders (low-risk). The medium tier captures a growing category of customer-facing AI tools that do not make final decisions but shape the consumer experience in ways that require transparency.

Mapping the Tiers to Live Insurance AI Deployments

The taxonomy becomes concrete when applied to the AI systems carriers are already running in production. Patterns we have seen across carrier earnings calls, patent filings, and regulatory disclosures over the past year illustrate where each tier lands.

High-risk examples in production. AIG’s underwriting platform, built on Palantir Foundry, processes over 370,000 excess and surplus submissions annually through its Lexington Insurance unit, with a target of 500,000 by 2030. The system uses LLM agents to analyze risk, build ontologies, and support underwriting decisions across a $300 million delegated authority portfolio at Lloyd’s Syndicate 2479. Under the NAIC taxonomy, any component of that pipeline that influences binding decisions or sets pricing parameters falls squarely in the high-risk tier, triggering full compliance report and model card requirements.

Similarly, predictive models used for claims severity scoring, subrogation identification, and fraud detection at carriers like Travelers, Progressive, and Allstate operate in the high-risk tier when their outputs directly influence claim payments or coverage determinations. Travelers’ deployment of nearly 10,000 AI-equipped staff via its Anthropic partnership includes analytics tools that feed into underwriting and claims workflows.

Medium-risk examples. Travelers launched an agentic AI claims assistant developed with OpenAI in February 2026 that handles first-notice-of-loss calls for auto damage. The system consults policies, guides customers through decisions, and escalates to live agents. Under the taxonomy, this sits at the boundary between medium and high risk. It interacts directly with consumers (medium-tier transparency requirement) but also makes real-time characterizations about damage and policy provisions that could influence claim outcomes (high-tier decision impact). How regulators draw that line will be one of the most consequential interpretive questions as the taxonomy moves toward adoption.

Low-risk examples. Internal document search tools, email routing systems, and scheduling automation fall clearly in the low tier. These systems do not interact with policyholders and do not influence coverage, pricing, or claims decisions. Under the proposed framework, carriers would need only to include them in their AI system inventory (Exhibit A of the evaluation tool) without additional compliance documentation.

The Model Compliance Report Structure

Alongside the risk taxonomy, NAIC staff proposed a standardized compliance report structure that carriers would use to document their AI governance. According to the Mayer Brown summary of the Spring 2026 meeting, the compliance report covers five core areas:

Executive summaries and management oversight. Carriers must document their AI governance committee structure, reporting lines, and the senior leader accountable for the AI program. This goes beyond the Model Bulletin’s general requirement to designate a responsible person; it asks for the full organizational chart of AI oversight.
Internal and external data source documentation. Every data source feeding an AI system must be cataloged, with provenance, quality controls, and representativeness assessments. For high-risk systems, this includes documenting whether data sources contain proxies for protected characteristics.
Model drift and validation techniques. Carriers must describe how they monitor model performance over time, detect distributional drift, and trigger revalidation. For traditional GLMs, this is established practice. For machine learning systems that retrain on new data, the monitoring cadence and drift thresholds must be explicitly documented.
Protected class inference and bias testing. The report requires documentation of how the carrier tests for disparate impact across protected classes, including methodology, frequency, and remediation procedures when bias is detected. This section formalizes what Colorado’s SB 21-169 already requires for auto and health insurers in that state, but extends it as a national reporting standard.
Consumer complaint procedures. Carriers must document how consumers can challenge AI-influenced decisions, how those challenges are investigated, and what remediation is available.

For a mid-size carrier running, say, 15 to 25 AI systems across underwriting, claims, and customer service, completing this compliance report means inventorying every system, classifying each by risk tier, and producing detailed documentation for every high-risk application. The effort is not trivial. Carriers that have been treating the Model Bulletin as a principles-based exercise rather than a documentation exercise will need to substantially expand their compliance infrastructure.

AI Model Cards: The Standardized Reporting Tool

The compliance report is the carrier-level document. The AI model card is the system-level document. Think of it as a nutrition label for an individual AI system: a standardized summary that tells regulators what the system does, how it was built, what data it uses, and what risks it poses.

The concept of model cards originated in academic research by Mitchell et al. (2019) and has been adopted by organizations including Google and Hugging Face for public model documentation. The NAIC’s proposal adapts the concept for regulatory reporting in insurance, adding fields specific to actuarial and consumer protection concerns.

Based on the compliance report structure and the evaluation tool’s Exhibit C (which focuses on high-risk AI systems), a model card for a high-risk insurance AI system would need to include:

System identification. Name, version, deployment date, business function served, and the designated responsible person.
Intended use and scope. What decisions the system makes or influences, the lines of business affected, and the volume of decisions processed.
Training data summary. Data sources, time period, representativeness assessment, and any known limitations or gaps.
Performance metrics. Accuracy, precision, recall, or other relevant metrics, benchmarked against the system’s intended use case. For pricing models, this might include lift curves and Gini coefficients. For claims triage models, it might include false-positive and false-negative rates for fraud flagging.
Bias testing results. Disparate impact analysis across protected classes, testing methodology, and any identified disparities with remediation actions.
Monitoring and update cadence. How often the system is reviewed, what drift detection mechanisms are in place, and the revalidation trigger criteria.
Third-party components. Vendor-supplied models, datasets, or infrastructure used in the system, with references to vendor governance documentation.
Limitations and known failure modes. Conditions under which the system may produce unreliable outputs, edge cases, and compensating controls.

For actuaries, the model card requirement has direct ASOP implications. ASOP No. 56 (Modeling) already requires actuaries to understand the model, its intended purpose, and its limitations. A standardized model card would serve as the primary documentation artifact for demonstrating that understanding to both regulators and the Actuarial Board for Counseling and Discipline.

The 12-State Evaluation Tool Pilot: Testing the Framework in Practice

The risk taxonomy and compliance report structure are conceptual. The AI Systems Evaluation Tool pilot is the operational testing ground. Launched March 2, 2026, the pilot runs through September 2026 across 12 states: California, Colorado, Connecticut, Florida, Iowa, Louisiana, Maryland, Pennsylvania, Rhode Island, Vermont, Virginia, and Wisconsin.

The evaluation tool uses four exhibits that align with the proposed compliance framework:

Exhibit	Focus Area	What Regulators Are Looking For
A: AI Usage Inventory	Quantify AI deployment across the organization	Total number of AI systems, functions affected, consumer complaints received, future AI plans. Includes vendor-embedded models and ML features that carriers may not formally categorize as “AI”
B: Governance Risk Assessment	Evaluate oversight structures and risk management	Committee structures, accountability chains, documentation practices, policy frameworks. A quarterly committee meeting and a policy document alone are insufficient
C: High-Risk AI System Details	Deep dive into systems affecting underwriting, claims, and pricing	Model design, training data, validation procedures, performance metrics, and bias testing for each high-risk system
D: AI Data Details	Data sources, quality, and discrimination risk	Rate-setting data provenance, proxies for protected characteristics, social media data usage, aerial imagery applications, and data quality controls

Participating states are sending information requests to selected carriers and holding weekly coordination calls. Companies chosen for the pilot span different product lines, sizes, and organizational types. The pilot’s stated purpose is to test whether the evaluation tool helps regulators understand insurer AI governance and improves their ability to assess AI implementation during market conduct exams, financial exams, and general regulatory inquiries.

States are prioritizing high-risk AI systems for examination, consistent with the proposed taxonomy. As Swept AI’s analysis of the pilot notes, regulators focus on “high-risk AI systems that could cause serious consumer or financial issues, while paying less attention to low-risk back-office systems.” The risk taxonomy gives regulators the classification framework to operationalize that prioritization.

For a detailed analysis of how the pilot is unfolding, including the joint industry letter objecting to the pilot structure and the resource disparities between large and small carriers, see our coverage of the 12-state pilot and industry pushback.

From Bulletin to Model Law: The Legislative Path

The Model Bulletin adopted in December 2023 is guidance, not law. It provides a framework that state insurance departments can issue as a bulletin, but it carries no independent enforcement mechanism beyond whatever authority a state’s existing statutes provide. The four-tier risk taxonomy gains teeth only if it is codified in a model law or regulation that states can adopt with statutory force.

The BDAI Working Group has taken a concrete step in that direction. It exposed a request for information regarding proposing a NAIC Model Law on the Use of Artificial Intelligence in the Insurance Industry, with a 45-day comment period for stakeholder submissions. This follows the 33 comment letters the Working Group received in response to its earlier RFI on the model law concept, which revealed deep fault lines around scope, vendor liability, and company-size thresholds.

The proposed model law would likely incorporate the risk taxonomy as the organizing principle for regulatory requirements, with escalating obligations at each tier. Based on the Spring 2026 discussion and the compliance report structure, the legislative framework would include:

Mandatory AI system inventory. All carriers using AI would be required to maintain a current inventory of deployed systems, classified by risk tier.
Tiered compliance obligations. High-risk systems would trigger full compliance reporting, model cards, and periodic regulatory examination. Medium-risk systems would require transparency disclosures and periodic review. Low-risk systems would need only inventory tracking.
Third-party vendor accountability. Consistent with the NAIC’s parallel work on a third-party AI vendor registry, the model law would likely require carriers to document vendor-supplied AI components and maintain governance over those components regardless of the vendor’s own compliance posture.
Consumer protection provisions. Requirements for consumer notice when AI influences coverage or claims decisions, complaint procedures, and remediation pathways.
Examination authority. Explicit statutory authority for state insurance departments to examine carrier AI systems, request model cards and compliance reports, and take enforcement action for non-compliance.

The timeline from RFI to adopted model law is measured in NAIC meeting cycles. The earliest a model law draft could be exposed for public comment would be the Summer 2026 National Meeting, with potential adoption no earlier than the Spring 2027 meeting. Individual state adoption would follow on its own timeline, as it has for the Model Bulletin. For context, the Model Bulletin took roughly 14 months from adoption to reach 24 state adoptions.

NAIC Taxonomy vs. EU AI Act: Convergence and Divergence

The NAIC’s four-tier approach mirrors the EU AI Act’s risk classification in structure but differs in important ways that matter for global carriers operating in both jurisdictions.

Dimension	NAIC Taxonomy	EU AI Act
Number of tiers	Four (unacceptable, high, medium, low)	Four (unacceptable, high, limited, minimal)
Scope	Insurance-specific	Cross-sector, all industries
Insurance classification	Risk-based by use case (underwriting, claims, chatbots)	Life and health insurance AI explicitly classified as high-risk under Annex III; P&C classification depends on use case
Enforcement mechanism	State-level adoption of model law; examination authority	EU-wide regulation with member state market surveillance authorities
Documentation standard	Model compliance report + AI model cards	Technical documentation, conformity assessment, EU database registration
Unacceptable tier	Subliminal manipulation, social scoring	Broader: includes real-time biometric identification, emotion recognition in workplaces/education, predictive policing
Effective dates	Pilot through September 2026; model law adoption 2027 at earliest	Phased: prohibited AI banned February 2025; high-risk obligations August 2026

The convergence in structure is not accidental. NAIC staff have been tracking the EU AI Act’s development since its proposal in 2021, and the tiered approach reflects a growing international consensus that risk-proportionate regulation is more effective than one-size-fits-all requirements. For global carriers like AIG, Zurich, and Allianz that operate in both U.S. and European markets, the structural similarity means a single AI governance infrastructure can serve both regulatory regimes with adaptations at the documentation level.

The key divergence is in insurance-specific classification. The EU AI Act explicitly classifies “AI systems intended to be used for risk assessment and pricing in relation to natural persons in the case of life and health insurance” as high-risk under Annex III. The NAIC taxonomy is broader in scope: it covers all insurance lines and classifies by the nature of the AI decision (binding authority, consumer interaction, back-office support) rather than by line of business. A P&C pricing model would clearly be high-risk under the NAIC taxonomy but might not fall under the EU Act’s specific Annex III provision.

What Compliance Looks Like for a Mid-Size Carrier

To make the framework concrete, consider a regional P&C carrier with $800 million in written premium, operating in six states, three of which are in the evaluation tool pilot. The carrier uses AI in four areas: a predictive model for homeowners pricing, a machine learning system for claims fraud detection, a customer service chatbot, and an internal document processing tool.

Step 1: Inventory and classify. Under the taxonomy, the carrier maps its four AI systems to risk tiers:

Homeowners pricing model: High risk (directly sets rates)
Claims fraud detection: High risk (triggers coverage investigations and potential rescission)
Customer service chatbot: Medium risk (customer-facing, requires transparency disclosure)
Document processing tool: Low risk (internal operations only)

Step 2: Produce model cards for high-risk systems. The carrier needs two detailed model cards: one for the pricing model and one for the fraud detection system. Each model card documents the system’s purpose, data sources, training methodology, performance metrics, bias testing results, monitoring cadence, vendor components, and known limitations. For a vendor-supplied fraud detection model, this requires getting documentation from the vendor that many carriers have not historically demanded.

Step 3: Complete the compliance report. The carrier assembles the five-section compliance report covering management oversight, data documentation, model drift monitoring, bias testing, and consumer complaint procedures. The two high-risk systems require detailed treatment in each section. The chatbot requires a transparency section. The document processing tool requires only its inventory entry.

Step 4: Respond to the evaluation tool. If the carrier operates in a pilot state, it may receive a regulatory information request structured around the four exhibits. Exhibit A captures the full AI inventory. Exhibit B assesses the governance framework. Exhibit C asks for detailed information on each high-risk system. Exhibit D examines the data sources across all systems.

Step 5: Establish ongoing monitoring. The compliance report is not a one-time exercise. High-risk systems require continuous monitoring for drift, periodic bias retesting, and documentation updates when models are retrained or data sources change. The carrier needs to build this into its operational cadence, not treat it as an annual filing exercise.

For carriers that have invested in structured AI governance programs since the Model Bulletin’s adoption, much of this infrastructure may already exist. For those that treated the bulletin as a checkbox exercise rather than an operational program, the gap between current practice and the taxonomy’s requirements is substantial.

Practical Steps Insurers Should Take Now

The taxonomy is not yet binding, but the direction of travel is clear. Carriers that wait for formal adoption will be scrambling. Those that begin now will have a compliance advantage when the model law arrives. Based on the Spring 2026 proposals and the evaluation tool pilot’s requirements, insurers should consider the following steps:

Complete a full AI system inventory. Identify every AI and ML system in production, including vendor-embedded models that may not carry the “AI” label internally. Include systems used in underwriting, pricing, claims, customer service, and back-office operations. This inventory maps directly to Exhibit A of the evaluation tool.
Classify each system by risk tier. Apply the four-tier taxonomy to every system in the inventory. Pay particular attention to systems at tier boundaries, especially customer-facing tools that also influence coverage or claims decisions. Document the rationale for each classification.
Begin drafting model cards for high-risk systems. Start with systems that have the most direct impact on policyholders: pricing models, underwriting triage, claims decision systems, and fraud detection. For vendor-supplied systems, begin requesting the documentation needed to complete the model card. If vendors cannot provide it, that gap is itself a finding that needs to be addressed.
Audit bias testing practices. Evaluate whether current bias testing covers all protected classes, uses appropriate methodologies, runs at sufficient frequency, and documents remediation actions. The compliance report’s “Protected Class Inference and Bias Testing” section will require specific evidence, not just policy statements.
Review third-party vendor contracts. Ensure contracts provide the audit rights, documentation access, and incident notification requirements that the compliance report and model card will demand. The NAIC has been clear that “we bought it from a vendor” does not relieve carriers of their compliance obligations. For carriers relying on vendor-supplied AI systems, the proposed vendor registry framework adds additional dimensions to vendor management.
Build monitoring infrastructure. Establish drift detection, performance monitoring, and escalation triggers for high-risk systems. The compliance report requires documentation of monitoring cadence and revalidation criteria, so this cannot be an ad hoc process.
Engage the comment process. The 45-day comment period on the model law RFI is an opportunity to shape the final framework. Carriers, actuarial firms, and professional organizations should submit substantive comments on scope, proportionality, and implementation timelines. The 33 comment letters from the previous RFI demonstrate that industry engagement materially influences the Working Group’s direction.

Implications for Actuaries

The risk taxonomy and compliance report structure have direct implications for every actuarial role that touches AI systems.

Pricing actuaries filing rates with AI-derived risk relativities will need to demonstrate that the underlying system has been classified, documented in a model card, and tested for disparate impact. State regulators examining rate filings in pilot states now have a structured framework for asking detailed questions about AI components in the pricing pipeline.

Reserving actuaries using AI-assisted claims estimates need to understand where those estimates fall in the risk taxonomy. If claims triage or severity scoring models are classified as high-risk, the documentation requirements affect the appointed actuary’s ability to opine on reserve adequacy, because ASOP No. 56’s requirement to understand models and their limitations now has a concrete documentation standard to reference.

Chief actuaries and compliance officers responsible for the overall AI governance program need to treat the taxonomy as a planning document for the compliance infrastructure buildout. The gap between the Model Bulletin’s principles-based expectations and the taxonomy’s prescriptive requirements is where regulatory risk lives.

Consulting actuaries advising carriers on AI governance should be incorporating the four-tier framework into their client recommendations now, even in non-pilot states. When the model law arrives, clients who have already classified their systems and begun model card development will be materially better positioned than those starting from scratch.

ASOP No. 56 does not yet reference the NAIC risk taxonomy, but the Actuarial Standards Board has a history of updating standards to reflect new regulatory frameworks. Actuaries should expect that ASOP No. 56 guidance on AI system documentation will eventually align with the model card structure, making early adoption both a compliance and professional-standards investment.

The Timeline Ahead

The next 12 months will determine whether the risk taxonomy becomes the organizing principle for U.S. insurance AI regulation or remains an advisory framework. The key dates and milestones:

Through September 2026: 12-state evaluation tool pilot runs. Regulators gather data on how the tool works in practice, carriers respond to information requests, and feedback shapes refinements.
45-day comment period (current): Stakeholders submit comments on the model law RFI. The volume and substance of these comments will influence whether the NAIC proceeds to a model law draft.
September to October 2026: Evaluation tool updated based on pilot feedback and re-exposed for public review.
November 2026 (Fall National Meeting): Anticipated formal adoption of the evaluation tool. The risk taxonomy’s integration into the adopted tool would give it immediate practical effect across every state that uses the tool for market conduct examinations.
2027: Earliest potential exposure of a model law draft incorporating the risk taxonomy, compliance report structure, and model card requirements. State adoption would follow on individual timelines.

Carriers operating in pilot states face the most immediate pressure, but the framework’s national scope means every insurer using AI should be planning for compliance. The evaluation tool, once adopted, will be available to every state insurance department, and the taxonomy gives regulators a common vocabulary for examining AI governance regardless of whether their state has adopted the Model Bulletin, additional regulations, or an eventual model law.

Why This Matters

The NAIC’s four-tier risk taxonomy represents the most significant structural development in U.S. insurance AI regulation since the Model Bulletin’s December 2023 adoption. It moves the conversation from “insurers should govern AI responsibly” to “here is how regulators will classify, examine, and enforce AI governance requirements.”

For the insurance industry, the taxonomy provides clarity that has been missing. Carriers can now map their AI deployments to specific tiers, understand the compliance obligations associated with each tier, and plan their governance investments accordingly. The model compliance report and AI model cards give that planning a concrete deliverable structure.

For the actuarial profession, the taxonomy creates a direct link between regulatory expectations and professional practice. Actuaries who validate models, file rates, opine on reserves, or advise on governance will be working within this framework for the foreseeable future. Understanding it now, before it becomes binding, is the professional equivalent of studying for an exam that has not yet been scheduled but is certain to arrive.

NAIC Four-Tier AI Risk Taxonomy Redefines Insurer Compliance