From tracking AI regulatory developments across three jurisdictions over the past 18 months, a pattern emerges: the EU, US states, and the NAIC are converging on similar fairness metrics but diverging sharply on enforcement mechanisms. The latest example arrived on May 7, 2026, when the European Parliament and Council reached a provisional political agreement under the Digital Omnibus package to defer the Annex III high-risk AI compliance deadline from August 2, 2026 to December 2, 2027. For insurers deploying AI in life and health underwriting, that is 16 additional months before mandatory conformity assessments, bias testing protocols, and human oversight mechanisms must be operational.

The deferral does not diminish the underlying regulatory substance. Every technical requirement in Articles 9 through 15 of the EU AI Act (Regulation (EU) 2024/1689) remains intact. A peer-reviewed MDPI Risks study analyzing 12.4 million quote-bind-claim observations from four pan-European insurers has quantified pricing distortions of up to 7% above actuarial value for protected groups, driven not by explicit protected attributes but by socio-economic proxies embedded in model features. Those distortions inflate loss-ratio volatility in exactly the quantiles that dominate Solvency II capital calibration. The compliance problem, in other words, is also a capital problem.

This article maps the Omnibus deferral against the unchanged compliance requirements, the MDPI bias quantification data, the Forvis Mazars analysis of the emerging compliance actuary role, EIOPA’s parallel governance opinion, and the US regulatory landscape where the NAIC 12-state pilot and Colorado SB 26-189 are advancing on their own timelines.

What Annex III Classifies as High-Risk in Insurance

The EU AI Act does not regulate all insurance AI. Its high-risk classification is narrower than many summaries suggest, though where it applies, the requirements are comprehensive.

Annex III, Category 5, titled “Access to and enjoyment of essential private services and public services and benefits,” captures two insurance-relevant sub-categories. Category 5(a) covers AI systems for creditworthiness assessment and credit scoring of natural persons, catching insurers who use credit-based insurance scores. Category 5(c) is the provision that pulls the actuarial profession directly into scope: “AI systems intended to be used for risk assessment and pricing in relation to natural persons in the case of life and health insurance.”

P&C insurance pricing and underwriting are not classified as high-risk under Annex III. A predictive model for auto insurance rating or homeowners risk scoring does not trigger high-risk obligations, though it may face limited-risk transparency requirements under Article 50 if it interacts directly with consumers. This distinction concentrates the compliance burden on life and health actuaries, not P&C pricing units.

The Harvard Data Science Review analysis by Hacker and Eber (July 2025) highlights a critical scope expansion: the Act captures “a wide spectrum of technologies, encompassing advanced deep learning, more traditional machine learning frameworks, and decision-making systems.” Traditional GLMs feeding into underwriting decisions may be in scope, not only gradient-boosted models and neural networks. For actuarial teams accustomed to treating GLMs as transparent-by-design, this represents a documentation obligation that did not previously exist.

The May 7 Omnibus Deferral: 16 Extra Months, Same Requirements

The Digital Omnibus agreement restructures the EU AI Act timeline without altering the substance of the obligations:

Obligation Original Date New Date Deferral
Annex III high-risk AI (standalone), including insurance underwriting August 2, 2026 December 2, 2027 16 months
Annex I high-risk AI (product safety) August 2, 2027 August 2, 2028 12 months
Article 50(2) watermarking and synthetic content August 2, 2026 December 2, 2026 4 months

Several provisions remain in force on their original timeline. Article 5 prohibited AI practices have applied since February 2, 2025. Article 4 AI literacy obligations, which require insurers to employ staff with a “sufficient level of AI literacy,” have also been in force since that date. The GPAI model requirements under Articles 50 through 55 are already active.

Two important caveats accompany this deferral. First, the agreement is provisional and must still receive formal endorsement from the European Parliament and Council before it becomes binding. Second, the self-assessment registration requirement under Article 6(3) was preserved despite the European Commission’s initial proposal to eliminate it. Insurers deploying AI systems that they classify as non-high-risk under Annex III must still register those systems in the EU database and document their reasoning.

For carriers that had been scrambling toward the August 2026 deadline, the deferral provides planning runway. For those that had not yet started, it removes the near-term urgency without reducing the ultimate scope of work. As a practical matter, 16 months is not generous for an insurer that needs to inventory all production AI systems, classify each against Annex III criteria, build conformity assessment documentation, implement bias testing pipelines, and establish human oversight workflows.

Quantifying Algorithmic Bias: The MDPI Pricing Distortion Data

The most concrete quantification of the bias problem that Annex III targets comes from Mahajan, Agarwal, and Gupta’s peer-reviewed study in MDPI Risks (Vol. 13, No. 9, Article 160, September 2025). Their dataset spans 12.4 million quote-bind-claim observations from four pan-European insurers covering 2019 Q1 through 2024 Q4, making it one of the largest empirical studies of algorithmic pricing bias in insurance.

The headline finding: protected groups pay up to 7% above actuarial value due to algorithmic bias. This distortion persists even after explicit protected attributes (gender, ethnicity proxy, disability status, and postcode deprivation index) are excluded from model training. SHAP (Shapley Additive Explanations) analysis reveals that the vast majority of unfair uplift originates from socio-economic proxies, particularly occupation and urban density, that correlate with protected characteristics without being flagged as such.

The study tested XGBoost models alongside benchmark GLMs for mortality, morbidity, and lapse risk. The researchers evaluated three debiasing approaches against the fairness thresholds that courts and regulators recognize:

Fairness Metric Threshold What It Measures
Disparate impact ratio 0.80 to 1.25 Ratio of favorable outcome rates between protected and reference groups
Statistical parity difference ±0.10 Absolute difference in positive outcome rates across groups
Equalized odds gap ±0.10 Difference in true positive and false positive rates across groups

A critical finding: only adversarial debiasing closed the gap below the AI Act materiality threshold without destroying predictive power. Simpler approaches, including reweighting and threshold adjustment, either failed to achieve the required fairness metrics or degraded model discrimination to the point where pricing accuracy suffered. The study also established an economic breakeven: when the supervisory detection probability exceeds 8.9%, proactive debiasing is strictly cheaper than the expected fine plus incremental Solvency Capital Requirement (SCR), even before accounting for reputational damage.

From Bias to Capital: The Solvency II Connection

The MDPI study’s most original contribution extends the bias literature from detection to valuation. Under current Solvency II practice, algorithmic bias appears only in qualitative “conduct-risk” footnotes, leaving boards without a quantitative framework to balance debiasing costs against capital consequences.

Mahajan, Agarwal, and Gupta derive a closed-form mapping from three legal fairness metrics (statistical parity difference, disparate impact ratio, equalized odds gap) to changes in the Solvency II Standard-Formula SCR. The mechanism is straightforward: the 7% pricing distortion inflates loss-ratio volatility for protected group segments. That volatility feeds directly into the quantiles that calibrate the SCR. Integration with Solvency II Quantitative Reporting Template S.25 filings enables direct feed-through from loss-ratio volatility to capital.

The practical implication is that bias is not merely a fairness concern to be managed through compliance checklists. It is a capital efficiency problem. An insurer running biased pricing models holds more capital than one that has debiased its models, all else being equal. The study demonstrates that “capital-efficient fairness” is attainable, meaning that debiasing can simultaneously improve regulatory compliance and reduce capital requirements.

For appointed actuaries preparing Solvency II own risk and solvency assessments (ORSAs), this creates a new modeling dimension. The standard ORSA stress-test battery should now include scenarios that quantify the capital impact of algorithmic bias detection and remediation across the underwriting portfolio.

The Compliance Actuary: Bridging Model Validation and AI Governance

Forvis Mazars, through Gary Stakem (Director of Actuarial and Risk Services, Forvis Mazars Ireland), has positioned actuaries as the natural AI Act compliance officers for insurance undertakings. The argument is structural: actuaries already own model validation for pricing and reserving systems, possess quantitative skills in statistical testing, and understand the regulatory environment in which these models operate.

The compliance actuary role, as Forvis Mazars describes it, encompasses three governance pillars:

  • Auditability: maintaining technical documentation for all high-risk AI systems per Article 11, including system architecture, training methodology, data demographics, performance metrics, and known failure modes
  • Bias detection: systematic testing across protected categories using the fairness metrics specified by EIOPA and implied by the Act’s fundamental rights protections
  • Fairness validation: ongoing monitoring of model outputs against defined thresholds, with escalation workflows when metrics breach acceptable ranges

Article 4 of the AI Act already requires insurers to employ staff with “sufficient AI literacy.” Forvis Mazars argues that actuaries satisfy this requirement because the role demands experience with complex models and large datasets, understanding of both regulatory and commercial environments, stakeholder communication abilities, and familiarity with model oversight practices. The gap, however, is that traditional actuarial model validation (focused on predictive accuracy and reserve adequacy) does not cover the fairness, transparency, and human oversight dimensions that the Act requires. Closing that gap is the core challenge for the compliance actuary role.

EIOPA’s Parallel AI Governance Framework

In August 2025, EIOPA published its Opinion on AI Governance and Risk Management (EIOPA-BoS-25-360), signed by Chairperson Petra Hielkema. The Opinion addresses the complementary space: AI systems used by insurers that are not prohibited or high-risk under the AI Act, while explicitly acknowledging the Act’s Annex III classification of life and health insurance AI.

The Opinion establishes six governance pillars: fairness and ethics, data governance, documentation and record keeping, transparency and explainability, human oversight, and accuracy, robustness and cybersecurity. Paragraph 3.30 specifically names the actuarial function as “responsible for the controls on AI systems that fall under its responsibilities (e.g. for coordination of technical provisions calculation, opinion on the overall underwriting policy).”

EIOPA’s Annex I sets out five fairness metrics for insurance AI: demographic parity, calibration, equalized odds, equalized opportunities, and individual fairness. The Opinion includes an explicit caution that “some of the group fairness metrics could contradict the concept/metric of actuarial fairness in insurance underwriting, where customers bearing the same risk are charged the same price.” This tension between statistical group fairness and actuarial individual fairness is not resolved by the Act and will require judgment calls by compliance teams.

A key principle in paragraph 3.11 holds insurers “ultimately responsible for the AI systems that they use, regardless of whether the AI systems are developed in-house or in collaboration with third party service providers.” For carriers relying on vendor models for underwriting or pricing, this means compliance obligations cannot be contracted away. The insurer must validate, monitor, and document third-party AI systems as if they were built internally. This parallels the approach the NAIC is developing through its proposed third-party AI vendor registry.

EIOPA will review supervisory convergence two years after publication, setting a 2027 checkpoint that now aligns closely with the deferred December 2027 Annex III compliance deadline.

US Regulatory Convergence: NAIC Pilot and Colorado’s Pivot

While the EU defers its deadline, US insurance AI regulation continues to advance on separate timelines. The landscape divides into two tracks: the NAIC’s voluntary evaluation framework and state-level legislative mandates.

NAIC 12-State AI Evaluation Pilot. The NAIC AI Systems Evaluation Tool pilot launched March 2, 2026, with 12 participating states: California, Colorado, Connecticut, Florida, Iowa, Louisiana, Maryland, Pennsylvania, Rhode Island, Vermont, Virginia, and Wisconsin. The pilot runs through September 2026, with expected formal adoption at the Fall National Meeting in November 2026. The evaluation tool uses four exhibits: Exhibit A quantifies AI usage, Exhibit B assesses governance risk, Exhibit C details high-risk AI systems, and Exhibit D captures AI data specifics. The tool applies a “principle of proportionality,” prioritizing examination of high-risk AI systems.

Roughly 24 states have adopted the NAIC Model Bulletin on AI (originally adopted in 2023), establishing that existing insurance laws apply regardless of whether decisions are made by humans, algorithms, or third-party vendors. The NAIC’s March 2026 Issue Brief explicitly supports state-based oversight and opposes federal preemption, a position that creates regulatory fragmentation for carriers operating across multiple states.

Colorado SB 26-189. Colorado’s AI regulatory trajectory illustrates how quickly the enforcement approach can shift. SB 26-189, passed May 9, 2026, replaces the landmark SB 24-205 (the Colorado AI Act) with a fundamentally different framework. The original law required algorithmic discrimination avoidance, mandatory bias mitigation programs, and annual impact assessments. The replacement eliminates those obligations in favor of a layered disclosure and documentation framework focused on developer-deployer relationships rather than algorithmic outcomes.

Critically for insurers, SB 26-189 includes a safe harbor: insurers subject to Colorado Section 10-3-1104.9 (governing use of external consumer data and algorithms) are deemed in compliance with the new bill for insurance practices, except for employment-related decisions. However, as we analyzed in detail, the insurance-specific bias testing requirements under SB 21-169 and Regulation 10-1-1 remain in force with a July 1, 2026 deadline. Colorado now has a dual-track compliance problem: the general AI law shifted to disclosure, but the insurance-specific regulations still require the four-fifths rule, proxy variable audits, intersectional testing, and counterfactual analysis.

Managing Dual Compliance Across Jurisdictions

Multinational carriers face a convergence of regulatory timelines that the Omnibus deferral does not simplify. The EU AI Act applies extraterritorially to any provider placing AI on the EU market, any deployer using AI within the EU, and any provider or deployer outside the EU whose system’s output is used within the EU. A US carrier with European subsidiary operations writing life or health business cannot opt out by hosting models domestically.

The three regulatory frameworks share structural similarities but differ on enforcement. All three converge on the concept that insurers bear responsibility for AI system outcomes regardless of whether the system was built in-house or by a vendor. All three reference similar fairness metrics (disparate impact ratio, equalized odds). All three require some form of human oversight and documentation.

The divergence is in enforcement teeth. The EU AI Act carries penalties of up to 35 million EUR or 7% of global turnover. Colorado’s insurance-specific regulations operate through the state DOI’s market conduct examination authority. The NAIC evaluation tool is voluntary during the pilot phase and depends on individual state adoption for enforcement power. For carriers managing compliance across all three, the EU standard effectively becomes the floor because it is the most prescriptive and carries the highest penalties.

Dimension EU AI Act (Annex III) NAIC Evaluation Tool Colorado SB 21-169
Compliance date December 2, 2027 (deferred) November 2026 adoption target July 1, 2026
Scope Life and health underwriting/pricing AI All insurer AI systems (proportionality-based) Auto and health insurance algorithms
Bias testing Required (conformity assessment) Documented in Exhibit C Four-fifths rule, proxy audit, intersectional testing
Penalties Up to €35M or 7% global turnover State examination authority DOI enforcement actions
Vendor responsibility Provider/deployer dual obligations Insurer bears ultimate responsibility Insurer bears ultimate responsibility

What Actuarial Teams Should Do Now

The Omnibus deferral provides planning runway, not a reason to delay. Patterns we have observed across recent regulatory cycles suggest that carriers who treat extended deadlines as permission to defer gap analysis consistently underestimate the implementation timeline. The following steps apply regardless of whether the final enforcement date lands in December 2027 or shifts again:

1. Inventory and classify. Identify every AI system that touches life or health underwriting, pricing, or coverage eligibility. Include vendor-provided models. Classify each against Annex III criteria. For systems that fall into the grey zone (traditional GLMs that the Harvard Data Science Review analysis suggests may be in scope), document the classification rationale.

2. Establish bias testing pipelines. Implement the three fairness metrics from the MDPI study (disparate impact ratio, statistical parity difference, equalized odds gap) as automated monitoring dashboards. The 0.80 to 1.25 disparate impact ratio threshold is the benchmark across jurisdictions. Test using SHAP-based explainability to identify proxy discrimination from socio-economic variables, not just explicit protected attributes.

3. Build technical documentation templates. Article 11 requires documentation before deployment and continuous updates thereafter. Start with system architecture, training data demographics, performance metrics across protected groups, known failure modes, and validation results. Align documentation with EIOPA’s six governance pillars to satisfy both frameworks simultaneously.

4. Design human oversight workflows. Article 14 requires that at least one qualified person can intervene in any automated decision before it takes effect on a policyholder. Map current escalation paths in underwriting workflows and identify where automated decisions execute without human review. The compliance actuary role that Forvis Mazars describes is the natural owner of these workflows.

5. Run the capital analysis. Use the MDPI closed-form mapping to estimate the SCR impact of current pricing distortions. If adversarial debiasing reduces loss-ratio volatility in protected group segments, the capital benefit may partially or fully offset the compliance investment. Build this business case before requesting budget for compliance infrastructure.

6. Align with NAIC and state requirements. For US carriers with EU operations, design compliance architecture that satisfies the EU standard while also meeting NAIC evaluation tool documentation requirements and, where applicable, Colorado’s insurance-specific bias testing obligations. The EU framework is the superset; building to that standard generally covers US requirements as well.

Why This Matters

The EU AI Act’s treatment of insurance AI as high-risk reflects a broader regulatory consensus that automated decisions about insurance coverage and pricing carry material consequences for consumers. The Omnibus deferral adjusts the timeline but not the destination. By December 2027, insurers deploying AI in life and health underwriting across EU markets will need conformity assessments, bias testing infrastructure, technical documentation, and human oversight mechanisms that most have not yet built.

The MDPI data adds a quantitative dimension that moves the discussion beyond compliance checklists. A 7% pricing distortion is not a theoretical risk; it is a measured outcome from 12.4 million observations across four carriers. That distortion has capital consequences under Solvency II and regulatory consequences under the AI Act. The governance gap in actuarial practice that we have documented separately is precisely the space where the compliance actuary role needs to develop.

For actuarial teams, the 16-month deferral is an opportunity to build compliance infrastructure methodically rather than reactively. The carriers that use this window to implement bias testing pipelines, establish documentation standards, and develop the compliance actuary function will be positioned not only for EU enforcement but for the parallel NAIC and state-level requirements that continue to advance on their own timelines.

Further Reading

Sources