From reviewing governance frameworks across five regulatory regimes over the past year, including NAIC model bulletins, the EU AI Act, IFoA technical guidance, IAIS consultation papers, and state-level DOI rules, the pattern has been consistent: regulators frame AI risks as problems to be managed through better process controls. The IFoA and LFBF’s June 2026 report breaks from that consensus. Its core argument is that generative AI’s most consequential risks for financial services are not incidental flaws to be patched through governance improvements but structural features of the technology itself, baked into the same architecture that makes GenAI useful.
Published on June 2, 2026, “It’s Still Not Magic: Framing the Risks Facing Financial Services in the Gen AI Era” updates the influential 2019 predecessor report with findings specific to generative AI adoption in insurance and banking. The report drew on a survey of 78 senior financial services practitioners and observers and introduces what the authors call “uncomfortable tensions”: paradoxes where the same capabilities that drive value simultaneously create risks that cannot be fully resolved, only managed. For American actuaries working under ASOP Nos. 12, 23, and 56, the implications are direct: the IFoA framework challenges the assumption that sufficient governance controls can reduce GenAI deployment risk to an acceptable baseline.
The Survey Numbers: Severity and Acceleration
The report’s survey findings establish the urgency. Among senior financial services practitioners, 70% agreed that AI risks are among the greatest facing their sector over the next five years. That figure alone is notable, but the acceleration metric sharpens it: 75% said those risks had increased substantially since generative AI became widely available, beginning roughly with the public launch of ChatGPT in late 2022 and the rapid enterprise deployment wave that followed through 2024 and 2025.
The top three concerns identified by respondents were cyber threats, misleading outputs, and knowledge gaps among staff deploying AI. Each of these maps to a distinct governance failure mode. Cyber threats reflect the technology’s capacity to scale attacks against financial services infrastructure, a concern reinforced by the IMF’s May 2026 finding that AI-enabled cyberattacks now pose systemic stability risks. Misleading outputs capture the hallucination problem: fluent, confident, and wrong. Knowledge gaps reflect a workforce reality where the people deploying and overseeing AI systems frequently lack the technical understanding to evaluate whether those systems are performing as intended.
These findings arrive alongside the Bank of England and FCA’s joint 2024 survey, which found that 95% of insurance firms in the UK were already using AI, the highest rate of any financial services subsector. The median number of AI use cases per firm is expected to more than double over the next three years, from 9 to 21, with foundation models already accounting for 17% of all AI use cases. The IFoA report’s risk findings, in other words, describe a technology that is already deeply embedded, not one still awaiting adoption.
The Nine-Risk Framework: Outcomes, Operating Environment, and System
The report’s central analytical contribution is a nine-risk framework organized across three layers: outcomes (direct effects on customers and decisions), operating environment (how AI changes the organizational and competitive landscape), and system (how AI creates ecosystem-wide vulnerabilities). This layered structure traces how AI risk propagates from individual customer interactions through firm-level operations to systemic financial stability concerns. The nine risks cluster around six named challenges: fairness, trust, truthfulness, security, governance, and concentration/contagion, with additional concerns flagged around workforce disruption, environmental impact, and AI asset bubble risk.
At the outcomes layer, the framework addresses risks such as biased or discriminatory outputs, misleading or hallucinated information reaching customers or internal decision-makers, and erosion of the quality controls that traditionally separated analysis from action. For actuaries, this layer maps directly to the concerns embedded in ASOP No. 12’s proposed revision, which adds an entirely new Section 3.4 on “Potential for Unintended Bias” and requires actuaries to consider whether AI-driven risk classification systems produce unintended disparate impacts.
The operating environment layer captures how GenAI restructures competitive dynamics and internal processes. When AI systems can generate plausible analysis at scale, the traditional bottleneck of qualified human review becomes the binding constraint. The report notes that only a minority of UK financial services leaders feel confident that their governance approach is keeping pace with AI innovation, pointing to a structural mismatch between the speed of deployment and the capacity for meaningful oversight.
The system layer addresses concentration risk, infrastructure dependencies, and correlated failures. Because most enterprise GenAI deployments rely on a small number of foundation model providers (primarily OpenAI, Anthropic, and Google), a vulnerability in a widely used system could cascade across dozens of institutions simultaneously. This echoes the IMF’s May 2026 analysis, which warned that AI attackers could discover and exploit common vulnerabilities across interconnected financial institutions, turning a localized breach into a systemic event.
Uncomfortable Tensions: The Core Thesis
The most analytically distinctive contribution of the report is the “uncomfortable tensions” framework. Rather than presenting risks and mitigations as a standard two-column compliance checklist, the authors argue that several of GenAI’s core risks are the direct consequence of the same features that make the technology valuable. These are not bugs to be patched; they are trade-offs to be navigated.
The report identifies several specific tensions:
Fluency versus reliability. GenAI’s ability to produce confident, well-structured prose is precisely what makes its errors dangerous. A model that hedges every statement would be less useful but more honest. The same linguistic fluency that allows an underwriting assistant to draft clear policy summaries also allows it to fabricate plausible-sounding claims data or loss development factors that a reviewing actuary might not catch without independent verification.
Inclusion versus exclusion. The same analytical power that allows insurers to identify previously unserved market segments can sharpen exclusionary pricing. The report frames this as a structural tension: the machinery that widens inclusion can simultaneously sharpen exclusion, and the difference depends on implementation choices that governance frameworks struggle to monitor at scale.
Human-in-the-loop versus human control. The report draws an important distinction: placing a human in the review chain is not the same as giving that human meaningful control over outcomes. When AI systems process information at volumes and speeds that exceed human cognitive capacity, the “human in the loop” becomes a formality rather than a safeguard. This tension is especially relevant for actuarial reserving and pricing workflows where GenAI-assisted analysis may generate outputs faster than the reviewing actuary can meaningfully evaluate them.
Concentration as feature, not flaw. The economies of scale that make foundation models effective require concentration: massive datasets, enormous compute infrastructure, and a small number of providers. The resulting vendor concentration is not a market inefficiency that competition will resolve; it is a structural characteristic of how the technology works. For insurers, this means that diversifying AI vendors does not necessarily diversify AI risk, because the underlying models and training data overlap substantially.
As Keyur Patel, the LFBF research associate who authored the report, put it: “The same characteristics that make AI useful in financial services also create many of the risks that make it so difficult to govern.” The question for firms, Patel argued, is not simply whether risks can be mitigated but how much risk they are prepared to accept in exchange for AI’s benefits.
From 2019 to 2026: What Changed
The 2019 predecessor report, “It’s Not Magic,” examined AI risks in financial services before generative models had entered mainstream enterprise use. That report focused on machine learning and predictive analytics: bias in credit scoring models, explainability challenges in gradient-boosted decision trees, and data quality concerns in supervised learning pipelines. The risks it identified were real but bounded. They could, in principle, be addressed through model validation, data governance, and human review.
The 2026 update argues that generative AI has changed the risk calculus in three fundamental ways. First, the output modality shifted: instead of producing numerical predictions that can be backtested against observed outcomes, GenAI produces unstructured text, code, and analysis that resists traditional validation methods. An actuary can compare a GLM’s predicted loss ratios against emerged experience; evaluating whether a GenAI-drafted reserve analysis contains subtle errors in reasoning is a qualitatively different challenge.
Second, the scope of deployment expanded. In 2019, AI use cases in insurance were concentrated in underwriting segmentation and claims triage. By 2026, GenAI touches customer communications, regulatory filings, compliance monitoring, internal audit, and strategic analysis. The attack surface has grown correspondingly.
Third, the concentration dynamics intensified. In 2019, firms deploying machine learning models typically built or commissioned proprietary systems. In 2026, most enterprise GenAI deployments depend on a handful of foundation model APIs, creating correlated risk exposures that did not exist seven years earlier.
Mapping the IFoA Framework Against U.S. Actuarial Standards
The IFoA report was written for a UK audience operating under a principles-based regulatory framework where the FCA and PRA set outcomes expectations rather than prescriptive rules. American actuaries operate under a different but equally relevant governance architecture: the Actuarial Standards of Practice (ASOPs) maintained by the Actuarial Standards Board, supplemented by state-level regulations and the NAIC model bulletin on AI governance. The IFoA’s structural-tensions thesis creates specific compliance questions under three ASOPs.
ASOP No. 56: Modeling
ASOP No. 56 requires actuaries to evaluate a model’s appropriateness for its intended use, assess data quality and model structure, perform validation testing, and ensure appropriate governance and controls. The IFoA report’s fluency-versus-reliability tension directly challenges ASOP No. 56’s validation framework. When a GenAI model produces text-based actuarial analysis rather than numerical outputs, what constitutes “validation testing”? Traditional backtesting methods assume quantitative outputs that can be measured against observed experience. GenAI outputs resist this comparison.
The American Academy of Actuaries’ October 2024 guidance on GenAI professionalism attempts to bridge this gap by arguing that existing ASOPs, particularly No. 56, apply directly to GenAI use even though they were developed before the technology existed. The guidance explicitly states that an actuary cannot use a GenAI result without validation and claim, “That’s what the model told me.” But the IFoA report suggests that this standard may be aspirational rather than operationally achievable when GenAI is embedded across pricing, reserving, and regulatory compliance workflows that collectively produce outputs faster than human reviewers can meaningfully assess.
ASOP No. 23: Data Quality
ASOP No. 23 requires actuaries to make reasonable efforts to determine the definition of each data element and to identify questionable values or inconsistent relationships. When GenAI systems produce intermediate analytical outputs that feed downstream actuarial calculations, the data quality assessment becomes recursive: the actuary must evaluate not just the input data but the AI-generated intermediate products that shape the final analysis. The IFoA report’s misleading-outputs concern maps directly to this challenge. GenAI can produce data summaries, trend analyses, and contextual interpretations that look authoritative but contain fabricated or distorted information, and ASOP No. 23’s data quality framework was not designed to catch this failure mode.
ASOP No. 12: Risk Classification
The proposed revision of ASOP No. 12 adds Section 3.4 on “Potential for Unintended Bias,” responding to regulatory pressure from the NAIC model bulletin, the Colorado AI Act, and similar state initiatives. The IFoA report’s inclusion-versus-exclusion tension complicates the compliance picture. If the same analytical capabilities that improve risk segmentation can simultaneously sharpen discriminatory outcomes, then bias testing cannot be a one-time exercise performed at model deployment. It requires continuous monitoring as the model encounters new data distributions and edge cases, a requirement that current actuarial practice is not structured to deliver at scale.
| IFoA “Uncomfortable Tension” | Relevant U.S. ASOP | Compliance Gap |
|---|---|---|
| Fluency vs. reliability | ASOP No. 56 (Modeling) | No validation framework for text-based actuarial outputs; backtesting assumes quantitative predictions |
| Misleading outputs | ASOP No. 23 (Data Quality) | Data quality checks not designed for AI-generated intermediate products that look authoritative but may contain fabrications |
| Inclusion vs. exclusion | ASOP No. 12 (Risk Classification) | One-time bias testing insufficient; same model can widen and sharpen exclusion depending on data distribution shifts |
| Human-in-the-loop vs. human control | ASOP No. 56 (Modeling) | Review requirements assume human cognitive capacity to evaluate outputs; GenAI volume can exceed that capacity |
| Concentration risk | NAIC Model Bulletin / State DOIs | Vendor diversification does not equal risk diversification when underlying foundation models share architecture and training data |
The IFoA Practice Board’s Evolving Role
The report’s publication coincides with the maturation of the IFoA’s AI, Data Science and Emerging Technologies Practice Board, launched in November 2025 as a member-led initiative to investigate the impact of AI on actuarial practice. The board oversees multiple research working parties, including groups focused on explainable AI (XAI) techniques and guidance notes for embedding XAI practices in actuarial work that relies on AI systems.
IFoA President Paul Sweeting FIA called AI “a defining force of our time” and emphasized that actuaries must leverage “technical skill, communication and professional oversight” to ensure AI functions properly within the profession’s ethical and regulatory boundaries.
This institutional development matters because the IFoA is the first major actuarial professional body to establish a dedicated practice board, as distinct from a committee or working group, specifically for AI governance. The SOA and CAS have data science working groups and continuing education initiatives, but neither has elevated AI to a practice-board-level governance function with the authority to issue technical guidance and shape professional standards. The IFoA’s structural choice reflects the “It’s Still Not Magic” report’s core argument: if GenAI risks are structural rather than incidental, then the institutional response must be structural too.
For American actuaries, the IFoA Practice Board’s output is worth tracking for two reasons. First, its research working parties are producing technical guidance on explainability, fairness testing, and model governance that will likely influence the ASB’s own standard-setting agenda. Second, multinational carriers and reinsurers operating under both IFoA and U.S. actuarial standards will face pressure to harmonize their internal AI governance frameworks, creating de facto convergence even without formal regulatory coordination.
What the Trade Press Missed
Coverage of the IFoA report across Insurance Edge, Insurance Business UK, Corporate Adviser, and Money Age has focused almost exclusively on the headline survey statistics: 70% identifying AI as a top risk, 75% saying risks have increased. These are newsworthy numbers, but they obscure the report’s more important analytical contribution.
The “uncomfortable tensions” framework is not a risk register. It is a structural argument about the limits of governance. Traditional risk management assumes that identified risks can be reduced to acceptable levels through appropriate controls. The IFoA report argues that for certain categories of GenAI risk, the controls and the capabilities are the same thing. You cannot make a language model less fluent to make it less prone to hallucination without also making it less useful for the tasks you deployed it to perform. You cannot reduce concentration risk without fragmenting the scale economies that make foundation models effective.
This has practical implications for how carriers frame their AI governance disclosures. When an insurer tells regulators that it has implemented “robust AI governance frameworks,” the IFoA thesis suggests that statement may be structurally misleading, not because the carrier is acting in bad faith, but because certain categories of GenAI risk cannot be governed to near-zero through any framework. The honest disclosure would be: “We have implemented governance controls that reduce, but cannot eliminate, the structural risks inherent in the technology we have chosen to deploy.” That framing is materially different for regulators, investors, and the actuaries signing opinions on the adequacy of reserves calculated with AI assistance.
Why This Matters for U.S. Actuarial Practice
The IFoA report’s structural-tensions thesis intersects with several active U.S. regulatory and standard-setting processes. The NAIC’s model bulletin on AI governance, now being extended to cover agentic AI, assumes that appropriate governance controls can manage AI deployment risk. The proposed ASOP No. 12 revision adds bias-testing requirements that assume bias can be detected and corrected through periodic review. The Federal Reserve’s SR 26-2, which replaced the 15-year-old SR 11-7 model risk framework, explicitly excluded generative and agentic AI from its scope, leaving a regulatory vacuum for carriers deploying these technologies.
If the IFoA is right that certain GenAI risks are structural rather than manageable, then the current U.S. regulatory approach has a foundational gap. The carrier AI audit layer failures documented across multiple recent deployments suggest the gap may already be operational: carriers are building AI systems faster than their governance frameworks can evaluate them, and the governance frameworks themselves may lack the conceptual vocabulary to describe what they are failing to catch.
For actuaries specifically, the implications flow through three channels:
Appointed actuary opinions. When GenAI assists in reserve calculations, the appointed actuary’s opinion on reserve adequacy implicitly depends on the quality of AI-generated analysis. If that analysis contains subtle errors that resist traditional validation, the opinion’s reliability is structurally compromised in ways that current disclosure requirements do not capture.
Rate filing support. Actuaries supporting rate filings that incorporate AI-driven segmentation must now consider the IFoA’s inclusion-versus-exclusion tension. A model that improves loss prediction accuracy for the portfolio may simultaneously produce discriminatory outcomes for specific subgroups, and the ASOP No. 12 revision requires consideration of unintended bias without providing a clear standard for how much bias is too much.
Vendor due diligence. Actuaries evaluating AI vendors or internal AI tools should incorporate the concentration-risk insight into their assessments. When multiple carriers deploy AI systems built on the same foundation model, a vulnerability in that model creates correlated risk exposures that compound across the market. The Hartford’s algorithmic impact assessment framework provides one model for how carriers can begin to address this, but the industry lacks a standardized approach.
Looking Forward: Risk Acceptance, Not Risk Elimination
The IFoA report’s most discomfiting implication is that the insurance industry’s traditional approach to risk management, identify, quantify, mitigate, and monitor, may not fully apply to GenAI. Some risks can be reduced but not eliminated because they are embedded in the technology’s architecture. The question shifts from “How do we manage this risk?” to “How much residual risk are we willing to accept in exchange for the productivity gains?”
That is an uncomfortable question for a profession built on quantifying and pricing risk. But it is precisely the question the report argues the industry must answer honestly rather than obscure behind governance frameworks that promise more control than they can deliver.
The uncomfortable tensions will not resolve themselves. They will need to be navigated case by case, carrier by carrier, and regulatory regime by regulatory regime. For actuaries, the practical task is not to reject GenAI deployment but to insist on honest risk disclosures that acknowledge structural limitations rather than claiming governance adequacy that the technology’s architecture makes impossible to guarantee.