Adversarial Self-Critique Rewrites AI Underwriting Governance

Joyjit Roy and Samaresh Kumar Singh submitted arXiv preprint 2602.13213 on January 21, 2026, presenting a framework that adds one agent to the standard agentic underwriting pipeline and changes everything downstream of it. Their system places a dedicated critic between the intake agent and the human reviewer, giving the decision a second layer of machine-generated scrutiny before it lands on an underwriter’s desk. Across 500 expert-validated commercial lines cases, that single architectural change reduced the hallucination rate from 11.3% to 3.8% and lifted decision accuracy from 92% to 96%. Eleven weeks later, Duck Creek Technologies launched its insurance-native Agentic AI Platform on April 28, delivering orchestration and governance tooling to P&C carriers at scale. And by August 2, 2026, the EU AI Act’s Annex III classification of insurance underwriting AI as high-risk takes full legal effect, activating Article 9’s documented-decision-rationale mandate for every carrier operating within EU jurisdiction. The paper arrived at precisely the moment the industry needed it.

Trade press covered Duck Creek’s platform launch and the EU AI Act timeline separately. No outlet has mapped the adversarial self-critique architecture to the specific governance requirements those two developments create, or analyzed the ASOP No. 56 documentation gaps that August 2026 is about to expose. The gap matters because the governance challenge in agentic underwriting is not that human reviewers are missing. Most carriers deploying agentic systems have human-in-the-loop controls. The problem is that single-agent systems leave the human reviewing a conclusion without the reasoning chain that produced it, and no governance standard accepts a conclusion without evidence.

The Architecture: Intake, Critic, Arbiter

Roy and Singh’s adversarial self-critique architecture assigns three functional roles to the agentic underwriting pipeline. An intake agent processes the submission, extracting structured risk data from unstructured documents and producing a preliminary recommendation. A critic agent then receives that recommendation and applies adversarial pressure: it attempts to find logical gaps, unsupported inferences, and factual inconsistencies in the intake agent’s reasoning before the recommendation advances. An arbiter layer, which may be automated or human depending on carrier configuration, resolves cases where the two agents reach conflicting assessments and escalates to a human reviewer when the conflict cannot be resolved algorithmically.

The critic’s role is the structural innovation. In conventional single-agent systems, the AI produces a recommendation and passes it to a human reviewer without any internal check on the reasoning chain that generated it. The human sees the conclusion, not the intermediate reasoning. When the conclusion is wrong because the underlying logic was hallucinated or factually unsupported, the human is working from contaminated input with no signal that contamination occurred. The underwriter who approves 200 AI-assisted submissions per day cannot interrogate each model’s reasoning from first principles; the system’s efficiency value depends precisely on not doing that.

The critic does not re-examine the submission independently. It examines the intake agent’s reasoning and attempts to refute it. That constraint matters for understanding what the critic catches and what it does not. The critic is blind to facts the intake agent did not notice; it can only challenge the inferences drawn from the facts the intake agent reported. A critic that finds the intake agent’s reasoning sound advances the case. A critic that identifies material problems sends both its challenge and the original recommendation to the arbiter, producing a structured artifact: the intake agent said X, the critic found problems with X for reasons Y and Z, the resolution was A. When the agents agree, the decision advances with a record of the critic’s concurrence. When they disagree, the human reviewer receives both positions and a documented basis for the disagreement.

That attached artifact is the governance artifact that changes the compliance picture entirely.

The Hallucination Numbers: What 11.3% Looks Like at Scale

An 11.3% hallucination rate in a single-agent underwriting system means that roughly one in nine decisions contains AI-generated content that is factually unsupported, internally inconsistent, or simply fabricated. For a carrier processing 10,000 commercial lines submissions annually, that is approximately 1,100 decisions carrying materially unreliable AI reasoning before a human reviewer has the opportunity to catch the error, assuming the error is visible in the recommendation rather than buried in the reasoning chain that produced it.

The reduction to 3.8% under adversarial self-critique does not eliminate hallucination. It reduces it from a systemic problem to a residual one: three to four cases per hundred still contain errors the critic did not catch. Human review remains essential, and the escalation logic of the arbiter layer is not a feature that can be disabled as AI systems mature. But the direction and magnitude of the change matter for governance documentation. A carrier that can demonstrate a reduction from 11.3% to 3.8% across its agentic underwriting pipeline holds a verifiable performance benchmark, a calibration target for the critic layer, and a documented improvement trajectory that regulators and boards can evaluate. A carrier operating a single-agent system holds none of those artifacts, because single-agent systems generate no self-critique record by design.

The accuracy improvement from 92% to 96% across 500 expert-validated cases is the companion metric. Decision accuracy in this context measures how often the agentic system’s recommendation matches the determination a qualified underwriter would reach on the same submission. A 92% baseline is commercially respectable; 96% is materially better. The four-point improvement translates to roughly 400 additional correct recommendations per 10,000 submissions, each of which represents either avoided mispricing, avoided adverse selection, or avoided coverage gap that would have surfaced at claim time. The critic agent earned its latency cost in that test population.

From reviewing model governance documentation at carriers deploying agentic underwriting in commercial lines, the audit trail for a three-agent decision is structurally different from a single-model output and requires documentation templates most carriers have not yet built. Single-agent systems produce a recommendation and a rationale. Three-agent systems produce a recommendation, a critique, a resolution, and where the agents disagreed, an escalation record. The difference is not cosmetic. The EU AI Act’s Article 9 requirements and ASOP No. 56’s modeling standards both depend on documentation of the decision process, not just the decision.

EU AI Act Article 9: The Documentation Mandate for High-Risk AI

The EU AI Act (Regulation 2024/1689) classifies insurance underwriting AI as a high-risk system under Annex III, the category covering AI used to assess and price risk in relation to life, health, and property casualty insurance. High-risk classification activates Article 9 obligations that apply identically to single-agent and multi-agent systems; the architecture does not change the regulatory requirement. What the adversarial self-critique architecture changes is how easily those requirements can be satisfied in practice, because the requirements assume the kind of decision artifact that adversarial critique generates automatically and that single-agent systems must produce through separate documentation overlays.

Article 9 requires providers of high-risk AI systems to establish, implement, document, and maintain a risk management system throughout the system’s lifecycle as a continuous iterative process. The regulation is specific about what “documented” means: technical documentation must cover the system’s architecture, the algorithms used, the risk assessments performed, the test results obtained, and the known limitations of the system. For each automated determination the system produces, the documentation must support a retrospective reconstruction of the reasoning that led to that determination. Regulators enforcing Article 9 do not accept a risk score and a timestamp as sufficient documentation; they need the decision chain.

For a three-agent architecture, that chain exists in the system’s operational logs as a natural byproduct of how the critic and arbiter layers function. The intake agent’s assessment, the critic’s challenge, and the arbiter’s resolution are timestamped artifacts produced in real time. For a single-agent system, no such chain exists; the system produced a recommendation, and if a human approved it, what the human actually reviewed and why they agreed with the AI remains undocumented in most carrier deployments. That documentation gap is what August 2, 2026 exposes.

The Annex III high-risk scope covers systems that materially influence a pricing or coverage determination, regardless of whether a human formally approves the final output. For agentic underwriting systems where an intake agent delivers a decision-ready submission that an underwriter acts on without deep interrogation, the regulatory exposure is real even when the human technically signs off on each case. EU enforcement will look at the substance of the AI’s role in the decision, not just the formality of human sign-off. Carriers whose underwriters approve AI recommendations at 200-per-day rates without documented review of the reasoning chain behind each one are not insulated from Article 9 liability by that approval.

The Gap Single-Agent Systems Cannot Bridge

Human-in-the-loop is the standard safety feature at carriers deploying agentic underwriting today, and it is structurally insufficient for high-risk AI governance under the EU AI Act, the NAIC’s evolving evaluation framework, or ASOP No. 56. The NAIC’s Big Data and Artificial Intelligence Working Group identified at its Spring 2026 National Meeting in San Diego that human-in-the-loop escalation does not address the intermediate steps in an agentic chain where errors can compound before reaching the human decision point. Roy and Singh’s adversarial self-critique framework is the peer-reviewed answer to that specific problem.

When a single-agent system passes a recommendation to a human reviewer, the human receives an output. The model’s internal reasoning is opaque; the underwriter sees the recommendation and exercises professional judgment about whether to follow it. If the underlying AI reasoning was hallucinated, the underwriter’s judgment operates on false premises and may or may not catch the error, depending on whether the error is visible in the output or buried in the inference chain that produced it. In a regulatory audit, the carrier cannot produce evidence that the AI reasoning was sound on any individual case, because the single-agent system generated no artifact to that effect. It produced a recommendation, not a reasoned argument subject to challenge.

The adversarial self-critique architecture produces that artifact automatically. The critic’s challenge is a structured document: the intake agent assessed the submission as X for reasons A, B, and C; the critic identified that reason B depends on an unsupported inference about the applicant’s loss history; the arbiter escalated the case to a human reviewer with both positions attached. When the carrier faces an Article 9 documentation request from a European regulator, an Exhibit C response in the NAIC’s 12-state evaluation pilot, or an ASOP No. 56 compliance review, that artifact chain answers the question. The single-agent system cannot.

Duck Creek’s AI Assurance layer, launched April 28, 2026, addresses part of this problem at the platform level: decision traceability, auditability, observability, compliance controls, and explainability for every AI-driven action within the Duck Creek ecosystem. CEO Hardeep Gulati framed the platform’s ambition: “Agentic AI will redefine how insurance operates, enabling carriers to move from manual, fragmented processes to orchestrated end-to-end decisioning.” Platform-level audit trails capture what agents did and when. The adversarial self-critique architecture captures why an agent reached a conclusion and whether that conclusion survived structured challenge. The two layers are complementary, not substitutable. Platform governance without reasoning-level critique produces timestamped decisions that regulators cannot evaluate for soundness. Reasoning-level critique without platform governance produces valuable artifacts that may not be systematically logged or retrievable.

ASOP No. 56 and the Documentation Templates That Do Not Exist Yet

ASOP No. 56 (Modeling) requires actuaries to understand the model, its intended purpose, and its limitations, and to document that understanding in a way that supports the model’s use in the actuarial work product. For a three-agent underwriting system, the documentation scope is substantially larger and structurally different from anything the standard was written to cover.

The intake agent has a defined scope: process submission documents, extract structured risk data, produce a preliminary recommendation. Documenting its limitations under ASOP No. 56 follows a recognizable pattern: training data characterization, known failure modes, conditions under which outputs are reliable, performance metrics on representative test populations. The 11.3% hallucination baseline from Roy and Singh’s experiment is the kind of metric ASOP No. 56 documentation should capture for a production intake agent, adjusted for the carrier’s specific submission population and commercial lines mix.

The critic agent introduces a complication with no direct precedent in existing ASOP templates. Its output is not a risk assessment; it is an assessment of an assessment. Documenting the critic’s limitations means characterizing what categories of error the critic will fail to catch, how critic performance varies by submission type and complexity, and whether the critic introduces systematic biases of its own. A critic trained primarily on commercial property submissions may perform less reliably on professional liability or management liability submissions, where the documentation structure and relevant risk factors differ substantially. A critic calibrated against a general commercial lines training population may be poorly positioned to challenge specialty or surplus lines reasoning. ASOP No. 56 requires actuaries using models based on algorithms or data to evaluate whether the model is appropriate for the intended purpose; the critic’s purpose is evaluating another model’s reasoning, which is a qualitatively different task than any actuarial model validation template currently addresses.

The arbiter layer requires its own documentation block. What is the threshold for escalation? When both agents speak in natural language rather than probability scores, how is disagreement quantified and compared against an escalation threshold? What is the documented false-negative rate for escalation, meaning the share of cases where material AI errors existed but the critic did not challenge the intake agent, so no escalation occurred? Those parameters are the governance inputs that actuaries signing off on agentic underwriting systems must understand and document. They are parameters that most carriers have not yet formalized, because the adversarial self-critique architecture is new enough that no carrier has had production experience to draw on. Roy and Singh’s 500-case experiment is the only published calibration reference available as of mid-2026.

The documentation gap is practical, not theoretical. An appointed actuary in a state where the carrier operates who signs off on a rate filing or a reserve opinion that relies on agentic underwriting outputs will face an actuarial standard compliance question that current templates cannot fully answer. The actuary must document their basis for concluding that the model is appropriate for its intended use. For a three-agent system where the critic’s performance on the carrier’s specific submission mix has never been independently validated, that documentation is either incomplete or it must be built from scratch. The actuaries who build those templates now will set the standard that peers and regulators reference for the next decade.

Calibrating the Critic: Agreement Rates as a Governance KPI

The emerging industry standard for AI governance metrics is the agreement rate: how often does the AI reach the same conclusion a qualified human professional would? AIG’s disclosure of an 88% AI-adjuster fraud alignment rate using Anthropic’s Claude during Q1 2026 earnings established a carrier-level baseline for what voluntary governance disclosure looks like. For three-agent underwriting systems with adversarial self-critique, agreement rates appear in two distinct places in the governance framework, and the interaction between them defines the system’s calibration.

The first agreement rate is between the AI system’s final recommendation and the human reviewer’s ultimate decision. This is the metric AIG reported for fraud detection and the metric the NAIC’s evaluation pilot will eventually require carriers to document for high-risk AI applications. In a three-agent system, the final recommendation that reaches the human is already critic-filtered; the agreement rate at the human review stage should be higher than in a single-agent system, because the critic has already removed the most obviously flawed recommendations before the human sees them.

The second agreement rate is between the intake agent and the critic agent, measured as the share of cases where the critic concurs with the intake agent’s assessment without raising a material challenge. This internal concurrence rate is the governance calibration parameter that determines how much human oversight the system actually provides. If the critic agrees with the intake agent 98% of the time, it is escalating two cases per hundred for additional review, which may be appropriate for a carrier processing standard commercial renewals with low complexity variance. If it agrees only 70% of the time on the same submission population, either the critic is overreaching or the intake agent is generating unreliable recommendations at a rate that warrants far more human attention than the carrier anticipated.

Carriers implementing adversarial self-critique must develop line-of-business-specific concurrence rate baselines and document the methodology used to set them. A concurrence rate calibrated on commercial property submissions cannot be applied to professional liability or specialty surplus lines submissions without validation on those populations specifically. The calibration documentation belongs in the ASOP No. 56 model documentation alongside the intake agent validation and the critic performance characterization.

An additional calibration problem concerns direction of disagreement. When the critic challenges the intake agent’s recommendation, is it more often finding that the intake agent was too permissive (approving risks it should have declined or priced more conservatively) or too restrictive (declining or surcharging risks it should have accepted)? Systematic critic bias in either direction affects the carrier’s loss ratio and its distribution relationships. A critic that consistently pushes intake agent recommendations toward more restrictive outcomes may be protecting underwriting quality or it may be introducing its own systematic bias that regulators following Colorado’s SB 21-169 requirements or the NAIC’s fairness testing framework will identify as discriminatory. That directionality analysis belongs in the governance documentation alongside the aggregate concurrence rate.

High-Volume Commercial Lines: The Latency Tradeoff

The operational objection to multi-agent architectures in high-volume commercial lines is latency. Leading carriers have already compressed underwriting timelines from multiple business days to under 15 minutes for standard commercial submissions, and the efficiency case for agentic underwriting depends partly on maintaining that speed. A second agent reviewing every intake agent recommendation before it advances adds processing time, the magnitude of which Roy and Singh did not quantify in their 500-case experiment.

Capgemini’s research on agentic AI in underwriting places the function second among insurer agentic deployment priorities at 68%, behind only customer service at 70%. That ranking reflects genuine commercial appetite for underwriting automation even where governance overhead is added. The latency question is therefore not whether to add adversarial critique but where in the submission workflow to place it. Duck Creek’s Agentic Underwriting Workbench applies AI agents first to submission intake and triage, prioritizing high-value opportunities before the full review cycle begins. The adversarial self-critique layer would sit downstream of that initial triage, applying critic review to submissions that have already been scored for priority. That sequencing reduces the latency exposure relative to applying critic review to every incoming submission in arrival order regardless of complexity.

The right implementation structure for most carriers maps the NAIC’s own risk taxonomy onto the critic-agent deployment decision. The NAIC’s Spring 2026 National Meeting produced a four-tier risk classification: unacceptable, high, medium, and low risk. AI systems producing automated coverage denials, significant surcharges, or premium decisions above a defined threshold almost certainly fall in the high-risk tier. Standard renewal processing for in-force accounts within appetite may fall in the medium tier. Mandatory critic-agent review for high-tier decisions, optional for medium-tier, and streamlined intake-only processing for low-tier decisions delivers Article 9 compliance at the decision types that regulation most directly targets while preserving the throughput that makes agentic underwriting commercially viable.

That tiered approach also solves a documentation scope problem. ASOP No. 56 documentation for a system that applies critic review to every submission is far more extensive than documentation for a system that applies critic review selectively based on defined risk criteria. Carrying the full documentation burden uniformly across all submission types is unnecessary and resource-intensive. Tiered deployment with documented tier-assignment criteria gives actuaries a bounded and defensible scope for model validation.

What Actuaries in Model Governance Roles Need to Do Now

Two things happen simultaneously as agentic underwriting scales through 2026. The regulatory requirement becomes more specific, and the technical scope of what actuaries must understand and document grows beyond what existing ASOP templates were designed to address. The adversarial self-critique architecture from arXiv 2602.13213 is the most structurally complete peer-reviewed answer to the governance requirement published as of mid-2026. Carriers that adopt the architecture or its functional equivalent gain a documented decision chain that satisfies Article 9’s continuous risk management mandate without requiring manual documentation overlays on top of a system not built to produce them.

Actuaries validating three-agent underwriting systems need four documentation elements that no current industry template provides. First, a critic performance specification: what is the critic’s measured error catch rate on a representative sample of the carrier’s submission population, and how does that rate vary by line of business and submission complexity? Second, a calibration methodology for the concurrence threshold: what concurrence rate between intake and critic agents triggers escalation, how was that threshold derived, and what is the documented rationale for setting it at that level for each deployment context? Third, a directionality analysis: does the critic exhibit systematic bias toward more restrictive or more permissive outcomes across protected class and geographic segments? Fourth, a false-negative escalation rate: what share of cases with material errors in the intake agent’s reasoning were advanced without critic challenge, and how was that rate measured?

Those four elements are the actuarial governance infrastructure that the AI governance gap in actuarial practice identified as missing well before multi-agent architectures became the carrier deployment standard. The adversarial self-critique framework makes building that infrastructure technically tractable in a way that single-agent systems do not. The EU AI Act’s August 2 deadline is not a distant compliance target. It is the deadline by which carriers operating in EU markets need the infrastructure in place. Actuaries whose carriers have not begun building it yet are six weeks from a gap that will not narrow by waiting.

Carriers that build the documentation templates now, on the architecture that Roy and Singh specified and that Duck Creek’s platform makes operationally deployable, will define the governance standard that regulators reference and peers replicate. Carriers that wait for explicit regulatory guidance will be measured against a standard they had no role in designing, at a compliance cost substantially higher than building it from a published technical foundation that is already available.

Sources

Agentic AI for Commercial Insurance Underwriting with Adversarial Self-Critique (January 2026) — Joyjit Roy and Samaresh Kumar Singh, arXiv preprint 2602.13213
Duck Creek Launches Insurance-Native Agentic AI Platform (April 28, 2026) — PR Newswire
EU AI Act Article 9: Risk Management System — EU Artificial Intelligence Act (Regulation 2024/1689)
EU AI Act Annex III: High-Risk AI Systems — EU Artificial Intelligence Act classification framework
Agentic AI in Underwriting: The Future of Insurance Decision Making at Scale — Capgemini
Agentic AI in Insurance Underwriting: Six Use Cases — hyperexponential
NAIC Big Data and AI Working Group Spring 2026 Materials (March 24, 2026) — National Association of Insurance Commissioners
CAS AI Primer: A Practical Guide for Actuaries (2026) — Casualty Actuarial Society