Inside EXL’s Insurance LLM Patent: Domain-Specific AI for Claims, Regulatory Reporting, and Property Risk

When EXL announced its Insurance LLM in September 2024, the company claimed it was the first industry-specific large language model built exclusively for insurance claims and underwriting tasks. The product launch, developed in partnership with NVIDIA using the NeMo framework and H100 GPUs, came with a bold performance claim: 20 to 30% higher accuracy on insurance-specific tasks than general-purpose models including GPT-4, Claude, and Gemini. Less than a year later, the patent behind that model was granted, and its claims reveal an architecture considerably more complex than a simple fine-tuned LLM.

In this article, we examine three of EXL's most insurance-specific patents in detail. US 12,399,924 (the Insurance LLM) protects a multi-domain signal evaluation system that combines data anonymization, domain categorization, ensemble model architectures, and guidance artifact generation. US 12,468,696 (Regulatory Reporting Assist.AI) protects an AI-powered platform for statutory compliance that includes flux analysis, historical query repositories, and automated validation against regulatory schemas. US 12,387,271 (Property Insights) protects a multi-model pipeline for property risk prediction using aerial imagery, object distance calculations, and location attribute analysis.

Together, these three patents represent EXL's bid to own the analytical intelligence layer of insurance operations, sitting above the document extraction pipeline we analyzed in our previous article and below the carrier-facing decision points where actuaries, underwriters, and claims adjusters make final calls.

Patent US 12,399,924: The Insurance LLM Architecture

Granted on August 26, 2025, this patent carries the clinical title "Robust methods for multi-domain signal evaluation systems" but is, at its core, an insurance claims adjudication and underwriting decision support system. The patent was filed on March 6, 2025, claims priority to an Indian application filed September 11, 2024, and names nine inventors led by Gaurav Iyer and Arturo Devesa.

What the independent claims protect: Claim 1 describes a method that receives a digital artifact containing unstructured alphanumeric signal data associated with a "decision logic condition" (the patent's abstracted term for scenarios like insurance claims), identifies non-compliant signals that violate compliance parameters, generates masking elements mapped to those non-compliant signals, creates a second digital artifact substituting the masked signals, determines signal domain categories for the masked artifact, generates a composite alphanumeric signal using a machine learning model based on the domain categories, and generates guidance artifacts for display that include both a human-readable narrative recommending specific actions and predicted outcomes of those actions.

Translated from patent language to insurance operations: the system ingests a claims file, identifies and anonymizes protected information while preserving analytical context, classifies the content by domain (medical, legal, financial), generates a summarized analysis, and presents an adjuster with recommended actions and predicted outcomes.

The anonymization architecture: The patent dedicates significant detail to its data de-identification approach, which is more sophisticated than simple redaction. Rather than replacing protected information with blank spaces or generic markers, the system uses SHA-256 hashing to convert identified PII and PHI entities into consistent n-digit hash values (16 bits or fewer). The consistency is key: the same entity produces the same hash value across different sections of the document, which means the anonymized version preserves relational context. If a patient name appears in both a medical summary and a billing record, the hash value is identical in both locations, allowing downstream models to recognize that the same individual is referenced without ever accessing the actual name.

The patent describes a four-step anonymization pipeline: PII/PHI identification (using HIPAA safe harbor entity lists and healthcare-specific detection tools), PII/PHI anonymization (SHA-256 conversion to consistent hash values), PII/PHI validation (human validators check for leakage), and post-processing (regex-based pattern matching catches any remaining exposed information). This layered approach acknowledges a reality of insurance data processing: a single missed identifier in a medical record can create a compliance violation, so redundant detection layers are necessary.

The ensemble model architecture: The patent describes what is effectively a mixture-of-experts system where multiple specialized models are each trained on different aspects of insurance operations. The description names specific domains: claims processing, fraud detection, risk assessment, and customer support. A gating network dynamically selects the most relevant expert model for each incoming task or data input.

From tracking the evolution of the Insurance LLM since its September 2024 launch, this architecture explains how EXL achieves the claimed 20 to 30% accuracy improvement over general-purpose models. Rather than fine-tuning a single model on all insurance tasks (which risks overfitting to dominant task types at the expense of less common ones), the system routes each task to a specialist. A claims reconciliation query goes to the claims expert. A fraud pattern analysis goes to the fraud expert. An underwriting risk assessment goes to the risk expert. The gating network handles the routing.

The patent description also details the training pipeline: Low-Rank Adaptation (LoRA) weights merged with original model weights for inference, tensor parallelism to distribute computation across GPUs, supervised fine-tuning targeting 1 to 5% of total parameters, and adaptive batching to optimize throughput. EXL's product disclosures indicate this training leverages more than 10 years of proprietary domain-specific labeled data, a dataset that competitors cannot replicate because it comes from EXL's decades of processing claims on behalf of carrier clients.

The guidance artifact system: Claims 2 and 3 extend the core method with interactive capabilities. Claim 2 protects real-time query response: when a user (the patent names claim adjusters, nurses, underwriters, case managers, actuaries, and others) submits a natural-language question about the decision logic condition, the system generates a narrative response using both the composite signal and the domain categories. Claim 3 protects cost analysis: when a user selects a displayed guidance artifact, the system accesses historical expenditure records and generates an approximate cost analysis report for the recommended action.

For claims operations, this means an adjuster viewing a bodily injury claim could receive a recommended settlement range alongside a predicted outcome probability, then drill into a cost analysis showing historical expenditure data for similar claims. The patent explicitly mentions "financial expense, consumer satisfaction level, regulatory compliance, risk assessment" as categories of predicted outcomes, which maps directly to the metrics that claims management teams track.

What this patent does NOT protect: The patent does not claim the specific training data, the NVIDIA infrastructure, or the LoRA fine-tuning methodology (all of which are described but not independently claimed). It does not protect specific decision logic rules for claims adjudication. It does not claim the gating network architecture in a standalone fashion. And it does not describe or claim a method for evaluating whether its own recommendations are accurate over time, which is a notable absence for a system making actionable claims recommendations.

Patent US 12,468,696: The Regulatory Reporting Platform

Granted on November 11, 2025, this patent protects EXL's Regulatory Reporting Assist.AI product and targets a workflow that is central to actuarial practice: statutory financial reporting and regulatory compliance. The patent names only two inventors, Suresh Murjani and Prashant Poddar, suggesting a more focused development effort than the nine-inventor Insurance LLM patent.

What the independent claims protect: Claim 1 describes a system that receives a natural-language query about a first set of digital artifacts (financial statements from a current reporting period) associated with a monitored system, retrieves a second mapped set of artifacts (prior period statements), calculates a performance differential report showing changes in operational performance characteristics between periods, determines historical dialogue records containing prior queries that satisfy a similarity threshold to the current query along with their corresponding formatted responses, inputs the differential report and historical records into a generative ML model to produce a narrative response, and transmits the response for display.

In insurance reporting terms: an analyst asks "Why did the loss ratio increase in Q4?", the system pulls current and prior quarter statutory statements, calculates the flux between them, finds prior regulatory queries about similar loss ratio movements and their accepted explanations, and generates a narrative response informed by both the numerical analysis and the institutional knowledge embedded in historical regulatory interactions.

The flux analysis engine: The patent's differential analysis is more than simple period-over-period subtraction. Claims 2 through 6 describe a system that identifies data fluctuation patterns between periods, retrieves user-validated narratives describing fluctuation patterns from prior differential reports, generates supplemental narrative responses describing the identified patterns, detects anomalous fluctuations that exceed tolerance thresholds, retrieves historical records of similar anomalies including their identified sources within prior reporting artifacts, and uses machine learning to identify candidate sources of the current anomaly. Claim 6 adds a visualization layer: a hierarchical map of interactive visual elements where anomalous patterns are marked with distinctive visual indicators.

The detailed description makes the insurance context explicit, referencing statutory reporting cycles, NAIC compliance requirements, expense analysis, underwriting insights, reinsurance insights, investment insights, financial ratios (combined ratio, loss ratio, expense ratio, operating ratio), capital adequacy metrics (total adjusted capital, total capital required, policyholder surplus ratio), and detailed flux analysis for line items of financial statements. This is not a generic reporting tool repurposed for insurance. It was designed from the ground up for the statutory reporting workflow.

The historical query repository: One of the most strategically significant features is the system's ability to learn from prior regulatory interactions. Claim 7 specifies that the historical response set includes content elements corresponding to portions of prior performance differential reports, and that the generative model uses both the current and historical differential reports when generating responses. This means the system accumulates institutional knowledge about how regulators have questioned financial results in the past and what explanations were accepted.

For statutory actuaries, this has immediate practical implications. Every reporting cycle involves explaining material changes to regulators, and the quality of those explanations often depends on institutional memory held by experienced staff. A system that indexes prior regulatory queries, accepted responses, and the specific differential reports that prompted them creates a searchable knowledge base of regulatory interaction history. When a new analyst needs to explain a capital adequacy movement, the system can surface how similar movements were explained and accepted in prior periods.

The validation framework: Claim 8 describes automated validation against compliance parameters, with automatic error resolution by referencing historical correction records. The system identifies validation parameters not satisfied by the current financial data, searches for prior instances where similar validation failures occurred and were resolved, and applies the historical correction pattern to the current data using a machine learning model. Claim 9 adds a validation model set where individual models are each configured to evaluate compliance for specific parameter subsets, and the system predicts compliance labels for current data using the appropriate model.

The description provides concrete examples: verifying that asset and liability pages balance, checking that sales proceeds by sale of bonds in cashflow match Schedule D Part 4, and validating that unearned premium and assumed premium receivable are present where expected. These are specific statutory reporting validations that insurance company controllers and actuaries perform manually in most organizations today.

What this patent does NOT protect: The patent does not claim the specific NAIC statutory reporting rules or validation criteria. It does not protect the data structures of statutory financial statements. It does not claim specific anomaly detection algorithms (z-score, moving average, and exponential smoothing are described as examples). And it does not describe integration with any specific statutory reporting software platform (such as Statutory Automated Reporting or similar products), leaving open the question of how the system interfaces with carriers' existing reporting infrastructure.

Patent US 12,387,271: The Property Risk Prediction System

Granted on August 12, 2025, this patent protects EXL's Property Insights product and addresses the property and casualty underwriting inspection workflow. The patent was filed in February 2023 (the earliest of the three insurance-specific patents) and claims priority to a provisional application from February 2022, indicating that this capability was under development well before the Insurance LLM.

What the independent claims protect: Claim 1 describes a system that receives a location identifier via a GUI, queries a local data store to determine whether stored images satisfy a data recency criterion, queries a remote data store if local images are outdated, extracts image-based attributes using a first ML model (cognitive image analysis), determines distances between objects in the image using a second ML model (object recognition), trains a third ML model using training data that correlates location attributes to image-based predictors and distance values, generates event predictions using the trained model, retrains the model using generated predictions and reference feedback, and displays prediction information.

The system claim (Claim 6) adds specificity about the multi-data-store architecture: a remote data store provides aerial imagery and location identifiers, while a local data store provides images, location identifiers, and location attributes (policy information, claims history, weather data).

The multi-model pipeline: The patent describes three distinct ML models working in sequence. The cognitive image analysis model extracts property attributes from aerial photographs: roof type and material (clay, asphalt, shingle), siding type (vinyl, wood), structures (sheds, garages, barns), vegetation density, proximity to water bodies, number of chimneys, windows, and doors. The object recognition model calculates distances between objects of interest: property to vegetation (critical for wildfire risk), property to fire station (response time proxy), property to water bodies (flood risk and firefighting resource), property to nearest road, and distances between residence features such as chimney to window (fire spread risk). The prediction model combines these with location attributes (historical claims data, policy information, weather history) to generate risk scores, severity estimates (predicted dollar costs), and frequency projections for specific environmental events.

Comparative location analysis: Claims 2 through 4 describe a method for improving predictions by analyzing similar properties. The system computes Euclidean distances across attribute vectors to identify similar locations, determines object distances for each similar location from stored imagery, and provides these comparative distances to the prediction model alongside the target property's data. For severity predictions specifically (Claim 4), the system extracts claims attributes from insurance claims associated with similar locations and incorporates them into the model input. This means a property risk assessment for a home in a wildfire-prone area would consider not just the target property's vegetation distance and roof material but also the claims history and damage costs of similar properties in the region.

The data recency optimization: A distinctive feature of this patent is its focus on reducing network traffic. Rather than always querying remote aerial imagery providers (like GEOX), the system first checks whether sufficiently recent images exist in the local data store. Only if local images fail the recency criterion does the system query the remote provider. Claim 15 describes an attribute freshness mechanism: the system compares location attributes in the local store to image-based attributes from newly obtained images and updates stale local attributes when the image-based data is more recent. This keeps the local data store current without requiring constant remote queries.

Weather integration: Claim 16 adds weather data to the prediction pipeline. The system retrieves weather data from a coupled data store, extracts weather events satisfying a recency criterion, filters for severe weather events exceeding a severity criterion (the patent references industry standards like hurricane categories, severe thunderstorm definitions from weather.gov, and flash flood warnings), extracts weather-based attributes (wind speed, barometric pressure, temperature, precipitation, ice conditions), and feeds these into the prediction model alongside image and location attributes.

What this patent does NOT protect: The patent does not claim specific aerial imagery sources or providers. It does not protect the property attribute classification taxonomy (roof types, siding types, etc.). It does not claim catastrophe modeling methodologies or actuarial pricing algorithms. And it does not describe integration with homeowners rating engines or policy administration systems, meaning the predictions are generated as standalone outputs rather than feeding directly into automated pricing decisions.

How These Three Patents Compare to AIG's Approach

AIG's patent portfolio and EXL's insurance-specific patents solve fundamentally different problems at different points in the insurance value chain.

AIG's three patents focus on the front end: getting data out of incoming E&S submission documents accurately and traceably so underwriters can evaluate risks. AIG's innovation is in extraction (markdown-based separation of tables and text), quality control (chunk-level traceability and hallucination detection), and handling complexity (chain-of-thought prompting for multi-table spreadsheets). AIG processes the documents; humans make the decisions.

EXL's three insurance patents focus on the analytical layer: once data has been extracted and structured, generating actionable intelligence for claims adjudicators, regulatory reporting teams, and property underwriters. EXL's Insurance LLM does not just extract information from claims documents. It generates recommended actions with predicted outcomes. The Regulatory Reporting patent does not just pull financial data from statutory statements. It generates narrative explanations informed by historical regulatory interactions. The Property Insights patent does not just identify roof types from aerial images. It generates risk, severity, and frequency predictions for specific environmental events.

This distinction matters for actuaries because the analytical layer is where actuarial judgment has traditionally lived. Extraction is a data preparation task. Analysis and recommendation are professional judgment tasks. As AI systems move from extraction into analysis, the question of how actuarial standards of practice apply to AI-generated recommendations becomes increasingly pressing. EXL's Insurance LLM patent explicitly names actuaries as target users for its guidance artifacts. The regulatory reporting patent automates flux analysis and validation workflows where actuarial sign-off is typically required. These are not tools that sit upstream of actuarial work. They are tools that overlap with it.

The Domain-Specific LLM Question

EXL's Insurance LLM patent raises a strategic question that the entire insurance industry is grappling with: should carriers invest in domain-specific models or use general-purpose models with domain-specific prompting and retrieval?

AIG's approach implicitly answers this question in favor of general-purpose models with specialized infrastructure. AIG uses Anthropic's Claude (a general-purpose model) deployed through Palantir Foundry with proprietary prompt engineering and retrieval systems. AIG does not fine-tune Claude on insurance data. It wraps Claude in a purpose-built extraction and traceability layer.

EXL's approach answers the question differently. The Insurance LLM is fine-tuned on proprietary insurance data using supervised fine-tuning and LoRA adaptation, creating a model that has internalized insurance domain knowledge at the parameter level rather than retrieving it at inference time. Gartner's projection that more than 50% of enterprise GenAI models will be industry-specific by 2027 (up from approximately 1% in 2023) suggests the market is tilting toward EXL's view.

The actuarial implications of this divergence are significant. A general-purpose model with domain-specific retrieval (AIG's approach) can be audited by examining its prompts, retrieval sources, and chain-of-thought outputs. The model's reasoning is, in principle, traceable. A domain-specific fine-tuned model (EXL's approach) has internalized patterns from its training data into model parameters, making its reasoning less directly auditable. When EXL's Insurance LLM recommends a claims settlement amount, the recommendation reflects both the input data and the patterns absorbed during fine-tuning on a decade of proprietary claims data. Explaining why the model made a specific recommendation is a harder problem than explaining why a general-purpose model produced a specific extraction.

This is not a theoretical concern. ASOP No. 56 (Modeling) requires actuaries to understand model limitations and communicate appropriate caveats. As domain-specific LLMs move into workflows where actuarial sign-off is expected, the profession will need frameworks for evaluating and validating AI models whose reasoning is partially embedded in opaque parameter weights rather than transparent retrieval chains.

Actuarial Implications

Three specific implications stand out from this analysis.

First, the regulatory reporting patent (US 12,468,696) is the most directly relevant to daily actuarial work of any patent in the AIG, Quantiphi, or EXL portfolios. Its automated flux analysis, compliance validation, historical query repository, and anomaly detection capabilities target the statutory reporting workflow where signing actuaries spend significant time. Carriers using EXL for regulatory reporting support should evaluate how much of the analytical work currently performed by actuaries is being replicated by the patented system, and whether the system's outputs are being used to inform or to replace actuarial judgment.

Second, the Property Insights patent (US 12,387,271) feeds directly into underwriting and pricing decisions that actuaries support. Risk scores, severity predictions, and frequency projections for environmental events are actuarial outputs in all but name. If a property underwriter uses EXL's patented system to assess wildfire risk for a homeowners portfolio, the resulting risk segmentation is performing a function that actuaries have traditionally owned. The patent's inclusion of comparative location analysis using historical claims data and Euclidean distance-based similarity scoring is, in essence, a simplified form of credibility-weighted experience rating.

Third, the three-patent combination creates a self-reinforcing system: the Insurance LLM generates claims recommendations, the regulatory reporting platform explains the financial impact of claims patterns to regulators, and the property analytics system informs the underwriting decisions that determine which risks enter the portfolio in the first place. For carriers deeply embedded in EXL's services ecosystem, this creates analytical dependency across multiple actuarial functions simultaneously, a concentration of capability in a single service provider that risk committees should evaluate.

Sources

U.S. Patent No. 12,399,924, "Robust methods for multi-domain signal evaluation systems," granted August 26, 2025, assigned to ExlService Holdings, Inc. patents.google.com
U.S. Patent No. 12,468,696, "Signal evaluation platform," granted November 11, 2025, assigned to ExlService Holdings, Inc. patents.google.com
U.S. Patent No. 12,387,271, "Reducing network traffic associated with generating event predictions based on cognitive image analysis systems and methods," granted August 12, 2025, assigned to ExlService Holdings, Inc. patents.google.com
EXL, "EXL launches specialized Insurance Large Language Model (LLM) leveraging NVIDIA AI Enterprise," September 26, 2024. exlservice.com
EXL, "AI-powered insurance workflows: Operationalizing LLMs with EXL Insurance LLM," white paper, 2025. exlservice.com
EXL, "EXL granted 10 new patents in the last year for AI solutions," GlobeNewsWire, February 9, 2026. globenewswire.com
ExlService Holdings, Inc., "EXL Reports 2025 Fourth Quarter and Year-End Results," SEC Form 8-K, February 24, 2026. sec.gov
EXL, "EXL to create enterprise-wide data and AI applications for insurance, healthcare, banking, retail and other industries using NVIDIA AI," 2024. exlservice.com
EXL, "EXL's Large Language Models," product page. exlservice.com
U.S. Patent No. 12,437,155, "Auto-extracting tabular and textual information from digital documents for populating data retrieval systems," granted February 11, 2025, assigned to American International Group, Inc. patents.google.com