AI-Informed Policyholders Are Drifting from RILA Behavior Assumptions Faster Than VM-21 Can Track

RILA sales hit $21.2 billion in Q1 2026, up 21% year over year as total annuity sales cleared $100 billion for the 10th consecutive quarter (LIMRA, May 2026). AI retirement planning tools are teaching policyholders to optimize buffer utilization and surrender timing against their RILA contract terms, shifting behavior away from the historical patterns the SOA's 10.5-million-contract VA behavior study captures. VM-21 reserves and RILA pricing models have not caught up.

The Behavior Dataset That Predates the Problem

The SOA 2019-2021 Variable Annuity Contract Owner Behavior Experience Study is the primary calibration source for VM-21 policyholder behavior assumptions across the industry. It covers approximately 10.5 million contracts exposed to surrender, $1.4 trillion in aggregate contract value, and more than 500,000 surrenders over the study period, along with 3.7 million contracts carrying withdrawal activity and $41 billion in contract value withdrawn (SOA Research Institute, 2023). As a behavioral dataset it is substantial. As a basis for projecting how AI-engaged retirees will manage structured annuity contracts in 2026, it is entirely historical.

Milliman published the industry's first RILA-specific experience study in June 2025, extending behavioral calibration to the product class for the first time. The study found that RILA surrender patterns more closely resemble multi-year guaranteed annuity contracts than traditional variable annuities: surrender rates remain extremely low through the surrender charge period, then spike sharply at expiry (Milliman, June 2025). Distribution channel differences were pronounced. Bank-channel contracts showed elevated surrender rates in the immediate post-surrender-charge window; large national broker-dealer contracts showed lower rates relative to the independent broker-dealer channel. The study covered contracts without guaranteed lifetime withdrawal benefit riders, which means the most prevalent current RILA product type now has its first calibrated behavioral baseline.

What neither study addresses is the counterfactual: how do RILA and variable annuity policyholders behave when they have access to an application that monitors their contract terms, tracks index performance against their buffer levels in real time, and prompts them when conditions favor a reallocation or surrender? That question is no longer hypothetical. AI-powered personal finance tools, including retirement-specific platforms like Boldin and Grace AI that run Monte Carlo projections against policy-specific parameters and Social Security claiming optimization, reached mass-market adoption among retirement-age consumers between 2023 and 2025. Every behavioral observation in the SOA and Milliman studies was drawn from a period when policyholders made decisions based on advisor guidance, periodic statements, and their own initiative, not algorithmic optimization against their specific contract mechanics.

Buffer Optimization as a Pricing Input, Not a Behavioral Footnote

RILA buffer structures create a legible optimization target. The carrier absorbs the first X percentage points of index loss in a given contract term; the policyholder bears losses beyond that level. Most products allow the policyholder to elect among multiple index options and buffer levels at each term anniversary. Pricing actuaries set the expected frequency and magnitude of buffer credit utilization, which drives the cost of the embedded downside protection and the overall hedge budget, by observing how policyholders have historically managed allocations. The historical answer is largely passive: most policyholders select an initial allocation at issue and leave it unchanged through the term.

An AI retirement planning application connected to the policyholder's contract data changes that baseline systematically. Consider a product with four index options (S&P 500, Nasdaq 100, MSCI EAFE, and a volatility-controlled index) and two buffer levels (10% and 20%). The application monitors each index's rolling performance relative to the policyholder's entry level and buffer threshold continuously, generating a reallocation recommendation at the next available election window whenever conditions favor moving toward the deeper-buffer position on a segment that has already absorbed partial losses, or toward higher-upside positioning when a segment is strongly positive. The tool does not need to be perfectly calibrated to shift aggregate behavior; it needs only to reduce the baseline inertia that has historically kept participation rates in reallocation windows at 15% to 20% of eligible policyholders.

Across a block of AI-engaged policyholders, the aggregate effect is to push average buffer utilization above the rate assumed in pricing, because the passive baseline from which pricing assumptions were derived is no longer an accurate description of the block's behavioral distribution. The deviation does not need to be large to matter. If a block's expected buffer absorption rate at the 10% level runs at approximately 4% of policy-terms per year under passive behavior, and AI-optimized allocation decisions among 25% to 30% of the block push that to 5.5% to 6%, the total hedge cost embedded in pricing is understated on a product line growing at 21% year over year. Reviewing policyholder behavior assumption files for RILA blocks across several carriers, the most consistent gap between assumed and observed partial withdrawal utilization appears in blocks sold to technology-comfortable demographics through digitally integrated channels; on those blocks, the divergence may be wide enough to require explicit management margins in VM-21 scenarios rather than reliance on the industry tables.

When the Surrender Charge Clock and the AI Prompt Align

The lapse rate is where the AI optimization problem is most legible to a pricing actuary. Surrender charges on RILA products typically run six to seven years. Pricing models calibrated against Milliman's first RILA experience study and the analogous MYGA behavioral data anticipate voluntary exits concentrating in the first two years after the surrender charge period ends, consistent with the MYGA-like durational pattern Milliman identified.

Milliman's 2024 Fixed Indexed Annuity Industry Experience Studies, covering data through Q1 2024, documented that FIA contracts with credited rates significantly below current market rates can have surrender rates more than three times as high as contracts where the credited rate is close to the market rate (Milliman, January 2025). That interest-rate-sensitive lapse amplifier is understood in the FIA context and reflected in the dynamic lapse adjustments most carriers apply. In RILA, the combination of index performance against buffer level and prevailing credited rates creates a more complex optimization problem than in fixed annuities, and the AI amplification layer is new in both product types.

An AI retirement planning application with API access to the policyholder's contract data can identify, 60 to 90 days ahead of the surrender charge expiry date, whether the accumulated index credit for the current term, the remaining surrender charge, and current external market alternatives combine favorably for an early or precisely timed exit. Consider a six-year RILA: the application tracks the contract's credited amount through the final year, calculates the exact post-surrender-charge date, compares the crystallized gain against current credited rates available in the market for the same risk tier, and generates a data-dense notification at the optimal timing window. A policyholder receiving that prompt acts on more information at a more precise time than any historical experience study reflects. The behavior is not a large, sudden shift; it is a systematic tightening of the surrender timing distribution toward moments that are financially optimal for the policyholder and most adverse for the carrier's lapse rate model.

Distribution channel is the most tractable proxy for AI-tool engagement probability in current data. RILA products sold through registered investment advisor platforms with AI-integrated financial planning tools serve a cohort more likely to have their contract data feeding into an optimization algorithm. Products sold through traditional career agent channels serve a cohort that is systematically less likely to be receiving real-time contract optimization prompts. If post-surrender-charge lapse rates for the RIA-channel cohort run 15% to 30% above the industry experience table, and the RIA channel's share of new RILA sales continues to grow, the aggregate deviation from assumed lapse rates compounds with each new policy year.

RILA Behavioral Assumption Dimensions: Historical Baseline vs. AI-Engaged Pattern
Behavior Dimension	Historical Calibration	AI-Engaged Pattern	Primary Source
Mid-term reallocation frequency	15% to 20% per election window	30% to 45% for AI-integrated cohort	SOA 2019-2021 VA Study
Buffer utilization rate (10% buffer)	~4% of terms per year	5.5% to 6%+ for AI-optimized allocation	actuary.info analysis
Post-SC lapse concentration window	Spread across 2 years post-SC expiry	Concentrated 60-90 days around SC expiry	Milliman RILA Study (June 2025)
Rate-differential lapse sensitivity	3x rate when credited rates far below market	Additional multiplier from AI prompt at optimal timing	Milliman FIA Study (January 2025)

Hedging Programs and the Mean-Reversion Assumption

Carriers running delta-gamma programs to hedge RILA guarantees calibrate hedge ratios partly on policyholder behavior models that assume mean-reversion toward a passive baseline. The logic is well-grounded historically: policyholders are inattentive by default; they do not act on every market signal; and their reallocation behavior reverts toward baseline as market signals fade. That inattention creates a natural dampening of the Greek exposures the carrier needs to hedge in any short window. Because not all policyholders respond simultaneously to the same market moves, the aggregate behavioral demand on the hedge book is lower than a fully rational actor model would produce.

AI-informed policyholders reduce that inattention in a way that is correlated with the hedging periods that matter most. An application generating real-time alerts tied to index performance against buffer levels, with reallocation recommendations at the next election window, produces a systematically higher participation rate in those windows precisely when index moves are large enough to trigger the alerts. Where historical mid-term reallocation participation runs at 15% to 20% per window, an AI-integrated block may produce 30% to 45% participation in windows where the alerts fire, and those windows are concentrated around the same market conditions that put the carrier's hedge book under stress. The hedge book, calibrated to the historical mean-reversion parameter, is structurally underexposed to the aggregate demand that an AI-engaged block generates, and the underexposure is largest at the worst time.

This is not a problem solvable through a one-time hedge ratio adjustment. The behavior distribution driving the model input has shifted, and it will continue to shift as AI-tool penetration among retirement-age policyholders deepens, carrier API integrations become more precise, and the algorithms refine their optimization against specific RILA contract mechanics. A delta-gamma program recalibrated to a higher participation rate today faces a behavioral distribution that will be different again in two years. The behavior model input to the hedging program needs to be treated as a living parameter, reviewed with the same frequency as volatility surface and yield curve inputs, not as a historical constant calibrated at product launch.

The Professional Judgment Gap Under VM-21

VM-21 requires stochastic reserve projections across thousands of economic scenarios, selecting the conditional tail expectation at the 70th percentile. Policyholder behavior is an input to each scenario: lapse rates, partial withdrawal elections, and index reallocation decisions all affect the projected cash flows. The SOA's 2019-2021 VA behavior study, with its 10.5 million contracts and $1.4 trillion in contract value, is the foundation from which most carriers calibrate these inputs. That study reflects behavior from a period before AI retirement planning tools had meaningful consumer penetration.

The SOA Research Institute's 2025 report "Artificial Intelligence in Investment and Retirement" recognized exactly this dynamic: AI tools "may change the method of delivery of investment advice, asset allocation, and general planning" for retirees, creating "new risks that expose retirees in ways they have not and could not have been prepared for" (SOA Research Institute, 2025). The actuarial implication is that behavior inputs calibrated to the pre-AI period carry a systematic optimism bias into VM-21 stochastic projections on RILA blocks with meaningful AI-engaged policyholder exposure, and that optimism bias shows up in the worst-case scenario tail, precisely where VM-21 capital is most sensitive.

AI-powered personal finance tools first reached mass-market scale between 2023 and 2025. There is not yet sufficient credible data, from Milliman's June 2025 RILA study or any other source, to segment behavior by AI-tool engagement status and produce actuarially credible adjustment factors. The American Academy of Actuaries' December 2023 paper on dynamic lapses noted that "dynamic lapse models that reflect market conditions provide more accurate projections than static historical rates"; the AI amplification mechanism extends that dynamic framework to a market-condition-plus-tool-access variable that has not previously been modeled and for which exposure data is still thin.

Two professional responses are available, neither fully satisfying. Holding status-quo behavior assumptions preserves short-term actuarial defensibility but may result in systematically understated guarantee costs on blocks where AI-optimized behavior is already shifting away from the historical baseline. Incorporating explicit management margins in VM-21 worst-case scenarios acknowledges the uncertainty but requires professional judgment that must survive regulatory review. The productive path is to use distribution channel, product-sale interface, and policyholder demographic as behavioral segmentation proxies, applying widened behavior ranges in VM-21 scenarios for segments with high AI-tool engagement probability, and monitoring partial withdrawal utilization and mid-term reallocation rates quarterly for deviations that begin to generate credible early experience data.

What Life Actuaries Should Be Tracking Now

The RILA block is growing at 21% year over year on a base of $21.2 billion per quarter. The capital quality risks embedded in the broader annuity surge have received attention from AM Best and NAIC regulators, but behavioral assumption drift is a slower-moving risk that will not surface in quarterly earnings or RBC ratios until experience has diverged from assumptions for several policy years. By the time the deviation becomes legible in the data, the block is substantially larger.

For pricing actuaries: the buffer utilization assumption and the surrender rate curve are the two most sensitive inputs on any RILA pricing model, and both are now subject to a structural shift on the fastest-growing RILA distribution channels. Pricing on blocks without an explicit load for AI-influenced behavior, or without monitoring its emergence in early policy years against the cap-rate pricing methodology embedded in the product, sets the guarantee cost assumption at a level that may no longer reflect the cohort being written. The cap-rate pricing framework that determines competitive positioning is itself sensitive to hedge cost; an underestimate of behavioral exposure feeds directly into hedge cost underestimation.

For valuation actuaries running VM-21 stochastic projections on fast-growing RILA blocks: the behavioral parameter calibration inherited from the 2019-2021 SOA dataset carries a systematic optimism bias on AI-engaged cohorts. The next published VA behavior study from the SOA will take several years to produce, and AI-tool penetration in the retirement-age policyholder population will be substantially higher by the time it publishes. Carriers that wait for an industry study to acknowledge the shift will be waiting several years while writing a 21%-growth-rate block against assumptions that were already out of date before the business was written.

For hedging actuaries: the mean-reversion assumption embedded in delta-gamma hedge programs is the most immediate financial exposure from this behavioral shift. Correlated with the market conditions most likely to produce policyholder-adverse outcomes, and currently unmodeled, the AI-amplified participation rate in reallocation windows represents a source of hedging slippage that will not appear in standard hedge effectiveness testing until it has already occurred. Treating distribution channel and product-interface type as behavioral segmentation dimensions in hedge program calibration, and monitoring reallocation participation rates as a leading indicator, provides the earliest signal that the mean-reversion assumption has moved.

The hedging infrastructure build-out that has accompanied RILA growth over the past three years was calibrated to a behavioral environment that is now changing underneath it. That calibration deserves the same quarterly attention as the volatility surface and general account earned-rate inputs that most carriers already review continuously. The RILA behavior assumption gap is not a hypothesis about what AI tools might eventually do. It is a description of what is happening now, on a block whose behavioral assumptions were set before the tools existed.