Coalition's 2026 Cyber Claims Report, published March 5, 2026, documents a record 86% ransom refusal rate among businesses hit by ransomware in 2025, while initial demands surged 47% year over year to an average exceeding $1 million. Overall cyber claim severity fell 19% to $116,000, but ransomware-specific losses averaged $269,000, and dual-extortion attacks (70% of ransomware claims) cost more than double encryption-only incidents at $299,000 versus $138,000. This data reveals a loss distribution that is not unimodal: it has a large mass near zero, a moderate cluster around incident-response costs, and a heavy right tail from ransom payments plus dual-extortion losses. Standard single-distribution severity models systematically misfit this shape. The solution is a finite mixture distribution where a binary policyholder decision (pay or refuse) defines which severity component generates each claim.

Why Single-Distribution Models Fail

Pricing actuaries typically fit cyber severity with a single parametric distribution: log-normal, Pareto, or Weibull. These families assume a single generating process for all claims. That assumption breaks when a binary decision point splits claims into fundamentally different cost paths.

Consider the 2025 Coalition data. Of all closed cyber claims, 64% resolved with zero out-of-pocket cost to the policyholder. Among the ransomware subset, 86% of targeted businesses refused to pay, incurring incident response, forensics, notification, and limited business interruption costs but no ransom outlay. The remaining 14% that paid faced negotiated demands averaging $355,000 (a 65% reduction from initial asks) plus data recovery, extended business interruption, and regulatory exposure. Fitting a single log-normal to both paths simultaneously forces the distribution to compromise: it understates the probability mass near zero, overstates the density in the middle range, and underestimates tail thickness. The result is a model that is wrong everywhere and adequate nowhere.

From tracking cyber loss data over several renewal cycles, the pattern is consistent: any line where a policyholder or claimant makes a discrete choice that fundamentally alters claim magnitude produces a multimodal severity distribution. In cyber, that choice is whether to pay the ransom. In general liability, it is whether the claimant retains litigation counsel. In auto physical damage, it is total loss versus repair. Single-distribution models misfit all of these.

The Mixture Model Specification

A two-component finite mixture distribution separates the two cost paths. The severity density for a randomly drawn ransomware claim is:

f(x) = π1 · f1(x) + π2 · f2(x)

where π1 is the refusal probability (currently 0.86), π2 = 1 - π1 is the payment probability (0.14), f1 is the severity density for claims on the refusal path, and f2 is the severity density for claims on the payment path.

Component A: The Refusal Path

When a business refuses to pay, claim costs comprise incident response retainers ($15,000 to $50,000 typical), digital forensics ($25,000 to $100,000 depending on network complexity), breach notification and credit monitoring ($5 to $15 per affected record), crisis communications, and limited business interruption for the period required to restore from backups. Coalition's data shows encryption-only incidents (no data theft, no payment) averaged $138,000 in 2025 severity. With 86% of victims refusing payment, the refusal-path distribution is well-characterized by observed loss data.

A log-normal distribution fits this component well. Incident response costs are multiplicative (each additional compromised system adds a proportional cost increment), and the distribution is right-skewed but bounded: even a catastrophic refusal-path claim rarely exceeds $500,000 because the ransom payment itself is absent. The parameters (μ1, σ1) can be estimated directly from the stratified subset of claims where the insured is known to have refused payment.

Component B: The Payment Path

The 14% that pay face a different cost structure: the negotiated ransom payment itself (averaging $355,000 per Coalition's data, after a typical 65% negotiation discount from initial demands), all the incident-response costs from Component A, plus extended business interruption from delayed recovery and potential regulatory penalties triggered by data theft disclosure. Coalition reports that dual-extortion attacks, which accounted for 70% of ransomware events, produced average severity of $299,000 versus $138,000 for encryption-only, confirming that data exfiltration substantially amplifies the payment-path cost.

This component requires a heavier-tailed distribution. A Pareto or a second log-normal with higher location and scale parameters captures the extended right tail. Some demands in 2025 reached $16 million, and while negotiation reduces these substantially, paid claims still produce losses in the hundreds of thousands to low millions. A Pareto Type II (Lomax) distribution with shape parameter α2 and scale parameter θ2 accommodates this tail weight while maintaining tractable maximum likelihood estimation.

Modeling the Mixture Weight as a Time-Varying Parameter

The refusal rate is not static. Chainalysis data shows ransomware payment rates fell from roughly 78.9% in 2022 to approximately 28% of identified victims by 2025, with Coveware reporting rates as low as 20% in Q4 2025. Coalition's 86% refusal rate among its policyholders is consistent with the broader trend but reflects the specific population of cyber-insured businesses with active incident response support.

Treating π1 as a fixed constant embeds an assumption that will be wrong within one or two renewal cycles. Instead, model the refusal probability as a logistic function of time:

π1(t) = 1 / (1 + e-(β0 + β1 · t))

where t indexes the accident year or quarter. Fitting β0 and β1 to the historical refusal-rate series (approximately 70% refusal two years ago, 80% one year ago, 86% now) produces a trend that can be projected forward for prospective rate indications. A logistic specification is natural because π1 is bounded between 0 and 1, and the current trajectory suggests asymptotic behavior: the refusal rate cannot exceed 100%, and the rate of increase should decelerate as the remaining "willing to pay" population shrinks to organizations with genuinely no backup alternative.

This trend has direct pricing consequences. As π1 rises, expected severity shifts toward the lighter-tailed refusal component, mechanically reducing the overall mean. But the payment component's severity may simultaneously increase if the remaining payers are self-selected for catastrophic data-loss situations where backups are unavailable. The mixture model captures both dynamics; a single-distribution model conflates them.

Parameter Estimation via the EM Algorithm

When individual claim outcomes (pay versus refuse) are observable in the carrier's own data, parameter estimation is straightforward: stratify the data, fit each component distribution separately using maximum likelihood, and estimate mixture weights from the observed proportions. Coalition's data, drawn from 100,000+ policyholders across five countries, permits this direct approach.

Industry-level data presents a harder problem. Aggregated cyber loss triangles from ISO or surplus lines statistical agents do not always label individual claims by ransom-payment status. The pay/refuse indicator becomes a latent variable, and the expectation-maximization (EM) algorithm provides the standard estimation framework.

The EM algorithm iterates between two steps. In the E-step, compute the posterior probability that each observed claim xi came from component k, given current parameter estimates:

wik = πk · fk(xi; θk) / [π1 · f1(xi; θ1) + π2 · f2(xi; θ2)]

In the M-step, update each component's parameters by maximizing the weighted log-likelihood, where the weights are the posterior probabilities from the E-step. Update mixture weights as πk = (1/n) · Σ wik. Iterate until the log-likelihood converges.

Two practical adjustments matter for insurance data. First, truncation: deductibles create left-truncation, and policy limits create right-censoring. Both the E-step posterior calculations and the M-step likelihood maximizations must condition on the observation being in the (deductible, limit) window. Ignoring truncation biases the refusal-component parameters upward (small claims below the deductible are missing) and the payment-component tail parameters downward (large claims hitting limits are censored). Second, the EM algorithm is sensitive to initialization. Starting with external information, such as Coalition's published 86/14 split and component-level severity averages, produces faster convergence and avoids local optima relative to random initialization.

Dual Extortion as a Sub-Segmentation Variable

Within the payment component, dual-extortion status introduces a further severity differential. Coalition's data shows dual-extortion attacks (encryption plus data theft) produced average severity of $299,000 versus $138,000 for encryption-only, a 2.2x multiplier. With 70% of ransomware claims involving data exfiltration, dual extortion is the dominant attack modality rather than an edge case.

Two modeling approaches accommodate this sub-segmentation. The first nests a second mixture within the payment component:

f2(x) = π2a · f2a(x) + π2b · f2b(x)

where f2a is the severity distribution for paid dual-extortion claims and f2b is the distribution for paid encryption-only claims, with π2a = 0.70 reflecting the observed dual-extortion share. This produces a three-component mixture overall but adds parameters that may not be identifiable with limited payment-path sample sizes.

The second approach, more parsimonious, treats dual-extortion status as a covariate within a single payment-component distribution. If the payment severity follows a log-normal, include a binary indicator Di (1 = dual extortion, 0 = encryption only) in the location parameter: μ2(Di) = μ20 + γ · Di. This requires fewer parameters and is estimable even when the payer sample is small. The coefficient γ captures the marginal severity increase from data exfiltration, directly interpretable as a multiplicative cost factor when exponentiated.

Revenue-Band Segmentation and Exposure Scaling

Coalition's data reveals stark size-based differences. Businesses with over $100 million in revenue experienced claims frequency of 5.72%, five times higher than the 1.21% rate for businesses under $25 million. Average severity for the large-revenue segment was $268,000, compared with $77,000 for small businesses. These are not differences that a single severity distribution with a revenue covariate can absorb cleanly; the generating processes differ because large organizations have more endpoints, more data, more regulatory exposure, and more bargaining leverage in ransom negotiations.

Pricing actuaries working with rated exposure bases should either fit separate mixture models by revenue band or incorporate revenue as a covariate in both the mixture weight (larger firms may have different refusal rates due to better backup infrastructure, or lower rates due to higher stakes) and the component parameters. The former approach is preferable when credibility supports it; the latter when sample sizes within bands are thin.

The industry-level data shows that information technology firms experienced the highest average loss at $182,000, while financial services firms averaged $64,000, reflecting different data volumes, regulatory environments, and incident-response maturity levels. Sector-specific parameterization, or at minimum sector-level severity relativities applied to the mixture model's base parameters, improves out-of-sample fit.

From Severity Model to Aggregate Loss Distribution

The mixture severity model feeds directly into aggregate loss estimation via Monte Carlo simulation. The simulation procedure is:

  1. Draw claim count N from the fitted frequency distribution (negative binomial is standard for cyber, accommodating contagion-driven overdispersion). Coalition reports overall claims frequency of 1.54% in 2025, a 3% increase year over year.
  2. For each of the N claims, draw a uniform random variable u. If u < π1(t), draw severity from f1 (refusal component). Otherwise, draw from f2 (payment component), with dual-extortion status drawn as a Bernoulli trial within the payment path.
  3. Sum severities to produce one realization of aggregate loss. Repeat 100,000+ times to build the empirical aggregate loss distribution.

This procedure produces an aggregate distribution that properly reflects the bimodality in claim-level severity. A single-distribution severity model generates aggregate distributions that are smoother and lighter-tailed than reality, underestimating the probability of extreme aggregate outcomes where multiple large ransom payments coincide.

Implications for Reinsurance Pricing and Risk Loads

Bifurcated severity distributions have direct consequences for excess-of-loss reinsurance pricing and aggregate stop-loss structures. The coefficient of variation (CV) of the mixture distribution exceeds the CV of either component alone because the mixture adds between-component variance to within-component variance. A higher CV translates to larger risk loads in reinsurance pricing, particularly for layers that attach in the region between the two component means.

Gallagher Re reported a 32% risk-adjusted rate decline for cyber aggregate excess-of-loss contracts at the January 2026 renewals, driven by favorable loss experience and surplus capacity. But reinsurance pricing models that use single-distribution severity assumptions may understate the tail risk in these layers. If the refusal rate reverses, even partially (for instance, a novel ransomware strain that defeats current backup strategies), the payment component would activate for a larger share of claims, and the aggregate distribution would shift rapidly toward heavier losses. A mixture model parameterized with a time-varying π1 allows reinsurance actuaries to stress-test this scenario explicitly, by projecting π1 downward and observing the impact on excess layer expected losses.

For aggregate stop-loss structures, the bimodal claim-level distribution produces a more variable aggregate outcome at small portfolio sizes. Cedants with fewer than 500 cyber policies may find that the aggregate distribution's shape is highly sensitive to the mixture weight, because a handful of ransom payments can dominate the aggregate. This argues for higher risk loads in aggregate covers written for small cyber books, independent of the reinsurance market's current pricing trajectory.

Transferability Beyond Cyber

The mixture-model framework applies to any P&C line where a binary decision creates structural severity bifurcation. In commercial general liability, represented versus unrepresented claimants produce different severity distributions: represented claimants access litigation funding, generate higher demands, and settle at multiples of unrepresented claims. In workers' compensation, the total-versus-partial disability determination creates a severity split analogous to pay-versus-refuse. In auto physical damage, total loss declarations versus repairable vehicles produce bimodal severity that single distributions systematically smooth over.

In each case, the methodology is identical: identify the binary decision variable, estimate the mixture weight from observed data (or as a latent variable via EM), fit component distributions separately, and simulate aggregates from the fitted mixture. The specific distributional families and parameterizations change, but the framework transfers directly.

Why This Matters for the 2027 Rate Cycle

The U.S. cyber insurance market recorded its first-ever premium decline in 2024, with direct written premiums falling from $7.25 billion to $7.08 billion as competition intensified and loss ratios remained favorable. S&P Global projects premiums reaching $23 billion globally by 2026. In a softening rate environment, the accuracy of severity assumptions directly determines whether filed rates are adequate.

A single-distribution model calibrated to 2025 data, where 86% of claims follow the lighter refusal path, produces a lower expected severity than a mixture model that separately accounts for the possibility that the refusal rate could revert. The mixture framework forces the pricing actuary to make the refusal-rate trend assumption explicit and testable, rather than embedding it implicitly in a blended severity fit. That transparency is valuable for actuarial opinions, rate filings, and reinsurance negotiations.

Patterns we have observed across recent cyber rate filings suggest that regulators are increasingly comfortable with split-trend and mixture-model approaches, provided the documentation clearly explains the component structure, the data stratification, and the sensitivity of the indicated rate to the mixture weight assumption. ASOP No. 25 (Credibility Procedures) and ASOP No. 43 (Property/Casualty Unpaid Claim Estimates) provide the professional standards framework for this documentation.

Sources

Further Reading