Exam FAM — Severity & Frequency Models Flashcards

Parametric loss models for SOA Exam FAM: severity families (exponential, gamma, two-parameter Pareto, lognormal, Weibull) with their pdf/cdf/survival, means, variances, raw moments, coefficient of variation and skewness; tail-weight comparison through existence of moments, hazard rate, and the mean-excess (mean residual life) function; the $(a,b,0)$ frequency class (Poisson, binomial, negative binomial, geometric) identified from the ratio $\frac{p_k}{p_{k-1}}=a+\frac{b}{k}$ and the variance-to-mean test; the $(a,b,1)$ zero-truncated and zero-modified distributions; and continuous mixtures such as the gamma mixture of Poisson giving the negative binomial — all with fully worked numeric examples.

44 cards6 topicsFree · fact-checked · LaTeX math

Tap card or press Space to flip

Answer

Import this deck

Download all 44 cards and import them into your flashcard app (JSON or CSV — works with Anki). Using the Willys app? No import needed — this deck is already built in (Settings → Library → Browse).

Download JSON Download CSV

Every deck is built into the Willys app

All of these decks — including the full practice problem banks — come built into Willys AI Flashcards & Quizzes for iPhone & iPad (Mac version coming soon), with FSRS + SM-2 spaced repetition, streaks, and exam-date cram mode. 14-day free trial, then $14.99. To load a deck in the app: Settings → Library → Browse, then pick your exam and deck.

Download on the App Store

More Exam FAM decks:

Aggregate Loss Models Aggregate Loss Models Practice Coverage Modifications Coverage Modifications Practice Credibility Credibility Practice

← All Exam FAM decks

Browse all 44 cards as a list

Severity distributions
Give the pdf, cdf, survival function, mean and variance of the **exponential** distribution with mean $\theta$.
$f(x)=\dfrac{1}{\theta}e^{-x/\theta}$, $F(x)=1-e^{-x/\theta}$, $S(x)=e^{-x/\theta}$ for $x>0$. $E[X]=\theta$ and $\operatorname{Var}(X)=\theta^{2}$, so the standard deviation is also $\theta$. The exponential is the building-block severity: a single **scale** parameter $\theta$, constant hazard rate $\frac{1}{\theta}$, and the memoryless property.
Moments & CV
State the raw moments $E[X^{k}]$ of an **exponential** with mean $\theta$, and use them to find the variance.
$E[X^{k}]=k!\,\theta^{k}$. So $E[X]=\theta$, $E[X^{2}]=2\theta^{2}$, $E[X^{3}]=6\theta^{3}$. Variance $=E[X^{2}]-(E[X])^{2}=2\theta^{2}-\theta^{2}=\theta^{2}$. Example: with $\theta=500$, $E[X^{3}]=6(500)^{3}=750{,}000{,}000$.
Severity distributions
Give the mean, variance, and raw moments of the **gamma** distribution with shape $\alpha$ and scale $\theta$.
$E[X]=\alpha\theta$, $\operatorname{Var}(X)=\alpha\theta^{2}$. Raw moments: $E[X^{k}]=\theta^{k}\dfrac{\Gamma(\alpha+k)}{\Gamma(\alpha)}=\theta^{k}(\alpha)(\alpha+1)\cdots(\alpha+k-1)$ for integer $k$. $\alpha$ is the **shape** (it controls skewness); $\theta$ is the **scale**. When $\alpha=1$ the gamma reduces to the exponential.
Moments & CV
For a **gamma** severity with $\alpha=3$, $\theta=500$, compute the mean, variance, standard deviation and coefficient of variation.
$E[X]=\alpha\theta=3(500)=1500$. $\operatorname{Var}(X)=\alpha\theta^{2}=3(500)^{2}=750{,}000$, so $\text{SD}=\sqrt{750{,}000}\approx 866.03$. $\text{CV}=\dfrac{\text{SD}}{E[X]}=\dfrac{866.03}{1500}\approx 0.5774$. Check: for a gamma, $\text{CV}=\dfrac{1}{\sqrt{\alpha}}=\dfrac{1}{\sqrt{3}}\approx 0.5774$.
Severity distributions
Give the pdf, survival function, and mean/variance of the **two-parameter Pareto** with shape $\alpha$ and scale $\theta$.
$f(x)=\dfrac{\alpha\theta^{\alpha}}{(x+\theta)^{\alpha+1}}$, $S(x)=\left(\dfrac{\theta}{x+\theta}\right)^{\alpha}$, $F(x)=1-\left(\dfrac{\theta}{x+\theta}\right)^{\alpha}$ for $x>0$. $E[X]=\dfrac{\theta}{\alpha-1}$ (needs $\alpha>1$); $E[X^{2}]=\dfrac{2\theta^{2}}{(\alpha-1)(\alpha-2)}$ (needs $\alpha>2$). $\alpha$ is the **shape** governing tail thickness; $\theta$ is the **scale**.
Moments & CV
For a **Pareto** severity with $\alpha=3$, $\theta=2000$, compute the mean, $E[X^{2}]$, variance, and coefficient of variation.
$E[X]=\dfrac{\theta}{\alpha-1}=\dfrac{2000}{2}=1000$. $E[X^{2}]=\dfrac{2\theta^{2}}{(\alpha-1)(\alpha-2)}=\dfrac{2(2000)^{2}}{(2)(1)}=4{,}000{,}000$. $\operatorname{Var}(X)=4{,}000{,}000-1000^{2}=3{,}000{,}000$, $\text{SD}=\sqrt{3{,}000{,}000}\approx 1732.05$. $\text{CV}=\dfrac{1732.05}{1000}\approx 1.7321>1$ — a sign of a heavy-tailed severity.
Severity distributions
For a **Pareto** with $\alpha=3$, $\theta=2000$, find $P(X>4000)$.
Use the survival function $S(x)=\left(\dfrac{\theta}{x+\theta}\right)^{\alpha}$. $S(4000)=\left(\dfrac{2000}{4000+2000}\right)^{3}=\left(\dfrac{2000}{6000}\right)^{3}=\left(\dfrac{1}{3}\right)^{3}=\dfrac{1}{27}\approx 0.037037$. So about $3.70\%$ of losses exceed $4000$.
Severity distributions
Give the cdf, mean, and $E[X^{2}]$ of the **lognormal** distribution with parameters $\mu$ and $\sigma$.
If $\ln X\sim N(\mu,\sigma^{2})$ then $F(x)=\Phi\!\left(\dfrac{\ln x-\mu}{\sigma}\right)$. $E[X]=e^{\mu+\sigma^{2}/2}$ and $E[X^{k}]=e^{k\mu+k^{2}\sigma^{2}/2}$, so $E[X^{2}]=e^{2\mu+2\sigma^{2}}$. Here $\mu,\sigma$ are the **log-scale** parameters, not the mean/SD of $X$ itself. The lognormal is moderately heavy-tailed (all moments exist, but it has no MGF).
Moments & CV
For a **lognormal** with $\mu=7$, $\sigma=1$, compute the mean, variance, and coefficient of variation.
$E[X]=e^{\mu+\sigma^{2}/2}=e^{7.5}\approx 1808.04$. $E[X^{2}]=e^{2\mu+2\sigma^{2}}=e^{16}\approx 8{,}886{,}111$. $\operatorname{Var}(X)=8{,}886{,}111-1808.04^{2}\approx 5{,}617{,}093$, so $\text{SD}\approx 2370.04$. $\text{CV}=\dfrac{2370.04}{1808.04}\approx 1.3108$. Check: $\text{CV}=\sqrt{e^{\sigma^{2}}-1}=\sqrt{e^{1}-1}\approx 1.3108$.
Severity distributions
For a **lognormal** with $\mu=7$, $\sigma=1$, find $P(X\le 3000)$. (Use $\Phi(1.01)\approx 0.8438$.)
Standardize on the log scale: $z=\dfrac{\ln x-\mu}{\sigma}=\dfrac{\ln 3000-7}{1}$. $\ln 3000\approx 8.00637$, so $z\approx 8.00637-7=1.00637$. $P(X\le 3000)=\Phi(1.0064)\approx 0.8429$. About $84.3\%$ of losses fall at or below $3000$.
Severity distributions
Give the cdf, survival, and mean of the **Weibull** distribution with shape $\tau$ and scale $\theta$.
$F(x)=1-e^{-(x/\theta)^{\tau}}$, $S(x)=e^{-(x/\theta)^{\tau}}$ for $x>0$. $E[X]=\theta\,\Gamma\!\left(1+\tfrac{1}{\tau}\right)$ and $E[X^{k}]=\theta^{k}\,\Gamma\!\left(1+\tfrac{k}{\tau}\right)$. $\tau$ is the **shape**: $\tau=1$ gives the exponential, $\tau>1$ gives an increasing hazard (lighter tail), $\tau<1$ a decreasing hazard (heavier tail). $\theta$ is the **scale**.
Moments & CV
For a **Weibull** with $\tau=2$, $\theta=1000$, compute the mean, variance, and coefficient of variation. (Use $\Gamma(1.5)=0.886227$, $\Gamma(2)=1$.)
$E[X]=\theta\,\Gamma(1+\tfrac12)=1000(0.886227)\approx 886.23$. $E[X^{2}]=\theta^{2}\,\Gamma(1+1)=1000^{2}(1)=1{,}000{,}000$. $\operatorname{Var}(X)=1{,}000{,}000-886.23^{2}\approx 214{,}602$, $\text{SD}\approx 463.25$. $\text{CV}=\dfrac{463.25}{886.23}\approx 0.5227<1$ — the $\tau=2$ Weibull is light-tailed relative to the exponential.
Severity distributions
How does a pure **scale** parameter $\theta$ act on a severity distribution, and which families is $\theta$ a scale parameter for?
If $\theta$ is a scale parameter, multiplying losses by a constant $c$ (e.g. inflation) just replaces $\theta$ with $c\theta$ and leaves the **shape** parameters unchanged: $E[X^{k}]$ scales by $c^{k}$ and the CV is unchanged. In the exponential, gamma, Pareto, and Weibull, $\theta$ is a scale parameter. For the lognormal, a scale change of $c$ shifts $\mu\to\mu+\ln c$ while $\sigma$ (the shape) is unchanged.
Moments & CV
Distinguish a **scale** parameter from a **shape** parameter, and why does the **coefficient of variation** isolate shape?
A **scale** parameter stretches the distribution along the loss axis without changing its form; a **shape** parameter changes the form (skewness, tail). Multiplying $X$ by $c$ multiplies the mean and SD both by $c$, so $\text{CV}=\dfrac{\text{SD}}{\text{mean}}$ is **scale-invariant** and depends only on the shape parameters. Hence gamma CV $=\frac{1}{\sqrt{\alpha}}$, exponential CV $=1$, lognormal CV $=\sqrt{e^{\sigma^{2}}-1}$ — each free of the scale.
Moments & CV
Define the **coefficient of variation** and **skewness** of a loss random variable, and give their values for the exponential.
$\text{CV}=\dfrac{\sqrt{\operatorname{Var}(X)}}{E[X]}$ (relative dispersion). Skewness $\gamma_{1}=\dfrac{E[(X-\mu)^{3}]}{\sigma^{3}}=\dfrac{E[X^{3}]-3\mu E[X^{2}]+2\mu^{3}}{\sigma^{3}}$ (asymmetry). For the exponential with mean $\theta$: $\text{CV}=1$ and skewness $=2$. For a gamma, $\text{CV}=\frac{1}{\sqrt{\alpha}}$ and skewness $=\frac{2}{\sqrt{\alpha}}$, both shrinking as $\alpha$ grows.
Moments & CV
Compute the **skewness** of an exponential with mean $\theta$ from its raw moments.
Use $E[X]=\theta$, $E[X^{2}]=2\theta^{2}$, $E[X^{3}]=6\theta^{3}$, and $\sigma=\theta$. Third central moment $=E[X^{3}]-3\mu E[X^{2}]+2\mu^{3}=6\theta^{3}-3\theta(2\theta^{2})+2\theta^{3}=6\theta^{3}-6\theta^{3}+2\theta^{3}=2\theta^{3}$. Skewness $=\dfrac{2\theta^{3}}{\sigma^{3}}=\dfrac{2\theta^{3}}{\theta^{3}}=2$ — positive and independent of $\theta$ (a shape feature).
Moments & CV
A gamma severity has mean $1000$ and variance $250{,}000$. Find its shape $\alpha$, scale $\theta$, and coefficient of variation.
$E[X]=\alpha\theta=1000$ and $\operatorname{Var}(X)=\alpha\theta^{2}=250{,}000$. Divide: $\dfrac{\alpha\theta^{2}}{\alpha\theta}=\theta=\dfrac{250{,}000}{1000}=250$. Then $\alpha=\dfrac{1000}{\theta}=\dfrac{1000}{250}=4$. $\text{CV}=\dfrac{\sqrt{250{,}000}}{1000}=\dfrac{500}{1000}=0.5$ (equivalently $\frac{1}{\sqrt{4}}$).
Tail weight
Define the **hazard rate** (failure rate) $h(x)$ and state it for the exponential, Weibull, and Pareto.
$h(x)=\dfrac{f(x)}{S(x)}=-\dfrac{d}{dx}\ln S(x)$ — the instantaneous loss/failure intensity given survival to $x$. **Exponential:** $h(x)=\frac{1}{\theta}$, constant. **Weibull:** $h(x)=\dfrac{\tau}{\theta}\left(\dfrac{x}{\theta}\right)^{\tau-1}$ — increasing if $\tau>1$, decreasing if $\tau<1$. **Pareto:** $h(x)=\dfrac{\alpha}{x+\theta}$, decreasing in $x$. A **decreasing** hazard signals a **heavier** tail; an **increasing** hazard a lighter one.
Tail weight
Define the **mean excess (mean residual life)** function $e(d)$ and give it for the exponential and the two-parameter Pareto.
$e(d)=E[X-d\mid X>d]=\dfrac{\int_{d}^{\infty}S(x)\,dx}{S(d)}=\dfrac{E[X]-E[X\wedge d]}{S(d)}$ — the expected loss above $d$ given the loss exceeds $d$. **Exponential:** $e(d)=\theta$, constant (memorylessness). **Two-parameter Pareto:** $e(d)=\dfrac{d+\theta}{\alpha-1}$, increasing **linearly** in $d$. A mean excess that **increases** with $d$ indicates a heavy tail.
Tail weight
For a **Pareto** with $\alpha=3$, $\theta=2000$, find the mean excess loss $e(4000)$ and interpret it against the exponential.
$e(d)=\dfrac{d+\theta}{\alpha-1}$, so $e(4000)=\dfrac{4000+2000}{3-1}=\dfrac{6000}{2}=3000$. Given a loss already exceeds $4000$, the expected amount **above** $4000$ is $3000$. Because $e(d)$ rises with $d$ (here from $e(0)=1000$ to $e(4000)=3000$), the Pareto is heavy-tailed; for an exponential $e(d)=\theta$ would be flat.
Tail weight
How does **existence of moments** classify tail weight, and rank exponential, gamma, lognormal, and Pareto?
The more moments that exist, the **lighter** the tail. A distribution with all positive moments finite is lighter-tailed than one where high moments diverge. **Pareto($\alpha$):** only $E[X^{k}]$ for $k<\alpha$ exist — heaviest. **Lognormal:** all moments finite but **no** MGF — heavy, but lighter than Pareto. **Gamma / exponential:** all moments finite **and** an MGF exists — lightest of these. So (heavy → light): Pareto $\succ$ lognormal $\succ$ gamma/exponential. (A Weibull with $\tau>1$ is even lighter than the exponential.)
Tail weight
Use the **limit of the ratio of survival functions** to compare the tails of a Pareto and an exponential.
Compare tail weight via $\displaystyle\lim_{x\to\infty}\dfrac{S_{1}(x)}{S_{2}(x)}$: if the limit is $\infty$, distribution $1$ has the heavier tail; if $0$, the lighter. Pareto $S_{1}(x)=\left(\frac{\theta}{x+\theta}\right)^{\alpha}$ decays **polynomially**; exponential $S_{2}(x)=e^{-x/\theta'}$ decays **exponentially**. $\displaystyle\lim_{x\to\infty}\dfrac{(\theta/(x+\theta))^{\alpha}}{e^{-x/\theta'}}=\infty$ (exponential beats any power), so the Pareto tail is **heavier**.
(a,b,0) class
State the defining recursion of the **$(a,b,0)$ class** and which four distributions belong to it.
A counting distribution on $k=0,1,2,\dots$ is in the $(a,b,0)$ class if it satisfies $\dfrac{p_{k}}{p_{k-1}}=a+\dfrac{b}{k}$ for $k=1,2,3,\dots$, with $p_{0}$ set by normalization. The **only** members are the **Poisson**, **binomial**, **negative binomial**, and **geometric** (the geometric is the negative binomial with $r=1$). The pair $(a,b)$ uniquely identifies which one.
(a,b,0) class
Give the $(a,b,0)$ parameters $a,b$, the mean, and the variance for the **Poisson($\lambda$)**.
$a=0$, $b=\lambda$ (so $\frac{p_{k}}{p_{k-1}}=\frac{\lambda}{k}$). $p_{k}=\dfrac{e^{-\lambda}\lambda^{k}}{k!}$, with $p_{0}=e^{-\lambda}$. $E[N]=\lambda$ and $\operatorname{Var}(N)=\lambda$, so the **variance-to-mean ratio is exactly $1$** — the Poisson's signature.
(a,b,0) class
Give the $(a,b,0)$ parameters, mean, and variance for the **binomial($m,q$)** and the **geometric($\beta$)**.
**Binomial($m,q$):** $a=-\dfrac{q}{1-q}$, $b=(m+1)\dfrac{q}{1-q}$. $E[N]=mq$, $\operatorname{Var}(N)=mq(1-q)$ — variance $<$ mean ($a<0$). **Geometric($\beta$):** $a=\dfrac{\beta}{1+\beta}$, $b=0$. $E[N]=\beta$, $\operatorname{Var}(N)=\beta(1+\beta)$ — variance $>$ mean. The sign of $a$ alone tells the family: $a<0$ binomial, $a=0$ Poisson, $a>0$ negative binomial/geometric.
(a,b,0) class
Give the $(a,b,0)$ parameters, mean, variance, and $p_{0}$ for the **negative binomial($r,\beta$)**.
$a=\dfrac{\beta}{1+\beta}$, $b=(r-1)\dfrac{\beta}{1+\beta}$, and $p_{0}=(1+\beta)^{-r}$. $E[N]=r\beta$ and $\operatorname{Var}(N)=r\beta(1+\beta)$, so the **variance-to-mean ratio is $1+\beta>1$**. With $r=1$ it collapses to the geometric ($b=0$). The negative binomial is the natural model for **over-dispersed** counts.
(a,b,0) class
State the **variance-to-mean ratio test** for distinguishing the $(a,b,0)$ distributions, and apply it to a sample with mean $3.0$ and variance $7.5$.
Compare $\dfrac{\operatorname{Var}(N)}{E[N]}$: **$<1$** → binomial; **$=1$** → Poisson; **$>1$** → negative binomial (geometric if the ratio equals $1+\beta$ with $r=1$). Sample: $\dfrac{7.5}{3.0}=2.5>1$ ⇒ **negative binomial**. Match moments: $1+\beta=2.5\Rightarrow\beta=1.5$, then $r\beta=3\Rightarrow r=\dfrac{3}{1.5}=2$. So the fitted model is negative binomial with $r=2,\ \beta=1.5$.
(a,b,0) class
Successive probabilities of a count satisfy $\dfrac{p_{1}}{p_{0}}=1.2$, $\dfrac{p_{2}}{p_{1}}=0.9$, $\dfrac{p_{3}}{p_{2}}=0.8$. Identify the $(a,b,0)$ distribution and its parameters.
Fit $\dfrac{p_{k}}{p_{k-1}}=a+\dfrac{b}{k}$. Using $k=1$ and $k=2$: $a+b=1.2$ and $a+\tfrac{b}{2}=0.9$. Subtract: $\tfrac{b}{2}=0.3\Rightarrow b=0.6$, then $a=0.6$. Check $k=3$: $a+\tfrac{b}{3}=0.6+0.2=0.8$ ✓. Since $a=0.6>0$, it is **negative binomial**: $a=\frac{\beta}{1+\beta}=0.6\Rightarrow\beta=1.5$, and $b=(r-1)\frac{\beta}{1+\beta}=0.6\Rightarrow r-1=1\Rightarrow r=2$.
(a,b,0) class
Explain the **linear-plot** method: how plotting $k\cdot\dfrac{p_{k}}{p_{k-1}}$ (or $k\cdot\dfrac{n_{k}}{n_{k-1}}$) against $k$ identifies the $(a,b,0)$ distribution.
Multiply the recursion by $k$: $k\dfrac{p_{k}}{p_{k-1}}=ak+b$ — a **straight line** in $k$ with slope $a$ and intercept $b$. Estimate the ratios from observed counts $n_{k}$ and regress $k\,\dfrac{n_{k}}{n_{k-1}}$ on $k$. **Slope $a$:** $<0$ binomial, $=0$ Poisson, $>0$ negative binomial. Example (negative binomial $a=0.6,b=0.6$): the points are $k=1\!:1.2$, $k=2\!:1.8$, $k=3\!:2.4$, $k=4\!:3.0$ — a line of slope $0.6$, intercept $0.6$.
(a,b,0) class
For a **Poisson** with $\lambda=2$, compute $p_{0},p_{1},p_{2},p_{3}$ using the $(a,b,0)$ recursion.
$a=0$, $b=2$, so $p_{k}=p_{k-1}\dfrac{2}{k}$ starting from $p_{0}=e^{-2}\approx 0.135335$. $p_{1}=p_{0}\cdot\tfrac{2}{1}=2e^{-2}\approx 0.270671$. $p_{2}=p_{1}\cdot\tfrac{2}{2}=p_{1}\approx 0.270671$. $p_{3}=p_{2}\cdot\tfrac{2}{3}\approx 0.180447$. (Sum so far $\approx 0.857$; the rest is the upper tail.)
(a,b,0) class
For a **negative binomial** with $r=2$, $\beta=1.5$, compute $p_{0}$ and $p_{1}$ and verify the mean and variance.
$p_{0}=(1+\beta)^{-r}=(2.5)^{-2}=\dfrac{1}{6.25}=0.16$. $a=\dfrac{\beta}{1+\beta}=0.6$, $b=(r-1)\dfrac{\beta}{1+\beta}=0.6$, so $\dfrac{p_{1}}{p_{0}}=a+b=1.2$ and $p_{1}=0.16(1.2)=0.192$. Mean $=r\beta=2(1.5)=3$; variance $=r\beta(1+\beta)=3(2.5)=7.5$ (ratio $2.5=1+\beta$). ✓
(a,b,0) class
A claim count has $E[N]=1.5$ and $\operatorname{Var}(N)=1.05$. Identify the $(a,b,0)$ model and find its parameters.
Ratio $\dfrac{1.05}{1.5}=0.7<1$ ⇒ **binomial** ($a<0$). Match moments: $mq=1.5$ and $mq(1-q)=1.05$, so $1-q=\dfrac{1.05}{1.5}=0.7\Rightarrow q=0.3$. Then $m=\dfrac{1.5}{0.3}=5$. Fitted model: binomial with $m=5$, $q=0.3$ (so $a=-\frac{0.3}{0.7}\approx-0.4286$, $b=(6)\frac{0.3}{0.7}\approx 2.5714$).
(a,b,1) class
Define the **$(a,b,1)$ class** and how it generalizes $(a,b,0)$.
The $(a,b,1)$ class uses the **same** recursion $\dfrac{p_{k}}{p_{k-1}}=a+\dfrac{b}{k}$ but only for $k\ge 2$, leaving $p_{0}$ (hence $p_{1}$) free to be reset. This frees the probability mass at $0$ from the usual $(a,b,0)$ value. Two important sub-types: **zero-truncated** ($p_{0}^{T}=0$, no zeros allowed) and **zero-modified** ($p_{0}^{M}$ chosen arbitrarily). They model count data with too many — or too few — zeros than the base distribution allows.
(a,b,1) class
Give the **zero-truncated** probabilities $p_{k}^{T}$ in terms of the base $(a,b,0)$ probabilities $p_{k}$, and the mean.
Set the probability of zero to $0$ and rescale the rest: $p_{k}^{T}=\dfrac{p_{k}}{1-p_{0}}$ for $k=1,2,3,\dots$. The mean inflates by the same factor: $E[N^{T}]=\dfrac{E[N]}{1-p_{0}}$. Variance: $\operatorname{Var}(N^{T})=\dfrac{E[N^{2}]}{1-p_{0}}-\left(\dfrac{E[N]}{1-p_{0}}\right)^{2}$. Zero-truncation is used when a zero count is impossible (e.g. only policies with at least one claim are observed).
(a,b,1) class
For a **zero-truncated Poisson** with $\lambda=2$, compute $p_{1}^{T}$ and the truncated mean.
Base Poisson: $p_{0}=e^{-2}\approx 0.135335$, so $1-p_{0}\approx 0.864665$. And $p_{1}=2e^{-2}\approx 0.270671$. $p_{1}^{T}=\dfrac{p_{1}}{1-p_{0}}=\dfrac{0.270671}{0.864665}\approx 0.313035$. Truncated mean $=\dfrac{\lambda}{1-p_{0}}=\dfrac{2}{0.864665}\approx 2.31304$ — larger than $2$ because the zeros are removed.
(a,b,1) class
Give the **zero-modified** probabilities $p_{k}^{M}$ in terms of the base $p_{k}$ and the chosen $p_{0}^{M}$, and the mean.
Choose $p_{0}^{M}$ freely, then scale the positive probabilities: $p_{k}^{M}=\dfrac{1-p_{0}^{M}}{1-p_{0}}\,p_{k}$ for $k=1,2,3,\dots$. Equivalently $p_{k}^{M}=(1-p_{0}^{M})\,p_{k}^{T}$, so a zero-modified is a mixture of a point mass at $0$ and the zero-truncated distribution. Mean: $E[N^{M}]=\dfrac{1-p_{0}^{M}}{1-p_{0}}\,E[N]$. (Zero-truncation is the special case $p_{0}^{M}=0$.)
(a,b,1) class
A **zero-modified Poisson** is built from a base Poisson with $\lambda=2$ but with $p_{0}^{M}=0.4$. Find $p_{1}^{M}$ and the modified mean.
Base: $p_{0}=e^{-2}\approx 0.135335$, $1-p_{0}\approx 0.864665$, $p_{1}=2e^{-2}\approx 0.270671$. Scaling factor $c=\dfrac{1-p_{0}^{M}}{1-p_{0}}=\dfrac{0.6}{0.864665}\approx 0.693911$. $p_{1}^{M}=c\,p_{1}=0.693911(0.270671)\approx 0.187821$. Mean $=c\,\lambda=0.693911(2)\approx 1.38782$ — below the base $2$ because extra mass was placed at $0$.
(a,b,1) class
A **zero-modified Poisson** has base $\lambda=1.5$ and $p_{0}^{M}=0.5$. Find $p_{2}^{M}$.
Base: $p_{0}=e^{-1.5}\approx 0.223130$, $1-p_{0}\approx 0.776870$. $p_{1}=1.5e^{-1.5}\approx 0.334695$, and $p_{2}=p_{1}\cdot\dfrac{1.5}{2}\approx 0.251021$. Factor $c=\dfrac{1-0.5}{1-p_{0}}=\dfrac{0.5}{0.776870}\approx 0.643608$. $p_{2}^{M}=c\,p_{2}=0.643608(0.251021)\approx 0.161560$.
(a,b,1) class
Why is the **zero-modified** model so useful in claim-frequency work, and what happens to a zero-modified distribution as $p_{0}^{M}\to 0$?
Real frequency data often have **excess zeros** (many policies file no claims) or, after conditioning, **no zeros** — neither of which a plain Poisson/negative binomial can match because their $p_{0}$ is fixed by the mean. Zero-modification decouples $P(N=0)$ from the rest of the shape, so you fit the zero rate and the positive-count shape separately. As $p_{0}^{M}\to 0$ the zero-modified distribution becomes the **zero-truncated** distribution (all mass moves to $k\ge 1$).
Mixtures
Define a **continuous mixture** of a counting distribution and state the resulting mean and variance via conditioning.
Let $N\mid\Lambda$ have a distribution indexed by a random parameter $\Lambda$ with its own (mixing) distribution. The unconditional pmf is $P(N=k)=\int P(N=k\mid\Lambda=\lambda)\,g(\lambda)\,d\lambda$. By the laws of total expectation/variance: $E[N]=E\!\left[E[N\mid\Lambda]\right]$, and $\operatorname{Var}(N)=E\!\left[\operatorname{Var}(N\mid\Lambda)\right]+\operatorname{Var}\!\left(E[N\mid\Lambda]\right)$. Mixing **adds** the variance term $\operatorname{Var}(E[N\mid\Lambda])$, producing **over-dispersion** and a heavier tail than the un-mixed model.
Mixtures
State the key result that the **gamma mixture of a Poisson is a negative binomial**, including the parameter map.
If $N\mid\Lambda\sim\text{Poisson}(\Lambda)$ and the mixing $\Lambda\sim\text{Gamma}(\alpha,\theta)$, then unconditionally $N\sim\text{Negative Binomial}(r=\alpha,\ \beta=\theta)$. Intuitively, the random Poisson mean injects extra variability, so the result is over-dispersed (variance $>$ mean), exactly the negative binomial's hallmark. This is the canonical motivation for the negative binomial as a claim-count model.
Mixtures
Suppose $N\mid\Lambda\sim\text{Poisson}(\Lambda)$ with $\Lambda\sim\text{Gamma}(\alpha=2,\theta=1.5)$. Find $E[N]$ and $\operatorname{Var}(N)$ two ways and confirm over-dispersion.
**By the negative-binomial map** ($r=2,\beta=1.5$): $E[N]=r\beta=3$, $\operatorname{Var}(N)=r\beta(1+\beta)=3(2.5)=7.5$. **By conditioning:** $E[N]=E[\Lambda]=\alpha\theta=3$. $\operatorname{Var}(N)=E[\operatorname{Var}(N\mid\Lambda)]+\operatorname{Var}(E[N\mid\Lambda])=E[\Lambda]+\operatorname{Var}(\Lambda)=\alpha\theta+\alpha\theta^{2}=3+2(1.5)^{2}=3+4.5=7.5$. Variance $7.5>$ mean $3$ ⇒ over-dispersed, ratio $2.5=1+\beta$. ✓
Mixtures
Describe the **exponential mixed over its mean** and the resulting tail.
Let the severity be $X\mid\Theta\sim\text{Exponential}(\text{mean }\Theta)$ with the mean $\Theta$ random (e.g. inverse-gamma distributed). Mixing the (light-tailed, all-moments) exponential over a distribution of means produces a **heavier-tailed** severity — the classic case yields a **Pareto**. Why: each conditional exponential is light, but blending many scales — some very large — fattens the upper tail. This is the continuous-mixture route to the Pareto and a model for heterogeneous risks.
Mixtures
Why does **mixing always increase variance** and tail weight relative to the conditional distribution?
From $\operatorname{Var}(N)=\underbrace{E[\operatorname{Var}(N\mid\Lambda)]}_{\text{average within-group var}}+\underbrace{\operatorname{Var}(E[N\mid\Lambda])}_{\ge 0}$, the second term is non-negative, so the mixture's variance is at least the average conditional variance. The extra term reflects **heterogeneity** across the mixing parameter. Consequence: a gamma-mixed Poisson (negative binomial) has variance $>$ mean even though each conditional Poisson has variance $=$ mean; an exponential mixed over its scale acquires a heavier (e.g. Pareto) tail. Mixing is the standard mechanism for building over-dispersed, heavy-tailed actuarial models.