Willys Flashcards Download
Become an ActuaryExamsFlashcardsExam FAM › Parametric & Empirical Estimation
Exam FAM · ~10-12%

Exam FAM — Parametric & Empirical Estimation Flashcards

Estimating loss and survival models from data for SOA Exam FAM: the empirical distribution, survival, and moment estimators for individual data; the Kaplan–Meier product-limit and Nelson–Aalen estimators built from risk sets and deaths; method-of-moments and percentile-matching fits for exponential, Pareto, lognormal, and gamma models; and maximum likelihood for complete, grouped, right-censored (policy limit), and left-truncated (deductible) data — each worked through to a final estimate.

44 cards6 topicsFree · fact-checked · LaTeX math
Tap card or press Space to flip
Answer

Import this deck

Download all 44 cards and import them into your flashcard app (JSON or CSV — works with Anki). Using the Willys app? No import needed — this deck is already built in (Settings → Library → Browse).

Every deck is built into the Willys app

All of these decks — including the full practice problem banks — come built into Willys AI Flashcards & Quizzes for iPhone & iPad (Mac version coming soon), with FSRS + SM-2 spaced repetition, streaks, and exam-date cram mode. 14-day free trial, then $14.99. To load a deck in the app: Settings → Library → Browse, then pick your exam and deck.

More Exam FAM decks:

Aggregate Loss Models Aggregate Loss Models Practice Coverage Modifications Coverage Modifications Practice Credibility Credibility Practice

← All Exam FAM decks

Browse all 44 cards as a list
  1. Empirical estimation
    Define the **empirical distribution function** $F_n(x)$ for a complete individual data sample of size $n$.
    $F_n(x)=\dfrac{\text{number of observations} \le x}{n}$. It is a step function that jumps by $\frac{1}{n}$ at each distinct data point (or by $\frac{k}{n}$ where $k$ observations tie). It places probability mass $\frac{1}{n}$ on each sampled value and is the nonparametric estimate of the true $F(x)$ when there is no censoring or truncation.
  2. Empirical estimation
    Define the **empirical survival function** $S_n(x)$ and how it relates to $F_n(x)$.
    $S_n(x)=1-F_n(x)=\dfrac{\text{number of observations} > x}{n}$. It is a right-continuous step function starting at $1$ and dropping by $\frac{1}{n}$ at each observed value, reaching $0$ at the largest data point. It estimates $S(x)=\Pr(X>x)$ from a complete sample.
  3. Empirical estimation
    Give the **empirical estimators of the mean and variance** of $X$ from a complete sample $x_1,\dots,x_n$.
    Empirical (sample) mean: $\hat\mu=\bar x=\dfrac{1}{n}\sum_{i=1}^{n}x_i$, which is $E[X]$ under the empirical distribution. Empirical variance: $\widehat{\operatorname{Var}}(X)=\dfrac{1}{n}\sum_{i=1}^{n}(x_i-\bar x)^2 = \dfrac{1}{n}\sum x_i^2 - \bar x^2$. Note this divides by $n$ (the empirical/MLE form), not $n-1$; the unbiased sample variance uses $n-1$.
  4. Empirical estimation
    For the complete loss sample $\{2,\,3,\,3,\,5,\,7\}$ (in thousands), compute the empirical mean and empirical variance.
    $n=5$. Mean: $\bar x=\dfrac{2+3+3+5+7}{5}=\dfrac{20}{5}=4$ (thousand). $\sum x_i^2 = 4+9+9+25+49 = 96$, so $\dfrac{1}{n}\sum x_i^2 = \dfrac{96}{5}=19.2$. Empirical variance $=19.2 - 4^2 = 19.2 - 16 = 3.2$ (thousand$^2$). Standard deviation $=\sqrt{3.2}\approx 1.7889$.
  5. Empirical estimation
    Using the empirical distribution of $\{2,3,3,5,7\}$, evaluate $F_5(3)$, $F_5(4)$, and $S_5(3)$.
    Probability mass $\frac{1}{5}$ on each point, with $3$ occurring twice. $F_5(3)=\Pr(X\le 3)=\dfrac{1+2}{5}=\dfrac{3}{5}=0.6$ (the $2$ and both $3$'s). $F_5(4)=\Pr(X\le 4)=\dfrac{3}{5}=0.6$ as well (no data between $3$ and $5$). $S_5(3)=1-F_5(3)=0.4$.
  6. Kaplan-Meier & Nelson-Aalen
    Define the **risk set** $r_i$ and the **number of observed deaths** $s_i$ at an event time $t_i$ in survival data.
    At each distinct observed event time $t_i$: $r_i$ = the **risk set**, the number of individuals known to be alive and under observation *just before* $t_i$ (i.e. not yet dead and not yet censored/withdrawn). $s_i$ = the number of **observed deaths (events)** exactly at $t_i$. Withdrawals (right-censored observations) at $t_i$ leave the risk set after $t_i$ but are not counted as deaths.
  7. Kaplan-Meier & Nelson-Aalen
    State the **Kaplan–Meier (product-limit)** estimator of the survival function.
    $\hat S(t)=\displaystyle\prod_{t_i \le t}\left(1-\dfrac{s_i}{r_i}\right)$, the product over all observed death times $t_i$ at or before $t$. Each factor $\left(1-\frac{s_i}{r_i}\right)$ is the estimated conditional probability of surviving past $t_i$ given survival up to $t_i$. $\hat S(t)=1$ for $t<t_1$, and the estimate is a right-continuous step function that only drops at death times (not at censoring times).
  8. Kaplan-Meier & Nelson-Aalen
    State the **Nelson–Aalen** estimator of the cumulative hazard $\hat H(t)$ and how it gives a survival estimate.
    $\hat H(t)=\displaystyle\sum_{t_i \le t}\dfrac{s_i}{r_i}$, summing the hazard increments $\frac{s_i}{r_i}$ over death times. The corresponding survival estimate is $\hat S(t)=e^{-\hat H(t)}$. Because $e^{-x}\ge 1-x$, the Nelson–Aalen survival estimate is always at least as large as the Kaplan–Meier estimate at the same $t$.
  9. Kaplan-Meier & Nelson-Aalen
    Ten lives are observed. Deaths occur at times $t=2$ ($1$ death), $t=5$ ($2$ deaths), and $t=9$ ($1$ death), with one **censored** observation at $t=7$ and the rest surviving past $t=9$. Build the risk sets and find $\hat S(9)$ by Kaplan–Meier.
    Order the event times and track the risk set $r_i$ (lives just before each death): $t_1=2$: $r_1=10$, $s_1=1$ → factor $1-\frac{1}{10}=0.9$. $t_2=5$: $r_2=9$, $s_2=2$ → factor $1-\frac{2}{9}=\frac{7}{9}\approx 0.77778$. (No one left before $t=5$.) The censored life at $t=7$ removes one from the risk set *after* $t=5$ but causes no drop. $t_3=9$: just before $t=9$ we had $10-1-2-1=6$ lives, so $r_3=6$, $s_3=1$ → factor $1-\frac{1}{6}=\frac{5}{6}\approx 0.83333$. $\hat S(9)=0.9\times 0.77778\times 0.83333\approx 0.58333$.
  10. Kaplan-Meier & Nelson-Aalen
    For the same data (deaths $1$ at $t=2$, $2$ at $t=5$, $1$ at $t=9$; risk sets $10$, $9$, $6$), find the **Nelson–Aalen** estimate $\hat H(9)$ and the implied $\hat S(9)$.
    $\hat H(9)=\dfrac{1}{10}+\dfrac{2}{9}+\dfrac{1}{6}=0.1+0.22222+0.16667=0.48889$. $\hat S(9)=e^{-\hat H(9)}=e^{-0.48889}\approx 0.61331$. This exceeds the Kaplan–Meier $\hat S(9)\approx 0.58333$, as Nelson–Aalen always does.
  11. Censoring & truncation
    Why does a **censored (withdrawn)** observation not cause a drop in the Kaplan–Meier $\hat S(t)$, and how does it still affect the estimate?
    A right-censored life is known only to have survived past its censoring time, so it provides **no death** — $\hat S$ steps down only at observed deaths. However, the withdrawal **reduces the risk set** $r_i$ for all later death times. A smaller $r_i$ makes each later factor $1-\frac{s_i}{r_i}$ smaller (a larger conditional drop), so censoring still influences subsequent estimates through the denominators.
  12. Censoring & truncation
    Distinguish **right-censoring** from **left-truncation** in loss/survival data, naming the actuarial cause of each.
    **Right-censoring (policy limit $u$):** the exact value is unknown above a threshold — you only know $X>u$. Caused by a **policy limit** (loss capped at $u$) or a study ending before death. The observation contributes survival information $S(u)$. **Left-truncation (deductible $d$):** observations below a threshold are never seen at all; you observe $X$ only conditional on $X>d$. Caused by a **deductible** $d$ (small losses not reported) or late entry. Such an observation contributes the conditional density $\frac{f(x)}{S(d)}$.
  13. Method of moments
    State the **method of moments** procedure for fitting a distribution with $k$ parameters.
    Match the first $k$ theoretical (model) moments to the corresponding empirical (sample) moments and solve for the parameters. For one parameter, set $E[X]=\bar x$. For two parameters, set $E[X]=\frac{1}{n}\sum x_i$ and $E[X^2]=\frac{1}{n}\sum x_i^2$ (raw second moment), then solve the system. The model moments are expressed as functions of the parameters; the sample moments are fixed numbers from the data.
  14. Method of moments
    Fit an **exponential** distribution to the sample $\{4,\,8,\,10,\,18\}$ by the method of moments.
    The exponential has mean $E[X]=\theta$, so the method of moments sets $\hat\theta=\bar x$. $\bar x=\dfrac{4+8+10+18}{4}=\dfrac{40}{4}=10$. Thus $\hat\theta=10$. (For the exponential this also equals the MLE, since the single parameter is the mean.)
  15. Method of moments
    A **gamma** distribution has $E[X]=\alpha\theta$ and $\operatorname{Var}(X)=\alpha\theta^2$. A sample has $\bar x = 200$ and empirical variance $80{,}000$. Fit $\alpha$ and $\theta$ by the method of moments.
    Match moments: $\alpha\theta=200$ and $\alpha\theta^2 = 80{,}000$ (variance form). Divide: $\dfrac{\alpha\theta^2}{\alpha\theta}=\theta=\dfrac{80{,}000}{200}=400$. Then $\hat\alpha=\dfrac{200}{\hat\theta}=\dfrac{200}{400}=0.5$. So $\hat\alpha=0.5$, $\hat\theta=400$.
  16. Method of moments
    A two-parameter **Pareto** has $E[X]=\dfrac{\theta}{\alpha-1}$ and $E[X^2]=\dfrac{2\theta^2}{(\alpha-1)(\alpha-2)}$. A sample gives $\bar x = 100$ and $\frac{1}{n}\sum x_i^2 = 60{,}000$. Fit $\alpha$ and $\theta$ by the method of moments.
    Set $\dfrac{\theta}{\alpha-1}=100$ and $\dfrac{2\theta^2}{(\alpha-1)(\alpha-2)}=60{,}000$. From the first, $\theta = 100(\alpha-1)$. Substitute: $\dfrac{2[100(\alpha-1)]^2}{(\alpha-1)(\alpha-2)}=\dfrac{2(10{,}000)(\alpha-1)}{\alpha-2}=60{,}000$. So $\dfrac{20{,}000(\alpha-1)}{\alpha-2}=60{,}000 \Rightarrow \alpha-1 = 3(\alpha-2)=3\alpha-6$, giving $2\alpha=5$, $\hat\alpha=2.5$. Then $\hat\theta = 100(2.5-1)=150$.
  17. Method of moments
    A **lognormal** has $E[X]=e^{\mu+\sigma^2/2}$ and $E[X^2]=e^{2\mu+2\sigma^2}$. A sample gives $\bar x = 1000$ and $\frac{1}{n}\sum x_i^2 = 2{,}000{,}000$. Fit $\mu$ and $\sigma^2$ by the method of moments.
    Take logs of the moment equations. $\ln(1000)=\mu+\tfrac{1}{2}\sigma^2$ and $\ln(2{,}000{,}000)=2\mu+2\sigma^2$. $\ln 1000 \approx 6.907755$, $\ln 2{,}000{,}000 \approx 14.508658$. From the equations: $2(6.907755)=2\mu+\sigma^2 = 13.815510$. Subtract from the second: $(2\mu+2\sigma^2)-(2\mu+\sigma^2)=\sigma^2 = 14.508658-13.815510=0.693148$. Then $\hat\mu = 6.907755 - \tfrac{1}{2}(0.693148)=6.907755-0.346574=6.561181$. So $\hat\mu\approx 6.56118$, $\hat\sigma^2\approx 0.69315$ ($\hat\sigma\approx 0.83256$).
  18. Percentile matching
    State the **smoothed empirical percentile** definition used for percentile matching, and how the rank of the target order statistic is found.
    For a sample of size $n$ sorted as $x_{(1)}\le\dots\le x_{(n)}$, the smoothed empirical estimate of the $100p$-th percentile is found by giving order statistic $x_{(k)}$ the percentile $\frac{k}{n+1}$. To estimate the $p$-th quantile, compute the rank $k=p(n+1)$. If $k$ is an integer the percentile is $x_{(k)}$; otherwise **linearly interpolate** between $x_{(\lfloor k\rfloor)}$ and $x_{(\lceil k\rceil)}$.
  19. Percentile matching
    For the ordered sample $\{3,\,6,\,9,\,14,\,20,\,28\}$ ($n=6$), find the **smoothed empirical** estimate of the $40$th percentile.
    Rank $k=p(n+1)=0.40(7)=2.8$. This lies between order statistics $x_{(2)}=6$ and $x_{(3)}=9$; interpolate the fractional part $0.8$: $\hat\pi_{0.40}=x_{(2)} + 0.8\,(x_{(3)}-x_{(2)})=6 + 0.8(9-6)=6+2.4=8.4$. So the smoothed empirical $40$th percentile is $8.4$.
  20. Percentile matching
    Fit an **exponential** distribution by **percentile matching** on the median, using the sample $\{3,6,9,14,20,28\}$ ($n=6$).
    Smoothed median: rank $k=0.5(7)=3.5$, between $x_{(3)}=9$ and $x_{(4)}=14$: $\hat\pi_{0.5}=9+0.5(14-9)=11.5$. Match the exponential median: $S(\pi)=e^{-\pi/\theta}=0.5 \Rightarrow \pi = \theta\ln 2$. So $\hat\theta=\dfrac{\hat\pi_{0.5}}{\ln 2}=\dfrac{11.5}{0.693147}\approx 16.5910$.
  21. Percentile matching
    A sample has smoothed empirical median (50th percentile) $100$ and $75$th percentile $300$. Fit a **two-parameter Pareto** $S(x)=\left(\frac{\theta}{x+\theta}\right)^{\alpha}$ by percentile matching.
    Match $S$ at each percentile: $S(100)=1-0.50=0.50$ and $S(300)=1-0.75=0.25$. Take logs: $\alpha\ln\!\frac{\theta}{100+\theta}=\ln 0.50$ and $\alpha\ln\!\frac{\theta}{300+\theta}=\ln 0.25$. Divide to eliminate $\alpha$: $\dfrac{\ln[\theta/(\theta+100)]}{\ln[\theta/(\theta+300)]}=\dfrac{\ln 0.50}{\ln 0.25}=\dfrac{\ln 0.50}{2\ln 0.50}=\dfrac{1}{2}$. So $2\ln\!\frac{\theta}{\theta+100}=\ln\!\frac{\theta}{\theta+300}$, i.e. $\left(\frac{\theta}{\theta+100}\right)^2=\frac{\theta}{\theta+300}$. This gives $\theta(\theta+300)=(\theta+100)^2 \Rightarrow \theta^2+300\theta=\theta^2+200\theta+10{,}000$, so $100\theta=10{,}000$, $\hat\theta=100$. Then $\hat\alpha=\dfrac{\ln 0.50}{\ln[100/(100+100)]}=\dfrac{\ln 0.50}{\ln 0.50}=1$.
  22. Percentile matching
    Contrast **method of moments** and **percentile matching** as estimation methods, noting one practical drawback of each.
    **Method of moments** matches model moments to sample moments; it is simple but can be unstable for **heavy-tailed** data because high sample moments are dominated by a few large losses (and may not exist if the model's moment doesn't). **Percentile matching** matches model quantiles to smoothed empirical percentiles; it is robust to tail outliers but the answer depends on **which percentiles** you choose, and different choices give different fits. Neither is generally as efficient as maximum likelihood, which uses the full data.
  23. Maximum likelihood
    State the **likelihood** and **log-likelihood** for a complete (uncensored, untruncated) individual data sample.
    Each observed value $x_i$ contributes its density $f(x_i\mid\theta)$. Likelihood: $L(\theta)=\displaystyle\prod_{i=1}^{n} f(x_i\mid\theta)$. Log-likelihood: $\ell(\theta)=\displaystyle\sum_{i=1}^{n}\ln f(x_i\mid\theta)$. The MLE $\hat\theta$ maximizes $\ell$; usually found by solving $\frac{d\ell}{d\theta}=0$.
  24. Maximum likelihood
    Summarize how each type of observation contributes to the **likelihood**: complete, right-censored (limit $u$), left-truncated (deductible $d$), and grouped.
    **Complete** (exact $x$): contributes the density $f(x)$. **Right-censored** at $u$ (known only $X>u$, e.g. a policy limit): contributes the survival $S(u)$. **Left-truncated** at $d$ (observed only because $X>d$, e.g. a deductible): contributes the **conditional** density $\dfrac{f(x)}{S(d)}$; a censored-and-truncated payment-at-limit gives $\dfrac{S(u)}{S(d)}$. **Grouped** in interval $(c_{j-1},c_j]$: contributes $F(c_j)-F(c_{j-1})$ raised to the count in that group.
  25. Maximum likelihood
    Derive the **MLE of $\theta$ for an exponential** distribution from a complete sample $x_1,\dots,x_n$.
    $f(x)=\frac{1}{\theta}e^{-x/\theta}$, so $\ell(\theta)=\sum\left(-\ln\theta - \frac{x_i}{\theta}\right)=-n\ln\theta - \frac{1}{\theta}\sum x_i$. Set $\frac{d\ell}{d\theta}=-\frac{n}{\theta}+\frac{\sum x_i}{\theta^2}=0$. Multiply by $\theta^2$: $-n\theta+\sum x_i=0 \Rightarrow \hat\theta=\dfrac{\sum x_i}{n}=\bar x$. The exponential MLE is the sample mean — same as the method of moments here.
  26. Maximum likelihood
    Compute the **exponential MLE** for the sample $\{5,\,12,\,12,\,21,\,30\}$ (complete data).
    For the exponential, $\hat\theta=\bar x$. $\sum x_i = 5+12+12+21+30 = 80$, $n=5$. $\hat\theta = \dfrac{80}{5}=16$. So the maximum-likelihood mean is $\hat\theta = 16$.
  27. Maximum likelihood
    Five exponential losses are observed: exact values $8$, $10$, and $15$, plus two policies that hit a **policy limit of $20$** (right-censored at $20$). Find the **MLE of $\theta$**.
    Exact values contribute $f(x_i)=\frac{1}{\theta}e^{-x_i/\theta}$; censored-at-$u$ contribute $S(u)=e^{-u/\theta}$. $L=\left(\frac{1}{\theta}\right)^3 e^{-(8+10+15)/\theta}\cdot \left(e^{-20/\theta}\right)^2$. $\ell=-3\ln\theta -\frac{33}{\theta} - \frac{40}{\theta}=-3\ln\theta -\frac{73}{\theta}$. The MLE has the closed form $\hat\theta=\dfrac{\text{total amount observed}}{\text{number of exact deaths}}=\dfrac{8+10+15+20+20}{3}=\dfrac{73}{3}\approx 24.3333$.
  28. Maximum likelihood
    State the general closed-form **exponential MLE** that handles censoring and truncation in one formula.
    For exponential data with deductibles and policy limits, $\hat\theta = \dfrac{\sum (\text{exposure amounts})}{\text{number of uncensored (exact) observations}}$. Each observation contributes its observed amount **above any deductible $d$** to the numerator: an exact loss contributes $x_i-d_i$, a censored-at-$u$ loss contributes $u_i-d_i$. The denominator counts only the deaths/exact losses (not censored ones). This is the "total time on test over number of events" rule.
  29. Maximum likelihood
    Losses follow an exponential. Three claims with a **deductible of $5$** are observed as payments $7$, $13$, and $25$ (so ground-up losses $12$, $18$, $30$); all are below the policy limit. Find the **MLE of $\theta$**.
    Left-truncation at $d=5$: each exact observation contributes $\dfrac{f(x)}{S(5)}$. For the exponential, $\dfrac{f(x)}{S(d)}=\dfrac{\theta^{-1}e^{-x/\theta}}{e^{-d/\theta}}=\frac{1}{\theta}e^{-(x-d)/\theta}$. So the likelihood depends only on the amounts **above the deductible**: $7,13,25$. $\hat\theta=\dfrac{\text{total above deductible}}{\text{number of exact losses}}=\dfrac{7+13+25}{3}=\dfrac{45}{3}=15$.
  30. Maximum likelihood
    Combine censoring and truncation: exponential losses with a **deductible of $10$** and **policy limit (maximum covered) of $50$**. Observed: two exact losses at ground-up $30$ and $45$, and one claim that reached the $50$ limit. Find $\hat\theta$.
    Work with amounts above the deductible $d=10$. Exact losses contribute $\frac{1}{\theta}e^{-(x-d)/\theta}$; the limit claim contributes $\frac{S(u)}{S(d)}=e^{-(u-d)/\theta}$. Amounts above deductible: $30-10=20$, $45-10=35$, and the censored $50-10=40$. $\hat\theta=\dfrac{\sum(\text{amount above }d)}{\text{number of exact losses}}=\dfrac{20+35+40}{2}=\dfrac{95}{2}=47.5$. (Only the two exact losses count in the denominator; the limit claim adds to the numerator only.)
  31. Maximum likelihood
    Derive the **MLE of the Poisson** parameter $\lambda$ from claim-count data $n_1,\dots,n_m$.
    $\Pr(N=k)=\frac{e^{-\lambda}\lambda^{k}}{k!}$, so $\ell(\lambda)=\sum_{j}\left(-\lambda + n_j\ln\lambda - \ln n_j!\right)=-m\lambda + \ln\lambda\sum n_j - \sum \ln n_j!$. $\frac{d\ell}{d\lambda}=-m + \frac{\sum n_j}{\lambda}=0 \Rightarrow \hat\lambda=\dfrac{\sum n_j}{m}=\bar n$. The Poisson MLE is the sample mean number of claims.
  32. Maximum likelihood
    A portfolio reports claim counts over $200$ policies: $120$ had $0$ claims, $60$ had $1$, $15$ had $2$, and $5$ had $3$. Find the **MLE of the Poisson** $\lambda$.
    Poisson MLE $=\bar n=\dfrac{\text{total claims}}{\text{number of policies}}$. Total claims $=120(0)+60(1)+15(2)+5(3)=0+60+30+15=105$. $\hat\lambda=\dfrac{105}{200}=0.525$ claims per policy.
  33. Maximum likelihood
    Derive the **MLE of $\mu$ for a lognormal** (with $\sigma$ known) from a complete sample, and state the MLE of $\sigma^2$.
    If $X$ is lognormal then $Y=\ln X$ is normal $(\mu,\sigma^2)$. The lognormal MLEs are just the normal MLEs applied to the **logged** data $y_i=\ln x_i$: $\hat\mu = \dfrac{1}{n}\sum \ln x_i$ (mean of the logs), and $\hat\sigma^2 = \dfrac{1}{n}\sum (\ln x_i - \hat\mu)^2$ (variance of the logs, dividing by $n$).
  34. Maximum likelihood
    Fit a **lognormal** by maximum likelihood to the complete sample $\{e^{1},\,e^{2},\,e^{3},\,e^{4}\}$ (so the logs are $1,2,3,4$).
    The logs are $y_i = 1,2,3,4$. $\hat\mu = \dfrac{1+2+3+4}{4}=\dfrac{10}{4}=2.5$. Deviations: $(1-2.5)^2+(2-2.5)^2+(3-2.5)^2+(4-2.5)^2 = 2.25+0.25+0.25+2.25 = 5$. $\hat\sigma^2 = \dfrac{5}{4}=1.25$, so $\hat\sigma=\sqrt{1.25}\approx 1.1180$.
  35. Maximum likelihood
    Write the **grouped-data likelihood** and use it: $100$ losses fall as $40$ in $(0,100]$, $35$ in $(100,300]$, $25$ in $(300,\infty)$ under an exponential. Set up the log-likelihood.
    Each group $(c_{j-1},c_j]$ with count $n_j$ contributes $[F(c_j)-F(c_{j-1})]^{n_j}$, so $L=\prod_j [F(c_j)-F(c_{j-1})]^{n_j}$. For the exponential $F(x)=1-e^{-x/\theta}$: Group probabilities are $p_1=1-e^{-100/\theta}$, $p_2=e^{-100/\theta}-e^{-300/\theta}$, $p_3=e^{-300/\theta}$. $\ell(\theta)=40\ln p_1 + 35\ln p_2 + 25\ln p_3$. Maximizing numerically over $\theta$ gives $\hat\theta\approx 210.62$. (There is no closed form for grouped exponential data; solve $\frac{d\ell}{d\theta}=0$ numerically.)
  36. Censoring & truncation
    For **grouped data** where the last interval is open, why must you use $S(c_{j-1})=F(\infty)-F(c_{j-1})$ for that group, and what does each interior group use?
    An open final interval $(c_{j-1},\infty)$ records only that the loss exceeded $c_{j-1}$, so its probability is the **survival** $S(c_{j-1})=1-F(c_{j-1})$ — formally $F(\infty)-F(c_{j-1})$. Each **interior** group $(c_{j-1},c_j]$ uses the difference $F(c_j)-F(c_{j-1})$. The likelihood is the product of these group probabilities raised to their observed counts. Grouped data is a form of interval censoring.
  37. Censoring & truncation
    How does **left-truncation at $d$** change the likelihood contribution, and why is the denominator $S(d)$?
    Truncated data is observed only conditional on $X>d$, so each observation's density must be the **conditional** density given survival past $d$: $\dfrac{f(x)}{S(d)}$. Dividing by $S(d)$ renormalizes the density to integrate to $1$ over the observable region $(d,\infty)$. Without it the likelihood would not account for the unobserved mass below $d$, biasing the estimate. For a deductible $d$ on a policy, every reported loss carries this $\frac{1}{S(d)}$ factor.
  38. Censoring & truncation
    Two lognormal-type observations have logs above a truncation point — but here, simply explain: why does a **policy limit** create a likelihood factor $S(u)$ rather than $f(u)$?
    When a loss reaches the policy limit $u$, the insurer pays $u$ and the **true loss is unknown — only that it was at least $u$** ($X\ge u$). The probability of that event is the survival $S(u)=\Pr(X>u)$, so a limit observation contributes $S(u)$ to the likelihood. Using $f(u)$ would wrongly assert the loss equaled exactly $u$; the correct censored contribution integrates the density over all values $\ge u$, which is $S(u)$.
  39. Maximum likelihood
    What does it mean that the MLE is **consistent**, and why is the MLE generally preferred over method of moments?
    **Consistency:** as the sample size $n\to\infty$, the MLE $\hat\theta$ converges (in probability) to the true parameter $\theta$. The MLE is also **asymptotically unbiased** and **asymptotically normal**, and achieves the smallest possible asymptotic variance (efficiency) — it attains the Cramér–Rao lower bound. Method of moments and percentile matching are consistent too but are generally **less efficient** (larger variance) because they use only selected moments/quantiles rather than the full data, so MLE is preferred when feasible.
  40. Maximum likelihood
    State the **asymptotic variance** of the MLE in terms of Fisher information, and give it for the exponential.
    For a single parameter, $\widehat{\operatorname{Var}}(\hat\theta)\approx \dfrac{1}{I(\theta)}$ where the Fisher information is $I(\theta)=-E\!\left[\dfrac{d^2\ell}{d\theta^2}\right]$ (for $n$ observations, $I(\theta)=n\,i(\theta)$ with $i$ the per-observation information). For the exponential, $\ell=-n\ln\theta-\frac{\sum x_i}{\theta}$, $\frac{d^2\ell}{d\theta^2}=\frac{n}{\theta^2}-\frac{2\sum x_i}{\theta^3}$; taking $-E[\cdot]$ with $E[\sum x_i]=n\theta$ gives $I(\theta)=\frac{n}{\theta^2}$, so $\widehat{\operatorname{Var}}(\hat\theta)\approx \dfrac{\theta^2}{n}$.
  41. Maximum likelihood
    Using the exponential MLE $\hat\theta=16$ from a sample of $n=5$ (with asymptotic variance $\theta^2/n$), give an approximate variance and standard error for $\hat\theta$.
    Plug the MLE into the asymptotic variance $\widehat{\operatorname{Var}}(\hat\theta)\approx \dfrac{\hat\theta^2}{n}=\dfrac{16^2}{5}=\dfrac{256}{5}=51.2$. Standard error $=\sqrt{51.2}\approx 7.1554$. An approximate $95\%$ confidence interval is $\hat\theta \pm 1.96\,(7.1554)$, i.e. roughly $16 \pm 14.0$, or $(1.98,\,30.02)$ — wide, as expected for $n=5$.
  42. Kaplan-Meier & Nelson-Aalen
    Eight lives: deaths at $t=1,3,3,6,8$ (note two deaths at $t=3$), with deaths at $t=1$, $t=6$, $t=8$ single and no censoring before $t=8$. Find the Kaplan–Meier $\hat S(6)$.
    Order distinct death times with risk sets (all $8$ start, no early censoring): $t=1$: $r=8$, $s=1$ → $1-\frac{1}{8}=0.875$. $t=3$: $r=7$, $s=2$ → $1-\frac{2}{7}=\frac{5}{7}\approx 0.714286$. $t=6$: just before $t=6$ we have $8-1-2=5$ lives, $r=5$, $s=1$ → $1-\frac{1}{5}=0.8$. $\hat S(6)=0.875\times 0.714286\times 0.8 = 0.5$. Exactly $0.5$, consistent with $4$ of the $8$ lives having died by $t=6$.
  43. Kaplan-Meier & Nelson-Aalen
    Explain why $\hat S(t)=e^{-\hat H(t)}$ converts a **Nelson–Aalen** cumulative hazard into a survival estimate, and when the two estimators agree most closely.
    The exact relationship between survival and cumulative hazard is $S(t)=e^{-H(t)}$ where $H(t)=-\ln S(t)=\int_0^t \mu(s)\,ds$. Nelson–Aalen estimates $H$ directly by summing the empirical hazard increments $\frac{s_i}{r_i}$, then exponentiates the negative to recover $\hat S$. Since each Kaplan–Meier factor $1-\frac{s_i}{r_i}\approx e^{-s_i/r_i}$ when $\frac{s_i}{r_i}$ is small, the two estimators agree closely when the **risk sets are large** relative to the number of deaths (small hazard increments). They diverge most when $\frac{s_i}{r_i}$ is large (small risk sets late in the data).
  44. Empirical estimation
    Distinguish **empirical** estimation from **parametric** estimation, and say when each is appropriate.
    **Empirical (nonparametric)** estimation makes no assumption about the form of $F$ — it reads the distribution, survival, moments, or (via Kaplan–Meier/Nelson–Aalen) the survival curve directly off the data. Best when you have **ample data** and want to avoid model misspecification. **Parametric** estimation assumes a family (exponential, Pareto, lognormal, gamma) and fits its parameters by method of moments, percentile matching, or maximum likelihood. Best for **smoothing, extrapolating into the tail**, or working with sparse/censored data where the empirical curve is unstable.