{
  "deckName": "Exam FAM — Parametric & Empirical Estimation",
  "examCode": "Exam FAM",
  "cards": [
    {
      "front": "Define the **empirical distribution function** $F_n(x)$ for a complete individual data sample of size $n$.",
      "back": "$F_n(x)=\\dfrac{\\text{number of observations} \\le x}{n}$.\nIt is a step function that jumps by $\\frac{1}{n}$ at each distinct data point (or by $\\frac{k}{n}$ where $k$ observations tie). It places probability mass $\\frac{1}{n}$ on each sampled value and is the nonparametric estimate of the true $F(x)$ when there is no censoring or truncation.",
      "tag": "Empirical estimation"
    },
    {
      "front": "Define the **empirical survival function** $S_n(x)$ and how it relates to $F_n(x)$.",
      "back": "$S_n(x)=1-F_n(x)=\\dfrac{\\text{number of observations} > x}{n}$.\nIt is a right-continuous step function starting at $1$ and dropping by $\\frac{1}{n}$ at each observed value, reaching $0$ at the largest data point. It estimates $S(x)=\\Pr(X>x)$ from a complete sample.",
      "tag": "Empirical estimation"
    },
    {
      "front": "Give the **empirical estimators of the mean and variance** of $X$ from a complete sample $x_1,\\dots,x_n$.",
      "back": "Empirical (sample) mean: $\\hat\\mu=\\bar x=\\dfrac{1}{n}\\sum_{i=1}^{n}x_i$, which is $E[X]$ under the empirical distribution.\nEmpirical variance: $\\widehat{\\operatorname{Var}}(X)=\\dfrac{1}{n}\\sum_{i=1}^{n}(x_i-\\bar x)^2 = \\dfrac{1}{n}\\sum x_i^2 - \\bar x^2$.\nNote this divides by $n$ (the empirical/MLE form), not $n-1$; the unbiased sample variance uses $n-1$.",
      "tag": "Empirical estimation"
    },
    {
      "front": "For the complete loss sample $\\{2,\\,3,\\,3,\\,5,\\,7\\}$ (in thousands), compute the empirical mean and empirical variance.",
      "back": "$n=5$. Mean: $\\bar x=\\dfrac{2+3+3+5+7}{5}=\\dfrac{20}{5}=4$ (thousand).\n$\\sum x_i^2 = 4+9+9+25+49 = 96$, so $\\dfrac{1}{n}\\sum x_i^2 = \\dfrac{96}{5}=19.2$.\nEmpirical variance $=19.2 - 4^2 = 19.2 - 16 = 3.2$ (thousand$^2$).\nStandard deviation $=\\sqrt{3.2}\\approx 1.7889$.",
      "tag": "Empirical estimation"
    },
    {
      "front": "Using the empirical distribution of $\\{2,3,3,5,7\\}$, evaluate $F_5(3)$, $F_5(4)$, and $S_5(3)$.",
      "back": "Probability mass $\\frac{1}{5}$ on each point, with $3$ occurring twice.\n$F_5(3)=\\Pr(X\\le 3)=\\dfrac{1+2}{5}=\\dfrac{3}{5}=0.6$ (the $2$ and both $3$'s).\n$F_5(4)=\\Pr(X\\le 4)=\\dfrac{3}{5}=0.6$ as well (no data between $3$ and $5$).\n$S_5(3)=1-F_5(3)=0.4$.",
      "tag": "Empirical estimation"
    },
    {
      "front": "Define the **risk set** $r_i$ and the **number of observed deaths** $s_i$ at an event time $t_i$ in survival data.",
      "back": "At each distinct observed event time $t_i$:\n$r_i$ = the **risk set**, the number of individuals known to be alive and under observation *just before* $t_i$ (i.e. not yet dead and not yet censored/withdrawn).\n$s_i$ = the number of **observed deaths (events)** exactly at $t_i$.\nWithdrawals (right-censored observations) at $t_i$ leave the risk set after $t_i$ but are not counted as deaths.",
      "tag": "Kaplan-Meier & Nelson-Aalen"
    },
    {
      "front": "State the **Kaplan–Meier (product-limit)** estimator of the survival function.",
      "back": "$\\hat S(t)=\\displaystyle\\prod_{t_i \\le t}\\left(1-\\dfrac{s_i}{r_i}\\right)$, the product over all observed death times $t_i$ at or before $t$.\nEach factor $\\left(1-\\frac{s_i}{r_i}\\right)$ is the estimated conditional probability of surviving past $t_i$ given survival up to $t_i$. $\\hat S(t)=1$ for $t<t_1$, and the estimate is a right-continuous step function that only drops at death times (not at censoring times).",
      "tag": "Kaplan-Meier & Nelson-Aalen"
    },
    {
      "front": "State the **Nelson–Aalen** estimator of the cumulative hazard $\\hat H(t)$ and how it gives a survival estimate.",
      "back": "$\\hat H(t)=\\displaystyle\\sum_{t_i \\le t}\\dfrac{s_i}{r_i}$, summing the hazard increments $\\frac{s_i}{r_i}$ over death times.\nThe corresponding survival estimate is $\\hat S(t)=e^{-\\hat H(t)}$.\nBecause $e^{-x}\\ge 1-x$, the Nelson–Aalen survival estimate is always at least as large as the Kaplan–Meier estimate at the same $t$.",
      "tag": "Kaplan-Meier & Nelson-Aalen"
    },
    {
      "front": "Ten lives are observed. Deaths occur at times $t=2$ ($1$ death), $t=5$ ($2$ deaths), and $t=9$ ($1$ death), with one **censored** observation at $t=7$ and the rest surviving past $t=9$. Build the risk sets and find $\\hat S(9)$ by Kaplan–Meier.",
      "back": "Order the event times and track the risk set $r_i$ (lives just before each death):\n$t_1=2$: $r_1=10$, $s_1=1$ → factor $1-\\frac{1}{10}=0.9$.\n$t_2=5$: $r_2=9$, $s_2=2$ → factor $1-\\frac{2}{9}=\\frac{7}{9}\\approx 0.77778$. (No one left before $t=5$.)\nThe censored life at $t=7$ removes one from the risk set *after* $t=5$ but causes no drop.\n$t_3=9$: just before $t=9$ we had $10-1-2-1=6$ lives, so $r_3=6$, $s_3=1$ → factor $1-\\frac{1}{6}=\\frac{5}{6}\\approx 0.83333$.\n$\\hat S(9)=0.9\\times 0.77778\\times 0.83333\\approx 0.58333$.",
      "tag": "Kaplan-Meier & Nelson-Aalen"
    },
    {
      "front": "For the same data (deaths $1$ at $t=2$, $2$ at $t=5$, $1$ at $t=9$; risk sets $10$, $9$, $6$), find the **Nelson–Aalen** estimate $\\hat H(9)$ and the implied $\\hat S(9)$.",
      "back": "$\\hat H(9)=\\dfrac{1}{10}+\\dfrac{2}{9}+\\dfrac{1}{6}=0.1+0.22222+0.16667=0.48889$.\n$\\hat S(9)=e^{-\\hat H(9)}=e^{-0.48889}\\approx 0.61331$.\nThis exceeds the Kaplan–Meier $\\hat S(9)\\approx 0.58333$, as Nelson–Aalen always does.",
      "tag": "Kaplan-Meier & Nelson-Aalen"
    },
    {
      "front": "Why does a **censored (withdrawn)** observation not cause a drop in the Kaplan–Meier $\\hat S(t)$, and how does it still affect the estimate?",
      "back": "A right-censored life is known only to have survived past its censoring time, so it provides **no death** — $\\hat S$ steps down only at observed deaths. However, the withdrawal **reduces the risk set** $r_i$ for all later death times. A smaller $r_i$ makes each later factor $1-\\frac{s_i}{r_i}$ smaller (a larger conditional drop), so censoring still influences subsequent estimates through the denominators.",
      "tag": "Censoring & truncation"
    },
    {
      "front": "Distinguish **right-censoring** from **left-truncation** in loss/survival data, naming the actuarial cause of each.",
      "back": "**Right-censoring (policy limit $u$):** the exact value is unknown above a threshold — you only know $X>u$. Caused by a **policy limit** (loss capped at $u$) or a study ending before death. The observation contributes survival information $S(u)$.\n**Left-truncation (deductible $d$):** observations below a threshold are never seen at all; you observe $X$ only conditional on $X>d$. Caused by a **deductible** $d$ (small losses not reported) or late entry. Such an observation contributes the conditional density $\\frac{f(x)}{S(d)}$.",
      "tag": "Censoring & truncation"
    },
    {
      "front": "State the **method of moments** procedure for fitting a distribution with $k$ parameters.",
      "back": "Match the first $k$ theoretical (model) moments to the corresponding empirical (sample) moments and solve for the parameters.\nFor one parameter, set $E[X]=\\bar x$. For two parameters, set $E[X]=\\frac{1}{n}\\sum x_i$ and $E[X^2]=\\frac{1}{n}\\sum x_i^2$ (raw second moment), then solve the system. The model moments are expressed as functions of the parameters; the sample moments are fixed numbers from the data.",
      "tag": "Method of moments"
    },
    {
      "front": "Fit an **exponential** distribution to the sample $\\{4,\\,8,\\,10,\\,18\\}$ by the method of moments.",
      "back": "The exponential has mean $E[X]=\\theta$, so the method of moments sets $\\hat\\theta=\\bar x$.\n$\\bar x=\\dfrac{4+8+10+18}{4}=\\dfrac{40}{4}=10$.\nThus $\\hat\\theta=10$. (For the exponential this also equals the MLE, since the single parameter is the mean.)",
      "tag": "Method of moments"
    },
    {
      "front": "A **gamma** distribution has $E[X]=\\alpha\\theta$ and $\\operatorname{Var}(X)=\\alpha\\theta^2$. A sample has $\\bar x = 200$ and empirical variance $80{,}000$. Fit $\\alpha$ and $\\theta$ by the method of moments.",
      "back": "Match moments: $\\alpha\\theta=200$ and $\\alpha\\theta^2 = 80{,}000$ (variance form).\nDivide: $\\dfrac{\\alpha\\theta^2}{\\alpha\\theta}=\\theta=\\dfrac{80{,}000}{200}=400$.\nThen $\\hat\\alpha=\\dfrac{200}{\\hat\\theta}=\\dfrac{200}{400}=0.5$.\nSo $\\hat\\alpha=0.5$, $\\hat\\theta=400$.",
      "tag": "Method of moments"
    },
    {
      "front": "A two-parameter **Pareto** has $E[X]=\\dfrac{\\theta}{\\alpha-1}$ and $E[X^2]=\\dfrac{2\\theta^2}{(\\alpha-1)(\\alpha-2)}$. A sample gives $\\bar x = 100$ and $\\frac{1}{n}\\sum x_i^2 = 60{,}000$. Fit $\\alpha$ and $\\theta$ by the method of moments.",
      "back": "Set $\\dfrac{\\theta}{\\alpha-1}=100$ and $\\dfrac{2\\theta^2}{(\\alpha-1)(\\alpha-2)}=60{,}000$.\nFrom the first, $\\theta = 100(\\alpha-1)$. Substitute:\n$\\dfrac{2[100(\\alpha-1)]^2}{(\\alpha-1)(\\alpha-2)}=\\dfrac{2(10{,}000)(\\alpha-1)}{\\alpha-2}=60{,}000$.\nSo $\\dfrac{20{,}000(\\alpha-1)}{\\alpha-2}=60{,}000 \\Rightarrow \\alpha-1 = 3(\\alpha-2)=3\\alpha-6$, giving $2\\alpha=5$, $\\hat\\alpha=2.5$.\nThen $\\hat\\theta = 100(2.5-1)=150$.",
      "tag": "Method of moments"
    },
    {
      "front": "A **lognormal** has $E[X]=e^{\\mu+\\sigma^2/2}$ and $E[X^2]=e^{2\\mu+2\\sigma^2}$. A sample gives $\\bar x = 1000$ and $\\frac{1}{n}\\sum x_i^2 = 2{,}000{,}000$. Fit $\\mu$ and $\\sigma^2$ by the method of moments.",
      "back": "Take logs of the moment equations.\n$\\ln(1000)=\\mu+\\tfrac{1}{2}\\sigma^2$ and $\\ln(2{,}000{,}000)=2\\mu+2\\sigma^2$.\n$\\ln 1000 \\approx 6.907755$, $\\ln 2{,}000{,}000 \\approx 14.508658$.\nFrom the equations: $2(6.907755)=2\\mu+\\sigma^2 = 13.815510$. Subtract from the second: $(2\\mu+2\\sigma^2)-(2\\mu+\\sigma^2)=\\sigma^2 = 14.508658-13.815510=0.693148$.\nThen $\\hat\\mu = 6.907755 - \\tfrac{1}{2}(0.693148)=6.907755-0.346574=6.561181$.\nSo $\\hat\\mu\\approx 6.56118$, $\\hat\\sigma^2\\approx 0.69315$ ($\\hat\\sigma\\approx 0.83256$).",
      "tag": "Method of moments"
    },
    {
      "front": "State the **smoothed empirical percentile** definition used for percentile matching, and how the rank of the target order statistic is found.",
      "back": "For a sample of size $n$ sorted as $x_{(1)}\\le\\dots\\le x_{(n)}$, the smoothed empirical estimate of the $100p$-th percentile is found by giving order statistic $x_{(k)}$ the percentile $\\frac{k}{n+1}$.\nTo estimate the $p$-th quantile, compute the rank $k=p(n+1)$. If $k$ is an integer the percentile is $x_{(k)}$; otherwise **linearly interpolate** between $x_{(\\lfloor k\\rfloor)}$ and $x_{(\\lceil k\\rceil)}$.",
      "tag": "Percentile matching"
    },
    {
      "front": "For the ordered sample $\\{3,\\,6,\\,9,\\,14,\\,20,\\,28\\}$ ($n=6$), find the **smoothed empirical** estimate of the $40$th percentile.",
      "back": "Rank $k=p(n+1)=0.40(7)=2.8$.\nThis lies between order statistics $x_{(2)}=6$ and $x_{(3)}=9$; interpolate the fractional part $0.8$:\n$\\hat\\pi_{0.40}=x_{(2)} + 0.8\\,(x_{(3)}-x_{(2)})=6 + 0.8(9-6)=6+2.4=8.4$.\nSo the smoothed empirical $40$th percentile is $8.4$.",
      "tag": "Percentile matching"
    },
    {
      "front": "Fit an **exponential** distribution by **percentile matching** on the median, using the sample $\\{3,6,9,14,20,28\\}$ ($n=6$).",
      "back": "Smoothed median: rank $k=0.5(7)=3.5$, between $x_{(3)}=9$ and $x_{(4)}=14$: $\\hat\\pi_{0.5}=9+0.5(14-9)=11.5$.\nMatch the exponential median: $S(\\pi)=e^{-\\pi/\\theta}=0.5 \\Rightarrow \\pi = \\theta\\ln 2$.\nSo $\\hat\\theta=\\dfrac{\\hat\\pi_{0.5}}{\\ln 2}=\\dfrac{11.5}{0.693147}\\approx 16.5910$.",
      "tag": "Percentile matching"
    },
    {
      "front": "A sample has smoothed empirical median (50th percentile) $100$ and $75$th percentile $300$. Fit a **two-parameter Pareto** $S(x)=\\left(\\frac{\\theta}{x+\\theta}\\right)^{\\alpha}$ by percentile matching.",
      "back": "Match $S$ at each percentile: $S(100)=1-0.50=0.50$ and $S(300)=1-0.75=0.25$.\nTake logs: $\\alpha\\ln\\!\\frac{\\theta}{100+\\theta}=\\ln 0.50$ and $\\alpha\\ln\\!\\frac{\\theta}{300+\\theta}=\\ln 0.25$.\nDivide to eliminate $\\alpha$: $\\dfrac{\\ln[\\theta/(\\theta+100)]}{\\ln[\\theta/(\\theta+300)]}=\\dfrac{\\ln 0.50}{\\ln 0.25}=\\dfrac{\\ln 0.50}{2\\ln 0.50}=\\dfrac{1}{2}$.\nSo $2\\ln\\!\\frac{\\theta}{\\theta+100}=\\ln\\!\\frac{\\theta}{\\theta+300}$, i.e. $\\left(\\frac{\\theta}{\\theta+100}\\right)^2=\\frac{\\theta}{\\theta+300}$. This gives $\\theta(\\theta+300)=(\\theta+100)^2 \\Rightarrow \\theta^2+300\\theta=\\theta^2+200\\theta+10{,}000$, so $100\\theta=10{,}000$, $\\hat\\theta=100$.\nThen $\\hat\\alpha=\\dfrac{\\ln 0.50}{\\ln[100/(100+100)]}=\\dfrac{\\ln 0.50}{\\ln 0.50}=1$.",
      "tag": "Percentile matching"
    },
    {
      "front": "Contrast **method of moments** and **percentile matching** as estimation methods, noting one practical drawback of each.",
      "back": "**Method of moments** matches model moments to sample moments; it is simple but can be unstable for **heavy-tailed** data because high sample moments are dominated by a few large losses (and may not exist if the model's moment doesn't).\n**Percentile matching** matches model quantiles to smoothed empirical percentiles; it is robust to tail outliers but the answer depends on **which percentiles** you choose, and different choices give different fits.\nNeither is generally as efficient as maximum likelihood, which uses the full data.",
      "tag": "Percentile matching"
    },
    {
      "front": "State the **likelihood** and **log-likelihood** for a complete (uncensored, untruncated) individual data sample.",
      "back": "Each observed value $x_i$ contributes its density $f(x_i\\mid\\theta)$.\nLikelihood: $L(\\theta)=\\displaystyle\\prod_{i=1}^{n} f(x_i\\mid\\theta)$.\nLog-likelihood: $\\ell(\\theta)=\\displaystyle\\sum_{i=1}^{n}\\ln f(x_i\\mid\\theta)$.\nThe MLE $\\hat\\theta$ maximizes $\\ell$; usually found by solving $\\frac{d\\ell}{d\\theta}=0$.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "Summarize how each type of observation contributes to the **likelihood**: complete, right-censored (limit $u$), left-truncated (deductible $d$), and grouped.",
      "back": "**Complete** (exact $x$): contributes the density $f(x)$.\n**Right-censored** at $u$ (known only $X>u$, e.g. a policy limit): contributes the survival $S(u)$.\n**Left-truncated** at $d$ (observed only because $X>d$, e.g. a deductible): contributes the **conditional** density $\\dfrac{f(x)}{S(d)}$; a censored-and-truncated payment-at-limit gives $\\dfrac{S(u)}{S(d)}$.\n**Grouped** in interval $(c_{j-1},c_j]$: contributes $F(c_j)-F(c_{j-1})$ raised to the count in that group.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "Derive the **MLE of $\\theta$ for an exponential** distribution from a complete sample $x_1,\\dots,x_n$.",
      "back": "$f(x)=\\frac{1}{\\theta}e^{-x/\\theta}$, so $\\ell(\\theta)=\\sum\\left(-\\ln\\theta - \\frac{x_i}{\\theta}\\right)=-n\\ln\\theta - \\frac{1}{\\theta}\\sum x_i$.\nSet $\\frac{d\\ell}{d\\theta}=-\\frac{n}{\\theta}+\\frac{\\sum x_i}{\\theta^2}=0$.\nMultiply by $\\theta^2$: $-n\\theta+\\sum x_i=0 \\Rightarrow \\hat\\theta=\\dfrac{\\sum x_i}{n}=\\bar x$.\nThe exponential MLE is the sample mean — same as the method of moments here.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "Compute the **exponential MLE** for the sample $\\{5,\\,12,\\,12,\\,21,\\,30\\}$ (complete data).",
      "back": "For the exponential, $\\hat\\theta=\\bar x$.\n$\\sum x_i = 5+12+12+21+30 = 80$, $n=5$.\n$\\hat\\theta = \\dfrac{80}{5}=16$.\nSo the maximum-likelihood mean is $\\hat\\theta = 16$.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "Five exponential losses are observed: exact values $8$, $10$, and $15$, plus two policies that hit a **policy limit of $20$** (right-censored at $20$). Find the **MLE of $\\theta$**.",
      "back": "Exact values contribute $f(x_i)=\\frac{1}{\\theta}e^{-x_i/\\theta}$; censored-at-$u$ contribute $S(u)=e^{-u/\\theta}$.\n$L=\\left(\\frac{1}{\\theta}\\right)^3 e^{-(8+10+15)/\\theta}\\cdot \\left(e^{-20/\\theta}\\right)^2$.\n$\\ell=-3\\ln\\theta -\\frac{33}{\\theta} - \\frac{40}{\\theta}=-3\\ln\\theta -\\frac{73}{\\theta}$.\nThe MLE has the closed form $\\hat\\theta=\\dfrac{\\text{total amount observed}}{\\text{number of exact deaths}}=\\dfrac{8+10+15+20+20}{3}=\\dfrac{73}{3}\\approx 24.3333$.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "State the general closed-form **exponential MLE** that handles censoring and truncation in one formula.",
      "back": "For exponential data with deductibles and policy limits, $\\hat\\theta = \\dfrac{\\sum (\\text{exposure amounts})}{\\text{number of uncensored (exact) observations}}$.\nEach observation contributes its observed amount **above any deductible $d$** to the numerator: an exact loss contributes $x_i-d_i$, a censored-at-$u$ loss contributes $u_i-d_i$. The denominator counts only the deaths/exact losses (not censored ones). This is the \"total time on test over number of events\" rule.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "Losses follow an exponential. Three claims with a **deductible of $5$** are observed as payments $7$, $13$, and $25$ (so ground-up losses $12$, $18$, $30$); all are below the policy limit. Find the **MLE of $\\theta$**.",
      "back": "Left-truncation at $d=5$: each exact observation contributes $\\dfrac{f(x)}{S(5)}$. For the exponential, $\\dfrac{f(x)}{S(d)}=\\dfrac{\\theta^{-1}e^{-x/\\theta}}{e^{-d/\\theta}}=\\frac{1}{\\theta}e^{-(x-d)/\\theta}$.\nSo the likelihood depends only on the amounts **above the deductible**: $7,13,25$.\n$\\hat\\theta=\\dfrac{\\text{total above deductible}}{\\text{number of exact losses}}=\\dfrac{7+13+25}{3}=\\dfrac{45}{3}=15$.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "Combine censoring and truncation: exponential losses with a **deductible of $10$** and **policy limit (maximum covered) of $50$**. Observed: two exact losses at ground-up $30$ and $45$, and one claim that reached the $50$ limit. Find $\\hat\\theta$.",
      "back": "Work with amounts above the deductible $d=10$. Exact losses contribute $\\frac{1}{\\theta}e^{-(x-d)/\\theta}$; the limit claim contributes $\\frac{S(u)}{S(d)}=e^{-(u-d)/\\theta}$.\nAmounts above deductible: $30-10=20$, $45-10=35$, and the censored $50-10=40$.\n$\\hat\\theta=\\dfrac{\\sum(\\text{amount above }d)}{\\text{number of exact losses}}=\\dfrac{20+35+40}{2}=\\dfrac{95}{2}=47.5$.\n(Only the two exact losses count in the denominator; the limit claim adds to the numerator only.)",
      "tag": "Maximum likelihood"
    },
    {
      "front": "Derive the **MLE of the Poisson** parameter $\\lambda$ from claim-count data $n_1,\\dots,n_m$.",
      "back": "$\\Pr(N=k)=\\frac{e^{-\\lambda}\\lambda^{k}}{k!}$, so $\\ell(\\lambda)=\\sum_{j}\\left(-\\lambda + n_j\\ln\\lambda - \\ln n_j!\\right)=-m\\lambda + \\ln\\lambda\\sum n_j - \\sum \\ln n_j!$.\n$\\frac{d\\ell}{d\\lambda}=-m + \\frac{\\sum n_j}{\\lambda}=0 \\Rightarrow \\hat\\lambda=\\dfrac{\\sum n_j}{m}=\\bar n$.\nThe Poisson MLE is the sample mean number of claims.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "A portfolio reports claim counts over $200$ policies: $120$ had $0$ claims, $60$ had $1$, $15$ had $2$, and $5$ had $3$. Find the **MLE of the Poisson** $\\lambda$.",
      "back": "Poisson MLE $=\\bar n=\\dfrac{\\text{total claims}}{\\text{number of policies}}$.\nTotal claims $=120(0)+60(1)+15(2)+5(3)=0+60+30+15=105$.\n$\\hat\\lambda=\\dfrac{105}{200}=0.525$ claims per policy.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "Derive the **MLE of $\\mu$ for a lognormal** (with $\\sigma$ known) from a complete sample, and state the MLE of $\\sigma^2$.",
      "back": "If $X$ is lognormal then $Y=\\ln X$ is normal $(\\mu,\\sigma^2)$. The lognormal MLEs are just the normal MLEs applied to the **logged** data $y_i=\\ln x_i$:\n$\\hat\\mu = \\dfrac{1}{n}\\sum \\ln x_i$ (mean of the logs), and\n$\\hat\\sigma^2 = \\dfrac{1}{n}\\sum (\\ln x_i - \\hat\\mu)^2$ (variance of the logs, dividing by $n$).",
      "tag": "Maximum likelihood"
    },
    {
      "front": "Fit a **lognormal** by maximum likelihood to the complete sample $\\{e^{1},\\,e^{2},\\,e^{3},\\,e^{4}\\}$ (so the logs are $1,2,3,4$).",
      "back": "The logs are $y_i = 1,2,3,4$.\n$\\hat\\mu = \\dfrac{1+2+3+4}{4}=\\dfrac{10}{4}=2.5$.\nDeviations: $(1-2.5)^2+(2-2.5)^2+(3-2.5)^2+(4-2.5)^2 = 2.25+0.25+0.25+2.25 = 5$.\n$\\hat\\sigma^2 = \\dfrac{5}{4}=1.25$, so $\\hat\\sigma=\\sqrt{1.25}\\approx 1.1180$.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "Write the **grouped-data likelihood** and use it: $100$ losses fall as $40$ in $(0,100]$, $35$ in $(100,300]$, $25$ in $(300,\\infty)$ under an exponential. Set up the log-likelihood.",
      "back": "Each group $(c_{j-1},c_j]$ with count $n_j$ contributes $[F(c_j)-F(c_{j-1})]^{n_j}$, so $L=\\prod_j [F(c_j)-F(c_{j-1})]^{n_j}$.\nFor the exponential $F(x)=1-e^{-x/\\theta}$:\nGroup probabilities are $p_1=1-e^{-100/\\theta}$, $p_2=e^{-100/\\theta}-e^{-300/\\theta}$, $p_3=e^{-300/\\theta}$.\n$\\ell(\\theta)=40\\ln p_1 + 35\\ln p_2 + 25\\ln p_3$.\nMaximizing numerically over $\\theta$ gives $\\hat\\theta\\approx 210.62$. (There is no closed form for grouped exponential data; solve $\\frac{d\\ell}{d\\theta}=0$ numerically.)",
      "tag": "Maximum likelihood"
    },
    {
      "front": "For **grouped data** where the last interval is open, why must you use $S(c_{j-1})=F(\\infty)-F(c_{j-1})$ for that group, and what does each interior group use?",
      "back": "An open final interval $(c_{j-1},\\infty)$ records only that the loss exceeded $c_{j-1}$, so its probability is the **survival** $S(c_{j-1})=1-F(c_{j-1})$ — formally $F(\\infty)-F(c_{j-1})$.\nEach **interior** group $(c_{j-1},c_j]$ uses the difference $F(c_j)-F(c_{j-1})$.\nThe likelihood is the product of these group probabilities raised to their observed counts. Grouped data is a form of interval censoring.",
      "tag": "Censoring & truncation"
    },
    {
      "front": "How does **left-truncation at $d$** change the likelihood contribution, and why is the denominator $S(d)$?",
      "back": "Truncated data is observed only conditional on $X>d$, so each observation's density must be the **conditional** density given survival past $d$: $\\dfrac{f(x)}{S(d)}$.\nDividing by $S(d)$ renormalizes the density to integrate to $1$ over the observable region $(d,\\infty)$. Without it the likelihood would not account for the unobserved mass below $d$, biasing the estimate. For a deductible $d$ on a policy, every reported loss carries this $\\frac{1}{S(d)}$ factor.",
      "tag": "Censoring & truncation"
    },
    {
      "front": "Two lognormal-type observations have logs above a truncation point — but here, simply explain: why does a **policy limit** create a likelihood factor $S(u)$ rather than $f(u)$?",
      "back": "When a loss reaches the policy limit $u$, the insurer pays $u$ and the **true loss is unknown — only that it was at least $u$** ($X\\ge u$). The probability of that event is the survival $S(u)=\\Pr(X>u)$, so a limit observation contributes $S(u)$ to the likelihood.\nUsing $f(u)$ would wrongly assert the loss equaled exactly $u$; the correct censored contribution integrates the density over all values $\\ge u$, which is $S(u)$.",
      "tag": "Censoring & truncation"
    },
    {
      "front": "What does it mean that the MLE is **consistent**, and why is the MLE generally preferred over method of moments?",
      "back": "**Consistency:** as the sample size $n\\to\\infty$, the MLE $\\hat\\theta$ converges (in probability) to the true parameter $\\theta$. The MLE is also **asymptotically unbiased** and **asymptotically normal**, and achieves the smallest possible asymptotic variance (efficiency) — it attains the Cramér–Rao lower bound.\nMethod of moments and percentile matching are consistent too but are generally **less efficient** (larger variance) because they use only selected moments/quantiles rather than the full data, so MLE is preferred when feasible.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "State the **asymptotic variance** of the MLE in terms of Fisher information, and give it for the exponential.",
      "back": "For a single parameter, $\\widehat{\\operatorname{Var}}(\\hat\\theta)\\approx \\dfrac{1}{I(\\theta)}$ where the Fisher information is $I(\\theta)=-E\\!\\left[\\dfrac{d^2\\ell}{d\\theta^2}\\right]$ (for $n$ observations, $I(\\theta)=n\\,i(\\theta)$ with $i$ the per-observation information).\nFor the exponential, $\\ell=-n\\ln\\theta-\\frac{\\sum x_i}{\\theta}$, $\\frac{d^2\\ell}{d\\theta^2}=\\frac{n}{\\theta^2}-\\frac{2\\sum x_i}{\\theta^3}$; taking $-E[\\cdot]$ with $E[\\sum x_i]=n\\theta$ gives $I(\\theta)=\\frac{n}{\\theta^2}$, so $\\widehat{\\operatorname{Var}}(\\hat\\theta)\\approx \\dfrac{\\theta^2}{n}$.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "Using the exponential MLE $\\hat\\theta=16$ from a sample of $n=5$ (with asymptotic variance $\\theta^2/n$), give an approximate variance and standard error for $\\hat\\theta$.",
      "back": "Plug the MLE into the asymptotic variance $\\widehat{\\operatorname{Var}}(\\hat\\theta)\\approx \\dfrac{\\hat\\theta^2}{n}=\\dfrac{16^2}{5}=\\dfrac{256}{5}=51.2$.\nStandard error $=\\sqrt{51.2}\\approx 7.1554$.\nAn approximate $95\\%$ confidence interval is $\\hat\\theta \\pm 1.96\\,(7.1554)$, i.e. roughly $16 \\pm 14.0$, or $(1.98,\\,30.02)$ — wide, as expected for $n=5$.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "Eight lives: deaths at $t=1,3,3,6,8$ (note two deaths at $t=3$), with deaths at $t=1$, $t=6$, $t=8$ single and no censoring before $t=8$. Find the Kaplan–Meier $\\hat S(6)$.",
      "back": "Order distinct death times with risk sets (all $8$ start, no early censoring):\n$t=1$: $r=8$, $s=1$ → $1-\\frac{1}{8}=0.875$.\n$t=3$: $r=7$, $s=2$ → $1-\\frac{2}{7}=\\frac{5}{7}\\approx 0.714286$.\n$t=6$: just before $t=6$ we have $8-1-2=5$ lives, $r=5$, $s=1$ → $1-\\frac{1}{5}=0.8$.\n$\\hat S(6)=0.875\\times 0.714286\\times 0.8 = 0.5$.\nExactly $0.5$, consistent with $4$ of the $8$ lives having died by $t=6$.",
      "tag": "Kaplan-Meier & Nelson-Aalen"
    },
    {
      "front": "Explain why $\\hat S(t)=e^{-\\hat H(t)}$ converts a **Nelson–Aalen** cumulative hazard into a survival estimate, and when the two estimators agree most closely.",
      "back": "The exact relationship between survival and cumulative hazard is $S(t)=e^{-H(t)}$ where $H(t)=-\\ln S(t)=\\int_0^t \\mu(s)\\,ds$. Nelson–Aalen estimates $H$ directly by summing the empirical hazard increments $\\frac{s_i}{r_i}$, then exponentiates the negative to recover $\\hat S$.\nSince each Kaplan–Meier factor $1-\\frac{s_i}{r_i}\\approx e^{-s_i/r_i}$ when $\\frac{s_i}{r_i}$ is small, the two estimators agree closely when the **risk sets are large** relative to the number of deaths (small hazard increments). They diverge most when $\\frac{s_i}{r_i}$ is large (small risk sets late in the data).",
      "tag": "Kaplan-Meier & Nelson-Aalen"
    },
    {
      "front": "Distinguish **empirical** estimation from **parametric** estimation, and say when each is appropriate.",
      "back": "**Empirical (nonparametric)** estimation makes no assumption about the form of $F$ — it reads the distribution, survival, moments, or (via Kaplan–Meier/Nelson–Aalen) the survival curve directly off the data. Best when you have **ample data** and want to avoid model misspecification.\n**Parametric** estimation assumes a family (exponential, Pareto, lognormal, gamma) and fits its parameters by method of moments, percentile matching, or maximum likelihood. Best for **smoothing, extrapolating into the tail**, or working with sparse/censored data where the empirical curve is unstable.",
      "tag": "Empirical estimation"
    }
  ]
}