{
  "deckName": "Exam MAS-I — Statistics & Simulation",
  "examCode": "Exam MAS-I",
  "cards": [
    {
      "front": "Define the **bias** of an estimator $\\hat\\theta$ of a parameter $\\theta$, and state what **unbiased** means.",
      "back": "Bias is the systematic error: $\\text{bias}(\\hat\\theta)=E[\\hat\\theta]-\\theta$.\nAn estimator is **unbiased** when $\\text{bias}(\\hat\\theta)=0$, i.e. $E[\\hat\\theta]=\\theta$ for every value of $\\theta$.\nBias measures whether the estimator is centered on the target on average; it says nothing about spread.",
      "tag": "Point estimation"
    },
    {
      "front": "State the relationship between **mean squared error**, variance, and bias.",
      "back": "$\\text{MSE}(\\hat\\theta)=E\\!\\left[(\\hat\\theta-\\theta)^{2}\\right]=\\text{Var}(\\hat\\theta)+\\big(\\text{bias}(\\hat\\theta)\\big)^{2}$.\nMSE bundles both error sources: how spread out the estimator is (variance) and how off-center it is (bias squared). For an **unbiased** estimator the bias term vanishes, so $\\text{MSE}=\\text{Var}(\\hat\\theta)$.",
      "tag": "Bias, variance & MSE"
    },
    {
      "front": "Define a **consistent** estimator.",
      "back": "$\\hat\\theta_{n}$ is **consistent** for $\\theta$ if it converges in probability to $\\theta$ as the sample size grows: for every $\\varepsilon>0$, $P\\!\\left(|\\hat\\theta_{n}-\\theta|>\\varepsilon\\right)\\to 0$ as $n\\to\\infty$.\nA convenient sufficient condition: if $\\text{bias}(\\hat\\theta_{n})\\to 0$ and $\\text{Var}(\\hat\\theta_{n})\\to 0$, then $\\text{MSE}\\to 0$ and $\\hat\\theta_{n}$ is consistent (mean-square convergence implies convergence in probability).",
      "tag": "Point estimation"
    },
    {
      "front": "What does it mean for one unbiased estimator to be more **efficient** than another, and define **relative efficiency**?",
      "back": "Among unbiased estimators, the more **efficient** one has the smaller variance. The **relative efficiency** of $\\hat\\theta_{1}$ to $\\hat\\theta_{2}$ is $\\frac{\\text{Var}(\\hat\\theta_{2})}{\\text{Var}(\\hat\\theta_{1})}$.\nAn estimator whose variance attains the **Cramér–Rao lower bound** $\\frac{1}{I(\\theta)}$ is called **efficient** (a UMVUE). Smaller variance $\\Rightarrow$ tighter, more reliable estimates.",
      "tag": "Point estimation"
    },
    {
      "front": "For an i.i.d. sample, why is the sample mean $\\bar X$ unbiased for $\\mu$, and what is its variance?",
      "back": "$E[\\bar X]=\\frac{1}{n}\\sum E[X_{i}]=\\frac{1}{n}(n\\mu)=\\mu$, so $\\bar X$ is **unbiased**.\n$\\text{Var}(\\bar X)=\\frac{1}{n^{2}}\\sum \\text{Var}(X_{i})=\\frac{n\\sigma^{2}}{n^{2}}=\\frac{\\sigma^{2}}{n}$.\nThe variance $\\frac{\\sigma^{2}}{n}\\to 0$ as $n\\to\\infty$, so $\\bar X$ is also **consistent** for $\\mu$.",
      "tag": "Point estimation"
    },
    {
      "front": "Why is the divisor $n-1$ used in the sample variance $S^{2}=\\frac{1}{n-1}\\sum(X_{i}-\\bar X)^{2}$?",
      "back": "Dividing by $n-1$ makes $S^{2}$ **unbiased**: $E[S^{2}]=\\sigma^{2}$. Using $n$ instead gives $\\frac{n-1}{n}\\sigma^{2}$, which underestimates $\\sigma^{2}$ because the deviations are taken about the estimated mean $\\bar X$ rather than the true $\\mu$.\nThe loss of one degree of freedom (the constraint $\\sum(X_{i}-\\bar X)=0$) is exactly corrected by the $n-1$ divisor.",
      "tag": "Bias, variance & MSE"
    },
    {
      "front": "An estimator $\\hat\\theta$ has $E[\\hat\\theta]=0.9\\theta$ and $\\text{Var}(\\hat\\theta)=0.04\\theta^{2}$. Find its bias and MSE at $\\theta=10$.",
      "back": "Bias $=E[\\hat\\theta]-\\theta = 0.9\\theta-\\theta=-0.1\\theta$. At $\\theta=10$: bias $=-1$.\nVariance at $\\theta=10$: $0.04(10)^{2}=4$.\n$\\text{MSE}=\\text{Var}+\\text{bias}^{2}=4+(-1)^{2}=4+1=5$.",
      "tag": "Bias, variance & MSE"
    },
    {
      "front": "Two unbiased estimators of $\\mu$ have variances $\\text{Var}(\\hat\\theta_{1})=\\frac{\\sigma^{2}}{n}$ and $\\text{Var}(\\hat\\theta_{2})=\\frac{2\\sigma^{2}}{n+1}$. Which is more efficient for $n=9$, and by how much?",
      "back": "Compare variances (drop the common $\\sigma^{2}$): $\\text{Var}(\\hat\\theta_{1})=\\frac{1}{9}\\approx 0.1111\\,\\sigma^{2}$ and $\\text{Var}(\\hat\\theta_{2})=\\frac{2}{10}=0.2\\,\\sigma^{2}$.\nSince $0.1111<0.2$, $\\hat\\theta_{1}$ is more efficient.\nRelative efficiency of $\\hat\\theta_{1}$ to $\\hat\\theta_{2}$ $=\\frac{\\text{Var}(\\hat\\theta_{2})}{\\text{Var}(\\hat\\theta_{1})}=\\frac{0.2}{0.1111}\\approx 1.80$, i.e. $\\hat\\theta_{2}$ has $1.80$ times the variance of $\\hat\\theta_{1}$ (equivalently $\\hat\\theta_{1}$'s variance is about $44\\%$ smaller).",
      "tag": "Point estimation"
    },
    {
      "front": "Consider estimating $\\sigma^{2}$ by $T=\\frac{1}{n}\\sum(X_{i}-\\bar X)^{2}$ (divisor $n$). Find its bias for normal data.",
      "back": "We know $E\\!\\left[\\sum(X_{i}-\\bar X)^{2}\\right]=(n-1)\\sigma^{2}$.\nSo $E[T]=\\frac{(n-1)\\sigma^{2}}{n}=\\left(1-\\tfrac{1}{n}\\right)\\sigma^{2}$.\nBias $=E[T]-\\sigma^{2}=-\\frac{\\sigma^{2}}{n}<0$ (it underestimates). The bias $\\to 0$ as $n\\to\\infty$, so $T$ is **asymptotically unbiased** and consistent.",
      "tag": "Bias, variance & MSE"
    },
    {
      "front": "Write the general **likelihood** $L(\\theta)$ and **log-likelihood** $\\ell(\\theta)$ for an i.i.d. sample, and state the MLE recipe.",
      "back": "$L(\\theta)=\\prod_{i=1}^{n} f(x_{i};\\theta)$ and $\\ell(\\theta)=\\ln L(\\theta)=\\sum_{i=1}^{n}\\ln f(x_{i};\\theta)$.\nThe **MLE** $\\hat\\theta$ maximizes $L$ (equivalently $\\ell$). Solve the score equation $\\frac{d}{d\\theta}\\ell(\\theta)=0$ and verify it is a maximum (e.g. $\\ell''<0$). Maximizing $\\ell$ is easier than $L$ because the product becomes a sum.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "Derive the **MLE of the mean $\\theta$** of an exponential distribution $f(x)=\\frac{1}{\\theta}e^{-x/\\theta}$ from an i.i.d. sample.",
      "back": "$\\ell(\\theta)=\\sum\\left(-\\ln\\theta-\\frac{x_{i}}{\\theta}\\right)=-n\\ln\\theta-\\frac{1}{\\theta}\\sum x_{i}$.\n$\\ell'(\\theta)=-\\frac{n}{\\theta}+\\frac{1}{\\theta^{2}}\\sum x_{i}=0 \\Rightarrow \\frac{\\sum x_{i}}{\\theta^{2}}=\\frac{n}{\\theta}\\Rightarrow \\theta=\\frac{\\sum x_{i}}{n}$.\nSo $\\hat\\theta=\\bar X$, the sample mean.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "Derive the **MLE of the Poisson rate $\\lambda$** from an i.i.d. sample $x_{1},\\dots,x_{n}$.",
      "back": "$f(x;\\lambda)=\\frac{e^{-\\lambda}\\lambda^{x}}{x!}$, so $\\ell(\\lambda)=\\sum\\big(-\\lambda + x_{i}\\ln\\lambda - \\ln x_{i}!\\big)=-n\\lambda+\\ln\\lambda\\sum x_{i}-\\sum\\ln x_{i}!$.\n$\\ell'(\\lambda)=-n+\\frac{\\sum x_{i}}{\\lambda}=0\\Rightarrow \\lambda=\\frac{\\sum x_{i}}{n}$.\nSo $\\hat\\lambda=\\bar X$.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "Derive the **MLE of the Bernoulli/binomial success probability $p$** from $n$ i.i.d. trials with $\\sum x_{i}$ successes.",
      "back": "For Bernoulli, $f(x;p)=p^{x}(1-p)^{1-x}$, so $\\ell(p)=\\left(\\sum x_{i}\\right)\\ln p + \\left(n-\\sum x_{i}\\right)\\ln(1-p)$.\n$\\ell'(p)=\\frac{\\sum x_{i}}{p}-\\frac{n-\\sum x_{i}}{1-p}=0$.\nCross-multiplying: $(1-p)\\sum x_{i}=p\\,(n-\\sum x_{i})\\Rightarrow \\sum x_{i}=np\\Rightarrow \\hat p=\\frac{\\sum x_{i}}{n}=\\bar X$.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "A sample of claim counts is $2,0,3,1,4,2$ from a Poisson distribution. Compute the MLE $\\hat\\lambda$.",
      "back": "The Poisson MLE is the sample mean $\\hat\\lambda=\\bar X$.\n$\\sum x_{i}=2+0+3+1+4+2=12$, with $n=6$.\n$\\hat\\lambda=\\frac{12}{6}=2.0$.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "Exponential lifetimes (mean $\\theta$) are observed as $4,7,11,2,6$. Compute the MLE $\\hat\\theta$.",
      "back": "The exponential MLE for the mean is $\\hat\\theta=\\bar X$.\n$\\sum x_{i}=4+7+11+2+6=30$, with $n=5$.\n$\\hat\\theta=\\frac{30}{5}=6.0$.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "Define the **Fisher information** $I(\\theta)$ and state how it gives the asymptotic variance of the MLE.",
      "back": "For one observation, $I(\\theta)=-E\\!\\left[\\frac{d^{2}}{d\\theta^{2}}\\ln f(X;\\theta)\\right]=E\\!\\left[\\left(\\frac{d}{d\\theta}\\ln f(X;\\theta)\\right)^{2}\\right]$.\nFor an i.i.d. sample of size $n$ the total information is $nI(\\theta)$, and the MLE is asymptotically normal:\n$\\hat\\theta \\;\\dot\\sim\\; N\\!\\left(\\theta,\\;\\frac{1}{nI(\\theta)}\\right)$.\nLarger information $\\Rightarrow$ smaller asymptotic variance.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "Find the **Fisher information** for the Poisson rate $\\lambda$ (per observation) and the asymptotic variance of $\\hat\\lambda$ for a sample of size $n$.",
      "back": "$\\ln f = -\\lambda + x\\ln\\lambda - \\ln x!$, so $\\frac{d}{d\\lambda}\\ln f = -1+\\frac{x}{\\lambda}$ and $\\frac{d^{2}}{d\\lambda^{2}}\\ln f = -\\frac{x}{\\lambda^{2}}$.\n$I(\\lambda)=-E\\!\\left[-\\frac{X}{\\lambda^{2}}\\right]=\\frac{E[X]}{\\lambda^{2}}=\\frac{\\lambda}{\\lambda^{2}}=\\frac{1}{\\lambda}$.\nSo $\\text{Var}(\\hat\\lambda)\\approx\\frac{1}{nI(\\lambda)}=\\frac{\\lambda}{n}$ — matching $\\text{Var}(\\bar X)$ exactly since $\\hat\\lambda=\\bar X$.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "State the **method of moments** and contrast it with maximum likelihood.",
      "back": "Equate the lowest sample moments to the corresponding population moments and solve for the parameters. For one parameter, set $\\bar X = E[X]=g(\\theta)$ and solve for $\\hat\\theta$; for two parameters use $\\bar X$ and $\\frac{1}{n}\\sum X_{i}^{2}$.\nMethod-of-moments estimators are simple and need no optimization, but are generally **less efficient** than MLEs and need not be unbiased. MLEs are typically preferred for their asymptotic efficiency.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "A gamma distribution has mean $\\alpha\\theta$ and variance $\\alpha\\theta^{2}$. A sample has $\\bar X=10$ and sample second moment $\\frac{1}{n}\\sum X_{i}^{2}=140$. Find the method-of-moments $\\hat\\alpha,\\hat\\theta$.",
      "back": "Sample variance (about the mean) $=\\frac{1}{n}\\sum X_{i}^{2}-\\bar X^{2}=140-10^{2}=40$.\nMatch moments: $\\alpha\\theta=10$ and $\\alpha\\theta^{2}=40$.\nDivide: $\\frac{\\alpha\\theta^{2}}{\\alpha\\theta}=\\theta=\\frac{40}{10}=4$, so $\\hat\\theta=4$.\nThen $\\hat\\alpha=\\frac{10}{\\hat\\theta}=\\frac{10}{4}=2.5$.",
      "tag": "Maximum likelihood"
    },
    {
      "front": "State the **large-sample (z) confidence interval** for a population mean $\\mu$ when $\\sigma$ is known, and for unknown $\\sigma$.",
      "back": "Known $\\sigma$: $\\bar x \\pm z_{\\alpha/2}\\,\\frac{\\sigma}{\\sqrt n}$, where $z_{\\alpha/2}$ is the standard-normal critical value ($z_{0.025}=1.96$ for $95\\%$).\nUnknown $\\sigma$ (use the sample $s$): $\\bar x \\pm t_{\\alpha/2,\\,n-1}\\,\\frac{s}{\\sqrt n}$, using the $t$ distribution with $n-1$ degrees of freedom. For large $n$ the $t$ critical value $\\to z$.",
      "tag": "Confidence intervals"
    },
    {
      "front": "Interpret a **$95\\%$ confidence interval** correctly.",
      "back": "It means the *procedure* captures the true parameter $95\\%$ of the time: if we repeated the sampling and interval construction many times, about $95\\%$ of the resulting intervals would contain the fixed true $\\theta$.\nIt does **not** mean there is a $95\\%$ probability that $\\theta$ lies in this particular computed interval — once computed, the interval either contains $\\theta$ or it doesn't. The randomness is in the interval, not in $\\theta$.",
      "tag": "Confidence intervals"
    },
    {
      "front": "A sample of $n=64$ losses has $\\bar x=520$ and known $\\sigma=160$. Build a $95\\%$ confidence interval for the mean loss.",
      "back": "Standard error $=\\frac{\\sigma}{\\sqrt n}=\\frac{160}{\\sqrt{64}}=\\frac{160}{8}=20$.\nMargin $=z_{0.025}\\cdot 20 = 1.96(20)=39.2$.\nCI $=520\\pm 39.2 = (480.8,\\;559.2)$.",
      "tag": "Confidence intervals"
    },
    {
      "front": "A sample of $n=25$ claims has $\\bar x=4{,}200$ and sample sd $s=900$. Build a $95\\%$ confidence interval for the mean (use $t_{0.025,24}=2.064$).",
      "back": "Standard error $=\\frac{s}{\\sqrt n}=\\frac{900}{\\sqrt{25}}=\\frac{900}{5}=180$.\nMargin $=t_{0.025,24}\\cdot 180 = 2.064(180)=371.52$.\nCI $=4200\\pm 371.52 = (3{,}828.48,\\;4{,}571.52)$.",
      "tag": "Confidence intervals"
    },
    {
      "front": "State the large-sample (Wald) **confidence interval for a proportion** $p$.",
      "back": "With $\\hat p=\\frac{x}{n}$, the standard error is $\\sqrt{\\frac{\\hat p(1-\\hat p)}{n}}$, and the interval is\n$\\hat p \\pm z_{\\alpha/2}\\sqrt{\\frac{\\hat p(1-\\hat p)}{n}}$.\nThis normal approximation is reliable when both $n\\hat p$ and $n(1-\\hat p)$ are at least about $5$ to $10$.",
      "tag": "Confidence intervals"
    },
    {
      "front": "Out of $400$ policyholders, $60$ filed a claim. Build a $95\\%$ confidence interval for the claim probability $p$.",
      "back": "$\\hat p=\\frac{60}{400}=0.15$.\nStandard error $=\\sqrt{\\frac{0.15(0.85)}{400}}=\\sqrt{\\frac{0.1275}{400}}=\\sqrt{0.00031875}\\approx 0.017854$.\nMargin $=1.96(0.017854)\\approx 0.0350$.\nCI $=0.15\\pm 0.0350 = (0.1150,\\;0.1850)$.",
      "tag": "Confidence intervals"
    },
    {
      "front": "How many observations $n$ are needed so a $95\\%$ confidence interval for a mean has margin of error at most $5$, given $\\sigma=40$?",
      "back": "Require $z_{0.025}\\frac{\\sigma}{\\sqrt n}\\le 5$, i.e. $1.96\\cdot\\frac{40}{\\sqrt n}\\le 5$.\nSolve: $\\sqrt n \\ge \\frac{1.96(40)}{5}=\\frac{78.4}{5}=15.68$, so $n\\ge 15.68^{2}=245.86$.\nRound up: $n=246$.",
      "tag": "Confidence intervals"
    },
    {
      "front": "Define the **null hypothesis** $H_0$, the **alternative** $H_1$, and the two error types in hypothesis testing.",
      "back": "$H_0$ is the default claim to be tested; $H_1$ is what we conclude if we reject $H_0$.\n**Type I error** (probability $\\alpha$): rejecting $H_0$ when it is true — a false positive. $\\alpha$ is the chosen significance level.\n**Type II error** (probability $\\beta$): failing to reject $H_0$ when $H_1$ is true — a false negative.",
      "tag": "Hypothesis testing"
    },
    {
      "front": "Define the **power** of a test and how it relates to $\\beta$.",
      "back": "Power $=1-\\beta=P(\\text{reject }H_0 \\mid H_1 \\text{ true})$ — the probability of correctly detecting a true effect.\nPower rises with a larger sample size, a larger true effect (further from $H_0$), a larger significance level $\\alpha$, and smaller variance. There is a tradeoff: lowering $\\alpha$ to reduce Type I error raises $\\beta$ and lowers power, all else equal.",
      "tag": "Hypothesis testing"
    },
    {
      "front": "Define the **$p$-value** and state the decision rule.",
      "back": "The $p$-value is the probability, assuming $H_0$ is true, of observing a test statistic at least as extreme as the one actually observed (in the direction of $H_1$).\nDecision rule: **reject $H_0$ if $p\\le\\alpha$**; otherwise fail to reject. A small $p$-value means the data would be unlikely under $H_0$, i.e. evidence against it. The $p$-value is *not* the probability that $H_0$ is true.",
      "tag": "Hypothesis testing"
    },
    {
      "front": "Distinguish a **one-tailed** from a **two-tailed** test and how the critical value changes.",
      "back": "A **two-tailed** test ($H_1:\\mu\\neq\\mu_0$) splits $\\alpha$ across both tails, using $z_{\\alpha/2}$ ($1.96$ for $\\alpha=0.05$).\nA **one-tailed** test ($H_1:\\mu>\\mu_0$ or $\\mu<\\mu_0$) puts all of $\\alpha$ in one tail, using $z_{\\alpha}$ ($1.645$ for $\\alpha=0.05$).\nThe one-tailed test has a less extreme critical value, so it has more power against the specified direction but cannot detect effects in the other direction.",
      "tag": "Hypothesis testing"
    },
    {
      "front": "Test $H_0:\\mu=100$ vs $H_1:\\mu\\neq 100$ at $\\alpha=0.05$ given $n=36$, $\\bar x=106$, known $\\sigma=18$. State the conclusion.",
      "back": "Test statistic $z=\\frac{\\bar x-\\mu_0}{\\sigma/\\sqrt n}=\\frac{106-100}{18/\\sqrt{36}}=\\frac{6}{18/6}=\\frac{6}{3}=2.0$.\nTwo-tailed critical value is $z_{0.025}=1.96$. Since $|2.0|>1.96$, **reject $H_0$**.\nThe two-sided $p$-value $=2\\,P(Z>2.0)=2(0.0228)=0.0455<0.05$, consistent with rejection.",
      "tag": "Hypothesis testing"
    },
    {
      "front": "Test $H_0:\\mu=50$ vs $H_1:\\mu>50$ at $\\alpha=0.05$ with $n=16$, $\\bar x=53.5$, sample $s=8$ (use $t_{0.05,15}=1.753$).",
      "back": "Test statistic $t=\\frac{\\bar x-\\mu_0}{s/\\sqrt n}=\\frac{53.5-50}{8/\\sqrt{16}}=\\frac{3.5}{8/4}=\\frac{3.5}{2}=1.75$.\nOne-tailed critical value $t_{0.05,15}=1.753$. Since $1.75<1.753$, **fail to reject $H_0$** (just barely).\nThe evidence for $\\mu>50$ is not quite significant at the $5\\%$ level.",
      "tag": "Hypothesis testing"
    },
    {
      "front": "Observed counts in 4 categories are $O=(22,18,20,40)$; the model predicts $E=(25,25,25,25)$. Compute the chi-square goodness-of-fit statistic and decide at $\\alpha=0.05$ (critical $\\chi^{2}_{0.05,3}=7.815$).",
      "back": "Statistic $\\chi^{2}=\\sum_{j}\\frac{(O_{j}-E_{j})^{2}}{E_{j}}$:\n$\\chi^{2}=\\frac{(22-25)^{2}}{25}+\\frac{(18-25)^{2}}{25}+\\frac{(20-25)^{2}}{25}+\\frac{(40-25)^{2}}{25}$\n$=\\frac{9}{25}+\\frac{49}{25}+\\frac{25}{25}+\\frac{225}{25}=\\frac{308}{25}=12.32$.\nDegrees of freedom $=(\\text{cells})-1=4-1=3$. Since $12.32>7.815$, **reject $H_0$** — the model fits poorly (the last category is far over-observed).",
      "tag": "Hypothesis testing"
    },
    {
      "front": "Compute the **power** of the test $H_0:\\mu=0$ vs $H_1:\\mu>0$, $\\alpha=0.05$, $n=25$, known $\\sigma=10$, when the true mean is $\\mu=4$.",
      "back": "Reject when $\\bar X > 0 + z_{0.05}\\frac{\\sigma}{\\sqrt n}=1.645\\cdot\\frac{10}{5}=3.29$.\nUnder the true $\\mu=4$, $\\bar X\\sim N\\!\\left(4,\\,\\left(\\tfrac{10}{5}\\right)^{2}=4\\right)$, so $\\text{sd}(\\bar X)=2$.\nPower $=P(\\bar X>3.29\\mid\\mu=4)=P\\!\\left(Z>\\frac{3.29-4}{2}\\right)=P(Z>-0.355)\\approx 0.639$.\nSo the power is about $64\\%$.",
      "tag": "Hypothesis testing"
    },
    {
      "front": "State the **inverse-transform method** for generating a random variate with cdf $F$.",
      "back": "If $U\\sim\\text{Unif}(0,1)$, then $X=F^{-1}(U)$ has cdf $F$.\nProcedure: draw $u$ from $\\text{Unif}(0,1)$, then solve $F(x)=u$ for $x$. This works because $P(X\\le x)=P(F^{-1}(U)\\le x)=P(U\\le F(x))=F(x)$.\nIt requires an invertible (or at least solvable) cdf and uses exactly one uniform per variate.",
      "tag": "Monte Carlo & bootstrap"
    },
    {
      "front": "Derive the **inverse-transform formula for an exponential** with rate $\\lambda$ (mean $\\frac{1}{\\lambda}$).",
      "back": "The cdf is $F(x)=1-e^{-\\lambda x}$. Set $F(x)=u$:\n$1-e^{-\\lambda x}=u \\Rightarrow e^{-\\lambda x}=1-u \\Rightarrow x=-\\frac{1}{\\lambda}\\ln(1-u)$.\nSo $X=-\\frac{1}{\\lambda}\\ln(1-U)$. Since $1-U$ is also $\\text{Unif}(0,1)$, the equivalent simplification $X=-\\frac{1}{\\lambda}\\ln U$ is often used.",
      "tag": "Monte Carlo & bootstrap"
    },
    {
      "front": "Using $X=-\\frac{1}{\\lambda}\\ln(1-U)$ with $\\lambda=0.5$ and the uniform draw $U=0.8$, generate an exponential variate.",
      "back": "$X=-\\frac{1}{0.5}\\ln(1-0.8)=-2\\ln(0.2)$.\n$\\ln(0.2)\\approx -1.609438$, so $X=-2(-1.609438)=3.218876$.\n$X\\approx 3.219$.",
      "tag": "Monte Carlo & bootstrap"
    },
    {
      "front": "Explain how to generate a **discrete** random variate (e.g. a claim count) by inversion using a uniform $U$.",
      "back": "Build the cumulative probabilities $F(0)\\le F(1)\\le\\cdots$ and partition $[0,1)$ into intervals $\\big[F(k-1),F(k)\\big)$. Draw $U\\sim\\text{Unif}(0,1)$ and return the smallest $x$ with $F(x)\\ge U$.\nEquivalently, return $x=k$ when $F(k-1)\\le U < F(k)$. Each value's interval width equals its probability, so the output has the target pmf.",
      "tag": "Monte Carlo & bootstrap"
    },
    {
      "front": "A discrete loss has $P(X=0)=0.5$, $P(X=1)=0.3$, $P(X=2)=0.2$. Using $U=0.65$ and the inversion rule, what value is generated?",
      "back": "Cumulative cutoffs: $F(0)=0.5$, $F(1)=0.8$, $F(2)=1.0$, giving intervals $[0,0.5)\\to 0$, $[0.5,0.8)\\to 1$, $[0.8,1.0)\\to 2$.\n$U=0.65$ falls in $[0.5,0.8)$, so the generated value is $X=1$.",
      "tag": "Monte Carlo & bootstrap"
    },
    {
      "front": "In **Monte Carlo estimation**, how is a quantity $\\mu=E[g(X)]$ estimated and what is the standard error?",
      "back": "Simulate $N$ independent draws and average: $\\hat\\mu=\\frac{1}{N}\\sum_{i=1}^{N} g(X_{i})$.\nBy the LLN $\\hat\\mu\\to\\mu$, and the standard error is $\\text{SE}=\\frac{s}{\\sqrt N}$ where $s$ is the sample sd of the $g(X_{i})$ values.\nAccuracy improves like $\\frac{1}{\\sqrt N}$, so cutting the error in half requires $4\\times$ the runs.",
      "tag": "Monte Carlo & bootstrap"
    },
    {
      "front": "A Monte Carlo simulation of a portfolio loss has per-run standard deviation $s=2{,}000$. How many runs $N$ are needed for a standard error of at most $25$?",
      "back": "Require $\\frac{s}{\\sqrt N}\\le 25$, i.e. $\\frac{2000}{\\sqrt N}\\le 25$.\nSolve: $\\sqrt N\\ge\\frac{2000}{25}=80$, so $N\\ge 80^{2}=6400$.\nThus at least $N=6{,}400$ runs.",
      "tag": "Monte Carlo & bootstrap"
    },
    {
      "front": "Describe the **bootstrap** and what it estimates.",
      "back": "The (nonparametric) bootstrap resamples the observed data **with replacement** to approximate the sampling distribution of a statistic. Draw $B$ resamples each of size $n$ from the original sample, recompute the statistic $\\hat\\theta^{*}_{b}$ on each, and use the spread of the $\\hat\\theta^{*}$ values.\nThe **bootstrap standard error** is the sample sd of the $\\hat\\theta^{*}_{b}$, and a percentile interval uses the empirical quantiles of those replicates. It is useful when an analytic standard error is hard to derive.",
      "tag": "Monte Carlo & bootstrap"
    },
    {
      "front": "Three bootstrap resamples of a sample give statistic values $\\hat\\theta^{*}=12,\\,16,\\,14$. Estimate the bootstrap standard error.",
      "back": "Mean of replicates $=\\frac{12+16+14}{3}=\\frac{42}{3}=14$.\nSquared deviations: $(12-14)^{2}=4$, $(16-14)^{2}=4$, $(14-14)^{2}=0$; sum $=8$.\nBootstrap SE (using divisor $B-1=2$) $=\\sqrt{\\frac{8}{2}}=\\sqrt{4}=2$.",
      "tag": "Monte Carlo & bootstrap"
    },
    {
      "front": "Explain how the bootstrap can estimate the **bias** of an estimator $\\hat\\theta$.",
      "back": "Compute $\\hat\\theta$ on the original data, then on each bootstrap resample obtain $\\hat\\theta^{*}_{b}$. The bootstrap bias estimate is\n$\\widehat{\\text{bias}}=\\bar{\\theta}^{*}-\\hat\\theta$, where $\\bar{\\theta}^{*}=\\frac{1}{B}\\sum_{b}\\hat\\theta^{*}_{b}$.\nThe resampling treats $\\hat\\theta$ as the \"true\" parameter and the resamples as new samples, so the average over-/under-shoot mimics the estimator's bias. A bias-corrected estimate is $\\hat\\theta-\\widehat{\\text{bias}}=2\\hat\\theta-\\bar\\theta^{*}$.",
      "tag": "Monte Carlo & bootstrap"
    }
  ]
}