{
  "deckName": "Exam P — Common Traps",
  "examCode": "Exam P",
  "cards": [
    {
      "front": "A problem says 'a die is rolled until the first six.' You're handed $E[X]=q/p$. Trap?",
      "back": "**Trap:** two geometric conventions. 'Number of *trials* until the first success' has support $\\{1,2,\\dots\\}$ and mean $E[X]=\\dfrac{1}{p}$. 'Number of *failures* before the first success' has support $\\{0,1,\\dots\\}$ and mean $\\dfrac{q}{p}$. Number-of-rolls means $E[X]=\\dfrac{1}{p}=6$, not $\\dfrac{5/6}{1/6}=5$. Read which one is counted.",
      "tag": "distributions"
    },
    {
      "front": "Both geometric forms have the *same* variance — so does the convention even matter for $\\mathrm{Var}$?",
      "back": "**Trap:** since $Y=X-1$ is just a shift, $\\mathrm{Var}(Y)=\\mathrm{Var}(X)=\\dfrac{q}{p^{2}}$ — identical for both conventions. The convention only changes the *mean* (by exactly $1$) and the *pmf support*. So don't second-guess the variance, but always re-check the mean and which $\\Pr(X=k)$ the problem wants.",
      "tag": "distributions"
    },
    {
      "front": "You compute $\\mathrm{Cov}(X,Y)=0$. Can you now treat $X$ and $Y$ as independent?",
      "back": "**Trap:** No. Independence $\\Rightarrow$ uncorrelated, but **not** the converse. Zero covariance only rules out a *linear* relationship; $X$ and $Y$ can be strongly dependent nonlinearly (e.g. $Y=X^{2}$ with $X$ symmetric about $0$ gives $\\mathrm{Cov}=0$). The lone exception: for a **bivariate normal**, $\\rho=0$ *does* imply independence.",
      "tag": "independence"
    },
    {
      "front": "An exam answer for $\\mathrm{Var}(X+Y)$ reads $\\mathrm{Var}(X)+\\mathrm{Var}(Y)$. When is that wrong?",
      "back": "**Trap:** dropping the covariance term. In general $\\mathrm{Var}(X+Y)=\\mathrm{Var}(X)+\\mathrm{Var}(Y)+2\\,\\mathrm{Cov}(X,Y)$. The cross term vanishes **only** if $X$ and $Y$ are uncorrelated (e.g. independent). If the problem gives a covariance or correlation, it almost certainly wants the $2\\,\\mathrm{Cov}(X,Y)$ term.",
      "tag": "variance-rules"
    },
    {
      "front": "$X-Y$: do you subtract the variances?",
      "back": "**Trap:** variances **add**, the covariance sign flips: $\\mathrm{Var}(X-Y)=\\mathrm{Var}(X)+\\mathrm{Var}(Y)-2\\,\\mathrm{Cov}(X,Y)$. For *independent* $X,Y$ this is $\\mathrm{Var}(X)+\\mathrm{Var}(Y)$ — never $\\mathrm{Var}(X)-\\mathrm{Var}(Y)$. The diff of two independent normals $N(\\mu_1,\\sigma_1^{2})-N(\\mu_2,\\sigma_2^{2})$ is $N(\\mu_1-\\mu_2,\\ \\sigma_1^{2}+\\sigma_2^{2})$.",
      "tag": "variance-rules"
    },
    {
      "front": "You scale a loss by $3$: is $\\mathrm{Var}(3X)=3\\,\\mathrm{Var}(X)$?",
      "back": "**Trap:** No — the constant comes out **squared**: $\\mathrm{Var}(aX)=a^{2}\\,\\mathrm{Var}(X)$, so $\\mathrm{Var}(3X)=9\\,\\mathrm{Var}(X)$. Additive shifts drop out entirely: $\\mathrm{Var}(aX+b)=a^{2}\\,\\mathrm{Var}(X)$. For standard deviation, $\\mathrm{SD}(aX+b)=|a|\\,\\sigma$.",
      "tag": "variance-rules"
    },
    {
      "front": "For the sample mean $\\bar X$ of $n$ iid draws, you write $\\mathrm{Var}(\\bar X)=\\sigma^{2}$. Spot the error.",
      "back": "**Trap:** forgetting the $1/n$. With $\\bar X=\\dfrac{1}{n}\\sum_{i=1}^{n}X_i$, the $1/n$ comes out squared: $\\mathrm{Var}(\\bar X)=\\dfrac{1}{n^{2}}\\cdot n\\sigma^{2}=\\dfrac{\\sigma^{2}}{n}$. The standard error is $\\dfrac{\\sigma}{\\sqrt{n}}$, not $\\sigma$. (The *sum* has variance $n\\sigma^{2}$; don't confuse sum with mean.)",
      "tag": "variance-rules"
    },
    {
      "front": "Approximating a discrete count with the normal: which way does the $\\pm 0.5$ go for $\\Pr(X\\leq k)$ vs $\\Pr(X\\geq k)$?",
      "back": "**Trap:** wrong continuity-correction direction. To *include* the integer $k$, widen the interval toward it: $\\Pr(X\\leq k)\\approx\\Pr(Z\\leq\\frac{k+0.5-\\mu}{\\sigma})$ and $\\Pr(X\\geq k)\\approx\\Pr(Z\\geq\\frac{k-0.5-\\mu}{\\sigma})$. For a strict $\\Pr(X<k)$ use $k-0.5$; for $\\Pr(X>k)$ use $k+0.5$. Only apply this for **integer-valued** variables.",
      "tag": "clt-normal"
    },
    {
      "front": "A 'given it has already lasted $s$' lifetime problem tempts you to use memorylessness. Always valid?",
      "back": "**Trap:** memorylessness holds **only** for the exponential (continuous) and the geometric (discrete). For those, $\\Pr(X>s+t\\mid X>s)=\\Pr(X>t)$. For a Weibull, gamma, Pareto, uniform, etc., you must compute the conditional honestly: $\\Pr(X>s+t\\mid X>s)=\\dfrac{S(s+t)}{S(s)}$.",
      "tag": "distributions"
    },
    {
      "front": "Under deductible $d$, a problem asks for the average payment 'on claims that are paid.' Is that $E[(X-d)_{+}]$?",
      "back": "**Trap:** confusing per-loss with per-payment. **Per loss** averages over *all* losses (including the $0$ payments below $d$): $E[(X-d)_{+}]$. **Per payment** conditions on $X>d$: $E[X-d\\mid X>d]=\\dfrac{E[(X-d)_{+}]}{S(d)}$. 'Paid claims' / 'per payment' is the *larger* conditional figure — divide by $S(d)$.",
      "tag": "loss-models"
    },
    {
      "front": "Given $E[X]$, is $E[g(X)]=g(E[X])$ a safe shortcut (e.g. $E[X^{2}]=(E[X])^{2}$)?",
      "back": "**Trap:** Jensen's inequality. In general $E[g(X)]\\neq g(E[X])$. For a *convex* $g$, $E[g(X)]\\geq g(E[X])$ (e.g. $E[X^{2}]\\geq(E[X])^{2}$, the gap being $\\mathrm{Var}(X)\\geq 0$); for *concave* $g$, the inequality reverses. Equality holds only if $g$ is linear or $X$ is degenerate.",
      "tag": "expectation"
    },
    {
      "front": "A test has $99\\%$ sensitivity and the disease prevalence is $0.1\\%$. A positive test — is the patient probably sick?",
      "back": "**Trap:** base-rate neglect. A high true-positive rate does **not** mean a high $\\Pr(\\text{disease}\\mid +)$ when the prior is tiny. Use Bayes: $\\Pr(D\\mid +)=\\dfrac{\\Pr(+\\mid D)\\,\\Pr(D)}{\\Pr(+\\mid D)\\Pr(D)+\\Pr(+\\mid D^{c})\\Pr(D^{c})}$. With a rare disease the false positives swamp the true ones, so $\\Pr(D\\mid +)$ can stay small.",
      "tag": "probability-rules"
    },
    {
      "front": "$\\Pr(A\\cup B\\cup C)$ — you add the three single probabilities and the three pairwise ones. Done?",
      "back": "**Trap:** double-counting in inclusion–exclusion. You must *subtract* the pairwise overlaps and *add back* the triple: $\\Pr(A\\cup B\\cup C)=\\sum\\Pr(A)-\\sum\\Pr(A\\cap B)+\\Pr(A\\cap B\\cap C)$. Forgetting to subtract overlaps (or omitting the $+\\Pr(A\\cap B\\cap C)$) is the classic Venn-diagram error.",
      "tag": "probability-rules"
    },
    {
      "front": "A pdf is given as $f(x)=c\\,x^{2}$ on $[0,2]$. You start integrating with $c=1$. First step?",
      "back": "**Trap:** assuming the density is already normalized. A pdf must integrate to $1$: solve $\\int_{0}^{2}c\\,x^{2}\\,dx=1\\Rightarrow c\\cdot\\frac{8}{3}=1\\Rightarrow c=\\frac{3}{8}$. **Always find the normalizing constant first** (and confirm $f\\geq 0$ on the support) before computing any probability or moment.",
      "tag": "densities"
    },
    {
      "front": "$X\\sim U(0,1)$, $Y=X^{2}$. You write $f_Y(y)=f_X(\\sqrt{y})$. What's missing?",
      "back": "**Trap:** forgetting the Jacobian. For a monotone $Y=g(X)$, $f_Y(y)=f_X\\!\\big(g^{-1}(y)\\big)\\left|\\dfrac{dx}{dy}\\right|$ — the **absolute value** of the derivative of the inverse. Here $x=\\sqrt{y}$, $\\frac{dx}{dy}=\\frac{1}{2\\sqrt{y}}$, so $f_Y(y)=1\\cdot\\frac{1}{2\\sqrt{y}}$ on $(0,1)$. Omitting $\\left|\\frac{dx}{dy}\\right|$ is the top transformation error.",
      "tag": "transformations"
    },
    {
      "front": "$X_1,\\dots,X_n$ are independent $\\mathrm{Exp}(\\lambda)$. The minimum — is its rate $\\lambda/n$?",
      "back": "**Trap:** the minimum has the **summed** rate, not the averaged one. $\\Pr(\\min>x)=\\prod e^{-\\lambda x}=e^{-n\\lambda x}$, so $\\min\\sim\\mathrm{Exp}(n\\lambda)$ with mean $\\dfrac{1}{n\\lambda}$. The **maximum** is *not* exponential at all: its cdf is $F(x)^{n}=(1-e^{-\\lambda x})^{n}$, and $E[\\max]=\\dfrac{1}{\\lambda}\\sum_{k=1}^{n}\\dfrac{1}{k}$.",
      "tag": "transformations"
    },
    {
      "front": "An estimator question: is $\\frac{1}{n}\\sum (X_i-\\bar X)^2$ an unbiased estimator of $\\sigma^2$?",
      "back": "**Trap:** dividing by $n$ gives the **biased** (MLE) variance estimator. The **unbiased** sample variance divides by $n-1$: $S^{2}=\\dfrac{1}{n-1}\\sum_{i=1}^{n}(X_i-\\bar X)^{2}$, with $E[S^{2}]=\\sigma^{2}$. Dividing by $n$ understates $\\sigma^{2}$ because $\\bar X$ is fitted from the same data (one lost degree of freedom).",
      "tag": "estimation"
    },
    {
      "front": "$E[X]=\\mu$ for the geometric — but which $\\mu$? You see $E[X]=\\frac{1-p}{p}$ in a table and $\\frac{1}{p}$ in your notes.",
      "back": "**Trap:** the same 'geometric' label hides two means. *Trials* form: $E[X]=\\dfrac{1}{p}$, support $\\{1,2,\\dots\\}$. *Failures* form: $E[X]=\\dfrac{1-p}{p}=\\dfrac{q}{p}$, support $\\{0,1,\\dots\\}$. Tables and software differ — anchor to the *support* the problem describes, then pick the matching mean and pmf $q^{x-1}p$ vs $q^{x}p$.",
      "tag": "distributions"
    },
    {
      "front": "$\\mathrm{Cov}(X,Y)=0$ for a bivariate normal pair — independent or not?",
      "back": "**Trap (the exception to the exception):** for the **bivariate normal**, $\\rho=0\\Rightarrow$ independence — this is the one named family where uncorrelated *does* mean independent. But beware: two *marginally* normal variables that aren't *jointly* normal can be uncorrelated yet dependent. The implication needs the full joint normality, not just normal margins.",
      "tag": "independence"
    },
    {
      "front": "Linear combination $aX+bY$ of dependent variables: is $\\mathrm{Var}(aX+bY)=a^{2}\\mathrm{Var}(X)+b^{2}\\mathrm{Var}(Y)$?",
      "back": "**Trap:** missing the cross term *and* its $ab$ factor. The full rule is $\\mathrm{Var}(aX+bY)=a^{2}\\mathrm{Var}(X)+b^{2}\\mathrm{Var}(Y)+2ab\\,\\mathrm{Cov}(X,Y)$. Note the covariance term carries $2ab$ — and if $b$ is negative (a difference), that term is *subtracted*. Drop it only when $\\mathrm{Cov}(X,Y)=0$ — independence is sufficient but not necessary, since uncorrelated dependent variables also have zero covariance.",
      "tag": "variance-rules"
    },
    {
      "front": "A binomial $\\Pr(X\\leq 45)$ with $n=100$, $p=0.5$ via the normal. You plug in $45$ directly. Fix it.",
      "back": "**Trap:** no continuity correction on a discrete count. Here $\\mu=50$, $\\sigma=\\sqrt{100(0.5)(0.5)}=5$. To include $45$, use $45+0.5$: $\\Pr(X\\leq 45)\\approx\\Pr\\!\\big(Z\\leq\\frac{45.5-50}{5}\\big)=\\Pr(Z\\leq-0.9)\\approx 0.1841$. Using $45$ gives the wrong tail.",
      "tag": "clt-normal"
    },
    {
      "front": "Memoryless 'expected additional wait': a bus is exponential with mean $10$, you've waited $7$. Expected remaining wait?",
      "back": "**Trap:** subtracting elapsed time. For an **exponential**, by memorylessness the expected *remaining* wait is still the full mean $E[X-7\\mid X>7]=10$, **not** $10-7=3$. Geometric behaves the same in discrete time. For any *non-memoryless* lifetime this shortcut is invalid.",
      "tag": "distributions"
    },
    {
      "front": "Per-payment expected cost: you compute $E[(X-d)_{+}]$ and report it as 'average payment per claim paid.' Right?",
      "back": "**Trap:** you skipped dividing by $S(d)$. Per-loss $E[(X-d)_{+}]$ already averages in the zeros from losses below $d$. To get **per payment**, condition on a payment: $\\dfrac{E[(X-d)_{+}]}{S(d)}$. For exponential losses this conditional mean is just $\\theta$ (memorylessness), regardless of $d$ — a fast sanity check.",
      "tag": "loss-models"
    },
    {
      "front": "You need $E[1/X]$ and reach for $1/E[X]$. Safe?",
      "back": "**Trap:** Jensen again — $g(x)=1/x$ is convex on $(0,\\infty)$, so $E\\!\\big[\\tfrac{1}{X}\\big]\\geq\\dfrac{1}{E[X]}$, with strict inequality unless $X$ is constant. You must integrate $\\int \\tfrac{1}{x}f(x)\\,dx$. Same warning for $E[\\sqrt{X}]\\neq\\sqrt{E[X]}$ (concave $\\Rightarrow E[\\sqrt X]\\leq\\sqrt{E[X]}$).",
      "tag": "expectation"
    },
    {
      "front": "Bayes: $\\Pr(B\\mid A)$ — can you just reuse $\\Pr(A\\mid B)$ as if it were the same number?",
      "back": "**Trap:** confusing the conditioning direction (the 'prosecutor's fallacy'). $\\Pr(A\\mid B)\\neq\\Pr(B\\mid A)$ in general; they're linked by $\\Pr(B\\mid A)=\\dfrac{\\Pr(A\\mid B)\\,\\Pr(B)}{\\Pr(A)}$. You must weight by the prior $\\Pr(B)$ and normalize by $\\Pr(A)$ (often via total probability). Swapping them ignores the base rate.",
      "tag": "probability-rules"
    },
    {
      "front": "$60\\%$ speak French, $50\\%$ Spanish. What's $\\Pr(\\text{at least one})$ — is it $0.6+0.5=1.1$?",
      "back": "**Trap:** a probability above $1$ flags double counting. $\\Pr(F\\cup S)=\\Pr(F)+\\Pr(S)-\\Pr(F\\cap S)$ — you must subtract the bilingual overlap. The fact that $0.6+0.5>1$ *forces* $\\Pr(F\\cap S)\\geq 0.1$. Use the Bonferroni-style sanity check: $\\Pr(\\text{union})\\leq 1$ always.",
      "tag": "probability-rules"
    },
    {
      "front": "A density $f(x)=k(1-x)$ on $[0,1]$, and you're asked for $\\Pr(X>0.5)$. You integrate with $k=1$. Problem?",
      "back": "**Trap:** any probability you compute is wrong until $f$ is normalized. Solve $\\int_{0}^{1}k(1-x)\\,dx=k\\cdot\\tfrac{1}{2}=1\\Rightarrow k=2$. Then $\\Pr(X>0.5)=\\int_{0.5}^{1}2(1-x)\\,dx=0.25$. Skipping the constant scales every answer by the wrong factor (here, by $2$).",
      "tag": "densities"
    },
    {
      "front": "Bivariate transform $(U,V)=g(X,Y)$: you multiply by the Jacobian of $g$ itself. Correct factor?",
      "back": "**Trap:** wrong Jacobian *and* sign. You need the absolute value of the determinant of the **inverse** map $(x,y)$ in terms of $(u,v)$: $f_{U,V}(u,v)=f_{X,Y}(x,y)\\,|J|$, where $J=\\dfrac{\\partial x}{\\partial u}\\dfrac{\\partial y}{\\partial v}-\\dfrac{\\partial x}{\\partial v}\\dfrac{\\partial y}{\\partial u}$. Take $|J|$, and re-express the support in the new variables.",
      "tag": "transformations"
    },
    {
      "front": "Two independent $\\mathrm{Exp}(\\lambda)$ machines; the system fails when **both** fail. Is the time-to-fail exponential?",
      "back": "**Trap:** 'both fail' is the **maximum**, which is *not* exponential and *not* memoryless. Its cdf is $(1-e^{-\\lambda t})^{2}$ and $E[\\max]=\\dfrac{1}{\\lambda}\\big(1+\\tfrac{1}{2}\\big)=\\dfrac{3}{2\\lambda}$. Only 'system fails when the **first** fails' (a series / minimum) gives an exponential, $\\mathrm{Exp}(2\\lambda)$.",
      "tag": "transformations"
    },
    {
      "front": "You report the population variance formula $\\frac{1}{n}\\sum(X_i-\\bar X)^2$ as the standard 'sample variance.' Bias?",
      "back": "**Trap:** that divisor understates $\\sigma^{2}$: $E\\!\\big[\\tfrac{1}{n}\\sum(X_i-\\bar X)^{2}\\big]=\\dfrac{n-1}{n}\\sigma^{2}<\\sigma^{2}$. Multiply by $\\dfrac{n}{n-1}$ (Bessel's correction) to debias, giving $S^{2}=\\dfrac{1}{n-1}\\sum(X_i-\\bar X)^{2}$. Use $n$ only when the true mean $\\mu$ is *known* (not estimated by $\\bar X$).",
      "tag": "estimation"
    },
    {
      "front": "Standardizing a sum $S_n=\\sum X_i$ for the CLT: do you divide by $\\sigma$ or $\\sigma\\sqrt{n}$?",
      "back": "**Trap:** using the wrong scale. The **sum** has SD $\\sigma\\sqrt{n}$, so $\\dfrac{S_n-n\\mu}{\\sigma\\sqrt{n}}\\approx N(0,1)$. The **mean** has SD $\\sigma/\\sqrt{n}$, so $\\dfrac{\\bar X-\\mu}{\\sigma/\\sqrt{n}}\\approx N(0,1)$. Mixing up sum-scale and mean-scale (e.g. dividing the sum by $\\sigma/\\sqrt n$) is a frequent CLT slip.",
      "tag": "variance-rules"
    },
    {
      "front": "$\\Pr(X=k)$ for a continuous random variable — you compute a positive number. Possible?",
      "back": "**Trap:** for any *continuous* variable, $\\Pr(X=k)=0$ — point masses contribute nothing. Hence $\\Pr(X\\leq k)=\\Pr(X<k)$ and $\\leq$ vs $<$ doesn't matter. This *fails* for **mixed** distributions (e.g. a censored loss $X\\wedge u$ has an atom $\\Pr(X\\geq u)$ at $u$), where the boundary mass is real.",
      "tag": "densities"
    },
    {
      "front": "$\\mathrm{Var}(X+Y+Z)$ for three *pairwise dependent* variables — just sum the three variances?",
      "back": "**Trap:** you owe **three** covariance terms, one per pair: $\\mathrm{Var}(X+Y+Z)=\\mathrm{Var}(X)+\\mathrm{Var}(Y)+\\mathrm{Var}(Z)+2[\\mathrm{Cov}(X,Y)+\\mathrm{Cov}(X,Z)+\\mathrm{Cov}(Y,Z)]$. In general $\\mathrm{Var}\\!\\big(\\sum_i X_i\\big)=\\sum_i\\mathrm{Var}(X_i)+2\\sum_{i<j}\\mathrm{Cov}(X_i,X_j)$.",
      "tag": "variance-rules"
    },
    {
      "front": "A 'memoryless' wording trap: a *used* part's lifetime is uniform on $[0,10]$, $5$ years elapsed. Expected remaining life $=5$?",
      "back": "**Trap:** the uniform is **not** memoryless. Given survival to $5$, the remaining life is uniform on $[0,5]$, so the expected residual is $\\dfrac{5}{2}=2.5$, not $5$. Only exponential (and discrete geometric) lifetimes keep a constant expected residual; everything else needs $E[X-s\\mid X>s]$ computed directly.",
      "tag": "distributions"
    },
    {
      "front": "Order-statistic median of $n=4$ values — you reach for the average of the two middle order statistics' *expectations* as the median's distribution. Trap?",
      "back": "**Trap:** the $k$-th order statistic has its **own** density, not a shortcut from the mean. For iid continuous $X_i$ with cdf $F$, pdf $f$: $f_{X_{(k)}}(x)=\\dfrac{n!}{(k-1)!(n-k)!}F(x)^{k-1}[1-F(x)]^{n-k}f(x)$. The max is the $k=n$ case, the min the $k=1$ case — don't conflate $E[\\text{stat}]$ with the stat itself.",
      "tag": "transformations"
    }
  ]
}