Exam P — Expectation, Variance & MGFs Flashcards
Expectation and LOTUS, variance and standard deviation, linear-transform properties, moments and shape, moment and probability generating functions, the Markov and Chebyshev inequalities, and the coefficient of variation — with worked Exam P-style calculations.
Import this deck
Download all 42 cards and import them into your flashcard app (JSON or CSV — works with Anki). Using the Willys app? No import needed — this deck is already built in (Settings → Library → Browse).
Every deck is built into the Willys app
All of these decks — including the full practice problem banks — come built into Willys AI Flashcards & Quizzes for iPhone & iPad (Mac version coming soon), with FSRS + SM-2 spaced repetition, streaks, and exam-date cram mode. 14-day free trial, then $14.99. To load a deck in the app: Settings → Library → Browse, then pick your exam and deck.
More Exam P decks:
Browse all 42 cards as a list
- ExpectationDefine the expected value $E[X]$ for a discrete and for a continuous random variable.For a discrete $X$ with pmf $p(x)$: $E[X]=\sum_{x} x\,p(x)$. For a continuous $X$ with pdf $f(x)$: $E[X]=\int_{-\infty}^{\infty} x\,f(x)\,dx$. $E[X]$ exists only if the sum/integral converges absolutely.
- ExpectationState the Law of the Unconscious Statistician (LOTUS) for $E[g(X)]$.You do **not** need the distribution of $Y=g(X)$ to find its mean. Use the distribution of $X$ directly: Discrete: $E[g(X)]=\sum_{x} g(x)\,p(x)$. Continuous: $E[g(X)]=\int_{-\infty}^{\infty} g(x)\,f(x)\,dx$.
- ExpectationIs $E[g(X)]=g(E[X])$ in general? Give the rule.No. Equality holds when $g$ is affine, $g(x)=ax+b$ — more generally when $g$ is affine on the support of $X$, or when $X$ is degenerate (constant). For a convex $g$, **Jensen's inequality** gives $E[g(X)]\geq g(E[X])$ (e.g. $E[X^{2}]\geq (E[X])^{2}$), strict when $g$ is strictly convex and $X$ is nondegenerate; for concave $g$ the inequality reverses.
- ExpectationFor a nonnegative continuous random variable, express $E[X]$ using the survival function $S(x)=1-F(x)$.$E[X]=\int_{0}^{\infty} S(x)\,dx=\int_{0}^{\infty}\big(1-F(x)\big)\,dx$. For a nonnegative integer-valued $X$: $E[X]=\sum_{k=1}^{\infty}\Pr(X\geq k)$. This 'tail formula' is very efficient for deductible/limit problems.
- Expectation$X$ takes values $0,1,2,3$ with probabilities $0.4,0.3,0.2,0.1$. Find $E[X]$ and $E[X^{2}]$.$E[X]=0(0.4)+1(0.3)+2(0.2)+3(0.1)=0+0.3+0.4+0.3=1.0$. $E[X^{2}]=0(0.4)+1(0.3)+4(0.2)+9(0.1)=0+0.3+0.8+0.9=2.0$. So $E[X]=1.0$ and $E[X^{2}]=2.0$.
- Expectation$X$ has pdf $f(x)=2x$ on $0<x<1$. Find $E[X]$ and $E[1/X]$ (a LOTUS application).$E[X]=\int_{0}^{1} x\cdot 2x\,dx=\int_{0}^{1}2x^{2}\,dx=\frac{2}{3}$. $E[1/X]=\int_{0}^{1}\frac{1}{x}\cdot 2x\,dx=\int_{0}^{1}2\,dx=2$. Note $E[1/X]=2\neq 1/E[X]=3/2$ — averaging a nonlinear $g$ is not $g$ of the average.
- VarianceGive the computational ('shortcut') formula for $\operatorname{Var}(X)$ and explain why it beats the definition.$\operatorname{Var}(X)=E[X^{2}]-(E[X])^{2}$. It avoids re-centering each value about the mean: you only need the first two raw moments. The definitional form is $\operatorname{Var}(X)=E\big[(X-\mu)^{2}\big]$ with $\mu=E[X]$.
- VarianceA loss $X$ is uniform on $[0,100]$. Compute $E[X]$, $E[X^{2}]$, and $\operatorname{Var}(X)$ from the definitions.pdf $f(x)=\frac{1}{100}$ on $[0,100]$. $E[X]=\int_{0}^{100}\frac{x}{100}\,dx=\frac{1}{100}\cdot\frac{100^{2}}{2}=50$. $E[X^{2}]=\int_{0}^{100}\frac{x^{2}}{100}\,dx=\frac{1}{100}\cdot\frac{100^{3}}{3}=\frac{10000}{3}$. $\operatorname{Var}(X)=\frac{10000}{3}-50^{2}=\frac{10000}{3}-2500=\frac{2500}{3}\approx 833.3$ $\big(=\frac{(b-a)^{2}}{12}\big)$.
- Variance$X\sim\operatorname{Bernoulli}(p)$ with $p=0.25$. Derive $E[X]$ and $\operatorname{Var}(X)$ from first principles.$E[X]=1\cdot p+0\cdot(1-p)=p=0.25$. $E[X^{2}]=1^{2}\cdot p+0=p=0.25$ (since $X^{2}=X$ for a $0/1$ variable). $\operatorname{Var}(X)=E[X^{2}]-(E[X])^{2}=p-p^{2}=p(1-p)=0.25\times0.75=0.1875$.
- Variance$X\sim\operatorname{Binomial}(n,p)$ with $n=20,p=0.3$. State and apply the mean and variance formulas.Mean $E[X]=np=20(0.3)=6$. Variance $\operatorname{Var}(X)=np(1-p)=20(0.3)(0.7)=4.2$. $\operatorname{SD}(X)=\sqrt{4.2}\approx 2.049$.
- VarianceA discrete $X$ has pmf $p(1)=0.5,\,p(2)=0.3,\,p(4)=0.2$. Find $\operatorname{Var}(X)$.$E[X]=1(0.5)+2(0.3)+4(0.2)=0.5+0.6+0.8=1.9$. $E[X^{2}]=1(0.5)+4(0.3)+16(0.2)=0.5+1.2+3.2=4.9$. $\operatorname{Var}(X)=4.9-(1.9)^{2}=4.9-3.61=1.29$.
- VarianceProve that $E[(X-c)^{2}]$ is minimized over constants $c$ at $c=E[X]$.Expand: $E[(X-c)^{2}]=E[X^{2}]-2cE[X]+c^{2}$. Differentiate in $c$: $-2E[X]+2c=0\Rightarrow c=E[X]$ (second derivative $2>0$, a minimum). The minimum value is $E[X^{2}]-(E[X])^{2}=\operatorname{Var}(X)$. So the mean is the least-squares center, and the variance is that minimal mean-squared deviation.
- Linear transformsHow do $E$ and $\operatorname{Var}$ behave under the linear transform $aX+b$?$E[aX+b]=a\,E[X]+b$. $\operatorname{Var}(aX+b)=a^{2}\operatorname{Var}(X)$, so $\operatorname{SD}(aX+b)=|a|\,\sigma$. A shift $b$ moves the mean but never changes the spread; only the scale $a$ does, and it enters squared.
- Linear transforms$X$ has $E[X]=20$ and $\operatorname{Var}(X)=9$. Let $Y=3X-5$. Find $E[Y]$, $\operatorname{Var}(Y)$, and $\operatorname{SD}(Y)$.$E[Y]=3E[X]-5=3(20)-5=55$. $\operatorname{Var}(Y)=3^{2}\operatorname{Var}(X)=9(9)=81$. $\operatorname{SD}(Y)=\sqrt{81}=9=|3|\cdot\sqrt{9}$.
- Linear transformsConvert a Fahrenheit temperature $F$ to Celsius via $C=\frac{5}{9}(F-32)$. If the standard deviation of $F$ is $\operatorname{SD}(F)=18$, what is $\operatorname{SD}(C)$?Only the multiplicative factor $\frac{5}{9}$ affects spread; the shift $-32\cdot\frac{5}{9}$ does not. $\operatorname{SD}(C)=\big|\frac{5}{9}\big|\cdot\operatorname{SD}(F)=\frac{5}{9}\times 18=10$. So the standard deviation of the Celsius reading is $10$ degrees.
- Linear transformsIndependent $X,Y$ have $\operatorname{Var}(X)=4,\operatorname{Var}(Y)=9$. Find $\operatorname{Var}(2X-Y+5)$.The constant $+5$ does not affect variance. $\operatorname{Var}(2X-Y)=2^{2}\operatorname{Var}(X)+(-1)^{2}\operatorname{Var}(Y)=4(4)+1(9)=16+9=25$. So $\operatorname{Var}(2X-Y+5)=25$. (Independence: variances add for both sum and difference.)
- Linear transformsIndependent $X_{1},\dots,X_{25}$ are i.i.d. with mean $\mu=8$ and variance $\sigma^{2}=4$. Find $E[\bar{X}]$ and $\operatorname{Var}(\bar{X})$ for the sample mean $\bar{X}$.$E[\bar{X}]=\mu=8$ (the sample mean is unbiased for the mean). $\operatorname{Var}(\bar{X})=\frac{\sigma^{2}}{n}=\frac{4}{25}=0.16$, so $\operatorname{SD}(\bar{X})=\frac{\sigma}{\sqrt{n}}=\frac{2}{5}=0.4$. Averaging shrinks variance by a factor of $n$.
- MomentsDefine the $n$-th raw moment and the $n$-th central moment of $X$.Raw (about the origin): $\mu_{n}'=E[X^{n}]$. Central (about the mean $\mu=E[X]$): $\mu_{n}=E\big[(X-\mu)^{n}\big]$. Note $\mu_{1}'=\mu$, $\mu_{1}=0$, and $\mu_{2}=\operatorname{Var}(X)$.
- Moments$X$ has $E[X]=5$ and $\operatorname{Var}(X)=9$. Find $E[X^{2}]$ and $E[(2X-3)^{2}]$.$E[X^{2}]=\operatorname{Var}(X)+(E[X])^{2}=9+25=34$. Let $Y=2X-3$: $E[Y]=2(5)-3=7$, $\operatorname{Var}(Y)=2^{2}(9)=36$. $E[Y^{2}]=\operatorname{Var}(Y)+(E[Y])^{2}=36+49=85$. So $E[(2X-3)^{2}]=85$.
- Moments$X$ has pdf $f(x)=3x^{2}$ on $0<x<1$. Let $Y=X^{2}$. Use LOTUS to find $E[Y]$ and $\operatorname{Var}(Y)$.$E[Y]=E[X^{2}]=\int_{0}^{1}x^{2}\cdot 3x^{2}\,dx=\int_{0}^{1}3x^{4}\,dx=\frac{3}{5}$. $E[Y^{2}]=E[X^{4}]=\int_{0}^{1}x^{4}\cdot 3x^{2}\,dx=\int_{0}^{1}3x^{6}\,dx=\frac{3}{7}$. $\operatorname{Var}(Y)=\frac{3}{7}-\big(\frac{3}{5}\big)^{2}=\frac{3}{7}-\frac{9}{25}=\frac{75-63}{175}=\frac{12}{175}\approx 0.0686$.
- Moments$X$ has pdf $f(x)=\frac{2}{x^{3}}$ for $x>1$ (a Pareto-type tail). Find $E[X]$ and explain whether $\operatorname{Var}(X)$ is finite.$E[X]=\int_{1}^{\infty}x\cdot\frac{2}{x^{3}}dx=\int_{1}^{\infty}2x^{-2}dx=\big[-2x^{-1}\big]_{1}^{\infty}=2$. $E[X^{2}]=\int_{1}^{\infty}x^{2}\cdot\frac{2}{x^{3}}dx=\int_{1}^{\infty}2x^{-1}dx=\big[2\ln x\big]_{1}^{\infty}=\infty$. Since $E[X^{2}]$ diverges, $\operatorname{Var}(X)$ is **infinite** — a hallmark of heavy tails (and no MGF).
- MomentsA risk pays benefit $0$ with probability $0.9$ and a $\operatorname{Uniform}(0,1000)$ amount with probability $0.1$. Find $E[X]$ and $\operatorname{Var}(X)$ of this mixed payment.Let $U\sim\operatorname{Uniform}(0,1000)$: $E[U]=500$, $E[U^{2}]=\frac{1000^{2}}{3}=333{,}333.33$. $E[X]=0.9(0)+0.1(500)=50$. $E[X^{2}]=0.9(0)+0.1(333{,}333.33)=33{,}333.33$. $\operatorname{Var}(X)=33{,}333.33-50^{2}=33{,}333.33-2500=30{,}833.33$.
- ShapeDefine skewness and state what its sign tells you about a distribution.Skewness $\gamma_{1}=\frac{E[(X-\mu)^{3}]}{\sigma^{3}}=\frac{\mu_{3}}{\sigma^{3}}$. $\gamma_{1}>0$: right-skewed (long right tail, mean above median — e.g. the exponential). $\gamma_{1}<0$: left-skewed. $\gamma_{1}=0$ for any symmetric distribution.
- ShapeDefine kurtosis and excess kurtosis, and give the normal benchmark.Kurtosis $\frac{E[(X-\mu)^{4}]}{\sigma^{4}}=\frac{\mu_{4}}{\sigma^{4}}$. Excess kurtosis subtracts $3$. A normal distribution has kurtosis $3$ (excess $0$). Excess $>0$ means heavier tails / sharper peak (leptokurtic); excess $<0$ means lighter tails (platykurtic).
- Coefficient of variationDefine the coefficient of variation (CV) and state when it is most useful.$\operatorname{CV}=\frac{\operatorname{SD}(X)}{E[X]}=\frac{\sigma}{\mu}$ (often as a percentage), defined for $\mu\neq 0$, usually for positive $X$. It is a **scale-free** measure of relative dispersion, letting you compare variability across variables with different units or magnitudes.
- Coefficient of variationA loss has $E[X]=2{,}000$ and $\operatorname{Var}(X)=2{,}560{,}000$. Find its coefficient of variation.$\operatorname{SD}(X)=\sqrt{2{,}560{,}000}=1{,}600$. $\operatorname{CV}=\frac{\operatorname{SD}(X)}{E[X]}=\frac{1{,}600}{2{,}000}=0.8$ (i.e. $80\%$).
- Coefficient of variationWhat is the coefficient of variation of any exponential distribution, and why is it constant?For an exponential with mean $\theta$: $\operatorname{SD}=\theta$ and $E[X]=\theta$, so $\operatorname{CV}=\theta/\theta=1$. The CV is scale-free, and changing $\theta$ is just a rescaling $X\to cX$, which leaves $\operatorname{SD}/\operatorname{mean}$ unchanged. So every exponential has $\operatorname{CV}=1$.
- Coefficient of variationAn aggregate is $S=X_{1}+X_{2}+X_{3}$ with the $X_{i}$ independent, $E[X_{i}]=100$, $\operatorname{Var}(X_{i})=400$. Find $E[S]$, $\operatorname{Var}(S)$, and $\operatorname{CV}(S)$.$E[S]=3\times100=300$. $\operatorname{Var}(S)=3\times400=1200$ (independence: variances add). $\operatorname{SD}(S)=\sqrt{1200}\approx 34.64$, so $\operatorname{CV}(S)=\frac{34.64}{300}\approx 0.115$. The CV of the sum ($\approx0.115$) is smaller than each component's CV ($20/100=0.2$) — pooling diversifies.
- MGFDefine the moment generating function $M_{X}(t)$ and state where it must exist.$M_{X}(t)=E\big[e^{tX}\big]$. Discrete: $\sum_{x} e^{tx}p(x)$; continuous: $\int_{-\infty}^{\infty} e^{tx}f(x)\,dx$. It must be finite for $t$ in some open interval containing $0$ for the MGF to be useful. Always $M_{X}(0)=1$.
- MGFHow do you recover the moments $E[X^{n}]$ from $M_{X}(t)$?Differentiate $n$ times and evaluate at $0$: $E[X^{n}]=M_{X}^{(n)}(0)=\frac{d^{n}}{dt^{n}}M_{X}(t)\Big|_{t=0}$. So $E[X]=M_{X}'(0)$ and $E[X^{2}]=M_{X}''(0)$; this is why it 'generates' moments.
- MGFGiven $M_{X}(t)=e^{3t+8t^{2}}$, identify $E[X]$ and $\operatorname{Var}(X)$.This is the normal MGF $\exp(\mu t+\frac{1}{2}\sigma^{2}t^{2})$. Matching exponents: $\mu=3$ and $\frac{1}{2}\sigma^{2}=8\Rightarrow\sigma^{2}=16$. So $E[X]=3$ and $\operatorname{Var}(X)=16$ (with $\operatorname{SD}=4$).
- MGFFrom $M_{X}(t)=(1-2t)^{-3}$ for $t<\frac{1}{2}$, compute $E[X]$ and $\operatorname{Var}(X)$ by differentiation.$M_{X}'(t)=6(1-2t)^{-4}\Rightarrow E[X]=M_{X}'(0)=6$. $M_{X}''(t)=48(1-2t)^{-5}\Rightarrow E[X^{2}]=M_{X}''(0)=48$. $\operatorname{Var}(X)=48-6^{2}=48-36=12$. (This is a gamma with $\alpha=3,\theta=2$: mean $\alpha\theta=6$, variance $\alpha\theta^{2}=12$.)
- MGFState the MGF transformation rule for $Y=aX+b$ and apply it: if $X$ is exponential with $M_{X}(t)=\frac{1}{1-2t}$, find $M_{Y}(t)$ for $Y=3X+1$.Rule: $M_{aX+b}(t)=e^{bt}\,M_{X}(at)$ (the shift gives $e^{bt}$, the scale rescales the argument). With $a=3,b=1$: $M_{Y}(t)=e^{t}\,M_{X}(3t)=e^{t}\cdot\frac{1}{1-2(3t)}=\frac{e^{t}}{1-6t}$, valid for $t<\frac{1}{6}$.
- MGFState the uniqueness property of MGFs and why it matters on Exam P.If two random variables have MGFs that are equal (and finite) on an open interval around $0$, they have the **same distribution**. So an MGF determines the distribution uniquely. This lets you 'recognize the distribution from its MGF' and read off its mean/variance from the standard form.
- MGFLet $X\sim\operatorname{Poisson}(2)$ and $Y\sim\operatorname{Poisson}(3)$ be independent. Use MGFs to identify the distribution of $X+Y$.For independent sums MGFs multiply, $M_{X+Y}(t)=M_{X}(t)M_{Y}(t)$. Poisson MGF: $\exp[\lambda(e^{t}-1)]$. $M_{X+Y}(t)=e^{2(e^{t}-1)}\,e^{3(e^{t}-1)}=e^{5(e^{t}-1)}$. By uniqueness this is $\operatorname{Poisson}(5)$. Independent Poissons add: $\lambda=2+3=5$.
- MGFGiven the MGF $M_{X}(t)=\frac{0.4e^{t}}{1-0.6e^{t}}$ for $e^{t}<\frac{1}{0.6}$, identify the distribution and its mean and variance.This matches the geometric MGF $\frac{pe^{t}}{1-qe^{t}}$ with $p=0.4,\,q=0.6$ on support $\{1,2,\dots\}$. By uniqueness $X$ is geometric (trials to first success): $E[X]=\frac{1}{p}=\frac{1}{0.4}=2.5$ and $\operatorname{Var}(X)=\frac{q}{p^{2}}=\frac{0.6}{0.16}=3.75$.
- PGFDefine the probability generating function (PGF) $G_{X}(s)$ and say which variables it applies to.For a **nonnegative integer-valued** $X$: $G_{X}(s)=E\big[s^{X}\big]=\sum_{k=0}^{\infty} s^{k}\Pr(X=k)$. It converges at least for $|s|\leq 1$, with $G_{X}(1)=1$. It relates to the MGF by $G_{X}(s)=M_{X}(\ln s)$.
- PGFHow do you recover $\Pr(X=k)$ and the factorial moments from the PGF $G_{X}(s)$?$\Pr(X=k)=\frac{G_{X}^{(k)}(0)}{k!}$ (the $k$-th Taylor coefficient). Factorial moments at $s=1$: $G_{X}'(1)=E[X]$ and $G_{X}''(1)=E[X(X-1)]$, so $\operatorname{Var}(X)=G''_{X}(1)+G'_{X}(1)-[G'_{X}(1)]^{2}$.
- PGFA count has PGF $G_{X}(s)=e^{4(s-1)}$. Find $\Pr(X=0)$, $E[X]$, and $\operatorname{Var}(X)$.$\Pr(X=0)=G_{X}(0)=e^{4(0-1)}=e^{-4}$. $G_{X}'(s)=4e^{4(s-1)}\Rightarrow E[X]=G_{X}'(1)=4$. $G_{X}''(s)=16e^{4(s-1)}\Rightarrow E[X(X-1)]=G_{X}''(1)=16$. $\operatorname{Var}(X)=16+4-4^{2}=4$. (This is $\operatorname{Poisson}(4)$: mean $=$ variance $=4$.)
- InequalitiesState Markov's inequality, its key requirement, and apply it: claim sizes are nonnegative with mean $E[X]=400$ — bound $\Pr(X\geq 2000)$.Markov (requires $X\geq 0$): $\Pr(X\geq a)\leq \frac{E[X]}{a}$ for $a>0$. With $a=2000$: $\Pr(X\geq 2000)\leq \frac{400}{2000}=0.2$. At most $20\%$ of claims can be $2000$ or larger. The bound uses only the mean and is usually loose.
- InequalitiesState Chebyshev's inequality, then apply it: a variable has $\mu=50$ and $\sigma=5$ — bound $\Pr(40<X<60)$.Chebyshev: $\Pr\big(|X-\mu|\geq k\sigma\big)\leq \frac{1}{k^{2}}$ (no shape assumption). The interval is $\mu\pm 10=\mu\pm 2\sigma$, so $k=2$: $\Pr(|X-50|\geq 10)\leq \frac{1}{2^{2}}=0.25$. Thus $\Pr(40<X<60)=1-\Pr(|X-50|\geq 10)\geq 0.75$.
- InequalitiesContrast what Markov, Chebyshev, and exact distribution give for $\Pr(X\geq 1200)$ when $X$ is exponential with mean $400$.Markov: $\Pr(X\geq1200)\leq 400/1200=0.333$. Chebyshev ($\sigma=400$, $1200=\mu+2\sigma$): the two-sided bound $\Pr(|X-400|\geq 800)\leq 1/4=0.25$, and since $\{X\geq 1200\}\subseteq\{|X-400|\geq 800\}$ this gives $\Pr(X\geq1200)\leq 0.25$. Exact: $\Pr(X\geq1200)=e^{-1200/400}=e^{-3}\approx 0.0498$. The inequalities are valid but loose; exact distributional knowledge is far tighter.