Exam P — Formula Sheet Flashcards

A terse cram/recall deck of every must-know Exam P formula — named distributions (pmf/pdf, support, mean, variance, MGF) plus core probability, moment, conditioning, CLT, order-statistic, transformation, and insurance-layer results.

52 cards13 topicsFree · fact-checked · LaTeX math

Tap card or press Space to flip

Answer

Import this deck

Download all 52 cards and import them into your flashcard app (JSON or CSV — works with Anki). Using the Willys app? No import needed — this deck is already built in (Settings → Library → Browse).

Download JSON Download CSV

Every deck is built into the Willys app

All of these decks — including the full practice problem banks — come built into Willys AI Flashcards & Quizzes for iPhone & iPad (Mac version coming soon), with FSRS + SM-2 spaced repetition, streaks, and exam-date cram mode. 14-day free trial, then $14.99. To load a deck in the app: Settings → Library → Browse, then pick your exam and deck.

Download on the App Store

More Exam P decks:

Common Traps Conditional Probability & Bayes Conditional Probability & Bayes Practice Continuous Distributions Continuous Distributions Practice Covariance, Sums & CLT

← All Exam P decks

Browse all 52 cards as a list

Core probability
Probability axioms (Kolmogorov) — three rules.
$\Pr(A)\geq 0$ $\cdot$ $\Pr(S)=1$ $\cdot$ countable additivity: disjoint $A_i\Rightarrow\Pr\!\left(\bigcup_i A_i\right)=\sum_i\Pr(A_i)$.
Core probability
Complement and addition (inclusion–exclusion, 2 events).
$\Pr(A^{c})=1-\Pr(A)$ $\cdot$ $\Pr(A\cup B)=\Pr(A)+\Pr(B)-\Pr(A\cap B)$.
Core probability
Inclusion–exclusion for 3 events.
$\Pr(A\cup B\cup C)=\Pr(A)+\Pr(B)+\Pr(C)-\Pr(A\cap B)-\Pr(A\cap C)-\Pr(B\cap C)+\Pr(A\cap B\cap C)$.
Core probability
Conditional probability, multiplication rule, independence.
$\Pr(A\mid B)=\dfrac{\Pr(A\cap B)}{\Pr(B)}$ $\cdot$ $\Pr(A\cap B)=\Pr(A\mid B)\Pr(B)$ $\cdot$ independent $\Leftrightarrow\Pr(A\cap B)=\Pr(A)\Pr(B)$.
Core probability
Law of total probability and Bayes' theorem (partition $\{B_i\}$).
$\Pr(A)=\sum_i\Pr(A\mid B_i)\Pr(B_i)$ $\cdot$ $\Pr(B_k\mid A)=\dfrac{\Pr(A\mid B_k)\Pr(B_k)}{\sum_i\Pr(A\mid B_i)\Pr(B_i)}$.
Discrete dist
Bernoulli$(p)$: pmf, support, mean, variance, MGF.
$p(x)=p^{x}(1-p)^{1-x}$, $x\in\{0,1\}$ $\cdot$ $E[X]=p$ $\cdot$ $\operatorname{Var}(X)=p(1-p)$ $\cdot$ $M_X(t)=q+pe^{t}$, $q=1-p$.
Discrete dist
Binomial$(n,p)$: pmf, support, mean, variance, MGF.
$p(x)=\binom{n}{x}p^{x}q^{\,n-x}$, $x=0,\dots,n$ $\cdot$ $E[X]=np$ $\cdot$ $\operatorname{Var}(X)=npq$ $\cdot$ $M_X(t)=(q+pe^{t})^{n}$, $q=1-p$.
Discrete dist
Geometric$(p)$, trials-to-first-success form: pmf, support, mean, variance, MGF.
$p(x)=q^{x-1}p$, $x=1,2,\dots$ $\cdot$ $E[X]=\dfrac{1}{p}$ $\cdot$ $\operatorname{Var}(X)=\dfrac{q}{p^{2}}$ $\cdot$ $M_X(t)=\dfrac{pe^{t}}{1-qe^{t}}$, $q=1-p$.
Discrete dist
Geometric — failures-before-first-success form ($Y=X-1$): pmf, support, mean, variance.
$p(y)=q^{y}p$, $y=0,1,2,\dots$ $\cdot$ $E[Y]=\dfrac{q}{p}$ $\cdot$ $\operatorname{Var}(Y)=\dfrac{q}{p^{2}}$ (same variance; mean drops by $1$).
Discrete dist
Negative Binomial$(r,p)$, trials-to-$r$th-success form: pmf, support, mean, variance, MGF.
$p(x)=\binom{x-1}{r-1}p^{r}q^{\,x-r}$, $x=r,r+1,\dots$ $\cdot$ $E[X]=\dfrac{r}{p}$ $\cdot$ $\operatorname{Var}(X)=\dfrac{rq}{p^{2}}$ $\cdot$ $M_X(t)=\left(\dfrac{pe^{t}}{1-qe^{t}}\right)^{r}$.
Discrete dist
Negative Binomial — failures-before-$r$th-success form ($W=X-r$): pmf, support, mean, variance.
$p(w)=\binom{w+r-1}{w}p^{r}q^{w}$, $w=0,1,\dots$ $\cdot$ $E[W]=\dfrac{rq}{p}$ $\cdot$ $\operatorname{Var}(W)=\dfrac{rq}{p^{2}}$.
Discrete dist
Hypergeometric$(N,K,n)$: pmf, support, mean, variance.
$p(x)=\dfrac{\binom{K}{x}\binom{N-K}{\,n-x}}{\binom{N}{n}}$ $\cdot$ $E[X]=n\dfrac{K}{N}$ $\cdot$ $\operatorname{Var}(X)=n\dfrac{K}{N}\!\left(1-\dfrac{K}{N}\right)\!\dfrac{N-n}{N-1}$ (last factor = finite-pop correction).
Discrete dist
Poisson$(\lambda)$: pmf, support, mean, variance, MGF.
$p(x)=\dfrac{e^{-\lambda}\lambda^{x}}{x!}$, $x=0,1,2,\dots$ $\cdot$ $E[X]=\lambda$ $\cdot$ $\operatorname{Var}(X)=\lambda$ $\cdot$ $M_X(t)=\exp[\lambda(e^{t}-1)]$.
Discrete dist
Poisson facts: factorial moment, Poisson limit of binomial, thinning.
$E[X(X-1)]=\lambda^{2}$ $\cdot$ $\text{Bin}(n,p)\to\text{Poisson}(np)$ as $n\to\infty,\,p\to0$ $\cdot$ split with prob $p$: independent $\text{Poisson}(\lambda p)$ and $\text{Poisson}(\lambda(1-p))$.
Discrete dist
Discrete Uniform on $\{1,\dots,n\}$: pmf, mean, variance.
$p(x)=\dfrac{1}{n}$ $\cdot$ $E[X]=\dfrac{n+1}{2}$ $\cdot$ $\operatorname{Var}(X)=\dfrac{n^{2}-1}{12}$.
Continuous dist
Continuous Uniform$(a,b)$: pdf, cdf, mean, variance, MGF.
$f(x)=\dfrac{1}{b-a}$ on $[a,b]$ $\cdot$ $F(x)=\dfrac{x-a}{b-a}$ $\cdot$ $E[X]=\dfrac{a+b}{2}$ $\cdot$ $\operatorname{Var}(X)=\dfrac{(b-a)^{2}}{12}$ $\cdot$ $M_X(t)=\dfrac{e^{tb}-e^{ta}}{t(b-a)}$.
Continuous dist
Exponential (rate $\lambda$, mean $\theta=1/\lambda$): pdf, cdf, survival, mean, variance, MGF.
$f(x)=\lambda e^{-\lambda x}$, $x\geq0$ $\cdot$ $F(x)=1-e^{-\lambda x}$ $\cdot$ $S(x)=e^{-\lambda x}$ $\cdot$ $E[X]=\dfrac{1}{\lambda}=\theta$ $\cdot$ $\operatorname{Var}(X)=\dfrac{1}{\lambda^{2}}=\theta^{2}$ $\cdot$ $M_X(t)=\dfrac{\lambda}{\lambda-t}$, $t<\lambda$.
Continuous dist
Exponential memorylessness and second moment.
$\Pr(X>s+t\mid X>s)=\Pr(X>t)$ $\cdot$ $E[X^{2}]=2\theta^{2}$ $\cdot$ only continuous memoryless distribution.
Continuous dist
Gamma (shape $\alpha$, rate $\lambda$): pdf, mean, variance, MGF.
$f(x)=\dfrac{\lambda^{\alpha}}{\Gamma(\alpha)}x^{\alpha-1}e^{-\lambda x}$, $x>0$ $\cdot$ $E[X]=\dfrac{\alpha}{\lambda}$ $\cdot$ $\operatorname{Var}(X)=\dfrac{\alpha}{\lambda^{2}}$ $\cdot$ $M_X(t)=\left(\dfrac{\lambda}{\lambda-t}\right)^{\alpha}$, $t<\lambda$.
Continuous dist
Gamma facts: gamma function, Erlang, exponential and chi-square links.
$\Gamma(\alpha+1)=\alpha\Gamma(\alpha)$, $\Gamma(n)=(n-1)!$, $\Gamma\!\left(\tfrac12\right)=\sqrt{\pi}$ $\cdot$ $\alpha=1$: exponential $\cdot$ integer $\alpha=n$: sum of $n$ iid $\text{Exp}(\lambda)$ (Erlang) $\cdot$ $\chi^{2}_{k}=\text{Gamma}(\alpha=k/2,\lambda=\tfrac12)$.
Continuous dist
Normal$(\mu,\sigma^{2})$: pdf, mean, variance, MGF, standardization.
$f(x)=\dfrac{1}{\sigma\sqrt{2\pi}}\exp\!\left(-\dfrac{(x-\mu)^{2}}{2\sigma^{2}}\right)$ $\cdot$ $E[X]=\mu$, $\operatorname{Var}(X)=\sigma^{2}$ $\cdot$ $M_X(t)=\exp\!\left(\mu t+\tfrac12\sigma^{2}t^{2}\right)$ $\cdot$ $Z=\dfrac{X-\mu}{\sigma}\sim N(0,1)$.
Continuous dist
Beta$(\alpha,\beta)$ on $[0,1]$: pdf, mean, variance.
$f(x)=\dfrac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)}x^{\alpha-1}(1-x)^{\beta-1}$, $0<x<1$ $\cdot$ $E[X]=\dfrac{\alpha}{\alpha+\beta}$ $\cdot$ $\operatorname{Var}(X)=\dfrac{\alpha\beta}{(\alpha+\beta)^{2}(\alpha+\beta+1)}$ $\cdot$ Beta$(1,1)=U(0,1)$.
Continuous dist
Beta function and its gamma identity.
$B(\alpha,\beta)=\int_{0}^{1}x^{\alpha-1}(1-x)^{\beta-1}\,dx=\dfrac{\Gamma(\alpha)\Gamma(\beta)}{\Gamma(\alpha+\beta)}$.
Expectation
Survival, hazard rate, and tail-integral mean (continuous).
$S(x)=1-F(x)$ $\cdot$ $h(x)=\dfrac{f(x)}{S(x)}=-\dfrac{d}{dx}\ln S(x)$, so $S(x)=\exp\!\left(-\int_{0}^{x}h(t)\,dt\right)$ $\cdot$ $E[X]=\int_{0}^{\infty}S(x)\,dx$ ($X\geq0$).
Expectation
Expectation: definition, LOTUS, linearity.
$E[X]=\int x f(x)\,dx$ or $\sum x\,p(x)$ $\cdot$ LOTUS: $E[g(X)]=\int g(x)f(x)\,dx$ $\cdot$ $E[aX+b]=aE[X]+b$ $\cdot$ $E[X+Y]=E[X]+E[Y]$ (always).
Expectation
Variance: shortcut, linear transform, and Jensen.
$\operatorname{Var}(X)=E[X^{2}]-(E[X])^{2}$ $\cdot$ $\operatorname{Var}(aX+b)=a^{2}\operatorname{Var}(X)$, $\operatorname{SD}=|a|\sigma$ $\cdot$ convex $g$: $E[g(X)]\geq g(E[X])$ (e.g. $E[X^{2}]\geq(E[X])^{2}$).
Expectation
Coefficient of variation, skewness, kurtosis.
$\operatorname{CV}=\dfrac{\sigma}{\mu}$ ($=1$ for exponential) $\cdot$ skewness $\gamma_1=\dfrac{E[(X-\mu)^{3}]}{\sigma^{3}}$ $\cdot$ kurtosis $\dfrac{E[(X-\mu)^{4}]}{\sigma^{4}}$ (normal $=3$; excess subtracts $3$).
MGF/PGF
MGF: definition, moment recovery, value at 0, uniqueness, $aX+b$ rule.
$M_X(t)=E[e^{tX}]$ $\cdot$ $E[X^{n}]=M_X^{(n)}(0)$ $\cdot$ $M_X(0)=1$ $\cdot$ MGF (on an interval around $0$) determines the distribution $\cdot$ $M_{aX+b}(t)=e^{bt}M_X(at)$.
MGF/PGF
MGF of independent sums; PGF basics.
Independent: $M_{\sum X_i}(t)=\prod_i M_{X_i}(t)$ $\cdot$ PGF $G_X(s)=E[s^{X}]$ (integer $X\geq0$) $\cdot$ $G_X'(1)=E[X]$, $G_X''(1)=E[X(X-1)]$, $\Pr(X=k)=\dfrac{G_X^{(k)}(0)}{k!}$.
Covariance/correlation
Covariance: definition, shortcut, with itself, under independence.
$\operatorname{Cov}(X,Y)=E[(X-\mu_X)(Y-\mu_Y)]=E[XY]-E[X]E[Y]$ $\cdot$ $\operatorname{Cov}(X,X)=\operatorname{Var}(X)$ $\cdot$ independent $\Rightarrow\operatorname{Cov}=0$ (converse false).
Covariance/correlation
Covariance bilinearity and correlation.
$\operatorname{Cov}(aX+b,\,cY+d)=ac\,\operatorname{Cov}(X,Y)$ $\cdot$ $\operatorname{Cov}\!\left(\sum_i a_iX_i,\sum_j b_jY_j\right)=\sum_{i,j}a_ib_j\operatorname{Cov}(X_i,Y_j)$ $\cdot$ $\rho=\dfrac{\operatorname{Cov}(X,Y)}{\sigma_X\sigma_Y}\in[-1,1]$.
Covariance/correlation
Variance of a sum / difference / linear combination.
$\operatorname{Var}(X\pm Y)=\operatorname{Var}(X)+\operatorname{Var}(Y)\pm2\operatorname{Cov}(X,Y)$ $\cdot$ $\operatorname{Var}\!\left(\sum_i a_iX_i\right)=\sum_i a_i^{2}\operatorname{Var}(X_i)+2\sum_{i<j}a_ia_j\operatorname{Cov}(X_i,X_j)$.
Joint dist
Joint, marginal, conditional, and independence (continuous).
$f_X(x)=\int f(x,y)\,dy$ $\cdot$ $f_{Y\mid X}(y\mid x)=\dfrac{f(x,y)}{f_X(x)}$ $\cdot$ independent $\Leftrightarrow f(x,y)=f_X(x)f_Y(y)$ on a rectangular support.
Conditioning
Law of total expectation (tower rule) and conditional-mean fact.
$E[X]=E\big[E[X\mid Y]\big]$ $\cdot$ $E[g(Y)X\mid Y]=g(Y)E[X\mid Y]$ $\cdot$ $E[X\mid Y]$ is a random variable (function of $Y$).
Conditioning
Law of total variance (EVVE decomposition).
$\operatorname{Var}(Y)=E\big[\operatorname{Var}(Y\mid X)\big]+\operatorname{Var}\big(E[Y\mid X]\big)$ = mean of conditional variance + variance of conditional mean.
Conditioning
Compound / conditional-Poisson mixing moments.
$S=\sum_{i=1}^{N}X_i$: $E[S]=E[N]E[X]$, $\operatorname{Var}(S)=E[N]\operatorname{Var}(X)+(E[X])^{2}\operatorname{Var}(N)$ $\cdot$ $N\mid\Lambda\sim\text{Poisson}(\Lambda)$: $E[N]=E[\Lambda]$, $\operatorname{Var}(N)=E[\Lambda]+\operatorname{Var}(\Lambda)$.
Sums/CLT
Sum of $n$ iid: $S=\sum X_i$ and sample mean $\bar{X}$.
$E[S]=n\mu$, $\operatorname{Var}(S)=n\sigma^{2}$, $\operatorname{SD}(S)=\sigma\sqrt{n}$ $\cdot$ $E[\bar{X}]=\mu$, $\operatorname{Var}(\bar{X})=\dfrac{\sigma^{2}}{n}$, $\operatorname{SD}(\bar{X})=\dfrac{\sigma}{\sqrt{n}}$.
Sums/CLT
Closure under independent sums (named distributions).
$\sum\text{Bin}(n_i,p)=\text{Bin}(\sum n_i,p)$ $\cdot$ $\sum\text{Poisson}(\lambda_i)=\text{Poisson}(\sum\lambda_i)$ $\cdot$ $\sum N(\mu_i,\sigma_i^{2})=N(\sum\mu_i,\sum\sigma_i^{2})$ $\cdot$ $n$ iid $\text{Exp}(\lambda)=\text{Gamma}(n,\lambda)$.
Sums/CLT
Central Limit Theorem (sum and mean forms).
$S_n\approx N(n\mu,\,n\sigma^{2})$, i.e. $\dfrac{S_n-n\mu}{\sigma\sqrt{n}}\to N(0,1)$ $\cdot$ $\bar{X}\approx N\!\left(\mu,\dfrac{\sigma^{2}}{n}\right)$, i.e. $\dfrac{\bar{X}-\mu}{\sigma/\sqrt{n}}\to N(0,1)$.
Sums/CLT
Continuity correction (normal approx to integer-valued $S$).
Normal approximant $Y\sim N(E[S],\operatorname{Var}(S))$: $\Pr(S\leq k)\approx\Pr(Y\leq k+0.5)$ $\cdot$ $\Pr(S\geq k)\approx\Pr(Y\geq k-0.5)$ $\cdot$ $\Pr(S=k)\approx\Pr(k-0.5\leq Y\leq k+0.5)$.
Inequalities
Markov and Chebyshev inequalities.
Markov ($X\geq0$): $\Pr(X\geq a)\leq\dfrac{E[X]}{a}$ $\cdot$ Chebyshev: $\Pr(|X-\mu|\geq k\sigma)\leq\dfrac{1}{k^{2}}$.
Inequalities
Standard normal: table use, symmetry, empirical rule.
$\Phi(z)=\Pr(Z\leq z)$, $\Phi(-z)=1-\Phi(z)$ $\cdot$ percentile: $x_p=\mu+z_p\sigma$ $\cdot$ $68\%$ within $\mu\pm\sigma$, $95\%$ within $\mu\pm2\sigma$, $99.7\%$ within $\mu\pm3\sigma$.
Joint dist
Bivariate normal: linear combinations and conditional law.
$aX+bY\sim N\!\left(a\mu_X+b\mu_Y,\;a^{2}\sigma_X^{2}+b^{2}\sigma_Y^{2}+2ab\,\rho\sigma_X\sigma_Y\right)$ $\cdot$ $E[Y\mid X{=}x]=\mu_Y+\rho\dfrac{\sigma_Y}{\sigma_X}(x-\mu_X)$, $\operatorname{Var}(Y\mid X)=\sigma_Y^{2}(1-\rho^{2})$ $\cdot$ $\rho=0\Leftrightarrow$ independent.
Joint dist
Multinomial (trinomial) counts: marginal and cross-covariance.
$X_i\sim\text{Bin}(n,p_i)$: $E[X_i]=np_i$, $\operatorname{Var}(X_i)=np_i(1-p_i)$ $\cdot$ $\operatorname{Cov}(X_i,X_j)=-np_ip_j$, $i\neq j$ (negative).
Transformations
Transformation by Jacobian (1-D, monotone $Y=g(X)$).
$f_Y(y)=f_X\!\big(g^{-1}(y)\big)\left|\dfrac{dx}{dy}\right|$ $\cdot$ non-monotone: sum over all branches $x_i=g^{-1}(y)$ $\cdot$ probability integral transform: $F_X(X)\sim U(0,1)$.
Transformations
Bivariate Jacobian and convolution for $Z=X+Y$.
$f_{U,V}(u,v)=f_{X,Y}\big(x(u,v),y(u,v)\big)|J|$, $J=\dfrac{\partial x}{\partial u}\dfrac{\partial y}{\partial v}-\dfrac{\partial x}{\partial v}\dfrac{\partial y}{\partial u}$ $\cdot$ independent: $f_Z(z)=\int f_X(x)f_Y(z-x)\,dx$.
Order statistics
Order statistics: pdf of max, min, and $k$th (iid, cdf $F$, pdf $f$).
Max: $f_{(n)}(x)=n[F(x)]^{n-1}f(x)$ $\cdot$ Min: $f_{(1)}(x)=n[1-F(x)]^{n-1}f(x)$ $\cdot$ $k$th: $f_{(k)}(x)=\dfrac{n!}{(k-1)!(n-k)!}[F(x)]^{k-1}[1-F(x)]^{n-k}f(x)$.
Order statistics
Order-statistic cdfs and $U(0,1)$ results.
$F_{(n)}(x)=[F(x)]^{n}$, $F_{(1)}(x)=1-[1-F(x)]^{n}$ $\cdot$ $U(0,1)$: $X_{(k)}\sim\text{Beta}(k,n-k+1)$, $E[X_{(k)}]=\dfrac{k}{n+1}$.
Order statistics
Minimum of independent exponentials and which is smallest.
$X_i\sim\text{Exp}(\lambda_i)$ independent: $\min_i X_i\sim\text{Exp}\!\left(\sum_i\lambda_i\right)$ $\cdot$ $\Pr(X_1=\min)=\dfrac{\lambda_1}{\sum_i\lambda_i}$.
Insurance layers
Insurance per-loss payment under deductible $d$: $E[(X-d)_{+}]$.
$E[(X-d)_{+}]=\int_{d}^{\infty}S(x)\,dx=E[X]-E[X\wedge d]$ $\cdot$ exponential mean $\theta$: $E[(X-d)_{+}]=\theta e^{-d/\theta}$.
Insurance layers
Limited expected value $E[X\wedge u]$ and the layer $d$-to-$u$.
$X\wedge u=\min(X,u)$, $E[X\wedge u]=\int_{0}^{u}S(x)\,dx$ $\cdot$ layer cost $=E[X\wedge u]-E[X\wedge d]=\int_{d}^{u}S(x)\,dx$ $\cdot$ coinsurance $\alpha$ scales by $\alpha$.
Insurance layers
Per-loss vs per-payment under deductible $d$.
Per loss: $E[(X-d)_{+}]$ (averages over all losses, including $0$-pay) $\cdot$ per payment: $E[X-d\mid X>d]=\dfrac{E[(X-d)_{+}]}{S(d)}$ (larger).