{
  "deckName": "Exam SRM — Linear Regression",
  "examCode": "Exam SRM",
  "cards": [
    {
      "front": "State the **simple linear regression** model and name its four standard assumptions.",
      "back": "Model: $Y = \\beta_0 + \\beta_1 x + \\epsilon$, where $\\beta_0$ is the intercept, $\\beta_1$ the slope, and $\\epsilon$ a random error.\nAssumptions:\n1. **Linearity** — $E[Y\\mid x]=\\beta_0+\\beta_1 x$ is linear in the parameters.\n2. **Independent errors** — the $\\epsilon_i$ are uncorrelated across observations.\n3. **Constant variance (homoscedasticity)** — $\\text{Var}(\\epsilon_i)=\\sigma^{2}$ for all $i$.\n4. **Normal errors** — $\\epsilon_i\\sim N(0,\\sigma^{2})$, needed for the exact $t$- and $F$-inference.",
      "tag": "Simple linear regression"
    },
    {
      "front": "What does each assumption violation look like, and what does it break?",
      "back": "**Non-linearity:** a curved residual-vs-fitted plot — the linear $E[Y\\mid x]$ is wrong, so coefficients are biased.\n**Correlated errors:** patterns/runs in residuals (e.g. time-series autocorrelation) — $SE$s are understated, so $t$/$p$-values are too optimistic.\n**Heteroscedasticity:** a funnel/fan shape in residuals — least-squares stays unbiased but $SE$s and intervals are wrong.\n**Non-normal errors:** mainly hurts small-sample $t$/$F$ exactness; large $n$ relies on the CLT.",
      "tag": "Simple linear regression"
    },
    {
      "front": "Define the sums $S_{xx}$, $S_{yy}$, and $S_{xy}$ used in simple linear regression.",
      "back": "$S_{xx}=\\sum_{i=1}^{n}(x_i-\\bar x)^2 = \\sum x_i^{2} - n\\bar x^{2}$\n$S_{yy}=\\sum_{i=1}^{n}(y_i-\\bar y)^2 = \\sum y_i^{2} - n\\bar y^{2}$\n$S_{xy}=\\sum_{i=1}^{n}(x_i-\\bar x)(y_i-\\bar y) = \\sum x_i y_i - n\\bar x\\,\\bar y$\nThese centered/corrected sums are the building blocks for $\\hat\\beta_1$, $r$, and the ANOVA table.",
      "tag": "Least-squares estimates"
    },
    {
      "front": "State the **least-squares estimators** $\\hat\\beta_1$ and $\\hat\\beta_0$, and explain what \"least squares\" minimizes.",
      "back": "Least squares chooses the line minimizing the residual sum of squares $\\text{RSS}=\\sum_{i=1}^{n}(y_i-\\hat\\beta_0-\\hat\\beta_1 x_i)^2$.\nSetting the partial derivatives to zero gives\n$\\hat\\beta_1 = \\frac{S_{xy}}{S_{xx}}$ and $\\hat\\beta_0 = \\bar y - \\hat\\beta_1\\bar x$.\nThe fitted line always passes through the centroid $(\\bar x,\\bar y)$, and the residuals sum to zero.",
      "tag": "Least-squares estimates"
    },
    {
      "front": "Given $n=10$, $\\sum x=50$, $\\sum y=120$, $\\sum x^{2}=300$, $\\sum xy=700$, find the least-squares line.",
      "back": "$\\bar x = 50/10 = 5$, $\\bar y = 120/10 = 12$.\n$S_{xx}=\\sum x^{2} - n\\bar x^{2}=300 - 10(25)=50$.\n$S_{xy}=\\sum xy - n\\bar x\\bar y = 700 - 10(5)(12)=700-600=100$.\n$\\hat\\beta_1 = \\frac{S_{xy}}{S_{xx}}=\\frac{100}{50}=2$.\n$\\hat\\beta_0 = \\bar y - \\hat\\beta_1\\bar x = 12 - 2(5)=2$.\nFitted line: $\\hat y = 2 + 2x$.",
      "tag": "Least-squares estimates"
    },
    {
      "front": "From the five points $(1,2),(2,4),(3,5),(4,4),(5,5)$, compute the least-squares slope and intercept.",
      "back": "$n=5$, $\\sum x = 15$, $\\sum y = 20$, so $\\bar x = 3$, $\\bar y = 4$.\n$\\sum x^{2}=1+4+9+16+25=55\\Rightarrow S_{xx}=55-5(9)=10$.\n$\\sum xy = 2+8+15+16+25 = 66 \\Rightarrow S_{xy}=66-5(3)(4)=66-60=6$.\n$\\hat\\beta_1 = 6/10 = 0.6$; $\\hat\\beta_0 = 4 - 0.6(3)=4-1.8=2.2$.\nFitted line: $\\hat y = 2.2 + 0.6x$.",
      "tag": "Least-squares estimates"
    },
    {
      "front": "How is the **sample correlation** $r$ related to the slope and to $R^{2}$ in simple linear regression?",
      "back": "$r = \\frac{S_{xy}}{\\sqrt{S_{xx}\\,S_{yy}}}$, which lies in $[-1,1]$ and shares the sign of $\\hat\\beta_1$.\nThe slope and correlation satisfy $\\hat\\beta_1 = r\\,\\frac{s_y}{s_x}=r\\sqrt{\\frac{S_{yy}}{S_{xx}}}$.\nIn **simple** linear regression $R^{2}=r^{2}$ — the coefficient of determination is literally the squared correlation between $x$ and $y$.",
      "tag": "Least-squares estimates"
    },
    {
      "front": "With $S_{xx}=50$, $S_{yy}=240$, $S_{xy}=100$, find $r$ and $R^{2}$ for the simple regression.",
      "back": "$r = \\frac{S_{xy}}{\\sqrt{S_{xx}S_{yy}}}=\\frac{100}{\\sqrt{50\\cdot 240}}=\\frac{100}{\\sqrt{12000}}=\\frac{100}{109.545}\\approx 0.9129$.\n$R^{2}=r^{2}\\approx 0.8333$.\nSo about $83.3\\%$ of the variation in $y$ is explained by the regression on $x$.",
      "tag": "ANOVA & R-squared"
    },
    {
      "front": "State the **ANOVA sum-of-squares decomposition** and define each term.",
      "back": "$\\text{SST}=\\text{SSR}+\\text{SSE}$.\n**SST** (total) $=\\sum(y_i-\\bar y)^2 = S_{yy}$ — total variation in $y$.\n**SSR** (regression/explained) $=\\sum(\\hat y_i-\\bar y)^2$ — variation captured by the model.\n**SSE** (error/residual) $=\\sum(y_i-\\hat y_i)^2 = \\text{RSS}$ — unexplained variation.\nIn SLR, $\\text{SSR}=\\hat\\beta_1 S_{xy}=\\frac{S_{xy}^{2}}{S_{xx}}$ and $\\text{SSE}=S_{yy}-\\frac{S_{xy}^{2}}{S_{xx}}$.",
      "tag": "ANOVA & R-squared"
    },
    {
      "front": "Define the **coefficient of determination** $R^{2}$ and state its range and meaning.",
      "back": "$R^{2}=\\frac{\\text{SSR}}{\\text{SST}}=1-\\frac{\\text{SSE}}{\\text{SST}}$.\nIt is the proportion of total variation in $y$ explained by the model, with $0\\le R^{2}\\le 1$. $R^{2}=1$ means a perfect fit (SSE $=0$); $R^{2}=0$ means the predictors explain nothing beyond $\\bar y$.\nAdding any predictor never decreases $R^{2}$, so it cannot be used alone to compare models of different sizes — use adjusted $\\bar R^{2}$ for that.",
      "tag": "ANOVA & R-squared"
    },
    {
      "front": "Given $S_{yy}=240$, $S_{xx}=50$, $S_{xy}=100$, build the ANOVA pieces SSR, SSE, and $R^{2}$.",
      "back": "$\\text{SST}=S_{yy}=240$.\n$\\text{SSR}=\\frac{S_{xy}^{2}}{S_{xx}}=\\frac{100^{2}}{50}=\\frac{10000}{50}=200$.\n$\\text{SSE}=\\text{SST}-\\text{SSR}=240-200=40$.\n$R^{2}=\\frac{\\text{SSR}}{\\text{SST}}=\\frac{200}{240}\\approx 0.8333$, matching $r^{2}$.",
      "tag": "ANOVA & R-squared"
    },
    {
      "front": "Define the **residual standard error** $s$ in simple linear regression and state its degrees of freedom.",
      "back": "$s = \\sqrt{\\dfrac{\\text{SSE}}{n-2}}$, the estimate of the error standard deviation $\\sigma$. The $n-2$ degrees of freedom come from estimating two parameters ($\\hat\\beta_0,\\hat\\beta_1$).\nEquivalently $s^{2}=\\frac{\\text{SSE}}{n-2}$ is the unbiased estimator of $\\sigma^{2}$ (also written $\\hat\\sigma^{2}$ or MSE).",
      "tag": "Inference & CIs"
    },
    {
      "front": "With $\\text{SSE}=40$ from $n=10$ observations, find the residual standard error $s$ and the MSE.",
      "back": "Degrees of freedom $=n-2=8$.\nMSE $= s^{2}=\\frac{\\text{SSE}}{n-2}=\\frac{40}{8}=5$.\n$s = \\sqrt{5}\\approx 2.236$.\nThis $s$ is the typical size of a residual and feeds every standard error in the model.",
      "tag": "Inference & CIs"
    },
    {
      "front": "State the standard error of the slope $SE(\\hat\\beta_1)$ and the slope $t$-statistic.",
      "back": "$SE(\\hat\\beta_1)=\\frac{s}{\\sqrt{S_{xx}}}$, where $s=\\sqrt{\\text{SSE}/(n-2)}$.\nTo test $H_0:\\beta_1=0$ vs $H_1:\\beta_1\\neq 0$, use\n$t = \\frac{\\hat\\beta_1}{SE(\\hat\\beta_1)}$, compared to a $t$ distribution with $n-2$ degrees of freedom.\nA large $|t|$ (small $p$-value) is evidence that $x$ has a real linear effect on $y$.",
      "tag": "Inference & CIs"
    },
    {
      "front": "With $\\hat\\beta_1=2$, $s=\\sqrt{5}\\approx 2.236$, $S_{xx}=50$, $n=10$, test $H_0:\\beta_1=0$.",
      "back": "$SE(\\hat\\beta_1)=\\frac{s}{\\sqrt{S_{xx}}}=\\frac{2.236}{\\sqrt{50}}=\\frac{2.236}{7.0711}\\approx 0.3162$.\n$t = \\frac{\\hat\\beta_1}{SE(\\hat\\beta_1)}=\\frac{2}{0.3162}\\approx 6.32$ on $n-2=8$ df.\nSince $|t|=6.32$ far exceeds $t_{0.025,8}\\approx 2.306$, reject $H_0$: the slope is significantly different from zero.",
      "tag": "Inference & CIs"
    },
    {
      "front": "Construct a $95\\%$ confidence interval for the slope $\\beta_1$ given $\\hat\\beta_1=2$, $SE(\\hat\\beta_1)=0.3162$, $n=10$.",
      "back": "CI: $\\hat\\beta_1 \\pm t_{0.025,\\,n-2}\\,SE(\\hat\\beta_1)$ with $n-2=8$ df, so $t_{0.025,8}\\approx 2.306$.\nMargin $=2.306(0.3162)\\approx 0.729$.\nInterval: $2 \\pm 0.729 = (1.271,\\ 2.729)$.\nBecause the interval excludes $0$, the slope is significant at the $5\\%$ level — consistent with the large $t$-statistic.",
      "tag": "Inference & CIs"
    },
    {
      "front": "State the standard error of the intercept $SE(\\hat\\beta_0)$ in simple linear regression.",
      "back": "$SE(\\hat\\beta_0)=s\\sqrt{\\dfrac{1}{n}+\\dfrac{\\bar x^{2}}{S_{xx}}}$.\nThe intercept is the predicted mean of $y$ at $x=0$; its standard error grows as $\\bar x$ moves away from $0$, since extrapolating the line back to $x=0$ is less certain when the data are centered far from the origin. Test/interval use the same $t_{n-2}$ form: $t=\\frac{\\hat\\beta_0}{SE(\\hat\\beta_0)}$.",
      "tag": "Inference & CIs"
    },
    {
      "front": "State the **overall $F$-test** in regression: hypotheses, statistic, and distribution.",
      "back": "Tests $H_0:\\beta_1=\\dots=\\beta_p=0$ (no predictor helps) versus the alternative that at least one slope is nonzero.\n$F = \\frac{\\text{SSR}/p}{\\text{SSE}/(n-p-1)} = \\frac{\\text{MSR}}{\\text{MSE}}$, with $p$ and $n-p-1$ degrees of freedom.\nIn **simple** regression $p=1$, so $F=t^{2}$ for the slope, and the $F$- and $t$-tests are equivalent.",
      "tag": "ANOVA & R-squared"
    },
    {
      "front": "A simple regression on $n=10$ points has $\\text{SSR}=200$ and $\\text{SSE}=40$. Compute the overall $F$-statistic and relate it to the slope $t$.",
      "back": "$p=1$, error df $=n-p-1=8$.\n$\\text{MSR}=\\frac{\\text{SSR}}{1}=200$; $\\text{MSE}=\\frac{\\text{SSE}}{8}=5$.\n$F = \\frac{\\text{MSR}}{\\text{MSE}}=\\frac{200}{5}=40$ on $(1,8)$ df.\nCheck: the slope $t\\approx 6.32$, and $t^{2}=6.32^{2}\\approx 40$ $=F$, as required in simple regression.",
      "tag": "ANOVA & R-squared"
    },
    {
      "front": "An ANOVA table reports $\\text{SST}=500$ with regression df $=3$, error df $=26$, and $\\text{SSE}=125$. Find $R^{2}$ and the $F$-statistic.",
      "back": "$\\text{SSR}=\\text{SST}-\\text{SSE}=500-125=375$.\n$R^{2}=\\frac{\\text{SSR}}{\\text{SST}}=\\frac{375}{500}=0.75$.\n$\\text{MSR}=\\frac{375}{3}=125$; $\\text{MSE}=\\frac{125}{26}\\approx 4.8077$.\n$F=\\frac{\\text{MSR}}{\\text{MSE}}=\\frac{125}{4.8077}\\approx 26.0$ on $(3,26)$ df — strong evidence the model as a whole is useful.",
      "tag": "ANOVA & R-squared"
    },
    {
      "front": "State the **multiple linear regression** model in scalar and matrix form.",
      "back": "Scalar: $Y_i = \\beta_0 + \\beta_1 x_{i1} + \\dots + \\beta_p x_{ip} + \\epsilon_i$ for $i=1,\\dots,n$.\nMatrix: $\\mathbf{y} = X\\boldsymbol\\beta + \\boldsymbol\\epsilon$, where $X$ is the $n\\times(p+1)$ **design matrix** (a leading column of $1$s for the intercept plus the $p$ predictor columns), $\\boldsymbol\\beta$ is $(p+1)\\times 1$, and $\\boldsymbol\\epsilon\\sim N(\\mathbf 0,\\sigma^{2}I)$.",
      "tag": "Multiple regression"
    },
    {
      "front": "State the **normal equations** and the least-squares solution for multiple regression.",
      "back": "Minimizing $\\text{RSS}=(\\mathbf y - X\\boldsymbol\\beta)^{\\top}(\\mathbf y - X\\boldsymbol\\beta)$ gives the normal equations $X^{\\top}X\\,\\hat{\\boldsymbol\\beta}=X^{\\top}\\mathbf y$.\nIf $X^{\\top}X$ is invertible,\n$\\hat{\\boldsymbol\\beta}=(X^{\\top}X)^{-1}X^{\\top}\\mathbf y$.\nThe fitted values are $\\hat{\\mathbf y}=X\\hat{\\boldsymbol\\beta}=H\\mathbf y$ with hat matrix $H=X(X^{\\top}X)^{-1}X^{\\top}$.",
      "tag": "Multiple regression"
    },
    {
      "front": "How do you interpret a **partial slope** $\\hat\\beta_j$ in multiple regression, and how does it differ from a simple-regression slope?",
      "back": "$\\hat\\beta_j$ is the expected change in $Y$ for a **one-unit increase in $x_j$ holding all other predictors fixed** — a *partial* (ceteris paribus) effect.\nIn simple regression the slope absorbs the effects of any omitted correlated variables, so the same predictor's coefficient can change sign or magnitude once other variables enter the model. This is why correlated predictors (multicollinearity) make individual $\\hat\\beta_j$ unstable.",
      "tag": "Multiple regression"
    },
    {
      "front": "For a two-parameter model (intercept and one slope) the inverse is $(X^{\\top}X)^{-1}=\\begin{pmatrix}0.5&-0.1\\\\-0.1&0.04\\end{pmatrix}$ and $X^{\\top}\\mathbf y=\\begin{pmatrix}40\\\\220\\end{pmatrix}$. Find $\\hat{\\boldsymbol\\beta}$.",
      "back": "$\\hat{\\boldsymbol\\beta}=(X^{\\top}X)^{-1}X^{\\top}\\mathbf y$.\nFirst component: $0.5(40)+(-0.1)(220)=20-22=-2$.\nSecond component: $(-0.1)(40)+0.04(220)=-4+8.8=4.8$.\nSo $\\hat{\\boldsymbol\\beta}=\\begin{pmatrix}-2\\\\4.8\\end{pmatrix}$, i.e. $\\hat\\beta_0=-2$ and $\\hat\\beta_1=4.8$.",
      "tag": "Multiple regression"
    },
    {
      "front": "Define the **adjusted $R^{2}$** ($\\bar R^{2}$) and explain why it is preferred for comparing models.",
      "back": "$\\bar R^{2}=1-\\dfrac{\\text{SSE}/(n-p-1)}{\\text{SST}/(n-1)}$, where $p$ is the number of predictors (excluding intercept).\nUnlike $R^{2}$, it penalizes extra parameters: adding a predictor raises $\\bar R^{2}$ only if it cuts SSE enough to offset the lost degree of freedom. It can decrease — and can even be negative — so it is a fairer basis for comparing models of different sizes.",
      "tag": "Multiple regression"
    },
    {
      "front": "A model with $n=30$, $p=4$ predictors has $\\text{SST}=500$ and $\\text{SSE}=125$. Find $R^{2}$ and adjusted $\\bar R^{2}$.",
      "back": "$R^{2}=1-\\frac{\\text{SSE}}{\\text{SST}}=1-\\frac{125}{500}=0.75$.\n$\\bar R^{2}=1-\\frac{\\text{SSE}/(n-p-1)}{\\text{SST}/(n-1)}=1-\\frac{125/25}{500/29}$.\nNumerator $=125/25=5$; denominator $=500/29\\approx 17.241$.\n$\\bar R^{2}=1-\\frac{5}{17.241}=1-0.290=0.710$.\nThe adjusted value $0.710 < 0.75$, reflecting the cost of the four predictors.",
      "tag": "Multiple regression"
    },
    {
      "front": "Define **multicollinearity** and the **variance inflation factor** $\\text{VIF}_j$.",
      "back": "Multicollinearity is strong linear association among predictors, which inflates the standard errors of the affected coefficients and makes individual $\\hat\\beta_j$ unstable.\n$\\text{VIF}_j=\\frac{1}{1-R_j^{2}}$, where $R_j^{2}$ is the $R^{2}$ from regressing predictor $x_j$ on all the **other** predictors.\nVIF $=1$ means no collinearity; a common rule of thumb flags $\\text{VIF}_j>5$ (or $10$) as problematic.",
      "tag": "Multiple regression"
    },
    {
      "front": "Predictor $x_2$ regressed on the other predictors gives $R_2^{2}=0.90$. Compute $\\text{VIF}_2$ and interpret it.",
      "back": "$\\text{VIF}_2=\\frac{1}{1-R_2^{2}}=\\frac{1}{1-0.90}=\\frac{1}{0.10}=10$.\nThe variance of $\\hat\\beta_2$ is inflated $10\\times$ relative to an uncorrelated predictor, so $SE(\\hat\\beta_2)$ is $\\sqrt{10}\\approx 3.16$ times larger. A VIF of $10$ is at the usual problematic threshold — $x_2$ is highly collinear with the other predictors.",
      "tag": "Multiple regression"
    },
    {
      "front": "Define an **F-test for a subset of coefficients** (partial / nested-model $F$-test) and its formula.",
      "back": "To test whether $q$ of the predictors can be dropped, compare the full model (SSE $_F$, df $n-p-1$) with the reduced model (SSE $_R$):\n$F=\\dfrac{(\\text{SSE}_R-\\text{SSE}_F)/q}{\\text{SSE}_F/(n-p-1)}$, with $q$ and $n-p-1$ degrees of freedom.\nA large $F$ means the dropped predictors together explain a significant amount of variation, so they should be kept.",
      "tag": "Multiple regression"
    },
    {
      "front": "A full model ($p=5$, $n=40$) has $\\text{SSE}_F=200$; dropping $2$ predictors gives $\\text{SSE}_R=260$. Test whether the two can be removed.",
      "back": "$q=2$, error df $=n-p-1=40-5-1=34$.\n$F=\\frac{(\\text{SSE}_R-\\text{SSE}_F)/q}{\\text{SSE}_F/(n-p-1)}=\\frac{(260-200)/2}{200/34}=\\frac{30}{5.882}\\approx 5.10$ on $(2,34)$ df.\nSince $5.10$ exceeds $F_{0.05;2,34}\\approx 3.28$, reject the reduced model: the two predictors jointly contribute significantly and should be retained.",
      "tag": "Multiple regression"
    },
    {
      "front": "How do you encode a categorical predictor with $k$ levels using **dummy (indicator) variables**?",
      "back": "Use $k-1$ indicator variables, each $0/1$, and leave one level out as the **baseline (reference)**.\nThe intercept then represents the baseline level's mean; each dummy's coefficient is the **difference** in mean response between that level and the baseline. Including all $k$ dummies plus an intercept would make $X^{\\top}X$ singular (the \"dummy-variable trap\").",
      "tag": "Prediction & dummies"
    },
    {
      "front": "A fitted model is $\\hat y = 30 + 5x + 8 D_B + (-4) D_C$, where $D_B,D_C$ are dummies for groups B and C (A is baseline). Predict $y$ for group C with $x=10$.",
      "back": "For group C: $D_B=0$, $D_C=1$.\n$\\hat y = 30 + 5(10) + 8(0) + (-4)(1) = 30 + 50 + 0 - 4 = 76$.\nInterpretation: group C's mean is $4$ units **below** baseline group A at the same $x$, while group B's would be $8$ units above. The slope $5$ on $x$ is common to all three groups (parallel lines).",
      "tag": "Prediction & dummies"
    },
    {
      "front": "What does an **interaction term** $x\\cdot D$ do in a regression, and how does it change the interpretation?",
      "back": "An interaction lets the **slope** on $x$ depend on the group. With $\\hat y = \\beta_0 + \\beta_1 x + \\beta_2 D + \\beta_3 (x D)$, the line for $D=0$ has slope $\\beta_1$, while for $D=1$ it has slope $\\beta_1+\\beta_3$.\nSo $\\beta_3$ is the **difference in slopes** between groups; without it, the model forces parallel lines (equal slopes, only different intercepts).",
      "tag": "Prediction & dummies"
    },
    {
      "front": "With $\\hat y = 12 + 3x + 6D + 2(xD)$ ($D$ a $0/1$ group dummy), give the fitted line for each group and predict $y$ at $x=5$, $D=1$.",
      "back": "Group $D=0$: $\\hat y = 12 + 3x$ (intercept $12$, slope $3$).\nGroup $D=1$: $\\hat y = (12+6) + (3+2)x = 18 + 5x$ (intercept $18$, slope $5$).\nAt $x=5$, $D=1$: $\\hat y = 18 + 5(5) = 18 + 25 = 43$.\nThe interaction makes the $D=1$ group both start higher and rise faster.",
      "tag": "Prediction & dummies"
    },
    {
      "front": "Distinguish a **confidence interval for the mean response** from a **prediction interval for a new observation**.",
      "back": "Both center on $\\hat y_0 = \\hat\\beta_0 + \\hat\\beta_1 x_0$, but answer different questions:\n**CI for the mean** $E[Y\\mid x_0]$ — captures uncertainty in the fitted line only.\n**Prediction interval** for a single new $Y$ at $x_0$ — adds the irreducible error variance $\\sigma^{2}$ on top, so it is always **wider**.\nThe prediction-interval standard error has an extra \"$1+$\" term inside the radical.",
      "tag": "Prediction & dummies"
    },
    {
      "front": "State the standard error formulas for the mean response and for a new prediction at $x_0$ in simple regression.",
      "back": "Mean response: $SE(\\hat y_0)=s\\sqrt{\\dfrac{1}{n}+\\dfrac{(x_0-\\bar x)^2}{S_{xx}}}$.\nNew prediction: $SE_{\\text{pred}}=s\\sqrt{1+\\dfrac{1}{n}+\\dfrac{(x_0-\\bar x)^2}{S_{xx}}}$.\nBoth intervals use $\\hat y_0\\pm t_{\\alpha/2,\\,n-2}\\cdot SE$. The width is smallest at $x_0=\\bar x$ and widens as $x_0$ moves away from the data center.",
      "tag": "Prediction & dummies"
    },
    {
      "front": "Using $\\hat y = 2 + 2x$, $s=\\sqrt 5\\approx 2.236$, $n=10$, $\\bar x=5$, $S_{xx}=50$, build a $95\\%$ CI for the mean response at $x_0=6$.",
      "back": "$\\hat y_0 = 2 + 2(6)=14$. With $x_0-\\bar x = 1$:\n$SE(\\hat y_0)=s\\sqrt{\\frac{1}{n}+\\frac{(x_0-\\bar x)^2}{S_{xx}}}=2.236\\sqrt{\\frac{1}{10}+\\frac{1}{50}}=2.236\\sqrt{0.10+0.02}=2.236\\sqrt{0.12}$.\n$\\sqrt{0.12}\\approx 0.3464$, so $SE\\approx 2.236(0.3464)\\approx 0.7746$.\nWith $t_{0.025,8}\\approx 2.306$: margin $=2.306(0.7746)\\approx 1.786$.\nCI: $14\\pm 1.786 = (12.21,\\ 15.79)$.",
      "tag": "Prediction & dummies"
    },
    {
      "front": "For the same model, build a $95\\%$ **prediction interval** for a new $Y$ at $x_0=6$ and compare its width to the mean-response CI.",
      "back": "$SE_{\\text{pred}}=s\\sqrt{1+\\frac{1}{n}+\\frac{(x_0-\\bar x)^2}{S_{xx}}}=2.236\\sqrt{1+0.10+0.02}=2.236\\sqrt{1.12}$.\n$\\sqrt{1.12}\\approx 1.0583$, so $SE_{\\text{pred}}\\approx 2.236(1.0583)\\approx 2.366$.\nMargin $=2.306(2.366)\\approx 5.46$, giving the interval $14\\pm 5.46 = (8.54,\\ 19.46)$.\nThis is far wider than the mean CI $(12.21,15.79)$ — the extra \"$1+$\" from the single-observation error $\\sigma^{2}$ dominates.",
      "tag": "Prediction & dummies"
    },
    {
      "front": "Why does the interval width grow as $x_0$ moves away from $\\bar x$, and where is it narrowest?",
      "back": "The variance term $\\frac{(x_0-\\bar x)^2}{S_{xx}}$ is $0$ at $x_0=\\bar x$ and increases quadratically as $x_0$ departs from the data center, so both the mean CI and the prediction interval are **narrowest at $x_0=\\bar x$** and flare outward.\nThis is why **extrapolation** beyond the observed range is risky: far from $\\bar x$ the intervals balloon and the linearity assumption is untested.",
      "tag": "Prediction & dummies"
    },
    {
      "front": "Define a **residual** $e_i$ and its standardized form, and explain how residuals diagnose model assumptions.",
      "back": "Raw residual: $e_i = y_i - \\hat y_i$. Standardized residual $\\approx \\frac{e_i}{s\\sqrt{1-h_{ii}}}$, where $h_{ii}$ is the $i$-th **leverage** (diagonal of the hat matrix $H$).\nPlots of residuals vs fitted values check **linearity and constant variance** (look for curvature or a funnel); a Q-Q plot of residuals checks **normality**; large $|h_{ii}|$ flags high-leverage points that can unduly pull the fit.",
      "tag": "Simple linear regression"
    },
    {
      "front": "Given the data $(2,3),(4,7),(6,8)$ and the fitted line $\\hat y = 1.5 + 1.25x$, compute the residuals and SSE.",
      "back": "Fitted: at $x=2$, $\\hat y=1.5+2.5=4.0$; at $x=4$, $\\hat y=1.5+5=6.5$; at $x=6$, $\\hat y=1.5+7.5=9.0$.\nResiduals: $e_1=3-4.0=-1.0$; $e_2=7-6.5=0.5$; $e_3=8-9.0=-1.0$.\n$\\text{SSE}=(-1.0)^2+(0.5)^2+(-1.0)^2=1.0+0.25+1.0=2.25$.\n(The residuals sum to $-1.5$ here because the given line is not the exact least-squares fit.)",
      "tag": "ANOVA & R-squared"
    },
    {
      "front": "What is the **Gauss–Markov theorem** and what does it say about least-squares estimators?",
      "back": "Under linearity, zero-mean uncorrelated errors with constant variance $\\sigma^{2}$ (normality **not** required), the ordinary least-squares estimators $\\hat{\\boldsymbol\\beta}$ are **BLUE** — the Best Linear Unbiased Estimators: among all estimators that are linear in $\\mathbf y$ and unbiased, OLS has the smallest variance.\nNormality is only needed for the *exact* $t$- and $F$-distributions used in inference, not for BLUE. Under those standard assumptions $\\text{Var}(\\hat\\beta_1)=\\frac{\\sigma^{2}}{S_{xx}}$, estimated by $\\frac{s^{2}}{S_{xx}}$ — larger spread in $x$ gives a more precise slope.",
      "tag": "Inference & CIs"
    },
    {
      "front": "Given $n=12$, $\\sum x = 60$, $\\sum y = 96$, $\\sum x^{2}=400$, $\\sum y^{2}=900$, $\\sum xy=560$, find $\\hat\\beta_1$, $\\hat\\beta_0$, and $R^{2}$.",
      "back": "$\\bar x = 5$, $\\bar y = 8$.\n$S_{xx}=400-12(25)=100$; $S_{yy}=900-12(64)=900-768=132$; $S_{xy}=560-12(5)(8)=560-480=80$.\n$\\hat\\beta_1=\\frac{80}{100}=0.8$; $\\hat\\beta_0=8-0.8(5)=8-4=4$.\n$\\text{SSR}=\\frac{S_{xy}^{2}}{S_{xx}}=\\frac{6400}{100}=64$; $R^{2}=\\frac{\\text{SSR}}{S_{yy}}=\\frac{64}{132}\\approx 0.485$.",
      "tag": "Least-squares estimates"
    },
    {
      "front": "For the same data ($n=12$, $S_{xx}=100$, $S_{yy}=132$, $S_{xy}=80$), compute $s$ and test the slope at the $5\\%$ level.",
      "back": "$\\text{SSE}=S_{yy}-\\frac{S_{xy}^{2}}{S_{xx}}=132-64=68$.\n$s^{2}=\\frac{\\text{SSE}}{n-2}=\\frac{68}{10}=6.8$, so $s\\approx 2.608$.\n$SE(\\hat\\beta_1)=\\frac{s}{\\sqrt{S_{xx}}}=\\frac{2.608}{10}=0.2608$.\n$t=\\frac{0.8}{0.2608}\\approx 3.07$ on $10$ df. Since $|t|>t_{0.025,10}\\approx 2.228$, the slope is significant at $5\\%$.",
      "tag": "Inference & CIs"
    }
  ]
}