Willys Flashcards Download
Become an ActuaryExamsFlashcardsExam SRM › Time Series
Exam SRM · ~10-14%

Exam SRM — Time Series Flashcards

Time-series modeling for SOA Exam SRM: the trend/seasonal/cyclical/irregular decomposition, white noise and weak stationarity, the random walk and differencing, the autocorrelation function, the AR(1) model and its mean/variance/autocorrelation, moving averages and single/double exponential smoothing for forecasting, and forecast-accuracy measures (MSE, MAE, MAPE) — with fully worked numeric examples.

43 cards6 topicsFree · fact-checked · LaTeX math
Tap card or press Space to flip
Answer

Import this deck

Download all 43 cards and import them into your flashcard app (JSON or CSV — works with Anki). Using the Willys app? No import needed — this deck is already built in (Settings → Library → Browse).

Every deck is built into the Willys app

All of these decks — including the full practice problem banks — come built into Willys AI Flashcards & Quizzes for iPhone & iPad (Mac version coming soon), with FSRS + SM-2 spaced repetition, streaks, and exam-date cram mode. 14-day free trial, then $14.99. To load a deck in the app: Settings → Library → Browse, then pick your exam and deck.

More Exam SRM decks:

Clustering & KNN Clustering & KNN Practice Decision Trees & Ensembles Decision Trees & Ensembles Practice Generalized Linear Models Generalized Linear Models Practice

← All Exam SRM decks

Browse all 43 cards as a list
  1. Trend & seasonality
    Name the four components into which a time series $\{Y_t\}$ is classically decomposed.
    **Trend** $(T_t)$ — the long-run upward or downward movement in the level. **Seasonal** $(S_t)$ — a pattern that repeats over a fixed, known period (e.g. 12 months, 4 quarters). **Cyclical** $(C_t)$ — longer wave-like swings of *variable* length not tied to a fixed calendar period (e.g. the business cycle). **Irregular / random** $(I_t)$ — the leftover unpredictable noise. An additive model writes $Y_t=T_t+S_t+C_t+I_t$; a multiplicative model writes $Y_t=T_t\cdot S_t\cdot C_t\cdot I_t$.
  2. Stationarity & white noise
    Define a **white noise** process $\{\epsilon_t\}$ and state its three defining properties.
    White noise is a sequence of uncorrelated random shocks with: 1. constant mean $E[\epsilon_t]=0$, 2. constant variance $\text{Var}(\epsilon_t)=\sigma^2$, 3. zero autocorrelation: $\text{Cov}(\epsilon_t,\epsilon_s)=0$ for $t\neq s$. White noise is the simplest stationary series and is the building block (the error term) of every other model; a fitted model is adequate when its residuals look like white noise.
  3. Stationarity & white noise
    State the conditions for **weak (covariance) stationarity** of a time series $\{Y_t\}$.
    A series is weakly stationary if its first two moments do not change over time: 1. **constant mean**: $E[Y_t]=\mu$ for all $t$, 2. **constant variance**: $\text{Var}(Y_t)=\sigma_Y^2$ for all $t$, 3. **autocovariance depends only on the lag** $k$, not on $t$: $\text{Cov}(Y_t,Y_{t+k})=\gamma_k$. White noise and a stationary AR(1) are stationary; a random walk and any series with a trend or seasonal pattern are **not**.
  4. Stationarity & white noise
    Why is **stationarity** required before fitting most time-series models, and how is non-stationarity usually removed?
    Estimation of autocorrelations and model parameters assumes the statistical properties are stable over time; a moving mean or variance makes those estimates meaningless and forecasts unreliable. Non-stationarity from a **trend** is removed by **differencing**: $\nabla Y_t = Y_t - Y_{t-1}$. Non-stationarity from changing **variance** is often tamed by a **log or power transform**. Seasonality is removed by **seasonal differencing** $Y_t - Y_{t-s}$. Difference repeatedly only as needed — over-differencing introduces spurious autocorrelation.
  5. Stationarity & white noise
    Define the **autocorrelation function (ACF)** $\rho_k$ at lag $k$ for a stationary series.
    The lag-$k$ autocorrelation is the autocovariance scaled by the variance: $\rho_k=\frac{\text{Cov}(Y_t,Y_{t+k})}{\text{Var}(Y_t)}=\frac{\gamma_k}{\gamma_0}$. Properties: $\rho_0=1$, $-1\le\rho_k\le 1$, and $\rho_{-k}=\rho_k$. The sample ACF $\hat\rho_k$ is plotted in a **correlogram**; for white noise all $\hat\rho_k$ for $k\ge 1$ should be near $0$ (inside the $\pm 2/\sqrt{n}$ bands).
  6. Stationarity & white noise
    Compute the lag-1 sample autocorrelation $\hat\rho_1$ for the series $4,\,6,\,5,\,7,\,8$ (mean $\bar Y=6$).
    Deviations $d_t=Y_t-\bar Y$: $-2,\,0,\,-1,\,1,\,2$. **Numerator** $\sum_{t=2}^{5} d_t d_{t-1}=(0)(-2)+(-1)(0)+(1)(-1)+(2)(1)=0+0-1+2=1$. **Denominator** $\sum_{t=1}^{5} d_t^2=4+0+1+1+4=10$. $\hat\rho_1=\frac{1}{10}=0.10$. The small positive value indicates almost no lag-1 dependence in this short series.
  7. Trend & seasonality
    How is a **linear trend** model fitted to a time series, and what does each coefficient mean?
    Regress the level on time: $Y_t=\beta_0+\beta_1 t+\epsilon_t$, estimated by ordinary least squares with $t=1,2,\dots,n$. $\hat\beta_0$ is the fitted level at $t=0$; $\hat\beta_1$ is the average change in $Y$ per period. The forecast for a future period $t^{*}$ is $\hat Y_{t^{*}}=\hat\beta_0+\hat\beta_1 t^{*}$. This **deterministic-trend** model assumes the trend continues unchanged, in contrast to a stochastic trend (random walk) removed by differencing.
  8. Trend & seasonality
    A linear-trend fit on $n=10$ periods gives $\hat\beta_0=20.0$ and $\hat\beta_1=1.5$. Forecast $Y_{12}$ and state the implied change per period.
    The fitted trend is $\hat Y_t=20.0+1.5\,t$. Forecast for $t=12$: $\hat Y_{12}=20.0+1.5(12)=20.0+18.0=38.0$. The series rises by $\hat\beta_1=1.5$ units each period on average. Two periods beyond the last observation ($t=10$, fitted $35.0$) the forecast adds $2(1.5)=3.0$, giving $38.0$.
  9. Trend & seasonality
    How are **seasonal indices** used to adjust a trend forecast in an additive (or multiplicative) model?
    After estimating the trend, compute the average departure of each season from the trend. **Additive:** seasonal index $S_q$ is an amount added; the forecast is $\hat Y=\text{trend}+S_q$, with the indices summing to $0$ across the season. **Multiplicative:** seasonal index is a ratio averaging $1$ (summing to the number of seasons); the forecast is $\hat Y=\text{trend}\times S_q$. To **deseasonalize** an observation, subtract (additive) or divide by (multiplicative) its seasonal index.
  10. Trend & seasonality
    Quarterly sales have trend forecast $200$ for next quarter and a multiplicative seasonal index of $1.15$ for that quarter. Give the seasonally adjusted forecast, and deseasonalize an observed value of $253$.
    **Seasonal forecast** (multiplicative): $\hat Y=\text{trend}\times S_q=200\times 1.15=230$. **Deseasonalize** an actual of $253$: divide by the index, $\frac{253}{1.15}=220$. The $220$ is the trend-level value with the seasonal effect stripped out, comparable across quarters.
  11. Trend & seasonality
    Compute the four additive seasonal indices given average quarterly deviations from trend of $+12,\,-8,\,-10,\,+4$.
    Additive indices must sum to $0$. The raw deviations sum to $12-8-10+4=-2$, so subtract the mean $\frac{-2}{4}=-0.5$ from each (equivalently add $0.5$). Adjusted indices: $12+0.5=12.5$, $-8+0.5=-7.5$, $-10+0.5=-9.5$, $4+0.5=4.5$. Check: $12.5-7.5-9.5+4.5=0$. A Q1 trend forecast of $150$ becomes $150+12.5=162.5$.
  12. Random walk
    Define the **random walk** $Y_t=Y_{t-1}+\epsilon_t$ and explain why it is non-stationary.
    Each value equals the previous value plus an independent white-noise shock $\epsilon_t$ (mean $0$, variance $\sigma^2$). Writing it from the start, $Y_t=Y_0+\sum_{i=1}^{t}\epsilon_i$, so it is the running sum of shocks. It is **non-stationary** because its variance grows without bound: $\text{Var}(Y_t)=t\sigma^2$ depends on $t$. The mean is constant ($E[Y_t]=Y_0$) but the variance is not, so weak stationarity fails. It is the AR(1) boundary case $\phi_1=1$ (a unit root).
  13. Random walk
    For a random walk $Y_t=Y_{t-1}+\epsilon_t$ with $\sigma^2=9$ and $Y_0=50$, find $\text{Var}(Y_6)$, the standard deviation of $Y_6$, and the one-step forecast of $Y_7$ given $Y_6$.
    Since $Y_t=Y_0+\sum_{i=1}^{t}\epsilon_i$ with independent shocks, $\text{Var}(Y_t)=t\sigma^2$. $\text{Var}(Y_6)=6(9)=54$, so $\text{SD}(Y_6)=\sqrt{54}\approx 7.35$. The optimal one-step forecast of a random walk is the **last observed value** (since $E[\epsilon_{7}]=0$): $\hat Y_7=Y_6$. The random walk is a "no-change" forecaster — tomorrow's best guess is today's value.
  14. Random walk
    How does **first differencing** turn a random walk into a stationary series?
    Define the first difference $\nabla Y_t=Y_t-Y_{t-1}$. For the random walk $Y_t=Y_{t-1}+\epsilon_t$ this gives $\nabla Y_t=\epsilon_t$, which is exactly **white noise** — constant mean $0$, constant variance $\sigma^2$, zero autocorrelation — hence stationary. This is why a series with a stochastic (unit-root) trend is made stationary by differencing rather than by subtracting a fitted line; the latter only works for a deterministic trend.
  15. Random walk
    A **random walk with drift** is $Y_t=\delta+Y_{t-1}+\epsilon_t$. Give its mean function and explain the role of $\delta$.
    Iterating from $Y_0$, $Y_t=Y_0+\delta t+\sum_{i=1}^{t}\epsilon_i$. So: $E[Y_t]=Y_0+\delta t$ — a linear trend with slope $\delta$ (the **drift**), $\text{Var}(Y_t)=t\sigma^2$ — still growing, so still non-stationary. First differencing gives $\nabla Y_t=\delta+\epsilon_t$, white noise plus a constant mean $\delta$. The drift adds a steady $\delta$ per period on top of the pure random-walk behavior.
  16. Random walk
    For a random walk with drift $\delta=2$, $Y_0=10$, and $\sigma^2=4$, find $E[Y_5]$ and $\text{Var}(Y_5)$.
    $E[Y_t]=Y_0+\delta t=10+2(5)=20$. $\text{Var}(Y_t)=t\sigma^2=5(4)=20$, so $\text{SD}(Y_5)=\sqrt{20}\approx 4.47$. The expected level drifts up by $2$ per step while the uncertainty band widens as $\sqrt{t}$.
  17. Autoregressive models
    Write the **AR(1)** model in mean-adjusted and intercept forms, and state the stationarity condition.
    Intercept form: $Y_t=\beta_0+\phi_1 Y_{t-1}+\epsilon_t$, with $\epsilon_t$ white noise (mean $0$, variance $\sigma^2$). Mean-adjusted form: $Y_t-\mu=\phi_1(Y_{t-1}-\mu)+\epsilon_t$, where $\mu=\frac{\beta_0}{1-\phi_1}$. **Stationarity** requires $|\phi_1|<1$. At $\phi_1=1$ the process is a random walk (non-stationary); $|\phi_1|>1$ explodes.
  18. Autoregressive models
    For a stationary AR(1) $Y_t=\beta_0+\phi_1 Y_{t-1}+\epsilon_t$, state the formulas for the mean, variance, and lag-$k$ autocorrelation.
    **Mean:** $E[Y_t]=\mu=\frac{\beta_0}{1-\phi_1}$. **Variance:** $\text{Var}(Y_t)=\frac{\sigma^2}{1-\phi_1^{2}}$. **Autocorrelation:** $\rho_k=\phi_1^{k}$ for $k\ge 0$. So the ACF decays geometrically; it stays positive when $\phi_1>0$ and alternates sign when $\phi_1<0$. The lag-1 autocorrelation $\rho_1=\phi_1$ identifies the parameter directly.
  19. Autoregressive models
    An AR(1) has $\beta_0=6$, $\phi_1=0.4$, and $\sigma^2=5$. Find the process mean $\mu$ and variance.
    **Mean:** $\mu=\frac{\beta_0}{1-\phi_1}=\frac{6}{1-0.4}=\frac{6}{0.6}=10$. **Variance:** $\text{Var}(Y_t)=\frac{\sigma^2}{1-\phi_1^{2}}=\frac{5}{1-0.16}=\frac{5}{0.84}\approx 5.952$. Because $|\phi_1|=0.4<1$ the process is stationary, so these constant moments are well defined.
  20. Autoregressive models
    For an AR(1) with $\phi_1=0.7$, compute the autocorrelations $\rho_1,\rho_2,\rho_3$ and describe the ACF shape.
    Using $\rho_k=\phi_1^{k}$: $\rho_1=0.7$, $\rho_2=0.7^{2}=0.49$, $\rho_3=0.7^{3}=0.343$. The ACF decays **geometrically toward zero**, all positive because $\phi_1>0$. A correlogram showing this smooth exponential decay (with the partial ACF cutting off after lag 1) is the signature of an AR(1) process.
  21. Autoregressive models
    An AR(1) with $\phi_1=-0.6$, $\beta_0=8$, and $\sigma^2=10$: find $\mu$, $\rho_1$, $\rho_2$, and describe the ACF.
    $\mu=\frac{\beta_0}{1-\phi_1}=\frac{8}{1-(-0.6)}=\frac{8}{1.6}=5$. $\rho_1=\phi_1=-0.6$, $\rho_2=\phi_1^{2}=0.36$. With $\phi_1<0$ the autocorrelations **alternate in sign** while shrinking in magnitude ($-0.6,\;0.36,\;-0.216,\dots$), giving an oscillating, damped correlogram. Variance $=\frac{10}{1-0.36}=\frac{10}{0.64}=15.625$.
  22. Autoregressive models
    An AR(1) has $\mu=20$, $\phi_1=0.5$. The last observation is $Y_t=26$. Give the one-step and two-step forecasts $\hat Y_{t+1}$ and $\hat Y_{t+2}$.
    Forecasts use the mean-adjusted recursion $\hat Y_{t+h}-\mu=\phi_1^{h}(Y_t-\mu)$. **One step:** $\hat Y_{t+1}=\mu+\phi_1(Y_t-\mu)=20+0.5(26-20)=20+3=23$. **Two step:** $\hat Y_{t+2}=\mu+\phi_1^{2}(Y_t-\mu)=20+0.25(6)=20+1.5=21.5$. As the horizon grows the forecast decays geometrically back to the long-run mean $\mu=20$.
  23. Autoregressive models
    Show why the AR(1) variance is $\frac{\sigma^2}{1-\phi_1^{2}}$ starting from $\text{Var}(Y_t)=\phi_1^{2}\text{Var}(Y_{t-1})+\sigma^2$.
    Take variances of $Y_t-\mu=\phi_1(Y_{t-1}-\mu)+\epsilon_t$. Since $\epsilon_t$ is independent of $Y_{t-1}$: $\text{Var}(Y_t)=\phi_1^{2}\,\text{Var}(Y_{t-1})+\sigma^2$. Stationarity means $\text{Var}(Y_t)=\text{Var}(Y_{t-1})=\gamma_0$, so $\gamma_0=\phi_1^{2}\gamma_0+\sigma^2$. Solve: $\gamma_0(1-\phi_1^{2})=\sigma^2\Rightarrow \gamma_0=\frac{\sigma^2}{1-\phi_1^{2}}$, which requires $|\phi_1|<1$ for a positive, finite variance.
  24. Autoregressive models
    Given the lag-1 autocorrelation $\hat\rho_1=0.8$ for a series modeled as AR(1), estimate $\phi_1$ and the lag-3 autocorrelation.
    For an AR(1), $\rho_1=\phi_1$, so $\hat\phi_1=\hat\rho_1=0.8$. Then $\rho_k=\phi_1^{k}$ gives $\hat\rho_3=0.8^{3}=0.512$. The single lag-1 autocorrelation pins down the whole geometric ACF of an AR(1).
  25. Smoothing
    How does a **simple moving average (SMA)** of order $k$ smooth a series, and how is it used to forecast?
    The $k$-period moving average at time $t$ is the mean of the most recent $k$ observations: $M_t=\frac{Y_t+Y_{t-1}+\dots+Y_{t-k+1}}{k}$. It damps the irregular component; a larger $k$ smooths more but lags more. The naive **forecast** of the next value is the current moving average: $\hat Y_{t+1}=M_t$. SMA weights the $k$ terms equally and ignores everything older.
  26. Smoothing
    Given the last four observations $Y_{t-3},\dots,Y_t = 22,\,26,\,24,\,28$, compute the 3-period and 4-period moving-average forecasts of $Y_{t+1}$.
    **3-period** (most recent three: $26,24,28$): $M_t=\frac{26+24+28}{3}=\frac{78}{3}=26.0$. **4-period** (all four: $22,26,24,28$): $M_t=\frac{22+26+24+28}{4}=\frac{100}{4}=25.0$. The forecasts are $\hat Y_{t+1}=26.0$ (3-period) and $25.0$ (4-period); the shorter window reacts more to the recent uptick.
  27. Smoothing
    State the **single (simple) exponential smoothing** updating formula and explain the role of $\alpha$.
    $\hat Y_{t+1}=\alpha Y_t+(1-\alpha)\hat Y_t$, with smoothing constant $0<\alpha<1$. Each new forecast blends the most recent actual $Y_t$ and the previous forecast $\hat Y_t$. Expanding shows geometrically declining weights on past data: $\alpha,\;\alpha(1-\alpha),\;\alpha(1-\alpha)^2,\dots$. Large $\alpha$ → responsive, tracks recent changes; small $\alpha$ → heavy smoothing, slow to react. It suits a series with **no trend or seasonality** (a locally constant level).
  28. Smoothing
    Single exponential smoothing with $\alpha=0.3$: the prior forecast is $\hat Y_t=50$ and the actual is $Y_t=60$. Find $\hat Y_{t+1}$.
    $\hat Y_{t+1}=\alpha Y_t+(1-\alpha)\hat Y_t=0.3(60)+0.7(50)=18+35=53$. Equivalently, error-correction form: $\hat Y_{t+1}=\hat Y_t+\alpha(Y_t-\hat Y_t)=50+0.3(60-50)=50+3=53$. The forecast moves $30\%$ of the way toward the latest observation.
  29. Smoothing
    Run single exponential smoothing ($\alpha=0.5$) on $Y_1,\dots,Y_4 = 10,\,14,\,12,\,16$, initializing $\hat Y_2=Y_1=10$. Forecast $\hat Y_5$.
    Update $\hat Y_{t+1}=0.5Y_t+0.5\hat Y_t$: $\hat Y_3=0.5(14)+0.5(10)=12.0$. $\hat Y_4=0.5(12)+0.5(12.0)=12.0$. $\hat Y_5=0.5(16)+0.5(12.0)=8+6=14.0$. The one-step-ahead forecast of $Y_5$ is $\mathbf{14.0}$.
  30. Smoothing
    Compare a large versus a small smoothing constant $\alpha$ in exponential smoothing.
    $\alpha$ controls how fast old data is forgotten (weights decay as $(1-\alpha)^{j}$): **Large $\alpha$ (near 1):** puts most weight on the latest actual — fast, responsive, but noisy; tracks real level shifts quickly. **Small $\alpha$ (near 0):** weights are spread over a long history — heavy smoothing, stable, but slow to adapt to genuine changes. $\alpha$ is often chosen to minimize in-sample one-step squared forecast error (SSE/MSE).
  31. Smoothing
    Why does single exponential smoothing fail on a trending series, and what does **double exponential (Holt) smoothing** add?
    Single smoothing forecasts a flat level, so on an upward trend it **lags behind systematically** (consistently under-forecasts). **Holt / double exponential smoothing** adds a second equation for the trend: it tracks a smoothed **level** $L_t=\alpha Y_t+(1-\alpha)(L_{t-1}+b_{t-1})$ and a smoothed **slope** $b_t=\beta(L_t-L_{t-1})+(1-\beta)b_{t-1}$. The $h$-step forecast is $\hat Y_{t+h}=L_t+h\,b_t$, projecting the current slope forward. (Holt-Winters extends this with a third seasonal equation.)
  32. Smoothing
    Holt's method gives a current level $L_t=200$ and trend $b_t=5$. Forecast 1 and 3 periods ahead.
    The Holt $h$-step forecast is $\hat Y_{t+h}=L_t+h\,b_t$. **One step** ($h=1$): $\hat Y_{t+1}=200+1(5)=205$. **Three steps** ($h=3$): $\hat Y_{t+3}=200+3(5)=215$. Unlike single smoothing's flat line, Holt extrapolates the slope $b_t=5$ per period into the future.
  33. Forecast accuracy
    Define **mean squared error (MSE)** and **root mean squared error (RMSE)** as forecast-accuracy measures.
    Over $n$ forecasts with errors $e_t=Y_t-\hat Y_t$: $\text{MSE}=\frac{1}{n}\sum_{t=1}^{n} e_t^{2}=\frac{1}{n}\sum (Y_t-\hat Y_t)^2$, $\text{RMSE}=\sqrt{\text{MSE}}$. Squaring penalizes **large** errors heavily, so MSE/RMSE are sensitive to outliers. RMSE is in the original units of $Y$, making it easier to interpret than MSE.
  34. Forecast accuracy
    Define **mean absolute error (MAE)** and **mean absolute percentage error (MAPE)**, and note when each is preferred.
    $\text{MAE}=\frac{1}{n}\sum_{t=1}^{n}|Y_t-\hat Y_t|$ — average size of the errors in the data's units; less sensitive to outliers than MSE. $\text{MAPE}=\frac{1}{n}\sum_{t=1}^{n}\left|\frac{Y_t-\hat Y_t}{Y_t}\right|\times 100\%$ — average error as a percent of the actual, which is **unit-free** and good for comparing across series of different scale. MAPE is undefined/unstable when any $Y_t$ is near $0$.
  35. Forecast accuracy
    Forecast errors over four periods are $e_t = 3,\,-2,\,4,\,-1$. Compute the MSE, RMSE, and MAE.
    **MSE:** $\frac{1}{4}(3^2+(-2)^2+4^2+(-1)^2)=\frac{1}{4}(9+4+16+1)=\frac{30}{4}=7.5$. **RMSE:** $\sqrt{7.5}\approx 2.74$. **MAE:** $\frac{1}{4}(|3|+|-2|+|4|+|-1|)=\frac{1}{4}(3+2+4+1)=\frac{10}{4}=2.5$. MSE exceeds MAE$^2$ here because the large $|4|$ error is amplified by squaring.
  36. Forecast accuracy
    Actuals $Y_t$ and forecasts $\hat Y_t$ over three periods: $(100,95),\,(120,126),\,(80,76)$. Compute the MAPE.
    Absolute percentage errors $\left|\frac{Y_t-\hat Y_t}{Y_t}\right|$: Period 1: $\left|\frac{100-95}{100}\right|=0.05$. Period 2: $\left|\frac{120-126}{120}\right|=\frac{6}{120}=0.05$. Period 3: $\left|\frac{80-76}{80}\right|=\frac{4}{80}=0.05$. $\text{MAPE}=\frac{0.05+0.05+0.05}{3}\times 100\%=5.0\%$.
  37. Forecast accuracy
    Two models are compared on a hold-out set: Model A has $\text{MAE}=4.2,\ \text{MAPE}=6.1\%$; Model B has $\text{MAE}=3.8,\ \text{MAPE}=7.5\%$. Which is better?
    It depends on the criterion — the measures disagree. Model B has the lower **absolute** error ($\text{MAE}=3.8<4.2$), so it makes smaller errors in raw units. Model A has the lower **percentage** error ($\text{MAPE}=6.1\%<7.5\%$), so it is more accurate relative to the size of the actuals. If the costliest errors are the large-magnitude ones, prefer B; if proportional accuracy across scales matters, prefer A. Pick the metric that matches the decision being supported, and ideally also check MSE/RMSE.
  38. Forecast accuracy
    Compute the MSE of a 3-period moving-average forecaster on the series $Y_4,Y_5,Y_6=30,\,33,\,31$, given moving-average forecasts $\hat Y_4,\hat Y_5,\hat Y_6 = 28,\,31,\,34$.
    Errors $e_t=Y_t-\hat Y_t$: $e_4=30-28=2$, $e_5=33-31=2$, $e_6=31-34=-3$. $\text{MSE}=\frac{1}{3}(2^2+2^2+(-3)^2)=\frac{1}{3}(4+4+9)=\frac{17}{3}\approx 5.67$. RMSE $=\sqrt{5.67}\approx 2.38$; the model would be compared against alternatives on the same hold-out periods.
  39. Random walk
    Distinguish a **deterministic trend** from a **stochastic trend**, and the correct way to make each stationary.
    **Deterministic trend:** $Y_t=\beta_0+\beta_1 t+\epsilon_t$ — a fixed line plus stationary noise. Made stationary by **detrending** (subtracting the fitted line / regressing on $t$). Shocks have only temporary effects. **Stochastic trend:** a unit-root process like the random walk $Y_t=Y_{t-1}+\epsilon_t$. Made stationary by **differencing** $\nabla Y_t$. Shocks are **permanent** — they accumulate into the level. Applying the wrong remedy (differencing a deterministic trend, or detrending a random walk) leaves the residuals non-white.
  40. Autoregressive models
    For a stationary AR(1) with $\phi_1=0.6$ and $\sigma^2=8$, find the lag-1 autocovariance $\gamma_1$.
    First the variance: $\gamma_0=\frac{\sigma^2}{1-\phi_1^{2}}=\frac{8}{1-0.36}=\frac{8}{0.64}=12.5$. Then $\gamma_1=\rho_1\gamma_0=\phi_1\gamma_0=0.6(12.5)=7.5$. (Equivalently $\gamma_k=\phi_1^{k}\gamma_0$, so $\gamma_2=0.36(12.5)=4.5$, etc.)
  41. Smoothing
    A series has a clear upward straight-line trend with no seasonality. Which forecasting method is appropriate, and why not single exponential smoothing or a plain moving average?
    Use a **linear-trend regression** $\hat Y_t=\hat\beta_0+\hat\beta_1 t$ or **double (Holt) exponential smoothing**, both of which project the slope forward. A **simple moving average** and **single exponential smoothing** assume a locally flat level, so on a trend they **lag and systematically under-forecast** (always behind the rising actuals). Matching the method to the data's structure — trend, seasonality, or neither — is the key model-selection step.
  42. Smoothing
    Single exponential smoothing with $\alpha=0.2$, prior forecast $\hat Y_t=40$, actual $Y_t=50$. Find the next forecast, then the forecast two steps ahead $\hat Y_{t+2}$ (no new data after $t$).
    **One step:** $\hat Y_{t+1}=0.2(50)+0.8(40)=10+32=42$. With no observation at $t+1$, single exponential smoothing produces a **flat** multi-step forecast: $\hat Y_{t+2}=\hat Y_{t+1}=42$. All horizons share the same value because the method has no trend term — a limitation that motivates Holt's method.
  43. Autoregressive models
    Given sample autocovariances $\gamma_0=20$, $\gamma_1=12$, $\gamma_2=6$, compute the autocorrelations $\rho_1,\rho_2$ and comment on whether an AR(1) is plausible.
    $\rho_1=\frac{\gamma_1}{\gamma_0}=\frac{12}{20}=0.6$ and $\rho_2=\frac{\gamma_2}{\gamma_0}=\frac{6}{20}=0.3$. For an AR(1) we would expect $\rho_2=\rho_1^{2}=0.6^{2}=0.36$. The observed $0.3$ is reasonably close to $0.36$, so an AR(1) with $\phi_1\approx 0.6$ is plausible; a large discrepancy would point to a higher-order model.