Willys Flashcards Download
Become an ActuaryExamsFlashcardsExam MAS-II › Time Series
Exam MAS-II · ~12-16%

Exam MAS-II — Time Series (ARIMA) Flashcards

Box–Jenkins time-series modeling for CAS Exam MAS-II: white noise and weak stationarity, the autocovariance and autocorrelation functions, AR(p) models with Yule–Walker equations and stationarity conditions, MA(q) models with invertibility and ACF cutoff, ARMA and ARIMA differencing, ACF/PACF order identification, and one-step forecasting with error variance and Ljung–Box/AIC diagnostics — with fully worked autocorrelation, identification, and forecast calculations.

44 cards6 topicsFree · fact-checked · LaTeX math
Tap card or press Space to flip
Answer

Import this deck

Download all 44 cards and import them into your flashcard app (JSON or CSV — works with Anki). Using the Willys app? No import needed — this deck is already built in (Settings → Library → Browse).

Every deck is built into the Willys app

All of these decks — including the full practice problem banks — come built into Willys AI Flashcards & Quizzes for iPhone & iPad (Mac version coming soon), with FSRS + SM-2 spaced repetition, streaks, and exam-date cram mode. 14-day free trial, then $14.99. To load a deck in the app: Settings → Library → Browse, then pick your exam and deck.

More Exam MAS-II decks:

Bayesian Analysis Bayesian Analysis Practice Credibility Credibility Practice Generalized Linear Models Generalized Linear Models Practice

← All Exam MAS-II decks

Browse all 44 cards as a list
  1. Stationarity & white noise
    Define a **white noise** process $\{\epsilon_t\}$ and state its mean, variance, and autocovariance.
    White noise is a sequence of uncorrelated, mean-zero, constant-variance shocks: $E[\epsilon_t]=0$, $\operatorname{Var}(\epsilon_t)=\sigma^2$ for all $t$, and $\operatorname{Cov}(\epsilon_t,\epsilon_s)=0$ for $t\neq s$. Equivalently $\gamma_0=\sigma^2$ and $\gamma_k=0$ for $k\neq 0$, so $\rho_0=1$ and $\rho_k=0$ for $k\neq 0$. It is the irreducible random input that drives AR/MA/ARMA models.
  2. Stationarity & white noise
    State the conditions for **weak (covariance) stationarity** of a series $\{Y_t\}$.
    A process is weakly stationary if its first two moments do not depend on time: 1. Constant mean: $E[Y_t]=\mu$ for all $t$. 2. Constant variance: $\operatorname{Var}(Y_t)=\gamma_0$ for all $t$. 3. Autocovariance depends only on the lag, not the position: $\operatorname{Cov}(Y_t,Y_{t+k})=\gamma_k$ for all $t$. Weak stationarity is what Box–Jenkins modeling requires; a deterministic trend or changing variance violates it.
  3. Stationarity & white noise
    Define the **autocovariance** $\gamma_k$ and **autocorrelation** $\rho_k$ of a stationary series.
    Autocovariance at lag $k$: $\gamma_k=\operatorname{Cov}(Y_t,Y_{t+k})=E[(Y_t-\mu)(Y_{t+k}-\mu)]$. Autocorrelation at lag $k$: $\rho_k=\frac{\gamma_k}{\gamma_0}$, the autocovariance normalized by the variance $\gamma_0$. Properties: $\rho_0=1$, $\rho_k=\rho_{-k}$ (symmetry), and $|\rho_k|\le 1$. The plot of $\rho_k$ against $k$ is the **ACF**.
  4. Stationarity & white noise
    What is the **backshift (lag) operator** $B$, and how is it used to write AR, MA, and differencing?
    The backshift operator shifts a series one step back: $B Y_t = Y_{t-1}$, and in general $B^k Y_t = Y_{t-k}$. AR(p): $\phi(B)Y_t=\epsilon_t$ with $\phi(B)=1-\phi_1 B-\cdots-\phi_p B^p$. MA(q): $Y_t=\theta(B)\epsilon_t$ with $\theta(B)=1+\theta_1 B+\cdots+\theta_q B^q$. Differencing: $\nabla Y_t=(1-B)Y_t=Y_t-Y_{t-1}$.
  5. AR models
    Write the **AR(1)** model and state its stationarity condition.
    Mean-zero AR(1): $Y_t=\phi Y_{t-1}+\epsilon_t$, where $\{\epsilon_t\}$ is white noise with variance $\sigma^2$. Stationarity requires $|\phi|<1$ (the root of the characteristic equation $1-\phi z=0$, namely $z=1/\phi$, lies outside the unit circle). If $\phi=1$ the process is a non-stationary random walk; if $|\phi|>1$ it explodes.
  6. AR models
    For a stationary mean-zero AR(1), give the **variance** $\gamma_0$ and the **autocorrelation** $\rho_k$.
    Variance: $\gamma_0=\operatorname{Var}(Y_t)=\dfrac{\sigma^2}{1-\phi^2}$. Autocorrelation: $\rho_k=\phi^{\,k}$ for $k\ge 0$, so the ACF decays geometrically. When $0<\phi<1$ it decays monotonically; when $-1<\phi<0$ it alternates in sign while shrinking in magnitude.
  7. AR models
    An AR(1) has $\phi=0.6$ and shock variance $\sigma^2=4$. Find $\gamma_0$ and the autocorrelations $\rho_1,\rho_2,\rho_3$.
    Variance: $\gamma_0=\dfrac{\sigma^2}{1-\phi^2}=\dfrac{4}{1-0.36}=\dfrac{4}{0.64}=6.25$. Autocorrelations $\rho_k=\phi^k$: $\rho_1=0.6$, $\rho_2=0.6^2=0.36$, $\rho_3=0.6^3=0.216$. The corresponding autocovariances are $\gamma_k=\rho_k\gamma_0$: $\gamma_1=0.6(6.25)=3.75$, $\gamma_2=0.36(6.25)=2.25$.
  8. AR models
    An AR(1) is $Y_t=-0.5\,Y_{t-1}+\epsilon_t$ with $\sigma^2=3$. Compute $\gamma_0$, $\rho_1$, $\rho_2$ and describe the ACF pattern.
    Variance: $\gamma_0=\dfrac{3}{1-(-0.5)^2}=\dfrac{3}{1-0.25}=\dfrac{3}{0.75}=4$. $\rho_1=\phi=-0.5$, $\rho_2=\phi^2=0.25$, $\rho_3=\phi^3=-0.125$. Because $\phi<0$ the ACF **alternates in sign** ($-,+,-,\dots$) while decaying geometrically in magnitude.
  9. AR models
    Include a nonzero mean in AR(1): write $Y_t=c+\phi Y_{t-1}+\epsilon_t$ and give the process mean $\mu$.
    Taking expectations of a stationary AR(1): $\mu=c+\phi\mu$, so $\mu=\dfrac{c}{1-\phi}$ (valid for $|\phi|<1$). The intercept $c$ is **not** the mean; the mean depends on $\phi$ too. The autocorrelation structure is unchanged from the mean-zero case: $\rho_k=\phi^k$.
  10. AR models
    An AR(1) is $Y_t=10+0.4\,Y_{t-1}+\epsilon_t$ with $\sigma^2=5$. Find the process mean and variance.
    Mean: $\mu=\dfrac{c}{1-\phi}=\dfrac{10}{1-0.4}=\dfrac{10}{0.6}\approx 16.667$. Variance: $\gamma_0=\dfrac{\sigma^2}{1-\phi^2}=\dfrac{5}{1-0.16}=\dfrac{5}{0.84}\approx 5.952$. Note the mean uses $1-\phi$ while the variance uses $1-\phi^2$ — different denominators.
  11. AR models
    Write the **AR(2)** model and state the **Yule–Walker equations** for its autocorrelations.
    AR(2): $Y_t=\phi_1 Y_{t-1}+\phi_2 Y_{t-2}+\epsilon_t$. Multiplying by $Y_{t-k}$ and taking expectations gives the Yule–Walker recursion $\rho_k=\phi_1\rho_{k-1}+\phi_2\rho_{k-2}$ for $k\ge 1$, with $\rho_0=1$ and $\rho_{-1}=\rho_1$. In particular $\rho_1=\dfrac{\phi_1}{1-\phi_2}$ and $\rho_2=\phi_1\rho_1+\phi_2$.
  12. AR models
    State the **stationarity conditions** for an AR(2) model in terms of $\phi_1$ and $\phi_2$.
    An AR(2) $Y_t=\phi_1 Y_{t-1}+\phi_2 Y_{t-2}+\epsilon_t$ is stationary iff the roots of $1-\phi_1 z-\phi_2 z^2=0$ lie outside the unit circle. Equivalently the three inequalities hold: $\phi_1+\phi_2<1$, $\quad \phi_2-\phi_1<1$, $\quad |\phi_2|<1$ (i.e. $-1<\phi_2<1$). Together these define the stationarity triangle in the $(\phi_1,\phi_2)$ plane.
  13. AR models
    An AR(2) has $\phi_1=0.5$ and $\phi_2=0.2$. Verify stationarity and find $\rho_1$, $\rho_2$, $\rho_3$.
    Stationarity: $\phi_1+\phi_2=0.7<1$, $\phi_2-\phi_1=-0.3<1$, $|\phi_2|=0.2<1$ — all satisfied, so stationary. $\rho_1=\dfrac{\phi_1}{1-\phi_2}=\dfrac{0.5}{1-0.2}=\dfrac{0.5}{0.8}=0.625$. $\rho_2=\phi_1\rho_1+\phi_2=0.5(0.625)+0.2=0.3125+0.2=0.5125$. $\rho_3=\phi_1\rho_2+\phi_2\rho_1=0.5(0.5125)+0.2(0.625)=0.25625+0.125=0.38125$.
  14. AR models
    An AR(2) has $\phi_1=1.0$ and $\phi_2=-0.25$. Is it stationary, and what are $\rho_1$ and $\rho_2$?
    Check: $\phi_1+\phi_2=0.75<1$, $\phi_2-\phi_1=-1.25<1$, $|\phi_2|=0.25<1$ — stationary. $\rho_1=\dfrac{\phi_1}{1-\phi_2}=\dfrac{1.0}{1-(-0.25)}=\dfrac{1}{1.25}=0.8$. $\rho_2=\phi_1\rho_1+\phi_2=1.0(0.8)+(-0.25)=0.8-0.25=0.55$.
  15. AR models
    Given sample autocorrelations $\hat\rho_1=0.7$ and $\hat\rho_2=0.5$ from an AR(2) fit, solve the **Yule–Walker equations** for $\hat\phi_1$ and $\hat\phi_2$.
    The lag-1 and lag-2 Yule–Walker equations are: $\rho_1=\phi_1+\phi_2\rho_1$ and $\rho_2=\phi_1\rho_1+\phi_2$. Closed form: $\phi_1=\dfrac{\rho_1(1-\rho_2)}{1-\rho_1^2}$, $\phi_2=\dfrac{\rho_2-\rho_1^2}{1-\rho_1^2}$. $\phi_1=\dfrac{0.7(1-0.5)}{1-0.49}=\dfrac{0.35}{0.51}\approx 0.686$. $\phi_2=\dfrac{0.5-0.49}{0.51}=\dfrac{0.01}{0.51}\approx 0.020$.
  16. MA models
    Write the **MA(1)** model and give its mean, variance, and autocorrelations.
    MA(1): $Y_t=\epsilon_t+\theta\,\epsilon_{t-1}$ (some texts use $-\theta$; check the sign convention). Mean $E[Y_t]=0$. Variance $\gamma_0=\sigma^2(1+\theta^2)$. Autocorrelations: $\rho_1=\dfrac{\theta}{1+\theta^2}$ and $\rho_k=0$ for $k\ge 2$. An MA(1) is **always stationary** (finite sum of white noise), so no stationarity restriction on $\theta$.
  17. MA models
    What does it mean that the ACF of an MA(q) process **cuts off**, and at which lag?
    For MA(q), $Y_t=\epsilon_t+\theta_1\epsilon_{t-1}+\cdots+\theta_q\epsilon_{t-q}$, the autocorrelation is **nonzero only out to lag $q$**: $\rho_k\neq 0$ for $k\le q$ but $\rho_k=0$ for all $k>q$. This abrupt drop to zero after lag $q$ is the ACF "cutting off." It is the signature used to identify the MA order $q$ from a sample ACF plot.
  18. MA models
    An MA(1) has $\theta=0.8$ and $\sigma^2=2$. Find $\gamma_0$, $\gamma_1$, $\rho_1$, and $\rho_2$.
    Variance: $\gamma_0=\sigma^2(1+\theta^2)=2(1+0.64)=2(1.64)=3.28$. Lag-1 autocovariance: $\gamma_1=\theta\sigma^2=0.8(2)=1.6$. $\rho_1=\dfrac{\gamma_1}{\gamma_0}=\dfrac{1.6}{3.28}\approx 0.488$, or equivalently $\dfrac{\theta}{1+\theta^2}=\dfrac{0.8}{1.64}\approx 0.488$. $\rho_2=0$ (and all higher lags), the MA(1) cutoff.
  19. MA models
    An MA(1) has $\theta=-0.5$ and $\sigma^2=9$. Compute $\gamma_0$, $\rho_1$, and the maximum possible $|\rho_1|$ for any MA(1).
    Variance: $\gamma_0=\sigma^2(1+\theta^2)=9(1+0.25)=9(1.25)=11.25$. $\rho_1=\dfrac{\theta}{1+\theta^2}=\dfrac{-0.5}{1.25}=-0.4$, and $\rho_k=0$ for $k\ge 2$. The function $\dfrac{\theta}{1+\theta^2}$ is maximized at $\theta=\pm 1$, giving $|\rho_1|\le 0.5$ — an MA(1) can never have lag-1 autocorrelation beyond $\pm 0.5$.
  20. MA models
    What is the **invertibility** condition for an MA(1), and why does it matter?
    An MA(1) $Y_t=\epsilon_t+\theta\epsilon_{t-1}$ is invertible iff $|\theta|<1$. Invertibility lets you rewrite the MA as a convergent infinite-AR representation $\epsilon_t=\sum_{j\ge 0}(-\theta)^j Y_{t-j}$, so the shocks are recoverable from past observations. It matters because two MA(1)s with reciprocal parameters $\theta$ and $1/\theta$ produce the **same** ACF; imposing $|\theta|<1$ makes the model identifiable (a unique parameter for a given autocorrelation).
  21. MA models
    Write a general **MA(2)** model and give its variance and lag-1, lag-2 autocorrelations.
    MA(2): $Y_t=\epsilon_t+\theta_1\epsilon_{t-1}+\theta_2\epsilon_{t-2}$. Variance: $\gamma_0=\sigma^2(1+\theta_1^2+\theta_2^2)$. Autocovariances: $\gamma_1=\sigma^2(\theta_1+\theta_1\theta_2)$, $\gamma_2=\sigma^2\theta_2$, and $\gamma_k=0$ for $k\ge 3$. Thus $\rho_1=\dfrac{\theta_1(1+\theta_2)}{1+\theta_1^2+\theta_2^2}$, $\rho_2=\dfrac{\theta_2}{1+\theta_1^2+\theta_2^2}$, and the ACF cuts off after lag $2$.
  22. MA models
    An MA(2) has $\theta_1=0.5$, $\theta_2=0.3$, $\sigma^2=1$. Compute $\gamma_0$, $\rho_1$, $\rho_2$, $\rho_3$.
    $\gamma_0=\sigma^2(1+\theta_1^2+\theta_2^2)=1(1+0.25+0.09)=1.34$. $\gamma_1=\sigma^2(\theta_1+\theta_1\theta_2)=1(0.5+0.5\cdot0.3)=0.5+0.15=0.65$, so $\rho_1=\dfrac{0.65}{1.34}\approx 0.485$. $\gamma_2=\sigma^2\theta_2=0.3$, so $\rho_2=\dfrac{0.3}{1.34}\approx 0.224$. $\rho_3=0$ (MA(2) cutoff after lag 2).
  23. ARMA & ARIMA
    Write the general **ARMA(p,q)** model and state when it is stationary and invertible.
    ARMA(p,q): $Y_t=\phi_1 Y_{t-1}+\cdots+\phi_p Y_{t-p}+\epsilon_t+\theta_1\epsilon_{t-1}+\cdots+\theta_q\epsilon_{t-q}$, or compactly $\phi(B)Y_t=\theta(B)\epsilon_t$. **Stationarity** is governed entirely by the AR part: roots of $\phi(z)=0$ outside the unit circle. **Invertibility** is governed by the MA part: roots of $\theta(z)=0$ outside the unit circle. The two conditions are separate.
  24. ARMA & ARIMA
    For an **ARMA(1,1)** process $Y_t=\phi Y_{t-1}+\epsilon_t+\theta\epsilon_{t-1}$, describe the shape of the ACF beyond lag 1.
    The lag-1 autocorrelation reflects both the AR and MA terms, but for $k\ge 2$ the ACF satisfies the pure-AR recursion $\rho_k=\phi\,\rho_{k-1}$. So after the first lag the ACF **tails off geometrically** like an AR(1) at rate $\phi$ — it never cuts off. Because both ACF and PACF tail off (neither cuts off), a mixed ARMA is identified by the failure of either to truncate cleanly.
  25. ARMA & ARIMA
    Define an **ARIMA(p,d,q)** model and the role of the differencing order $d$.
    ARIMA(p,d,q) applies an ARMA(p,q) to the $d$-th difference of the series: $\phi(B)(1-B)^d Y_t=\theta(B)\epsilon_t$. The integer $d$ is the number of times you difference $Y_t$ to remove a trend and reach stationarity. With $W_t=(1-B)^d Y_t$, $\{W_t\}$ is a stationary ARMA(p,q). Most economic/loss series need $d=1$ (remove a linear trend); $d=2$ removes a quadratic trend.
  26. ARMA & ARIMA
    Why difference a series, and how does **first differencing** $\nabla Y_t=Y_t-Y_{t-1}$ remove a linear trend?
    Stationary modeling needs a constant mean; a deterministic linear trend $Y_t=a+bt+\epsilon_t$ has a mean that grows with $t$, violating stationarity. First differencing gives $\nabla Y_t=Y_t-Y_{t-1}=b+(\epsilon_t-\epsilon_{t-1})$, whose mean is the constant $b$ — the trend slope becomes a constant level, restoring stationarity. A quadratic trend requires differencing twice ($d=2$).
  27. ARMA & ARIMA
    Show that a **random walk** is an ARIMA(0,1,0), and explain why it is non-stationary.
    A random walk is $Y_t=Y_{t-1}+\epsilon_t$, i.e. $(1-B)Y_t=\epsilon_t$ — differencing once ($d=1$) yields white noise, so it is ARIMA(0,1,0). It is non-stationary because its variance grows without bound: $\operatorname{Var}(Y_t)=t\sigma^2$ (starting from $Y_0=0$), which depends on $t$. It is the boundary case $\phi=1$ of an AR(1), a unit root.
  28. ARMA & ARIMA
    A random walk with drift is $Y_t=Y_{t-1}+\delta+\epsilon_t$ with $\delta=2$ and $\sigma^2=9$, starting at $Y_0=50$. Find $E[Y_3]$ and $\operatorname{Var}(Y_3)$.
    Unrolling: $Y_t=Y_0+\delta t+\sum_{j=1}^{t}\epsilon_j$. Mean: $E[Y_3]=Y_0+\delta\cdot 3=50+2(3)=56$. Variance: $\operatorname{Var}(Y_3)=3\sigma^2=3(9)=27$ (the three independent shocks accumulate; the drift is deterministic). The growing variance confirms non-stationarity — differencing gives the stationary $\nabla Y_t=2+\epsilon_t$.
  29. ACF/PACF identification
    Define the **partial autocorrelation function (PACF)** $\phi_{kk}$ and what it measures.
    The lag-$k$ partial autocorrelation $\phi_{kk}$ is the correlation between $Y_t$ and $Y_{t-k}$ **after removing** the linear effect of the intervening values $Y_{t-1},\dots,Y_{t-k+1}$. It equals the last coefficient $\phi_{kk}$ in the best linear AR($k$) fit. $\phi_{11}=\rho_1$. For an AR(p), $\phi_{kk}=0$ for $k>p$ — the PACF cuts off at lag $p$.
  30. ACF/PACF identification
    State the **Box–Jenkins identification rules** for distinguishing AR(p), MA(q), and ARMA from the ACF and PACF.
    **AR(p):** ACF **tails off** (decays geometrically/sinusoidally); PACF **cuts off** after lag $p$. **MA(q):** ACF **cuts off** after lag $q$; PACF **tails off**. **ARMA(p,q):** **both** ACF and PACF tail off (neither truncates). White noise: both ACF and PACF are ~0 at all lags. Mnemonic: the function that *cuts off* names the model and its order.
  31. ACF/PACF identification
    A sample ACF decays geometrically (lags ~$0.6, 0.36, 0.22,\dots$) and the sample PACF shows a single spike at lag 1 ($\hat\phi_{11}\approx0.6$) then ~0. Identify the model.
    ACF tailing off + PACF cutting off after lag 1 is the textbook signature of an **AR(1)**. The geometric ACF $0.6,0.36,0.22$ matches $\rho_k=\phi^k$ with $\phi\approx 0.6$, and the lone PACF spike at lag 1 confirms $p=1$. Estimated model: $Y_t\approx 0.6\,Y_{t-1}+\epsilon_t$.
  32. ACF/PACF identification
    A sample ACF has a significant spike only at lag 1 ($\hat\rho_1\approx0.45$) and is ~0 thereafter, while the PACF tails off (decaying, alternating). Identify the model.
    ACF cutting off after lag 1 + PACF tailing off is the signature of an **MA(1)**. The single ACF spike at lag 1 sets $q=1$. Solving $\rho_1=\dfrac{\theta}{1+\theta^2}=0.45$ recovers the parameter. Estimated model: $Y_t=\epsilon_t+\theta\epsilon_{t-1}$.
  33. ACF/PACF identification
    A series' ACF decays slowly and nearly linearly (e.g. $0.98, 0.95, 0.92,\dots$), barely dying out. What does this indicate and what is the remedy?
    A very slowly decaying ACF that stays near 1 for many lags signals **non-stationarity** (a trend or near-unit root) — the series is not yet stationary, so AR/MA orders cannot be read off reliably. The remedy is to **difference** the series ($d=1$, or $d=2$ if a single difference still decays slowly) and re-examine the ACF/PACF of the differenced series.
  34. ACF/PACF identification
    Give the approximate **standard error** used to judge whether a sample autocorrelation $\hat\rho_k$ is significant, and the resulting bound.
    Under the white-noise null, $\operatorname{SE}(\hat\rho_k)\approx \dfrac{1}{\sqrt{n}}$ for a series of length $n$. The usual $95\%$ significance band is $\pm \dfrac{1.96}{\sqrt{n}}$ (often drawn as $\pm\dfrac{2}{\sqrt{n}}$). A sample ACF/PACF value outside this band is treated as significantly nonzero; values inside are deemed indistinguishable from zero.
  35. ACF/PACF identification
    With $n=100$ observations, a sample autocorrelation is $\hat\rho_3=0.15$. Is it statistically significant at the $5\%$ level?
    The $95\%$ band is $\pm\dfrac{1.96}{\sqrt{n}}=\pm\dfrac{1.96}{\sqrt{100}}=\pm\dfrac{1.96}{10}=\pm 0.196$. Since $|\hat\rho_3|=0.15<0.196$, it falls **inside** the band, so it is **not** significant — consistent with white noise at lag 3. (A value such as $0.25$ would have been significant.)
  36. Forecasting & diagnostics
    Give the **one-step-ahead forecast** for a stationary AR(1) and its forecast error variance.
    For $Y_t=c+\phi Y_{t-1}+\epsilon_t$, the minimum-MSE forecast made at time $t$ is $\hat Y_{t+1}=c+\phi Y_t$ (the future shock $\epsilon_{t+1}$ has expectation 0). The one-step forecast error is $Y_{t+1}-\hat Y_{t+1}=\epsilon_{t+1}$, so its variance is $\sigma^2$ — the one-step error variance for an AR(1) equals the white-noise variance.
  37. Forecasting & diagnostics
    An AR(1) is $Y_t=4+0.7\,Y_{t-1}+\epsilon_t$ with $\sigma^2=2$, and the latest observation is $Y_{50}=20$. Give the 1-step and 2-step forecasts and their error variances.
    1-step: $\hat Y_{51}=4+0.7(20)=4+14=18$. Error variance $=\sigma^2=2$. 2-step: $\hat Y_{52}=4+0.7\hat Y_{51}=4+0.7(18)=4+12.6=16.6$. The 2-step error is $\epsilon_{52}+\phi\epsilon_{51}$, so its variance $=\sigma^2(1+\phi^2)=2(1+0.49)=2(1.49)=2.98$. Forecasts converge toward the mean $\mu=\dfrac{4}{1-0.7}\approx 13.33$ as the horizon grows.
  38. Forecasting & diagnostics
    Give the general **$h$-step forecast error variance** for a stationary AR(1).
    Writing $Y_{t+h}-\hat Y_{t+h}=\sum_{j=0}^{h-1}\phi^{j}\epsilon_{t+h-j}$, the forecast error variance is $\operatorname{Var}(e_h)=\sigma^2\sum_{j=0}^{h-1}\phi^{2j}=\sigma^2\dfrac{1-\phi^{2h}}{1-\phi^2}$. As $h\to\infty$ this rises to the unconditional variance $\dfrac{\sigma^2}{1-\phi^2}$, so long-horizon forecasts carry the full process uncertainty.
  39. Forecasting & diagnostics
    An MA(1) is $Y_t=\epsilon_t+0.6\epsilon_{t-1}$ with $\sigma^2=4$, and the most recent residual is $\hat\epsilon_t=1.5$. Give the 1-step and 2-step forecasts and their error variances.
    1-step: $\hat Y_{t+1}=E[\epsilon_{t+1}+0.6\epsilon_t]=0.6\hat\epsilon_t=0.6(1.5)=0.9$. Error $=\epsilon_{t+1}$, variance $=\sigma^2=4$. 2-step: $\hat Y_{t+2}=E[\epsilon_{t+2}+0.6\epsilon_{t+1}]=0$ (both shocks are future). Error $=\epsilon_{t+2}+0.6\epsilon_{t+1}$, variance $=\sigma^2(1+0.36)=4(1.36)=5.44$. Beyond the MA order ($h>q=1$) the forecast is just the mean (0 here).
  40. Forecasting & diagnostics
    State the **Ljung–Box** statistic and how it is used to check residuals.
    $Q=n(n+2)\displaystyle\sum_{k=1}^{m}\dfrac{\hat\rho_k^2}{n-k}$, where $\hat\rho_k$ are the residual autocorrelations, $n$ is the sample size, and $m$ is the number of lags tested. Under the null of no residual autocorrelation (an adequate model), $Q\sim\chi^2$ with $m-(p+q)$ degrees of freedom. A large $Q$ (small p-value) rejects adequacy — the residuals still carry structure, so the model needs more AR/MA terms.
  41. Forecasting & diagnostics
    From a fitted ARMA(1,1) on $n=120$ residuals, the residual autocorrelations at lags 1–4 are $0.05, -0.08, 0.03, 0.06$. Compute the Ljung–Box $Q$ for $m=4$ and state the degrees of freedom.
    $Q=n(n+2)\sum_{k=1}^{4}\dfrac{\hat\rho_k^2}{n-k}=120(122)\Big[\dfrac{0.05^2}{119}+\dfrac{(-0.08)^2}{118}+\dfrac{0.03^2}{117}+\dfrac{0.06^2}{116}\Big]$. Terms: $\dfrac{0.0025}{119}=2.101\times10^{-5}$, $\dfrac{0.0064}{118}=5.424\times10^{-5}$, $\dfrac{0.0009}{117}=7.692\times10^{-6}$, $\dfrac{0.0036}{116}=3.103\times10^{-5}$; sum $=1.140\times10^{-4}$. $Q=14640(1.140\times10^{-4})\approx 1.67$. df $=m-(p+q)=4-2=2$. Since $\chi^2_{0.95,2}=5.99>1.67$, do not reject — residuals look like white noise.
  42. Forecasting & diagnostics
    Define the **AIC** for ARMA order selection and how to use it.
    AIC $=-2\ln(\hat L)+2k$, where $\hat L$ is the maximized likelihood and $k$ is the number of estimated parameters (e.g. $p+q$ plus intercept and $\sigma^2$). It trades goodness of fit against complexity. Among candidate models, choose the one with the **lowest** AIC. The BIC variant uses penalty $k\ln n$ instead of $2k$, penalizing parameters more heavily and thus favoring more parsimonious models in large samples.
  43. Forecasting & diagnostics
    Two candidate models are fitted: ARMA(1,0) with maximized log-likelihood $\ln\hat L=-145.0$ ($k=2$) and ARMA(2,1) with $\ln\hat L=-142.5$ ($k=4$). Which does AIC prefer?
    AIC $=-2\ln\hat L+2k$. ARMA(1,0): $-2(-145.0)+2(2)=290.0+4=294.0$. ARMA(2,1): $-2(-142.5)+2(4)=285.0+8=293.0$. The ARMA(2,1) has the lower AIC ($293.0<294.0$), so AIC prefers it — the $2.5$ gain in log-likelihood outweighs the extra-parameter penalty.
  44. Forecasting & diagnostics
    Outline the three stages of the **Box–Jenkins** modeling cycle.
    1. **Identification:** make the series stationary (difference if needed, choose $d$), then read the sample ACF/PACF to propose tentative $p$ and $q$. 2. **Estimation:** fit the candidate ARIMA(p,d,q) by maximum likelihood (or least squares), estimating $\phi$, $\theta$, and $\sigma^2$; compare models with AIC/BIC. 3. **Diagnostic checking:** verify residuals are white noise (residual ACF inside bands, Ljung–Box not rejected). If checks fail, return to step 1 with a revised order; if they pass, forecast.