Exam MAS-II — Tree-Based Machine Learning Practice Flashcards

Thirty exam-realistic multiple-choice problems on CAS Exam MAS-II tree-based machine learning — node impurity (Gini, cross-entropy, classification error) and split gain, cost-complexity pruning, bagging and out-of-bag error, random-forest predictor subsetting, gradient boosting with shrinkage, confusion-matrix metrics, ROC/AUC, cross-validation, and variable importance — each with a fully worked solution.

8 free sample30 total · in appFree · fact-checked · LaTeX math

Tap card or press Space to flip

Answer

Unlock the full set

You're studying a free 8-problem sample. All 30 Tree-Based Machine Learning practice problems — plus every other Exam MAS-II subject and spaced-repetition scheduling — are built into the Willys AI Flashcards & Quizzes app. 14-day free trial, then $14.99.

Get the app — free 14-day trial

Every deck is built into the Willys app

All of these decks — including the full practice problem banks — come built into Willys AI Flashcards & Quizzes for iPhone & iPad (Mac version coming soon), with FSRS + SM-2 spaced repetition, streaks, and exam-date cram mode. 14-day free trial, then $14.99. To load a deck in the app: Settings → Library → Browse, then pick your exam and deck.

Download on the App Store

More Exam MAS-II decks:

Bayesian Analysis Bayesian Analysis Practice Credibility Credibility Practice Generalized Linear Models Generalized Linear Models Practice

← All Exam MAS-II decks

Browse all 30 problems as a list

Decision trees
A classification-tree node holds $40$ observations: $28$ of class A and $12$ of class B. Calculate the Gini index of the node. (A) $0.30$ (B) $0.42$ (C) $0.48$ (D) $0.51$ (E) $0.70$
**Answer: (B).** Proportions: $\hat p_A=\dfrac{28}{40}=0.7$ and $\hat p_B=\dfrac{12}{40}=0.3$. Gini $G=\sum_k \hat p_k(1-\hat p_k)=1-\sum_k \hat p_k^{2}=1-(0.7^{2}+0.3^{2})=1-(0.49+0.09)=1-0.58=0.42$. Distractor (E) $0.70$ is just $\hat p_A$ (forgetting the formula); (A) $0.30$ is the classification error $1-\max_k\hat p_k=1-0.7$; (D) $\approx 0.51$ comes from mis-squaring as $1-(0.7^2-0.3^2)$ or similar arithmetic slips.
Decision trees
A classification-tree node holds $50$ observations split across three classes as A$:25$, B$:15$, C$:10$. Calculate the cross-entropy (deviance) of the node using natural logs. (A) $0.62$ (B) $0.95$ (C) $1.03$ (D) $1.10$ (E) $1.50$
**Answer: (C).** Proportions: $\hat p_A=0.5$, $\hat p_B=0.3$, $\hat p_C=0.2$. Cross-entropy $D=-\sum_k \hat p_k\ln\hat p_k = -[0.5\ln 0.5 + 0.3\ln 0.3 + 0.2\ln 0.2]$. $0.5\ln 0.5 = 0.5(-0.693147)=-0.346574$; $0.3\ln 0.3 = 0.3(-1.203973)=-0.361192$; $0.2\ln 0.2 = 0.2(-1.609438)=-0.321888$. $D = -(-0.346574-0.361192-0.321888)\approx 1.0297$, i.e. $\approx 1.03$. Distractor (A) $0.62$ is the Gini index $1-(0.25+0.09+0.04)$ of the same node — a Gini-vs-entropy mix-up.
Decision trees
A node of $40$ observations ($24$ class A, $16$ class B) is split into a left child of $20$ ($18$ A, $2$ B) and a right child of $20$ ($6$ A, $14$ B). Calculate the Gini gain (reduction in Gini index) from the split. (A) $0.12$ (B) $0.18$ (C) $0.30$ (D) $0.42$ (E) $0.48$
**Answer: (B).** Parent: $\hat p_A=0.6$, $\hat p_B=0.4$, $G_{\text{parent}}=1-(0.36+0.16)=0.48$. Left: $\hat p_A=0.9$, $G_L=1-(0.81+0.01)=0.18$. Right: $\hat p_A=0.3$, $G_R=1-(0.09+0.49)=0.42$. Weight by node size: weighted child Gini $=\dfrac{20}{40}(0.18)+\dfrac{20}{40}(0.42)=0.09+0.21=0.30$. Gini gain $=G_{\text{parent}}-0.30=0.48-0.30=0.18$. Distractor (C) $0.30$ is the post-split weighted Gini itself (forgetting to subtract from the parent); (E) $0.48$ is the parent Gini alone.
Decision trees
A regression-tree region holds the responses $\{3,\,5,\,8,\,12\}$. A candidate split sends $\{3,5\}$ to the left leaf and $\{8,12\}$ to the right. Calculate the total RSS after the split. (A) $4$ (B) $10$ (C) $18$ (D) $42$ (E) $46$
**Answer: (B).** Left leaf mean $=\dfrac{3+5}{2}=4$; $\text{RSS}_L=(3-4)^2+(5-4)^2=1+1=2$. Right leaf mean $=\dfrac{8+12}{2}=10$; $\text{RSS}_R=(8-10)^2+(12-10)^2=4+4=8$. Total RSS $=2+8=10$. For reference the parent mean is $\dfrac{3+5+8+12}{4}=7$ with RSS $=16+4+1+25=46$ (distractor E), so the split cuts RSS from $46$ to $10$.
Decision trees
A grown regression tree has training $\text{RSS}=120$ with $|T|=8$ terminal nodes. A candidate pruned subtree has $\text{RSS}=170$ with $|T|=3$. Using cost-complexity score $\text{RSS}+\alpha|T|$ at $\alpha=12$, which tree is selected? (A) The full tree, score $216$ (B) The full tree, score $206$ (C) The pruned subtree, score $206$ (D) The pruned subtree, score $182$ (E) They tie at $206$
**Answer: (C).** Full tree score $=120+12(8)=120+96=216$. Pruned subtree score $=170+12(3)=170+36=206$. Since $206<216$, the **pruned** subtree is selected at $\alpha=12$. Distractor (A) reports the full-tree score but picks the wrong tree; (D) $182=170+12(1)$ uses $|T|=1$ instead of $3$. At $\alpha=0$ the full tree (score $120$) would win — larger $\alpha$ pushes the choice toward the smaller tree.
Decision trees
A node of $100$ observations ($50$ A, $50$ B) is split into a left child of $60$ ($45$ A, $15$ B) and a right child of $40$ ($5$ A, $35$ B). Using classification error as the impurity measure, calculate the reduction in error from the split. (A) $0.05$ (B) $0.20$ (C) $0.25$ (D) $0.30$ (E) $0.50$
**Answer: (D).** Parent error $E=1-\max(0.5,0.5)=0.5$. Left: $\hat p_A=\dfrac{45}{60}=0.75$, $E_L=1-0.75=0.25$. Right: $\hat p_B=\dfrac{35}{40}=0.875$, $E_R=1-0.875=0.125$. Weighted child error $=\dfrac{60}{100}(0.25)+\dfrac{40}{100}(0.125)=0.15+0.05=0.20$. Reduction $=0.50-0.20=0.30$. Distractor (B) $0.20$ is the weighted post-split error itself (no subtraction); (C) $0.25$ is the left child's error alone.
Bagging & OOB
In bagging, what is the limiting fraction of observations that are out-of-bag (left out of a given bootstrap sample) as the sample size $n\to\infty$? (A) $0.250$ (B) $0.333$ (C) $0.368$ (D) $0.500$ (E) $0.632$
**Answer: (C).** In one bootstrap draw of size $n$ (with replacement), a specific observation is missed on a single draw with probability $1-\dfrac{1}{n}$, and missed on all $n$ draws with probability $\left(1-\dfrac{1}{n}\right)^{n}$. As $n\to\infty$, $\left(1-\dfrac{1}{n}\right)^{n}\to e^{-1}\approx 0.368$. So about $36.8\%$ of observations are out-of-bag. Distractor (E) $0.632\approx 1-e^{-1}$ is the **in-bag** fraction; (B) $\tfrac{1}{3}$ is the common rounded value but the exact limit is $e^{-1}$.
Bagging & OOB
For a training set of $n=5$ observations, calculate the exact probability that a given observation is out-of-bag for one bootstrap sample. (A) $0.200$ (B) $0.328$ (C) $0.368$ (D) $0.410$ (E) $0.590$
**Answer: (B).** $P(\text{out-of-bag})=\left(1-\dfrac{1}{n}\right)^{n}=\left(1-\dfrac{1}{5}\right)^{5}=0.8^{5}$. $0.8^{2}=0.64$; $0.8^{3}=0.512$; $0.8^{4}=0.4096$; $0.8^{5}=0.32768$. So $P\approx 0.328$, already approaching the limiting $e^{-1}\approx 0.368$ (distractor C) but not equal to it for finite $n$. Distractor (A) $0.2=\tfrac{1}{n}$ is the single-draw miss probability for one specific draw position, not the all-draws probability.