INTRODUCTION TO MACHINE LEARNING
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
EL-GY 6143/CS-GY 6923: INTRODUCTION TO MACHINE LEARNING
qDecision tree as constrained space partition
qRegression tree design
qDecision tree pruning
qClassification tree design
qBagging
qRandom Forest
qFeature ranking from random forest
2
NYU
WIRELESS
Decision tree as constrained
space partitioning
3
Fig. 9.2 in ESL
• Each region is regressed/classified to
the same value
• The partition can be specified
sequentially by splitting the range of
one feature at a time.
• The splitting rule can be described by
a tree.
• Each leaf node = One region
• Size of tree = number of leaf
nodes
• The partition is constrained: only
rectangles in the 2D case.
• The top left partition cannot be
realized by a decision tree.
NYU
WIRELESS
How to build a decision tree?
(Regression Case)
qGoal: minimize RSS =#!"#$ #%!∈'" ( − &! )
qGreedy algorithm:
qStart with a single region (entire space) and iterate:
qFor each region !, select a feature *, and a
splitting threshold , such that splitting + with the
criterion * < produces the largest decrease in
RSS in !
◦ Exhaustive search: for each !, try all possible s in the
current range of ! in "
qStop splitting a region if it contains <= !,(
samples
4
# $
All possible splits of #:
$
NYU
WIRELESS
Overfitting
qDecision tree is very prone to overfitting
qCan exactly represent any function defined by the training set by having as many regions (or
leaf nodes) as needed (Fully grown tree)
qHow to control overfitting?
◦ Find optimal subtree (with a certain constraint on the minimum number of samples in the leaf nodes or
maximum depth) by cross validation: too many possibilities
◦ Stop growing once RSS stop decreasing by a threshold with any new cut:
◦ Not good because we use greedy search. It is possible to find a good cut after a bad one.
◦ Better idea: grow a full tree first, then prune the tree.