Line 1: Line 1:
 
Note: Most tree growing methods favor greatest impurity reduction near the root node.
 
Note: Most tree growing methods favor greatest impurity reduction near the root node.
 +
 
Ex.
 
Ex.
 
[[Image:Lecture22_DecisionTree_OldKiwi.JPG]]
 
[[Image:Lecture22_DecisionTree_OldKiwi.JPG]]
  
 
To assign category to a leaf node.
 
To assign category to a leaf node.
 +
 
Easy!  
 
Easy!  
 +
 
If sample data is pure
 
If sample data is pure
  => assign this class to leaf.
+
 
 +
-> assign this class to leaf.
 +
 
 
else
 
else
  => assign the most frequent class.
+
 
 +
-> assign the most frequent class.
  
 
Note: Problem of building decision tree is "ill-conditioned"
 
Note: Problem of building decision tree is "ill-conditioned"
 +
 
i.e. small variance in the training data can yield large variations in decision rules obtained.
 
i.e. small variance in the training data can yield large variations in decision rules obtained.
  
 
Ex. p.405(D&H)
 
Ex. p.405(D&H)
 +
 
A small move of one sample data can change the decision rules a lot.
 
A small move of one sample data can change the decision rules a lot.
  
 
Reference about clustering
 
Reference about clustering
 +
 
"Data clustering, a review" A. K. Jain, M. N.
 
"Data clustering, a review" A. K. Jain, M. N.

Revision as of 10:43, 3 April 2008

Note: Most tree growing methods favor greatest impurity reduction near the root node.

Ex. Lecture22 DecisionTree OldKiwi.JPG

To assign category to a leaf node.

Easy!

If sample data is pure

-> assign this class to leaf.

else

-> assign the most frequent class.

Note: Problem of building decision tree is "ill-conditioned"

i.e. small variance in the training data can yield large variations in decision rules obtained.

Ex. p.405(D&H)

A small move of one sample data can change the decision rules a lot.

Reference about clustering

"Data clustering, a review" A. K. Jain, M. N.

Alumni Liaison

Basic linear algebra uncovers and clarifies very important geometry and algebra.

Dr. Paul Garrett