Difference between revisions of "Lecture 22 - Decision Trees and Clustering OldKiwi" - Rhea

Revision as of 10:54, 3 April 2008

Note: Most tree growing methods favor greatest impurity reduction near the root node.

Ex.

To assign category to a leaf node.

Easy!

If sample data is pure

-> assign this class to leaf.

else

-> assign the most frequent class.

Note: Problem of building decision tree is "ill-conditioned"

i.e. small variance in the training data can yield large variations in decision rules obtained.

Ex. p.405(D&H)

A small move of one sample data can change the decision rules a lot.

Reference about clustering

"Data clustering, a review," A.K. Jain, M.N. Murty, P.J. Flynn

"Algorithms for clustering data," A.K. Jain, R.C. Dibes[1]

"Support vector clustering," Ben-Hur, Horn, Siegelmann, Vapnik [2]

"Dynamic cluster formation using level set methods," Yip, Ding, Chan[3]

What is clustering?

The task of finding "natural " groupings in a data set.

Synonymons="unsupervised learning"

@@ Line 26: / Line 26: @@
 Reference about clustering
-"Data clustering, a review" A. K. Jain, M. N.
+"Data clustering, a review," A.K. Jain, M.N. Murty, P.J. Flynn
+"Algorithms for clustering data," A.K. Jain, R.C. Dibes[http://www.cse.msu.edu/~jain/Clustering_Jain_Dubes.pdf]
+"Support vector clustering," Ben-Hur, Horn, Siegelmann, Vapnik [http://jmlr.csail.mit.edu/papers/volume2/horn01a/rev1/horn01ar1.pdf]
+"Dynamic cluster formation using level set methods," Yip, Ding, Chan[http://ieeexplore.ieee.org/iel5/34/34099/01624353.pdf?arnumber=1624353]
+What is clustering?
+The task of finding "natural " groupings in a data set.
+Synonymons="unsupervised learning"