Difference between revisions of "Lecture 22 - Decision Trees and Clustering OldKiwi" - Rhea

Revision as of 10:56, 3 April 2008

Note: Most tree growing methods favor greatest impurity reduction near the root node.

Ex.

To assign category to a leaf node.

Easy!

If sample data is pure

-> assign this class to leaf.

else

-> assign the most frequent class.

Note: Problem of building decision tree is "ill-conditioned"

i.e. small variance in the training data can yield large variations in decision rules obtained.

Ex. p.405(D&H)

A small move of one sample data can change the decision rules a lot.

Reference about clustering

"Data clustering, a review," A.K. Jain, M.N. Murty, P.J. Flynn[1]

"Algorithms for clustering data," A.K. Jain, R.C. Dibes[2]

"Support vector clustering," Ben-Hur, Horn, Siegelmann, Vapnik [3]

"Dynamic cluster formation using level set methods," Yip, Ding, Chan[4]

The task of finding "natural " groupings in a data set.

Synonymons="unsupervised learning"

@@ Line 26: / Line 26: @@
 Reference about clustering
-"Data clustering, a review," A.K. Jain, M.N. Murty, P.J. Flynn
+"Data clustering, a review," A.K. Jain, M.N. Murty, P.J. Flynn[http://www.cs.rutgers.edu/~mlittman/courses/lightai03/jain99data.pdf]
 "Algorithms for clustering data," A.K. Jain, R.C. Dibes[http://www.cse.msu.edu/~jain/Clustering_Jain_Dubes.pdf]
@@ Line 34: / Line 34: @@
 "Dynamic cluster formation using level set methods," Yip, Ding, Chan[http://ieeexplore.ieee.org/iel5/34/34099/01624353.pdf?arnumber=1624353]
-What is clustering?
+==What is clustering?==
 The task of finding "natural " groupings in a data set.
 Synonymons="unsupervised learning"