Line 26: | Line 26: | ||
Reference about clustering | Reference about clustering | ||
− | "Data clustering, a review," A.K. Jain, M.N. Murty, P.J. Flynn | + | "Data clustering, a review," A.K. Jain, M.N. Murty, P.J. Flynn[http://www.cs.rutgers.edu/~mlittman/courses/lightai03/jain99data.pdf] |
"Algorithms for clustering data," A.K. Jain, R.C. Dibes[http://www.cse.msu.edu/~jain/Clustering_Jain_Dubes.pdf] | "Algorithms for clustering data," A.K. Jain, R.C. Dibes[http://www.cse.msu.edu/~jain/Clustering_Jain_Dubes.pdf] | ||
Line 34: | Line 34: | ||
"Dynamic cluster formation using level set methods," Yip, Ding, Chan[http://ieeexplore.ieee.org/iel5/34/34099/01624353.pdf?arnumber=1624353] | "Dynamic cluster formation using level set methods," Yip, Ding, Chan[http://ieeexplore.ieee.org/iel5/34/34099/01624353.pdf?arnumber=1624353] | ||
− | What is clustering? | + | ==What is clustering?== |
The task of finding "natural " groupings in a data set. | The task of finding "natural " groupings in a data set. | ||
Synonymons="unsupervised learning" | Synonymons="unsupervised learning" |
Revision as of 10:56, 3 April 2008
Note: Most tree growing methods favor greatest impurity reduction near the root node.
To assign category to a leaf node.
Easy!
If sample data is pure
-> assign this class to leaf.
else
-> assign the most frequent class.
Note: Problem of building decision tree is "ill-conditioned"
i.e. small variance in the training data can yield large variations in decision rules obtained.
Ex. p.405(D&H)
A small move of one sample data can change the decision rules a lot.
Reference about clustering
"Data clustering, a review," A.K. Jain, M.N. Murty, P.J. Flynn[1]
"Algorithms for clustering data," A.K. Jain, R.C. Dibes[2]
"Support vector clustering," Ben-Hur, Horn, Siegelmann, Vapnik [3]
"Dynamic cluster formation using level set methods," Yip, Ding, Chan[4]
What is clustering?
The task of finding "natural " groupings in a data set.
Synonymons="unsupervised learning"