(New page: ==Clustering Methods== ==Algorithms for clustering from feature vector== Called "Partitional clustering" in Jain and Dude as opposed to "hierarchical clustering" Clustering feature v...) |
(→Algorithms for clustering from feature vector) |
||
Line 14: | Line 14: | ||
We have a set of points S | We have a set of points S | ||
Want to find subsets <math>S_1 , S_2 , \cdots , S_c</math> such that <math>S=S_1 \cup S_2 \cup \cdots \cup S_c</math> and <math>S_i \cap S_j = \phi</math> | Want to find subsets <math>S_1 , S_2 , \cdots , S_c</math> such that <math>S=S_1 \cup S_2 \cup \cdots \cup S_c</math> and <math>S_i \cap S_j = \phi</math> | ||
+ | |||
+ | need to define "clustering criteria" i.e., a measure of how natural the clustering is. | ||
+ | |||
+ | if c=2, | ||
+ | |||
+ | consider | ||
+ | |||
+ | <math>J=tr(S_m ^{-1} S_w)</math> | ||
+ | |||
+ | where <math>S_w</math> is "within class scatter matrix" | ||
+ | |||
+ | <math>S_w= \sum _{i=1, X_i \in S_1} ^d ||X_i - \mu _1||^2 + \sum _{i=1, X_i \in S_2} ^d ||X_i - \mu _2||^2</math> (2-1) | ||
+ | |||
+ | <math>\mu _1 = \frac{1}{|S_1|} \sun _{X_i \in S_1} X_i</math>, <math>\mu _2 = \frac{1}{|S_2|} \sun _{X_i \in S_2} X_i</math> (2-2) | ||
+ | |||
+ | <math>S_m</math> is "mixture scatter matrix" | ||
+ | |||
+ | <math>S_m= \sum _{i=1} ^d ||X_i - \mu||^2</math>, <math>\mu = \frac{1}{d} \sum _{i=1} ^{d} X_i</math> (2-3) | ||
+ | |||
+ | Try to find <math>S_1</math> and <math>S_2</math> that minimize J. | ||
+ | |||
+ | Exhaustive search procedure | ||
+ | |||
+ | Examnple with 6 pattern <math>X_1 , X_2 , \cdots , X_6</math> | ||
+ | |||
+ | List all partition of 6 points into 2 sets. | ||
+ | |||
+ | <<Someting>> | ||
+ | |||
+ | |||
+ | Evaluate J for each partition | ||
+ | |||
+ | Cansider other J's | ||
+ | |||
+ | <math>J=ln|S_m ^{-1} S_w|</math> (2-4) | ||
+ | |||
+ | <math>J=tr (S_m) - \mu (tr S_w -c)</math> (2-5) | ||
+ | |||
+ | C is fixed constant, <math>\mu</math> : Largrange multiplier | ||
+ | |||
+ | <math>J=\frac{tr(S_m)}{tr(S_w)}</math> (2-6) | ||
+ | |||
+ | To speed up search "use iterative procedure" | ||
+ | |||
+ | Pick a partition at random | ||
+ | |||
+ | <math>S_1={X_1, X_2, X_3}, S_2= {X_4, X_5, X_6}</math> (2-7) | ||
+ | |||
+ | Compute J | ||
+ | |||
+ | Consider effect of moving <math>X_1</math> into <math>S_2 \Rightarrow \Delta J_{12}</math>, <math>X_2</math> into <math>S_2 \Rightarrow \Delta J_{22}</math>, <math>X_3</math> into <math>S_3 \Rightarrow \Delta J_{33}</math>, <math>X_4</math> into <math>S_1 \Rightarrow \Delta J_{41}</math>, <math>X_5</math> into <math>S_1 \Rightarrow \Delta J_{51}</math>, <math>X_6</math> into <math>S_1 \Rightarrow \Delta J_{61}</math> | ||
+ | |||
+ | Apply (Simultaneously) all the moves for which <math>\Delta J</math> is negative, repeat procedure | ||
+ | |||
+ | OR | ||
+ | |||
+ | Apply the move for which <math>\Delta J</math> is the most negative repeat procedure | ||
+ | |||
+ | Convergence? | ||
+ | |||
+ | If convergence, global minimum? No idea | ||
+ | |||
+ | If c>2, can use similar procedure | ||
+ | |||
+ | If c is unknown, try c=2,3,4, etc (hierarchical clustering) | ||
+ | |||
+ | Look at evolution of J for c increase (similarity scale) |
Revision as of 07:10, 16 April 2008
Clustering Methods
Algorithms for clustering from feature vector
Called "Partitional clustering" in Jain and Dude as opposed to "hierarchical clustering"
Clustering feature vectors = finding separation between clusters but these are not known
<<Picture>>
We have a set of points S Want to find subsets $ S_1 , S_2 , \cdots , S_c $ such that $ S=S_1 \cup S_2 \cup \cdots \cup S_c $ and $ S_i \cap S_j = \phi $
need to define "clustering criteria" i.e., a measure of how natural the clustering is.
if c=2,
consider
$ J=tr(S_m ^{-1} S_w) $
where $ S_w $ is "within class scatter matrix"
$ S_w= \sum _{i=1, X_i \in S_1} ^d ||X_i - \mu _1||^2 + \sum _{i=1, X_i \in S_2} ^d ||X_i - \mu _2||^2 $ (2-1)
$ \mu _1 = \frac{1}{|S_1|} \sun _{X_i \in S_1} X_i $, $ \mu _2 = \frac{1}{|S_2|} \sun _{X_i \in S_2} X_i $ (2-2)
$ S_m $ is "mixture scatter matrix"
$ S_m= \sum _{i=1} ^d ||X_i - \mu||^2 $, $ \mu = \frac{1}{d} \sum _{i=1} ^{d} X_i $ (2-3)
Try to find $ S_1 $ and $ S_2 $ that minimize J.
Exhaustive search procedure
Examnple with 6 pattern $ X_1 , X_2 , \cdots , X_6 $
List all partition of 6 points into 2 sets.
<<Someting>>
Evaluate J for each partition
Cansider other J's
$ J=ln|S_m ^{-1} S_w| $ (2-4)
$ J=tr (S_m) - \mu (tr S_w -c) $ (2-5)
C is fixed constant, $ \mu $ : Largrange multiplier
$ J=\frac{tr(S_m)}{tr(S_w)} $ (2-6)
To speed up search "use iterative procedure"
Pick a partition at random
$ S_1={X_1, X_2, X_3}, S_2= {X_4, X_5, X_6} $ (2-7)
Compute J
Consider effect of moving $ X_1 $ into $ S_2 \Rightarrow \Delta J_{12} $, $ X_2 $ into $ S_2 \Rightarrow \Delta J_{22} $, $ X_3 $ into $ S_3 \Rightarrow \Delta J_{33} $, $ X_4 $ into $ S_1 \Rightarrow \Delta J_{41} $, $ X_5 $ into $ S_1 \Rightarrow \Delta J_{51} $, $ X_6 $ into $ S_1 \Rightarrow \Delta J_{61} $
Apply (Simultaneously) all the moves for which $ \Delta J $ is negative, repeat procedure
OR
Apply the move for which $ \Delta J $ is the most negative repeat procedure
Convergence?
If convergence, global minimum? No idea
If c>2, can use similar procedure
If c is unknown, try c=2,3,4, etc (hierarchical clustering)
Look at evolution of J for c increase (similarity scale)