Revision as of 16:20, 10 April 2008 by Mboschru (Talk)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Introduction

Fuzzy c-means is a method of clustering, which allows one piece of data belong to two or more clusters.

The algorithm

Fuzzy c-means method (Bezdek, 1981) is frequently used in pattern recognition. It is based on minimization of the following objective function, with respect to U, a fuzzy c-partition of the data set, and to V, a set of K prototypes:

$ J_m(U,V)=sum_{j=1}^n sum_{i=1}^c(u_{ij}^m||X_j-V_i||) $, $  1\leq m<\infty $.

Where m is any real number greater than 1, $ u_{ij} $ is the degree of membership of $ X_j $ in the cluster i, $ X_i $ is the jth of d-dimensional measured data, $ V_i $ is the d-dimensional center of the cluster, and ||*|| is any norm expressed the similarity between any measured data and the center. Fuzzy partition is carried out through an iterative optimization of the previous expression with the update of membership $ u_{ij} $ and the clusters centers $ V_i $ by:

$ u_{ij}\frac{1}{sum_{k=1}^c (\frac{d_{ij}}{d_{ik}})^\frac{2}{m-1}} $
 $ V_i=\frac{sum_{k=1}^n u_{ij}^m X_j}{sum_{k=1}^n u_{ij}^m} $.

The criteria in this iteration will stop when $ max_{ij}|u_{ij}-\hat{u_{ij}}|<\epsilon $, where \epsilon is a termination criterion between 0 and 1, whereas k are the iteration steps. This procedure coverges to a local minimum or a saddle point of J_m.

Alumni Liaison

Prof. Math. Ohio State and Associate Dean
Outstanding Alumnus Purdue Math 2008

Jeff McNeal