Revision as of 16:22, 6 April 2008 by Mboschru (Talk)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Introduction

This approach consist in using certain models for clusters and attempting to optimize the fit between the data and the model. In practice, each cluster can be mathematically represented by a parametric distribution, like a Gaussian. Thus, the entire data set is modelled by a mixture of these distributions.

Mixture of Gaussians

This is the most widely used clustering method of this kind is the one based on learning a mixture of Gaussians. The algorithms works in this way:

1) it chosses the component (the Gaussian) at random with probability $ P(w_i) $.

2) it samples a point $ N(m_i,std^2I) $ So basically it is trying to choose clusters so that the average distance between the data vectors and the chosen means (basically the model parameter) is a minimum. Tesentation of the way the data is distributed in the space.

3) Using Expectation-Maximization_OldKiwi algorithm we maximise the likelihood function.

Alumni Liaison

Ph.D. on Applied Mathematics in Aug 2007. Involved on applications of image super-resolution to electron microscopy

Francisco Blanco-Silva