Revision as of 11:30, 7 April 2008 by Tha (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

In the K Nearest Neighbor (K-NN)technique, we need to store n samples of the training set. If n is very large, this storage will require a big memory. Moreover, the computation for the distances from a new sample to n training samples will be very huge. Editting technique is a way to reduce this memory and computation expenses.

The basic idea of this technique is to remove away from the training set any samples which contribute to the misclassification.

Figure 1 below shows the overlapping distributions of the training set of two classes along with their Bayes boundary. In the figure, in each region, we can observe some samples which are misclassified by Bayes rule. Removing these misclassfied sample will generate two homogeneous sets of samples with the a nearest neighbor decision boundary approximate the Bayes decision boundary (Fig. 2).

Fig1 OldKiwi.jpg

Fig2 OldKiwi.jpg

The followings are the algorithm of the editing technique for the K-NN rule:

1. Divide the training set to N partitions: $ R_1,R_2,...,R_N $. 2. Classify the samples in the set $ Insert formula here $

Alumni Liaison

Recent Math PhD now doing a post-doctorate at UC Riverside.

Kuei-Nuan Lin