(New page: We have seen Nearest Neighbor (NN) error rate as the number of samples approaches infinity is <math>P=\int(1-\sum_{i=1}^c P^2(w_i|\vec{x}))p(\vec{x})d\vec{x}</math> We would like to be ab...)
 
Line 12: Line 12:
  
 
'''Claim 1:''' If <math>P^*</math> is low, then <math>P\approx 2P^*</math> (Assumes <math>\infty</math> number of samples.)
 
'''Claim 1:''' If <math>P^*</math> is low, then <math>P\approx 2P^*</math> (Assumes <math>\infty</math> number of samples.)
 +
 +
Justification: <math>P^*(e|\vec{x})=1-P(w_{max}|\vec{x})</math>, where <math>w_{max}</math> is such that <math>P(w_{max}|\vec{x})\geq P(w_j|\vec{x}),\forall j</math>
 +
 +
So, <math>P^*</math> low => <math>p^*(e|\vec{x})</math> is low for almost every x.
 +
 +
=> <math>P(w_{max}|\vec{x})</math> is close to 1 for almost every x.
 +
 +
We have <math>P=\int(1-\sum_{i=1}^cP^2(w_i|\vec{x}))p(\vec{x})d\vec{x}</math> and for almost every x, <math>1-\sum_{i=1}^cP^2(w_i|\vec{x})\approx 1-P^2(w_{max}|\vec{x})\approx 2(1-P(w_{max}|\vec{x}))</math>, by Taylor expansion
 +
 +
<math>=2(P^*(e|\vec{x})</math>
 +
 +
=> <math>P\approx\int 2P^*(e|\vec{x})p(\vec{x})d\vec{x}=2P^*</math>
 +
 +
'''Claim 2:''' (To be added later.)

Revision as of 17:44, 25 March 2008

We have seen Nearest Neighbor (NN) error rate as the number of samples approaches infinity is $ P=\int(1-\sum_{i=1}^c P^2(w_i|\vec{x}))p(\vec{x})d\vec{x} $

We would like to be able to answer two questions:

1) How good is that in terms of error rate?

2) How does it compare to Bayes, the best error rate we can achieve?

Recall error rate is $ P(e)=\int P(e|\vec{x})p(\vec{x})d\vec{x} $. For all x, Bayes rule yields minimum possible $ P(e|\vec{x})=:P^*(e|\vec{x}) $

Thus, we get the minimum $ P(e)=:P^*=\int P^*(e|\vec{x})p(\vec{x})d\vec{x} $

Claim 1: If $ P^* $ is low, then $ P\approx 2P^* $ (Assumes $ \infty $ number of samples.)

Justification: $ P^*(e|\vec{x})=1-P(w_{max}|\vec{x}) $, where $ w_{max} $ is such that $ P(w_{max}|\vec{x})\geq P(w_j|\vec{x}),\forall j $

So, $ P^* $ low => $ p^*(e|\vec{x}) $ is low for almost every x.

=> $ P(w_{max}|\vec{x}) $ is close to 1 for almost every x.

We have $ P=\int(1-\sum_{i=1}^cP^2(w_i|\vec{x}))p(\vec{x})d\vec{x} $ and for almost every x, $ 1-\sum_{i=1}^cP^2(w_i|\vec{x})\approx 1-P^2(w_{max}|\vec{x})\approx 2(1-P(w_{max}|\vec{x})) $, by Taylor expansion

$ =2(P^*(e|\vec{x}) $

=> $ P\approx\int 2P^*(e|\vec{x})p(\vec{x})d\vec{x}=2P^* $

Claim 2: (To be added later.)

Alumni Liaison

Followed her dream after having raised her family.

Ruth Enoch, PhD Mathematics