Lecture 19 - Nearest Neighbor Error Rates OldKiwi - Rhea

We have seen Nearest Neighbor (NN) error rate as the number of samples approaches infinity is $P=\int(1-\sum_{i=1}^c P^2(w_i|\vec{x}))p(\vec{x})d\vec{x}$

We would like to be able to answer two questions:

1) How good is that in terms of error rate?

2) How does it compare to Bayes, the best error rate we can achieve?

Recall error rate is $P(e)=\int P(e|\vec{x})p(\vec{x})d\vec{x}$ . For all x, Bayes rule yields minimum possible $P(e|\vec{x})=:P^*(e|\vec{x})$

Thus, we get the minimum $P(e)=:P^*=\int P^*(e|\vec{x})p(\vec{x})d\vec{x}$

Claim 1: If $$ P^* $$ is low, then $P\approx 2P^*$ (Assumes $\infty$ number of samples.)

Justification: $P^*(e|\vec{x})=1-P(w_{max}|\vec{x})$ , where $w_{max}$ is such that $P(w_{max}|\vec{x})\geq P(w_j|\vec{x}),\forall j$

So, $$ P^* $$ low => $p^*(e|\vec{x})$ is low for almost every x.

=> $P(w_{max}|\vec{x})$ is close to 1 for almost every x.

We have $P=\int(1-\sum_{i=1}^cP^2(w_i|\vec{x}))p(\vec{x})d\vec{x}$ and for almost every x, $1-\sum_{i=1}^cP^2(w_i|\vec{x})\approx 1-P^2(w_{max}|\vec{x})\approx 2(1-P(w_{max}|\vec{x}))$ , by Taylor expansion

$=2(P^*(e|\vec{x})$

=> $P\approx\int 2P^*(e|\vec{x})p(\vec{x})d\vec{x}=2P^*$

Claim 2: (To be added later.)

Lecture 19 - Nearest Neighbor Error Rates OldKiwi - Rhea

Alumni Liaison