m
 
(2 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
=Lecture 19, [[ECE662]]: Decision Theory=
 +
Lecture notes for [[ECE662:BoutinSpring08_Old_Kiwi|ECE662 Spring 2008]], Prof. [[user:mboutin|Boutin]].
 +
 +
Other lectures: [[Lecture 1 - Introduction_Old Kiwi|1]],
 +
[[Lecture 2 - Decision Hypersurfaces_Old Kiwi|2]],
 +
[[Lecture 3 - Bayes classification_Old Kiwi|3]],
 +
[[Lecture 4 - Bayes Classification_Old Kiwi|4]],
 +
[[Lecture 5 - Discriminant Functions_Old Kiwi|5]],
 +
[[Lecture 6 - Discriminant Functions_Old Kiwi|6]],
 +
[[Lecture 7 - MLE and BPE_Old Kiwi|7]],
 +
[[Lecture 8 - MLE, BPE and Linear Discriminant Functions_Old Kiwi|8]],
 +
[[Lecture 9 - Linear Discriminant Functions_Old Kiwi|9]],
 +
[[Lecture 10 - Batch Perceptron and Fisher Linear Discriminant_Old Kiwi|10]],
 +
[[Lecture 11 - Fischer's Linear Discriminant again_Old Kiwi|11]],
 +
[[Lecture 12 - Support Vector Machine and Quadratic Optimization Problem_Old Kiwi|12]],
 +
[[Lecture 13 - Kernel function for SVMs and ANNs introduction_Old Kiwi|13]], 
 +
[[Lecture 14 - ANNs, Non-parametric Density Estimation (Parzen Window)_Old Kiwi|14]],
 +
[[Lecture 15 - Parzen Window Method_Old Kiwi|15]],
 +
[[Lecture 16 - Parzen Window Method and K-nearest Neighbor Density Estimate_Old Kiwi|16]],
 +
[[Lecture 17 - Nearest Neighbors Clarification Rule and Metrics_Old Kiwi|17]],
 +
[[Lecture 18 - Nearest Neighbors Clarification Rule and Metrics(Continued)_Old Kiwi|18]],
 +
[[Lecture 19 - Nearest Neighbor Error Rates_Old Kiwi|19]],
 +
[[Lecture 20 - Density Estimation using Series Expansion and Decision Trees_Old Kiwi|20]],
 +
[[Lecture 21 - Decision Trees(Continued)_Old Kiwi|21]],
 +
[[Lecture 22 - Decision Trees and Clustering_Old Kiwi|22]],
 +
[[Lecture 23 - Spanning Trees_Old Kiwi|23]],
 +
[[Lecture 24 - Clustering and Hierarchical Clustering_Old Kiwi|24]],
 +
[[Lecture 25 - Clustering Algorithms_Old Kiwi|25]],
 +
[[Lecture 26 - Statistical Clustering Methods_Old Kiwi|26]],
 +
[[Lecture 27 - Clustering by finding valleys of densities_Old Kiwi|27]],
 +
[[Lecture 28 - Final lecture_Old Kiwi|28]],
 +
----
 +
----
 
We have seen Nearest Neighbor (NN) error rate as the number of samples approaches infinity is <math>P=\int(1-\sum_{i=1}^c P^2(w_i|\vec{x}))p(\vec{x})d\vec{x}</math>
 
We have seen Nearest Neighbor (NN) error rate as the number of samples approaches infinity is <math>P=\int(1-\sum_{i=1}^c P^2(w_i|\vec{x}))p(\vec{x})d\vec{x}</math>
  
Line 85: Line 118:
 
<math> = (2- \frac{c}{c-1} p* ) p^* </math>
 
<math> = (2- \frac{c}{c-1} p* ) p^* </math>
  
 +
<math> c=2,  p^* \leq p \leq p^* (2-sp^*) </math>
 +
 +
[[Image:park213_Old Kiwi.jpg]]
  
 
[[Image:Lec19_fish_Old Kiwi.PNG]]
 
[[Image:Lec19_fish_Old Kiwi.PNG]]
 
Figure 1
 
Figure 1
 
+
----
 +
[[ECE662:BoutinSpring08_Old_Kiwi|Back to ECE662, Spring 2008, Prof. Boutin]]
 
[[Category:Lecture Notes]]
 
[[Category:Lecture Notes]]

Latest revision as of 08:40, 17 January 2013

Lecture 19, ECE662: Decision Theory

Lecture notes for ECE662 Spring 2008, Prof. Boutin.

Other lectures: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,



We have seen Nearest Neighbor (NN) error rate as the number of samples approaches infinity is $ P=\int(1-\sum_{i=1}^c P^2(w_i|\vec{x}))p(\vec{x})d\vec{x} $

We would like to be able to answer two questions:

1) How good is that in terms of error rate?

2) How does it compare to Bayes, the best error rate we can achieve?

Recall error rate is $ P(e)=\int P(e|\vec{x})p(\vec{x})d\vec{x} $. For all x, Bayes rule yields minimum possible $ P(e|\vec{x})=:P^*(e|\vec{x}) $

Thus, we get the minimum $ P(e)=:P^*=\int P^*(e|\vec{x})p(\vec{x})d\vec{x} $

Claim 1: If $ P^* $ is low, then $ P\approx 2P^* $ (Assumes $ \infty $ number of samples.)

Justification: $ P^*(e|\vec{x})=1-P(w_{max}|\vec{x}) $, where $ w_{max} $ is such that $ P(w_{max}|\vec{x})\geq P(w_j|\vec{x}),\forall j $

So, $ P^* $ low => $ p^*(e|\vec{x}) $ is low for almost every x.

=> $ P(w_{max}|\vec{x}) $ is close to 1 for almost every x.

We have $ P=\int(1-\sum_{i=1}^cP^2(w_i|\vec{x}))p(\vec{x})d\vec{x} $ and for almost every x, $ 1-\sum_{i=1}^cP^2(w_i|\vec{x})\approx 1-P^2(w_{max}|\vec{x})\approx 2(1-P(w_{max}|\vec{x})) $, by Taylor expansion

$ =2(P^*(e|\vec{x}) $

=> $ P\approx\int 2P^*(e|\vec{x})p(\vec{x})d\vec{x}=2P^* $

Claim 2: $ P^*\leq P\leq (2-\frac{c}{c-1}P^*)P^* $

$ P^*\leq P $ obvious can't beat Bayes. In fact, tight!

for RHS inequality $ P=\int(1-\sum_{i=1}^cP^2(w_i|\vec{x}))p(\vec{x})d\vec{x} $

Find the lower bound for this $ \sum_{i=1}^cP^2(w_i|\vec{x}) $

Write $ \sum_{i=1}^cP^2(w_i|\vec{x})=P^2(w_m|\vec{x})+\sum_{i\neq m}P^2(w_i|\vec{x}) $

Minimize this $ \sum_{i\neq m}P^2(w_i|\vec{x}) $

under the constraint $ \sum_{i\neq m}P(w_i|\vec{x})=1-P(w_m|\vec{x})=P^*(e|\vec{x}) $

min is attained when

$ P(w_i|\vec{x})=\frac{P^*(e|\vec{x})}{c-1},\forall i $

So we have

$ \sum_{i=1}^cP^2(w_i|\vec{x})\geq (1-P^*(e|\vec{x}))^2 +(c-1)(\frac {P^*(e|\vec{x})}{c-1})^2 $

$ =(1-P^*(e|\vec{x}))^2 +\frac{(P^*(e|\vec{x}))^2}{c-1} $

$ =1-2P^*(e|\vec{x})+(P^*(e|\vec{x}))^2 +\frac{(P^*(e|\vec{x}))^2}{c-1} $

$ =1-2P^*(e|\vec{x})+\frac{c}{c-1}(P^*(e|\vec{x}))^2 $

Inside the $ \int $

$ 1-\sum_{i=1}^cP^2(w_i|\vec{x}))p(\vec{x})\leq 1-(1-2P^*(e|\vec{x})+\frac{c}{c-1}(P^*(e|\vec{x}))^2) $

$ =2P^*(e|\vec{x})-\frac{c}{c-1}(P^*(e|\vec{x}))^2 $

To get better bound, observe

$ Var(P^*(e|\vec{x})\geq 0 $

$ => \int(p^*(e|\vec x)-p^*)^2p(\vec x)dx\geq 0 $

$ => \int ({p}^{*2}(e|\vec x)-2p*p*(e|\vec x)+{p}^{*2} p(\vec x)dx $

$ => \int {p}^{*2}(e|\vec x)p(x) dx - 2p* \int p*(e| \vec x)p(\vec x)dx + {p}^{*2} \int p(\vec x)dx) \geq 0 $

(where, $ \int p*(e|\vec x)p(\vec x)dx $ should be changed as $ \int p*(e|\vec x)p(\vec x)dx = p* $ )

$ \int {p}^{*2}(e|\vec x)p(\vec x) dx \geq {p}^{*2} $

(This is equal only if variance = 0)

so, $ p\leq \int (2p^*(e| \vec x)- \frac{c}{c-1} {p}^{*2}(e|\vec{x}) ) p(\vec x)dx $

$ = 2 \int p* (e|vec x)p(\vec x) dx - \frac{c}{c-1} \int {p}^{*2}(e|\vec x)p(\vec x) dx $

$ \leq 2p* - \frac {c}{c-1} {p}^{*2} $

$ = (2- \frac{c}{c-1} p* ) p^* $

$ c=2, p^* \leq p \leq p^* (2-sp^*) $

Park213 Old Kiwi.jpg

Lec19 fish Old Kiwi.PNG Figure 1


Back to ECE662, Spring 2008, Prof. Boutin

Alumni Liaison

Recent Math PhD now doing a post-doctorate at UC Riverside.

Kuei-Nuan Lin