(ML Estimation Rule)
 
(73 intermediate revisions by 19 users not shown)
Line 1: Line 1:
 +
[[Category:ECE302Fall2008_ProfSanghavi]]
 +
[[Category:probabilities]]
 +
[[Category:ECE302]]
 +
[[Category:cheat sheet]]
 +
 +
=[[ECE302]] Cheat Sheet number 3=
 
==Covariance==
 
==Covariance==
 
* <math>COV(X,Y)=E[(X-E[X])(Y-E[Y])]\!</math>
 
* <math>COV(X,Y)=E[(X-E[X])(Y-E[Y])]\!</math>
 
* <math>COV(X,Y)=E[XY]-E[X]E[Y]\!</math>
 
* <math>COV(X,Y)=E[XY]-E[X]E[Y]\!</math>
 +
 +
 +
X and Y are uncorrelated if cov(X,Y) = 0
 +
 +
 +
If X and Y are independent, they are uncorrelated. The converse is not always true.
 +
 +
If either RV X or Y is multiplied by a large constant, then the magnitude of the covariance will also increase. A measure of correlation in an absolute scale is the correlation coefficient.
  
 
==Correlation Coefficient==
 
==Correlation Coefficient==
  
 
<math>\rho(X,Y)= \frac {cov(X,Y)}{\sqrt{var(X)} \sqrt{var(Y)}} \,</math>
 
<math>\rho(X,Y)= \frac {cov(X,Y)}{\sqrt{var(X)} \sqrt{var(Y)}} \,</math>
 +
<math>  = \frac{E[XY]-E[X]E[Y]}{\sqrt{var(X)} \sqrt{var(Y)}} </math>
 +
 +
-1 <= <math>\rho(X,Y) <= 1  </math>
 +
 +
If <math>\rho(X,Y) = 0</math> then x, y are uncorrelated
 +
 +
If <math>\rho(X,Y) = 1 or -1 </math> then x,y are very correlated
  
 
==Markov Inequality==
 
==Markov Inequality==
Line 11: Line 32:
 
* <math>P(X \geq a) \leq E[X]/a\!</math>   
 
* <math>P(X \geq a) \leq E[X]/a\!</math>   
 
for all a > 0
 
for all a > 0
 +
 +
 +
EXAMPLE:
 +
 +
On average it takes 1 hour to catch a fish. What is (an upper bound) the probability it will take 3 hours?
 +
 +
SOLUTION:
 +
 +
Using Markov's inequality, where <math>E[X]\!</math> = 1 and a = 3.
 +
 +
<math> P(X \geq 3) \leq \frac {E[X]}{3} = \frac{1}{3}</math>
 +
 +
so 1/3 is the upper bound to the probability that it will take more than 3 hours to catch a fish.
  
 
==Chebyshev Inequality==
 
==Chebyshev Inequality==
Line 16: Line 50:
  
 
:<math>\Pr(\left|X-E[X]\right|\geq C)\leq\frac{var(X)}{C^2}.</math>
 
:<math>\Pr(\left|X-E[X]\right|\geq C)\leq\frac{var(X)}{C^2}.</math>
 +
 +
==Weak Law of Large Numbers==
 +
The weak law of large numbers states that the sample average converges in probability towards the expected value
 +
:<math>\overline{X}_n \, \xrightarrow{P} \, \mu \qquad\textrm{for}\qquad n \to \infty.</math>
 +
 +
Mn = (X1 + ... + Xn)/n = X1/n + ... + Xn/n
 +
 +
E[Mn] = nE[X]/n = E[X]
 +
 +
Var[Mn] = Var(X1/n) + ... + Var(Xn/n) = Var(X)/n
 +
 +
Pr[ |Mn - E[X]| ] >= Var(Mn)/<math> \sigma^2 = Var(X)/n\sigma^2 </math>
  
 
==ML Estimation Rule==
 
==ML Estimation Rule==
Line 21: Line 67:
  
 
<math>\hat a_{ML} = \text{max}_a ( Pr(x_i;a))</math> discrete
 
<math>\hat a_{ML} = \text{max}_a ( Pr(x_i;a))</math> discrete
 +
 +
 +
If X is a binomial (n,p), where is X is number of heads n tosses,
 +
Then, for any fixed k-value;
 +
 +
<math>\hat p_{ML}(k) = k/n</math>
 +
 +
If X is exponential then it's ML estimate is:
 +
 +
<math> \frac{1}{ \overline{X}} </math>
  
 
==MAP Estimation Rule==
 
==MAP Estimation Rule==
 +
 +
<math>\hat \theta_{MAP} = \text{argmax}_\theta ( f_{\theta|X}(\theta|x))</math>
 +
 +
Which can be expanded and turned into the following (if I am not mistaken):
 +
 +
<math>\hat \theta_{MAP} = \text{argmax}_\theta ( f_{X|\theta}(x|\theta)f_{\theta}(\theta))</math>
 +
 +
for discrete case:
 +
 +
<math>\hat \theta_{MAP} = \text{argmax}_\theta ( P_{X|\theta}(x|\theta)P_{\theta}(\theta))</math>
  
 
==Bias of an Estimator, and Unbiased estimators==
 
==Bias of an Estimator, and Unbiased estimators==
 +
 +
An estimator is unbiased if:
 +
<math>E[\hat a_{ML}] = a</math> for all values of a
 +
 +
 +
 +
'''A biased estimator can be made unbiased ''' :  Let an event X be uniform from (0,a). Now <math>E[\hat a_{ML}] = a</math> = E[X] = a/2,which makes it biased as it is not equal to a. Now we can make <math>E[\hat a_{ML}]= a</math>      if    a^{ML}=2x,  which  makes  it unbiased.
  
 
==Confidence Intervals, and how to get them via Chebyshev==
 
==Confidence Intervals, and how to get them via Chebyshev==
 +
 +
<math> \theta \text{ is unknown and fixed}</math>
 +
 +
<math>\hat \theta \text{ is random and should be close to } \theta \text{ most of the time}</math>
 +
 +
<math> if Pr[|\hat \theta \text{-} \theta|] <= (1-a) \text { then we say we have (1-a) confidence in the interval } [\hat \theta - E, \hat \theta + E]</math>
 +
 +
 +
Confidence level of <math> (1-a) </math> if <math>Pr[\hat \theta \text{-} \delta < \theta < \hat \theta + \delta] >= (1-a)
 +
for  all \theta</math>
 +
 +
==Definition of the term Unbiased estimators==
 +
 +
The ML estimator is said to be UNBIASED if its expected value is the true value for all true values.
 +
 +
(ie, A estimate <math>\hat \theta</math> for parameter <math>\theta</math> is said to be unbiased if <math>\forall \theta \text{(where }\theta\text{ exists)}, E(\hat \theta) = \theta</math>)
 +
 +
==Few equations from previous material==
 +
 +
::X and Y are independent
 +
 +
*<math>E[X] = \int^\infty_{-\infty}x*f_X(x)dx\!</math>
 +
 +
*<math>E[XY] = E[X]E[Y]\!</math>
 +
 +
*<math>Var(X) = E[X^2] - (E[X])^2\!</math>
 +
 +
Marginal PDF
 +
 +
*<math>E(g(x))=\int^\infty_{-\infty} g(x)f_X(x,y) dy</math>
 +
 +
*<math>f_X(x) = \int^\infty_{-\infty} f_{XY}(x,y) dy</math>
 +
 +
*<math>f_Y(y) = \int^\infty_{-\infty} f_{XY}(x,y) dx</math>
 +
----
 +
[[Main_Page_ECE302Fall2008sanghavi|Back to ECE302 Fall 2008 Prof. Sanghavi]]

Latest revision as of 13:06, 22 November 2011


ECE302 Cheat Sheet number 3

Covariance

  • $ COV(X,Y)=E[(X-E[X])(Y-E[Y])]\! $
  • $ COV(X,Y)=E[XY]-E[X]E[Y]\! $


X and Y are uncorrelated if cov(X,Y) = 0


If X and Y are independent, they are uncorrelated. The converse is not always true.

If either RV X or Y is multiplied by a large constant, then the magnitude of the covariance will also increase. A measure of correlation in an absolute scale is the correlation coefficient.

Correlation Coefficient

$ \rho(X,Y)= \frac {cov(X,Y)}{\sqrt{var(X)} \sqrt{var(Y)}} \, $ $ = \frac{E[XY]-E[X]E[Y]}{\sqrt{var(X)} \sqrt{var(Y)}} $

-1 <= $ \rho(X,Y) <= 1 $

If $ \rho(X,Y) = 0 $ then x, y are uncorrelated

If $ \rho(X,Y) = 1 or -1 $ then x,y are very correlated

Markov Inequality

Loosely speaking: In a nonnegative RV has a small mean, then the probability that it takes a large value must also be small.

  • $ P(X \geq a) \leq E[X]/a\! $

for all a > 0


EXAMPLE:

On average it takes 1 hour to catch a fish. What is (an upper bound) the probability it will take 3 hours?

SOLUTION:

Using Markov's inequality, where $ E[X]\! $ = 1 and a = 3.

$ P(X \geq 3) \leq \frac {E[X]}{3} = \frac{1}{3} $

so 1/3 is the upper bound to the probability that it will take more than 3 hours to catch a fish.

Chebyshev Inequality

"Any RV is likely to be close to its mean"

$ \Pr(\left|X-E[X]\right|\geq C)\leq\frac{var(X)}{C^2}. $

Weak Law of Large Numbers

The weak law of large numbers states that the sample average converges in probability towards the expected value

$ \overline{X}_n \, \xrightarrow{P} \, \mu \qquad\textrm{for}\qquad n \to \infty. $

Mn = (X1 + ... + Xn)/n = X1/n + ... + Xn/n

E[Mn] = nE[X]/n = E[X]

Var[Mn] = Var(X1/n) + ... + Var(Xn/n) = Var(X)/n

Pr[ |Mn - E[X]| ] >= Var(Mn)/$ \sigma^2 = Var(X)/n\sigma^2 $

ML Estimation Rule

$ \hat a_{ML} = \text{max}_a ( f_{X}(x_i;a)) $ continuous

$ \hat a_{ML} = \text{max}_a ( Pr(x_i;a)) $ discrete


If X is a binomial (n,p), where is X is number of heads n tosses, Then, for any fixed k-value;

$ \hat p_{ML}(k) = k/n $

If X is exponential then it's ML estimate is:

$ \frac{1}{ \overline{X}} $

MAP Estimation Rule

$ \hat \theta_{MAP} = \text{argmax}_\theta ( f_{\theta|X}(\theta|x)) $

Which can be expanded and turned into the following (if I am not mistaken):

$ \hat \theta_{MAP} = \text{argmax}_\theta ( f_{X|\theta}(x|\theta)f_{\theta}(\theta)) $

for discrete case:

$ \hat \theta_{MAP} = \text{argmax}_\theta ( P_{X|\theta}(x|\theta)P_{\theta}(\theta)) $

Bias of an Estimator, and Unbiased estimators

An estimator is unbiased if: $ E[\hat a_{ML}] = a $ for all values of a


A biased estimator can be made unbiased  : Let an event X be uniform from (0,a). Now $ E[\hat a_{ML}] = a $ = E[X] = a/2,which makes it biased as it is not equal to a. Now we can make $ E[\hat a_{ML}]= a $ if a^{ML}=2x, which makes it unbiased.

Confidence Intervals, and how to get them via Chebyshev

$ \theta \text{ is unknown and fixed} $

$ \hat \theta \text{ is random and should be close to } \theta \text{ most of the time} $

$ if Pr[|\hat \theta \text{-} \theta|] <= (1-a) \text { then we say we have (1-a) confidence in the interval } [\hat \theta - E, \hat \theta + E] $


Confidence level of $ (1-a) $ if $ Pr[\hat \theta \text{-} \delta < \theta < \hat \theta + \delta] >= (1-a) for all \theta $

Definition of the term Unbiased estimators

The ML estimator is said to be UNBIASED if its expected value is the true value for all true values.

(ie, A estimate $ \hat \theta $ for parameter $ \theta $ is said to be unbiased if $ \forall \theta \text{(where }\theta\text{ exists)}, E(\hat \theta) = \theta $)

Few equations from previous material

X and Y are independent
  • $ E[X] = \int^\infty_{-\infty}x*f_X(x)dx\! $
  • $ E[XY] = E[X]E[Y]\! $
  • $ Var(X) = E[X^2] - (E[X])^2\! $

Marginal PDF

  • $ E(g(x))=\int^\infty_{-\infty} g(x)f_X(x,y) dy $
  • $ f_X(x) = \int^\infty_{-\infty} f_{XY}(x,y) dy $
  • $ f_Y(y) = \int^\infty_{-\infty} f_{XY}(x,y) dx $

Back to ECE302 Fall 2008 Prof. Sanghavi

Alumni Liaison

Ph.D. 2007, working on developing cool imaging technologies for digital cameras, camera phones, and video surveillance cameras.

Buyue Zhang