Revision as of 05:56, 3 April 2008 by Pclough (Talk)

Course Topics

This page and its subtopics discusses about Parametric Estimators

Lectures discussing Parametric Estimators: Lecture 7_OldKiwi and Lecture 8_OldKiwi


MORE ON THE MLE: Issues related to the properties and computational efficiency of the Maximum Likelihood Estimator

The MLE estimator is probably the most important parameter estimator in classical statistics. The reason is that the MLE estimator is asymptotically efficient. That is to say that given a large enough data sample, the estimator will be efficient. Furthermore if $ \hat \theta $ is the MLE estimator of the parameter $ \theta $ , then $ \sqrt{n}({\hat \theta}-\theta) $ will asymptotically converges to the distribution $ \mathcal{N}(0,v(\theta)) $ where $ v(\theta) $ is the Cramer Rao Bound(http://en.wikipedia.org/wiki/Cram%C3%A9r-Rao_inequality).

But what is an efficient estimator? An estimator $ {\hat \theta} $ is efficient if:

  1. $ \hat \theta $ here is an unbiased estimator.
  1. $ \hat \theta $ achieves the Cramer-Rao Lower Bound(CRLB).


The CRLB is the minimum variance achievable by any unbiased estimator for a parameter. The estimator that is unbiased and achieves the CRLB is referred to as the Minimum Variance Unbiased Estimator(MVUE).

So the MLE is an important estimator because: 1. If an MVUE exists, the MLE procedure will find it. 2. If an MVUE does not exist, the MLE will asymptotically converge to it.

Therefore if the pdf of the model is known the MLE is often a good candidate estimator since it can be computed (although this might not be an easy task) and it is "optimal" for a large enough data set ( although how large is large enough is not always easily answered). The MLE does have some disadvantages in practice: 1. It is not the best method for small data and can give highly erroneous results in some cases. 2. The computation can be extremely difficult and sometimes leads to a plethora of numerical methods such as:

1. Brute Force Method (i.e compute the pdf on a very fine grid and try to get the maximum). Although it can be done, this is very computationally inefficiently.

2. Iterative Methods (i.e. Newton-Raphson which does not guarantee convergence. In fact good initial guess is needed here)

3. Scoring Method

4. Expectation-Maximization: Guarantees convergence to at least a local maximum. A good method for the complicated vector-parameter cases.

Alumni Liaison

Ph.D. 2007, working on developing cool imaging technologies for digital cameras, camera phones, and video surveillance cameras.

Buyue Zhang