# Proof of Fischer's Information

Variance can also be shown as differences between the expected value of the square of event Y and the square of the expected value of Y. This theorem is proven below:

$ \begin{align} \bar Var(Y) &= E[(Y-E(Y))^2]\\ &= E[Y^2-2YE[Y]+(E[Y])^2]\\ &= E[Y^2]-2(E[Y])^2+(E[Y])^2\\ &= E[Y^2] - (E[Y])^2 \end{align} $

*note: This definition is used often in statistics and therefore I will not be explaining the derivation of this identity. If you would like to know more, check our "More Sources" tab.*

Using the identity of variance above, we can show that:

$ I(θ) = var(s(θ;X)) = E[(s(θ;X))^2] - (E[s(θ;X)])^2 $

As you can see we already have our E[(s(θ;X))^2] term which is part of our Fischer's Information definition. From here, we can use integrals to make $ (E[s(θ;X)])^2 $ equal 0, therefore satisfying the equation

Recall that the score function is equal to the gradient with respect to θ of the natural log of the likelihood function with parameters θ and X. Also denoted like this:
$ s(θ;X) = \nabla [ln(L(θ,X))] $

From here,

$ \begin{align} \bar E[(s(θ;X))] &= E[\nabla [ln(L(θ,X))]]\\ &=\int \nabla [ln(L(θ,X))] f(X;θ)dx\\ &=\int \frac{1}{f(X;θ)}\nabla [f(X;θ)]f(X;θ)dx\\ &=\int \nabla [f(X;θ)]dx\\ &=\nabla [\int[f(X;θ)]dx]\\ &=\nabla[1]\\ &=0 \end{align} $

*note: $ \int[f(X;θ)]dx $ simplifies out to one because the integral of a probability function is always 1.*

Coming back to our intial identity:

$ I(θ) = E[(s(θ;X))^2] - (E[s(θ;X)])^2 $, if we plug in 0 into the second term and we get:

$ \begin{align} \bar I(θ) &= E[(s(θ;X))^2] - (E[s(θ;X)])^2\\ &=E[(s(θ;X))^2]-(0)^2\\ &= E[(s(θ;X))^2] \end{align} $

Back To Fisher information