ECE600 F13 Stochastic Convergence mhossain - Rhea

Topic 18: Stochastic Convergence

Stochastic Convergence

We will now consider infinite sequences of random variables. We will discuss what it means for such a sequence to converge. This will lead to some very important results: the laws of large numbers and the Central Limit Theorem.

Consider a sequence X $$ _1 $$ ,X $$ _2 $$ ,..., where each X $$ _i $$ is a random variable on (S,F,P). We will call this a random sequence (or a discrete-time random process).

Notation $\qquad$ X $$ _n $$ may refer to either the sequence itself or to the nth element in the sequence. We may also use {X $$ _n $$ } to denote the sequence, or X $$ _n $$ , n ≥ 1.

The sequence X $$ _n $$ maps S to the set of all sequences of real numbers, so for a fixed S, X $_1(\omega)$ ,X $_2(\omega)$ ,... is a sequence of real numbers.

Fig 1: The mapping from the sample space to the reals under X

$ _j $

.

Before looking at convergence, recall the meaning of convergence or a sequence of real numbers.

Definition $\qquad$ A sequence of real numbers x $$ _1 $$ ,x $$ _2 $$ ,... converges to a number x ∈ R if ∀ $\epsilon$ > 0, ∃ an n $_{\epsilon}$ ∈ N such that

|x_n-x|<\epsilon\qquad\forall n\geq n_{\epsilon}

If there is such an x ∈ R, we say

\lim_{n\rightarrow\infty}x_n=x

or

x_n\rightarrow\infty\;\mbox{as}\;n\rightarrow\infty

For a random sequence X $$ _n $$ , the issue of convergence is more complicated since X $$ _n $$ is a function of $\omega$ ∈ S.

First look at a motivating example.

Example $\qquad$ Let X $$ _k $$ = s + W $$ _k $$ , where s ∈ R and W $$ _k $$ is a random sequence with E[W $$ _k $$ ] = 0 ∀k = 1,2,.... W $$ _k $$ can be viewed as a noise sequence if we want to know the value of s.

Let

Y_n=\frac{1}{n}\sum_{k=1}^nX_k

Then, E[Y $$ _n $$ ] = s ∀n. But Y $$ _n $$ is a random variable, so we cannot expect Y $$ _n $$ = s ∀n. However, we intuitively expect Y $$ _n $$ to be a better estimate of s as n increases. Does Y $$ _n $$ → s as n → ∞ ? If so, in what sense?

Types of Convergence

Since X $_n(\omega)$ is generally a different sequence for very $\omega$ ∈ S, what does it mean for X $$ _n $$ to converge? We will discuss different ways in which X $$ _n $$ can converge.

Definition $\qquad$ X $$ _n $$ converges everywhere if the sequence X $_1(\omega)$ ,X $_2(\omega)$ ,... converges to some value X $(\omega)$ ∀ $\omega$ ∈ S. We also call this sure convergence or convergence surely.

Fig 2: Convergence for all

\omega

in S.

Notation

X_n\overset{e}{\rightarrow}X

Definition $\qquad$ X $$ _n $$ converges almost everywhere or almost surely, if X $_n(\omega)$ → X $(\omega)$ for some A ∈ S with P(A) = 1. Also known as convergence with probability 1.

Notation

X_n\overset{a.e.}{\rightarrow}X,\quad X_n\overset{a.s.}{\rightarrow}X,\quad X_n\overset{w.p.1}{\rightarrow}X

Definition $\qquad$ X $$ _n $$ converges in mean square to X if

E\left[|X_n-X|^2\right]\rightarrow 0

as

n\rightarrow\infty

Notation

X_n\overset{ms}{\rightarrow}X

Definition X $$ _n $$ converges in probability to X if ∀ $\epsilon$ > 0

P(|X_n-X|>\epsilon)\rightarrow 0

as

n\rightarrow\infty

Note that P(|X $$ _n $$ - X| > $\epsilon$ ) is a sequence of real numbers.

Notation

X_n\overset{p}{\rightarrow}X

Definition $\qquad$ X $$ _n $$ converges in distribution to X if

F_{X_n}(x)\rightarrow F_X(x)

as

n\rightarrow\infty

Note that for a fixed x ∈ R, F $$ _n $$ (x) is a sequence of real numbers.

Notation

X_n\overset{d}{\rightarrow}X

We will generally assume that if X $$ _n $$ converges in distribution, then

\frac{F_{X_n}(x)}{dx}\rightarrow\frac{F_X(x)}{dx}

or
$f_{X_n}(x)\rightarrow f_X(x)$
as

n\rightarrow\infty

although this is not true in every instance, as seen in the next example.

Example $\qquad$ Let X $$ _n $$ have pdf

f_{X_n}(x) = 1+\cos 2\pi nx\quad0\leq x\leq 1,\;\;n=1,2,...

Then

F_{X_n}(x) = \begin{cases} 0 & x<0 \\ x+\frac{\sin 2\pi nx}{2\pi n} & 0\leq x\leq 1 \\ 1 & x>1 \end{cases}

Now

F_{X_n}(x)\rightarrow F_X(x)

as
$n\rightarrow\infty$

\forall x\in\mathbb R

where

F_X(x)=\begin{cases} 0 & x<0 \\ x & 0\leq x\leq 1 \\ 1 & x>1 \end{cases}

However, f $_{Xn}$ does not converge for x ∈ (0,1).

Cauchy Criterion for Convergence

We will now discuss a method for showing that X $$ _n $$ → X in the mean square sense for some random variables X without knowing what X is. Frist, consider a sequence of real numbers.

Definition If x $$ _n $$ is a sequence of real numbers and

|x_{m+n}-x_n|\rightarrow 0

as

n\rightarrow \infty \quad \forall m>0

then x $$ _n $$ converges iff it is a Cauchy sequence. This is known as the Cauchy criterion for convergence.

The Cauchy Criterion for mean square convergence of X $$ _n $$ $\qquad$ It can be shown that a random sequence X $$ _n $$ converges in the mean square sense iff

E[|X_{n+m}-X_n|^2]\rightarrow 0\quad\forall m>0

as

n\rightarrow\infty

It can be shown that the following relationships between types of convergences hold:

Fig 3: Relationships implied by the different types of convergence. Note that the box represents all possible random sequences

Some Important Random Sequences

First, we need the Chebyshev Inequality:
Let X be a random variable with mean $\mu$ and variance $\sigma^2$ . Then,

P(|X_\mu|\geq\epsilon)\leq\frac{\sigma^2}{\epsilon^2}\quad\forall\epsilon>0

and

g_2(x) = \frac{(x-\mu)^2}{\epsilon^2}

Proof $\quad$ let

g_1(x) = I_{\{x\in\mathbb R:|x-\mu|\geq\epsilon\}}(x)

for a fixed ε > 0. Then,

\begin{align} E[g_1(x)]&=\int_{-\infty}^{\infty}g_1(x)f_X(x)dx \\ &=P(|X-\mu|\geq\epsilon) \end{align}

and

E[g_2(X)]=E\left[\frac{(X-\mu)^2}{\epsilon^2}\right]=\frac{\sigma^2}{\epsilon^2}

Now consider $\phi$ (x) = g $$ _2 $$ (x) - g $$ _1 $$ (x)

Fig 4: g

$ _1 $

and g

$ _2 $

as functions of x. Note that g

$ _2 $

(x) ≥ g

$ _1 $

(x) for all x.

We have that $\phi$ (x) ≥ 0 ∀x ∈ R. So,

0\leq E[\phi(X)]=E[g_2(X)]-E[g_1(X)]=\frac{\sigma^2}{\epsilon^2}-P(|X-\mu|\geq\epsilon)

\Rightarrow P(|X_\mu|\geq\epsilon)\leq\frac{\sigma^2}{\epsilon^2}

The Laws of Large Numbers

Recall that the previous example where

X_k=s+W_k \

where s ∈ R and E[W $$ _k $$ ] = 0. Let

Y_n=\frac{1}{n}\sum_{k=1}^n X_k

Does Y $$ _n $$ converge to s? If so, in what sense? The laws of large numbers address this question.

Theorem $\qquad$ The Weak Law of Large Numbers
Let X $$ _n $$ be a sequence of iid random variables with mean $\mu$ and variance $\sigma^2$ . Let

Y_n=\frac{1}{n}\sum_{k=1}^n X_k

Y $$ _n $$ is the sample mean.
Then, for any $\epsilon$ < 0,

P(|Y_n-\mu|\geq\epsilon)\rightarrow0

as

n\rightarrow\infty

i.e., Y $$ _n $$ → $\mu$ in probability.

Proof

\begin{align} E[Y_n]&=\mu \\ \mbox{Var}(Y_n)&=\frac{\sigma^2}{n} \end{align}

Using the Chebyshev Inequality, we have that

P(|Y_n-\mu|\geq\epsilon)\leq\frac{\mbox{Var}(Y_n)}{\epsilon^2}=\frac{\sigma^2}{\epsilon^2n}\;\overset{n\rightarrow\infty}{\longrightarrow}\;0

Theorem $\qquad$ Strong Law of Large Numbers
Let X $$ _n $$ be a sequence of iid random variables with mean $\mu$ and variance $\sigma^2$ . Then

Y_n=\frac{1}{n}\sum_{k=1}^nX_k\overset{a.e.}{\longrightarrow}\mu

as

n\rightarrow\infty

The proof is omitted as it is beyond the scope of this class. Nevertheless, note that the strong law of large numbers implies the weak law of large numbers.

So for example

X_k=s+W_k \

if the W $$ _k $$ 's are independent and have the same variance $\sigma^2$ , then the sample mean Y $$ _n $$ converges to s with probability 1.

But the laws of large numbers tell us something more fundamental about our axiomatic-based probability measure. Consider a random variable X and a set A ⊂ B(R). Let X $$ _1 $$ ,X $$ _2 $$ ,... be an iid sequence of random variables with the same distribution as X. Now let

Y_k=\begin{cases} 1 & \mbox{if} \;X_k\in A \\ 0 & \mbox{if} \;X_k\notin A \end{cases}

Then

E[Y_k] = P(X_k\in A) = P(X\in A)

The Y $$ _k $$ 's are iid with with mean P(X ∈ A) and the relative frequency of the even {X ∈ A} is

\frac{1}{n}\sum_{k=1}^nY_k

The strong law pf large number tells us that

\frac{1}{n}\sum_{k=1}^nY_k\rightarrow P(X\in A)

with probability 1 as

n\rightarrow\infty

So with probability 1, the relative frequency approach to probability converges to the axiom-based P(X ∈ A).

The Central Limit Theorem

Let X $$ _n $$ be a sequence of iid random variables with finite mean $\mu$ and variance $\sigma^2$ . Then if

Z_n = \frac{\sum_{j=1}^nX_j-n\mu}{\sigma\sqrt{n}}

then,

Z_n\overset{d}{\longrightarrow}Z

as

n\rightarrow\infty

where Z is N(0,1).

i.e.

F_{Z_n}(z)\rightarrow\Phi(z)=\int_{-\infty}^z\frac{1}{\sqrt{2\pi}}e^{-\frac{x^2}{2}}dx

Proof $\qquad$ We will use the following Lemma

Lemma $\qquad$ Let Z $$ _n $$ be a sequence of random variables with cdfs F $_{Zn}$ and mgfs $\Phi_{Zn}$ (s), and let Z be a random variable with cdf F $$ _Z $$ and mgf $\Phi_Z$ . Then if $\Phi_{Zn}$ (s) → $\Phi_Z$ ∀s ∈ C, then F $_{Zn}$ (z) → F $$ _Z $$ (z) ∀z ∈ R.

So it is sufficient to show that

\Phi_{Z_n}(s)\rightarrow e^{\frac{s^2}{2}}

Consider

\Phi_X\left(\frac{s}{\sqrt n}\right)=E\left[e^{\frac{sX_k}{\sqrt n}}\right]\quad\forall k

Now considering the case $\mu=0$ , $\sigma^2=1$ , we have

\begin{align} \Phi_{Z_n}(s) &= E\left[e^{s\sum_{k=1}^n\frac{X_k}{\sqrt n}}\right] \\ &=E \left[\prod_{k=1}^ne^{s\frac{X_k}{\sqrt n}}\right] \\ &=\prod_{k=1}^nE\left[e^{s\frac{X_k}{\sqrt n}}\right] \\ &=\prod_{k=1}^n\Phi_X\left(\frac{s}{\sqrt n}\right) \\ &=\left(\Phi_X\left(\frac{s}{\sqrt n}\right)\right)^n \end{align}

Let L(s) = log $$ |phi_X $$ (s). Then, to show that

\Phi_{Z_n}(s)\rightarrow e^{\frac{s^2}{2}}

it is sufficient to show that

nL\left(\frac{s}{\sqrt n}\right)=\log\left[\left(\Phi_X\left(\frac{s}{\sqrt n}\right)\right)^n\right]\rightarrow\frac{s^2}{2}

Will need

\begin{align} & L(0)=0 \\ & L'(0)=\mu=0 \\ & L''(0)=E\left[X^2\right]=1 \\ \end{align}

Now

\begin{align} \lim_{n\rightarrow\infty} nL\left(\frac{s}{\sqrt n}\right)&=\lim_{n\rightarrow\infty}\frac{L\left(\frac{s}{\sqrt n}\right)}{n^{-1}} \\ &= \lim_{n\rightarrow\infty} \frac{-L'\left(\frac{s}{\sqrt n}\right)n^{-\frac{3}{2}}s}{-2n^{-2}} \quad (\mbox{L}'\mbox{Hopital}) \\ &=\lim_{n\rightarrow\infty} \frac{L'\left(\frac{s}{\sqrt n}\right)s}{2n^{-\frac{1}{2}}} \\ &= \lim_{n\rightarrow\infty} \frac{-L''\left(\frac{s}{\sqrt n}\right)n^{-\frac{3}{2}}s^2}{-2n^{-\frac{3}{2}}} \quad (\mbox{L}'\mbox{Hopital again}) \\ &=\lim_{n\rightarrow\infty} \frac{L''\left(\frac{s}{\sqrt n}\right)s^2}{2} \\ &=\frac{s^2}{2} \end{align}

For general $\mu$ , $\sigma^2$ , use the result for $\mu$ = 0, $\sigma^2$ = 1 with

Y_k = \frac{X_k-\mu}{\sigma}

References

M. Comer. ECE 600. Class Lecture. Random Variables and Signals. Faculty of Electrical Engineering, Purdue University. Fall 2013.

Questions and comments

If you have any questions, comments, etc. please post them on this page

Back to all ECE 600 notes

ECE600 F13 Stochastic Convergence mhossain - Rhea

Contents

Stochastic Convergence

Types of Convergence

Cauchy Criterion for Convergence

Some Important Random Sequences

The Laws of Large Numbers

The Central Limit Theorem

References

Questions and comments

Alumni Liaison