Revision as of 12:13, 21 May 2014 by Rhea (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Back to all ECE 600 notes

The Comer Lectures on Random Variables and Signals

Slectures by Maliha Hossain


Topic 17: Random Vectors



Random Vectors

Definition $ \qquad $ let X$ _1 $,..., X$ _n $ be n random variables on (S,F,P). The column vector X is given by

$ \underline{X} = [X_1,...,X_n]^T $

is a random vector (RV) on (S,F,P).

Fig 1: The mapping from the sample space to the event space under X$ _j $.


We can view X($ \omega $) as a point in R$ ^n $$ \omega $S.
Much of what we need to work with random vectors we can get by a simple extension of what we have developed for n = 2.

For example:

  • The cumulative distribution function of X is
$ F_{\underline X}(\underline x) = P(X_1\leq x_1,...,X_n\leq x_n)\;\;\forall\underline x = [x_1,...,x_n]^T\in\mathbb R^n $
and the probability density function of X is
$ f_{\underline X}(\underline x) = \frac{\partial^nF_{\underline X}(\underline x)}{\partial x_1...\partial x_n} $
  • For any D ⊂ R$ ^n $ such that D ∈ B(R$ ^2 $),
$ P(\underline X \in D) = \int_D f_{\underline X}(\underline x)d\underline x $
Note that B(R$ ^n $) is the $ \sigma $-field generated by the collection of all open n-dimensional hypercubes (more formally, k-cells) in R$ ^n $.
  • The formula for the joint pdf of two functions of two random variables can be extended to find the pdf of n functions of n random variables (see Papoulis).
  • The random variables X$ _1 $,..., X$ _n $ are statistically independent if the events {X$ _1 $ ∈ A$ _1 $},..., {X$ _n $ ∈ A$ _n $} are independent ∀A$ _1 $, ..., A$ _n $ ∈ B(R). An equivalent definition is that X$ _1 $,..., X$ _n $ are independent if
$ f_{\underline X}(\underline x)=\prod_{i=1}^nf_{X_i}(x_i)\;\;\forall\underline x\in\mathbb R^n $



Random Vectors: Moments

We will spend some time on moments of random vectors. We will be especially interested in pairwise covariances/correlations.
The correlation between X$ _j $ and X$ _k $ is denoted R$ _{jk} $, so

$ R_{jk} \equiv E[X_jX_k] $

and the covariance is C$ _{jk} $:

$ C_{jk}\equiv E[(X_j-\mu_{X_j})(X_k-\mu_{X_k})] $

For a random vector X, we define the correlation matrix RX as

$ R_{\underline X}=\begin{bmatrix} R_{11} & \cdots & R_{1n} \\ \vdots & & \vdots \\ R_{n1} & \cdots & R_{nn} \end{bmatrix} $

and the covariance matrix CX as

$ C_{\underline X}=\begin{bmatrix} C_{11} & \cdots & C_{1n} \\ \vdots & & \vdots \\ C_{n1} & \cdots & C_{nn} \end{bmatrix} $

The mean vector $ \mu $X is

$ \mu_{\underline X}=[\overline X_1,\cdots,\overline X_n]^T $

Note that the correlation matrix and the covariance matrix can be written as

$ \begin{align} R_{\underline X}&=E[\underline X\underline X^T] \\ C_{\underline X}&=E[(\underline X-\mu_{\underline X})(\underline X-\mu_{\underline X})^T] \end{align} $

Note that $ \mu $X, RX and CX are the moments we most commonly use for the random vectors.

We need to discuss an important property of RX, but first, a definition from Linear Algebra.

Definition $ \qquad $ An n × m matrix B with b$ _{ij} $ as its i,j$ ^{th} $ entry is non-negative definite (NND) (or positive semidefinite) if

$ \sum_{i=1}^n\sum_{j=1}^n x_i x_j b_{ij} \geq 0 $

for all real vectors [x$ _1 $,...,x$ _n $] ∈ R$ ^n $.

That is to say that for any real vector x, the product x$ ^T $Ax, where A is a real matrix, is non negative.

Theorem $ \qquad $ For any random vector X, RX is NND.

Proof: $ \qquad $ let a be an arbitrary real vector in R$ ^n $, and let

$ Y = \underline a^T\underline X = \underline X^T\underline a $

be a scalar random variable. Then

$ \begin{align}0\leq E[Y^2]&=E[\underline a^T\underline X\underline X^T\underline a] \\ &=\underline a^TE[\underline X\underline X^T]\underline a \\ &=\underline a^TR_{\underline X}\underline a \end{align} $

So

$ 0\leq \underline a^TR_{\underline X}\underline a = \sum_{i=1}^n\sum_{j=1}^na_ia_jR_{ij} $

and thus, RX is NND

Note: CX is also NND.



Characteristic Functions of Random Vectors

Definition $ \qquad $ let X be a random vector on (S,F,P). Then the characteristic function of X is

$ \Phi_{\underline X}(\underline\Omega)=E\left[e^{i\sum_{j=1}^n\omega_jX_j}\right] $

where

$ \underline\Omega = [\omega_1,...,\omega_n]^T \in \mathbb R^n $

The characteristic function ΦX is extremely useful for finding pdfs of sums of random variables.
Let

$ Z=\sum_{j=1}^nX_j $

Then

$ \Phi_Z(\omega)=E\left[e^{i\omega\sum_{j=1}^nX_j}\right] = \Phi_{\underline X}(\omega,...,\omega) $

If X$ _1 $,..., X$ _n $ are independent, then

$ \Phi_Z(\omega)=\prod_{j=1}^n\Phi_{X_j}(\omega) $

If, in addition, X$ _1 $,..., X$ _n $ are identically distributed with common characteristic function Φ$ _X $, then

$ \Phi_Z(\omega) = (\Phi_X(\omega))^n \ $



Gaussian Random Vectors

Definition $ \qquad $ Let X be a random vector on (S,F,P). Then X is Gaussian and X$ _1 $,..., X$ _n $ are said to be jointly Gaussian iff

$ Z=a_0+\sum_{j=1}^na_jX_j $

is a Gaussian random variable ∀[a$ _0 $,..., a$ _n $] ∈ R$ ^{n+1} $.

Now we will show that the characteristic function of a Gaussian random vector X is

$ \Phi_{\underline X}(\underline\Omega) = e^{i\underline\Omega^T\mu_{\underline X}-\frac{1}{2}\underline\Omega^TC_{\underline X}\underline\Omega^T} $

where $ \mu $X is the mean vector of X and CX is the covariance matrix.

Proof $ \qquad $ Let

$ Z = \sum_{j=1}^n\omega_jX_j $

for

$ \underline\Omega \in \mathbb R^n $


Then Z is a Gaussian random variable since X is Gaussian. So

$ \Phi_Z(\omega)=e^{i\omega\mu_Z}e^{-\frac{1}{2}\omega^2\sigma_Z^2} $

where

$ \begin{align} \mu_Z&=E\left[\sum_{j=1}^n\omega_jX_j\right] \\ &=\sum_{j=1}^n\omega_j\mu_{X_j} \\ &=\underline\Omega^T\mu_{\underline X} \end{align} $

and

$ \begin{align} \sigma_Z^2 &= \mbox{Var}\left(\sum_{j=1}^n\omega_jX_j\right) \\ &= \sum_{j=1}^n\sum_{k=1}^n\sigma_{jk}\omega_j\omega_k \\ &=\underline\Omega^TC_{\underline X}\underline\Omega \end{align} $

where

$ \sigma_{jk}^2 = E[(X_j-\mu_{X_j})(X_k-\mu_{X_k})] $

and CX is the covariance matrix of X.

Now

$ \begin{align} \Phi_{\underline X}(\omega)&= E\left[e^{i\sum_{j=i}^n\omega_jX_j}\right] \\ &= \Phi_Z(\omega) \end{align} $

Plugging the expressions for $ \mu_Z $ and $ \sigma_Z $$ ^2 $ into Φ$ _Z $ gives

$ \Phi_{\underline X}(\omega) =e^{i\underline\Omega^T\mu_{\underline X}}e^{-\frac{1}{2}\underline\Omega^TC_{\underline X}\underline\Omega} $

Note that we can use the equation

$ f_{\underline X}(\underline x) = \frac{1}{(2\pi)^n}\int_{\mathbb R^n}\Phi_{\underline X}(\underline\Omega)e^{-i\underline\Omega^T\underline x}d\underline\Omega $
to show that if X is Gaussian, then 
$ f_{\underline X}(\underline x) = \frac{1}{\sqrt{(2\pi)^n|C_{\underline X}|}}e^{-\frac{1}{2}(\underline X-\mu_{\underline X})^TC_{\underline X}^{-1}(\underline X-\mu_{\underline X})} $

$ \forall \underline x\in\mathbb R^n $

Note that if X$ _1 $,..., X$ _n $ are pairwise uncorrelated, then CX is the covariance matrix of X is diagonal and

$ f_{\underline X}(\underline x) = \frac{e^{-\frac{1}{2}\sum_{j=1}^n\frac{x_j^2}{\sigma_j^2}}}{\sqrt{(2\pi)^n\prod_{j=1}^n\sigma_j^2}} $

Then, we can find the joint characteristic function and the joint pdf of the jointly Gaussian random variables X and Y using the forms for a Gaussian random vector with n = 2, X$ _! $ = X and X$ _2 $ = Y.



References



Questions and comments

If you have any questions, comments, etc. please post them on this page



Back to all ECE 600 notes

Alumni Liaison

Correspondence Chess Grandmaster and Purdue Alumni

Prof. Dan Fleetwood