Topic 6: Random Variables: Distributions

How do we find, compute and model P(x ∈ A) for a random variable X for all A ∈ B(R)? We use three different functions:

1. the cumulative distribution function (cdf)
2. the probability density function (pdf)
3. the probability mass function (pmf)

We will discuss these in this order, although we could come at this discussion in a different way and a different order and arrive at the same place.

Definition $\quad$ The cumulative distribution function (cdf) of X is defined as

\begin{align} F_X(x) &= P_X((-\infty,x])\;\forall x\in \mathbb R \\ &= P(X^{-1}((-\infty, x])) \\ &=P(\{ \omega\in\mathcal S:\;X(\omega)\leq x \}) \end{align}

Notation $\quad$ Normally, we write this as

$F_X(x) = P(X\leq x)$

So $F_X(x)$ tells us P$P_X(A)$ if A = (-∞,x] for some real x.
What about other A ∈ B(R)? It can be shown that any A ∈ B(R) can be written as a countable sequence of set operations (unions, intersections, complements) on intervals of the form (-∞,x$_n$], so can use the probability axioms to find $P_X(A)$ from $F_X$ for any A ∈ B(R). This is not how we do things in practice normally. This will be discussed more later.

Can an arbitrary function $F_X,$ be a valid cdf? No, it cannot.
Properties of a valid cdf:
1.

$\lim_{x\rightarrow\infty}F_X(x) = 1\;\;\mbox{and}\;\;\lim_{x\rightarrow -\infty}F_X(x) = 0$

This is because

$\lim_{x\rightarrow\infty}F_X(x) = P(\{ \omega\in\mathcal S:\;X(\omega)\leq\infty \})=1$

and

$\lim_{x\rightarrow -\infty}F_X(x)= P(\varnothing)= 0$

2. For any $x_1,x_2$R such that $x_1<x_2$,

$F_X(x_1)\leq F_X(x_2)$

i.e. $F_X(x)$ is a non decreasing function.

3. $F_X$ is continuous from the right , i.e.

$F_X(x^+) \equiv \lim_{\epsilon\rightarrow0,\epsilon>0}F_X(x+\epsilon)=F_X(x)\;\;\forall x\in\mathbb R$

Proof: First, we need some results from analysis and measure theory:
(i) For a sequence of sets, $A_1, A_2,...$, if $A_1 $$A_2 ⊃ ..., then \lim_{n\rightarrow\infty}A_n = \bigcap_{n=1}^{\infty}A_n (ii) If A_1$$ A_2$ ⊃ ..., then

$P(\lim_{n\rightarrow\infty}A_n) = \lim_{n\rightarrow\infty}P(A_n)$

(iii) We can write $F_X(x^+)$ as

$F_X(x^+) = \lim_{n\rightarrow\infty}F_X(x+\frac{1}{n})$

Now let

$A_n = \{X\leq x+\frac{1}{n}\}$

Then

\begin{align} F_X(x^+) &= \lim_{n\rightarrow\infty}P(X\leq x+\frac{1}{n}) \\ &=\lim_{n\rightarrow\infty}P(A_n) \\ &=P(\lim_{n\rightarrow\infty}A_n) \\ &=P(\bigcap_{n=1}^{\infty}A_n) \\ &=P(\bigcap_{n=1}^{\infty}\{X\leq x+\frac{1}{n}\})\\ &=F_X(x) \end{align}

4. $P(X>x) = 1-F_X(x)$ for all x ∈ R

5. If $x_1 < x_2$, then

$P(x_1<X\leq x_2) = F_X(x_2) - F_X(x_1)\;\forall x_1,x_2\in\mathbb R$

6. $P(\{X=x\})= F_X(x) - F_X(x^-)$, where

$F_X(x^-) = \lim_{\epsilon\rightarrow 0,\epsilon>0} F_X(x-\epsilon)$

## The Probability Density Function

Definition $\quad$ The probability density function (pdf) of a random variable X is the derivative of the cdf of X,

$f_X(x) = \frac{dF_X(x)}{dx}$

at points where $F_x$ is differentiable.
From the Fundamental Theorem of Calculus, we then have that

$F_X(x)=\int_{-\infty}^xf_X(r)dr\;\;\forall x\in\mathbb R$

Important note: the cdf $F_X$ might not be differentiable everywhere. At points where $F_X$ is not differentiable, we can use the Dirac delta function to defing $f_x$.

Definition $\quad$ The Dirac Delta Function $\delta(x)$ is the function satisfying the properties:
1.

$\delta(x) = 0 \;\forall x\neq0$

2.

$\int_{-\infty}^{\infty} \delta(x)dx = \int_{-\epsilon}^{\epsilon}\delta(x)dx = 1\;\forall\epsilon>0$

If $F_X$ is not differentiable at a point, use $\delta(x)$ at that point to represent $f_X$.

Why do we do this? Consider the step function $u(x)$, which is discontinuous and thus not differentiable at $x=0$. This is a common type of discontinuity we see in cdfs. The derivative of $u(x)$ is defined as

$\frac{du(x)}{dx}=\lim_{h\rightarrow 0}\frac{u(x+h)-u(x)}{h}$

This limit does not exist at $x=0$

Let's look at the function

$g(x) = \frac{u(x+h)-u(h)}{h}$

It looks like this:

Fig 1: g(x) for h>0

For any x ≠ 0, we have that

$\frac{u(x+h)-u(x)}{h}=0$

for small enough h.
Also, ∀ $\epsilon$ > 0,

$\int_{-\epsilon}^{\epsilon}\delta(x)dx= 1\; \forall h<\epsilon$

So, in the limit, the function g(x) has the properties of the $\delta$-function as h tends to 0. A similar argument can be made for h<0.
So this is why it is sometimes written that

$\frac{du(x)}{dx} = \delta(x)$

Since we will only work with non-differentiable functions that have step discontinuities as cdfs, we write

$f_X(x) = \frac{dF_X(x)}{dx}$

with the understanding that $d/dx$ is not necessarily the traditional definition of the derivative.

Properties of the pdf:
1. (proof: derivative of increasing Function F$_X$(x) must be non-negative)

$f_X(x)\geq 0\;\forall x\in \mathbb R$

2. (proof: use the fact that the limit of F$_X$(x) as x goes to infinity is 1)

$\int_{-\infty}^{\infty}f_X(x)dx = 1$

3. If $x_1<x_2$, then

$P(x_1<X\leq x_2) = \int_{x_1}^{x_2}f_X(x)dx$
(proof: use the fact that P(x$_1$ < X ≤ x$_2$) = F$_X$(x$_2$) - F$_X$(x$_1$))

Some notes:

• We introduced the concept of a pdf in our discussion of probability spaces. We could have defined the pdf of a random variable X as a function $f_X$ satisfying properties 1 and 2 above, and then define $F_X$ in terms of $f_X$.
• f_X(x) is not a probability for a fixed x, it gives us instead the "probability density", so it must be integrated to give us the probability.
• In practice, to compute probabilities of random variable X, we normally use
$P(X\in A) = \int_{A}f_X(x)dx$

## Continuous and Discrete Random Variables

A random variable X having a cdf that is continuous everywhere is called a continuous random variable.

Fig:2 a possible cdf for a continuous random variable

A random variable X having a piece-wise constant cdf is a discrete random variable.

Fig:3 a possible cdf for a discrete random variable

A random variable X whose cdf is neither continuous everywhere nor piece-wise constant is a mixed random variable.

Fig:4 a possible cdf for a mixed random variable

We will consider only discrete and continuous random variables in this course.

Note:

• For a continuous random variable X, P(X=x)=0 ∀x ∈ R, because $F_X(x)-F_X(x^-) = 0$ ∀x ∈ R.
• We can use the cdf/pdf functions for a discrete random variable X, but generally, we do not. Instead, we use the pmf.

## Probability Mass Function

The probability mass function (pmf) of random variable X is the function

$p_X(x) \equiv p_X(\{x\})=P(X=x)$

We will use this function when X is discrete.

Although P(X=x) exists for every x ∈ R, we normally define $p_X(x)$ for some subset RR

Definition $\quad$ Given random variable X:SR, let R = X(S) be the range space of X (note that the range of X is still R).

Example: Let X be the sum of values rolled on two fair die. Then R = {2,3,...,12}.

We define the pmf only on R, so in the example above, we would have $p_X(x) = P(X=x)$ ∀x ∈ R.

We can now consider the probability space for a discrete random variable to be (R,P(R),p$_X$).
Note that if X is continuous, we define the cdf/pdf for all x ∈ R. for example, if X = V$^2$, where V is the voltage (so X is proportional to power), then R=[0,∞), but we still define $f_X(x)$ ∀x ∈ R.

Properties of the pmf:
1. (proof)

$p_X(x) \geq 0$

2. (proof)

$\sum_{x\in\mathcal R}p_X(x) = 1$

These properties can be derived from the axioms. Note that we could simply define the pmf to be a function having these two and then create a probability mapping P(.) in such a way that P(.) will satisfy the axioms. We discussed this in our lectures on probability spaces. This is what we do in practice.

## 1. Gaussian Random Variable

Fig:5 The pdf of a random variable that is Gaussian distributed with mean $\mu$ and variance $\sigma^2$.

The Gaussian (or Normal) continuous random variable $X$ has pdf

$f_X(x) = \frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x-\mu)^2}{2\sigma^2}}\;\;\forall x\in\mathbb R;\;\sigma,\mu\in\mathbb R, \sigma>0$

and cdf $F_X(x)$, where

$F_X(x)=\int_{-\infty}^x\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x-\mu)^2}{2\sigma^2}}dr$

(No closed solution)
Let

$\phi(x) = \int_{-\infty}^x\frac{1}{\sqrt{2\pi}}e^{-\frac{x^2}{2}}dr$

Then write

$F_X(x) = \phi(\frac{x-\mu}{\sigma})$

Use the table to find values of ϕ.

Notation for Gaussian random variable $X$:$N(\mu,\sigma^2)$

Gaussian random variable are used to model

• certain types of noise
• sum of large number of independent random variables
• continuous random variables with no prior information about distribution (default assumption)

## 2. Uniform Random Variable

Continuous Case:

$f_X(x) = \begin{cases} \frac{1}{x_2-x_1} & \mbox{if }x_1\leq x\leq x_2 \\ 0 & \mbox{else} \end{cases}\quad x_1,x_2 \in\mathbb R;\;\forall x\in\mathbb R$

Fig:6 The pdf of a continuous random variable that is uniformly distributed with parameters a=$x_1<x_2$=b.

$F_X(x) = \begin{cases} 1 & \mbox{if }x\geq x_2 \\ \frac{x-x_1}{x_2-x_1} & \mbox{if }x_1\leq x\leq x_2 \\ 0 & \mbox{if }x<x_1 \end{cases}$

Discrete Case

$R = \{x_1,...,x_2\}$ for some integer n ≥ 1.

$p_X(x_i) = \frac{1}{n}\;\forall x_i\in\mathcal R$

Used to model:

• Continuous case - random variables where $P(X$(a,b)) depends only on $b-a$, ∀a,b ∈ R, $x_1$a<b$x_2$
• Discrete case - random variables whose values are equally likely to occur.

## 3. Binomial Random Variable (Disrete)

$R = \{0,1,...,n\}$

$p_X(k) = {n\choose k}p^k(1-p)^{n-k}\;\;k = 0,...,n$

p ∈ [0,1], n ≥ 1, n finite.

Used to model number of successes in Bernoulli trials

## 4. Exponential (Continuous)

Fig:7 The pdf of a continuous random variable that is exponentially distributed with parameters $x_1<x_2$.

$f_X(x) = \begin{cases} \lambda e^{-\lambda x} & \mbox{if }x\geq 0 \\ 0 & \mbox{else} \end{cases}\quad \lambda\in\mathbb R;\;\lambda>0$

Used to model

• times between arrival of customers (or other things)

## 5. Rayleigh (Continuous)

Fig:8 The pdf of a continuous random variable with a Rayleigh distribution.

$f_X(x) = \begin{cases} \frac{x}{\sigma^2}e^{-\frac{x^2}{2\sigma^2}} & x\geq 0 \\ 0 & x<0 \end{cases}$

Used to model square root of a sum of squares (e.g. magnitude of complex exponential).

## 6. Laplace (Continuous)

$f_X(x) = \frac{\alpha}{2}e^{-\alpha|x|};\quad\alpha>0$
Fig:9 The pdf of a continuous random variable with a Laplace distribution.

Used to model prediction errors.

## 7. Poisson (Discrete)

$\mathcal R = \{0,1,2,...\}$
$p_X(k)=e^{-\lambda}\frac{\lambda^k}{k!};\quad\lambda>0,\;k=0,1,2,...$

Used to model the number of occurrences of events in time or space.

## 8. Geometric (Discrete)

Form 1:

$\mathcal R = \{0,1,2,...\}$

$p_X(k)=p(1-p)^k \$

or

Form 2:

$\mathcal R = \{1,2,3,...\}$

$p_X(k)=p(1-p)^{k-1} \$

Used to model number of Bernoulli trials until the occurrence of first success.

## Alumni Liaison

Basic linear algebra uncovers and clarifies very important geometry and algebra.

Dr. Paul Garrett