The principles for how to generate random samples from a Gaussian distribution - Rhea

The principle of how to generate a Gaussian random variable

A slecture by ECE student Joonsoo Kim

Partly based on the ECE662 Spring 2014 lecture material of Prof. Mireille Boutin.

1 1. Introduction
2 2. Inverse transform sampling
3 3. Box Muller Transform
4 4. Marsaglia polar method
5 5. Generate a Gaussian random variable using a normal distributed random variable.
6 6. Sample C code for how to generate a Gaussian random variable based on Marsaglia polar method
7 Questions and comments

1. Introduction

In this slecture, we will explain the principle of how to generate Gaussian random samples. Even though there are more general methods to generate random samples which have any distribution, we will focus on the simple method such as Box Muller transform to generate Gaussian random samples in this slecture. To do this, we will introduce inverse transform sampling which is a bas method for pseudo random number sampling first. Then, we will explain Gaussian random sample generation method based on Box Muller transform. Finally, we will introduce Marsaglia polar method which improves Box Muller transform.

2. Inverse transform sampling

Let U be a random variable which is uniformly distributed on the interval [0, 1]. And let F be a continuous CDF(cumulative distribution function) of a random variable, X which we want to generate. Then, inverse CDF is defined by

F^{-1}(u)=inf\{ x|F(x)\leq u, \quad u\in [0, 1] \}

Using this inverse CDF, we can generate random variable X as following

X <- F^{-1}(U)\quad

Now, we will apply this to generate a random variable X which has the exponential distribution

f(x) = \lambda exp(-\lambda x) \quad where \quad x \geq 0

Let's compute CDF of X

F(x) = \int_{-\infty}^x \lambda exp(-\lambda x') dx' = \int_0^x \lambda exp(-\lambda x') dx' = [-exp(-\lambda x')]_0^x = 1-exp(-\lambda x) \leq u

Then, we can compute x as

x = -\lambda^{-1}ln(1-u) \quad

In other words, we can get an exponentially distributed random variable X using an uniformly distributed random variable U

X = -\lambda^{-1}ln(1-U) \quad

This method can be applied to generate any random variable if the CDF of it is easily calculated. If it is not, we cannot use this method. Instead, we can use other methods such as rejection sampling, importance sampling, Hastings-Metropolis sampling, etc. However, we do not cover those sampling methods here.

3. Box Muller Transform

Before explaing Box Muller transform, we will apply Inverse transform sampling to generate a normal distributed random variable.

F(x) = \int_{-\infty}^x \frac{1}{\sqrt{2\pi}} exp(-\frac{1}{2} x'^2) dx'

This integration is not straightforward to make an algebraic expression in terms of elementary functions such as exponentials and logs. Therefore, we use Box Muller transform method to generate a normal distributed random variable.

Let (X, Y) be a pair of independent normal distributed random variable. Then, we can express F(x,y) as follows

F(x,y) = \int_{-\infty}^y \int_{-\infty}^x \{ \frac{1}{\sqrt{2\pi}} exp(-\frac{1}{2} x'^2) \} \{ \frac{1}{\sqrt{2\pi}} exp(-\frac{1}{2} y'^2)\} dx'dy' = \int_{-\infty}^y \int_{-\infty}^x \frac{1}{2\pi} exp(-\frac{1}{2} (x'^2 + y'^2) ) dx'dy'

Now, define random variable R and θ as follows

R = \sqrt{X^2 + Y^2}, \quad \theta = tan(\frac{Y}{X}) \quad where \quad \theta \in [0, 2\pi]

X and Y can be expressed using R and θ as follows

X = Rcos(\theta) ,\quad Y = Rsin(\theta)

Then, we can express F(x,y) using r and θ (dxdy = rdrdθ)

F(x,y) = F(r,\theta) = \int_{-\infty}^\theta \int_{-\infty}^r \frac{1}{2\pi} exp(-\frac{1}{2} r'^2 ) r' dr'd\theta' = \int_{0}^\theta \int_{0}^r \frac{1}{2\pi} exp(-\frac{1}{2} r'^2 ) r' dr'd\theta' = \frac{1}{2\pi} \int_{-\infty}^\theta [-exp(-\frac{1}{2} r'^2 )]_0^r d\theta'

= \frac{\theta}{2\pi} \{ 1-exp(-\frac{1}{2} r^2 ) \} = F(r)F(\theta)

Let $$ U_1 $$ and $$ U_2 $$ be uniformly distributed random variables such that $U_1,U_2 \in [0, 1]$ . Then we can say as follows

F(\theta) \leq u_1, \quad F(r) \leq u_2

Since $U_2 \in [0, 1]$ , we can say $1-U_2 \in [0, 1]$ . Then

F(\theta) \leq u_1, \quad F(r) \leq 1-u_2

Therefore, we can get $$ u_1 $$ and $$ u_2 $$ as follows

u_1 = \frac{\theta}{2\pi}, \quad \theta = 2\pi u_1

1-u_2 = 1-exp(-\frac{1}{2} r^2 ), \quad u_2 = exp(-\frac{1}{2} r^2), \quad r = \sqrt{-2ln(u_2)}

Now, we can generate X and Y using $$ U1 $$ and $$ U2 $$ .

X = Rcos(\theta) = \sqrt{-2ln(U_2)}cos( 2\pi U_1 ),

Y = Rsin(\theta) = \sqrt{-2ln(U_2)}sin( 2\pi U_1 ),

4. Marsaglia polar method

To remove cosine and sine functions used in the Box Muller transforms, this Marsaglia which is generally used to generate normal distributed random variable is introduced. To avoid using trigonometric functions of Box Muller transforms, let's consider polar coordinates. For this, we consider random variables, $$ W_1 $$ , $$ W_2 $$ which are uniformly distributed on [-1, 1] such that $$ W_1^2+W_2^2 < 1 $$ It can be generated as follows.

W_1 = 2U_1-1 ,\quad W_2 = 2U_2-1 \quad where \quad 0\leq W_1^2+W_2^2 < 1

Then, we can substitute $cos( 2\pi U_1 )$ with $W_1/\sqrt{W_1^2+W_2^2}$ and $sin( 2\pi U_1 )$ with $W_2/\sqrt{W_1^2+W_2^2}$ .

In addition, since $$ W_1^2+W_2^2 $$ is uniformly distributed on [0, 1), $\sqrt{-2ln(U_2)}$ is replaced with $\sqrt{-2ln(W_1^2+W_2^2)}$ . Therefore, X and Y are generated by

X = \sqrt{-2ln(W_1^2+W_2^2)} \frac{W_1}{\sqrt{W_1^2+W_2^2}},

Y = \sqrt{-2ln(W_1^2+W_2^2)} \frac{W_2}{\sqrt{W_1^2+W_2^2}},

5. Generate a Gaussian random variable using a normal distributed random variable.

Let G : Gaussian random variable, $\mu$ : mean of G, and $\sigma$ : standard deviation of G Then, G can be generated by X which is a normal distributed random variable

G = X*\sigma + \mu \quad

6. Sample C code for how to generate a Gaussian random variable based on Marsaglia polar method

/************************************************************************/
 
	gaussian -- generate a Gaussian random variable   
 
	            with mean 'u' and standard deviation 'd'
 
/************************************************************************/
 
double gaussian(double u, double d)
 
{
 
        static double t = 0.0;
 
        double x, w1, w2, r;
 
        if( t == 0 )
 
     {
 
               do{
                      w1 = 2.0 * rnd() - 1.0;
 
                      w2 = 2.0 * rnd() - 1.0;
 
                      r = w1 * w1 + w2 * w2;
 
                  }
 
                while( r >= 1.0 );
 
                r = sqrt( -2.0*log(r) / r );
 
                t = v2 * r;
 
                return( u + w1 * r * d );
 
       }
 
        else
 
        {
 
                x = t;
 
                t = 0.0;
 
                return( u + x * d);
 
        }
 
}

Questions and comments

If you have any questions, comments, etc. please post them on this page.

Back to ECE662, Spring 2014