(22 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 
[[Category:math]]
 
[[Category:math]]
 
[[Category:tutorial]]
 
[[Category:tutorial]]
 +
[[Category:bayes rule]]
 +
[[Category:conditional probability]]
 +
[[Category:math squad]]
 +
 
== Bayes' Theorem ==
 
== Bayes' Theorem ==
by Maliha Hossain
+
by [[user:Mhossain | Maliha Hossain]], proud Member of [[Math_squad | the Math Squad]].
<pre> keyword: probability, Bayes' Theorem, Bayes' Rule </pre>
+
----
 +
<pre>keyword: probability, Bayes' Theorem, Bayes' Rule </pre>
  
 
'''INTRODUCTION'''
 
'''INTRODUCTION'''
  
Bayes' Theorem (or Bayes' Rule) allows us to calculate P(A|B) from P(B|A) given that  P(A) and P(B) are also known, where A and B are events. In this tutorial, we will derive Bayes' Theorem and illustrate it with a few examples.  
+
Bayes' Theorem (or Bayes' Rule) allows us to calculate P(A|B) from P(B|A) given that  P(A) and P(B) are also known. In this tutorial, we will derive Bayes' Theorem and illustrate it with a few examples. After going over the examples, if you have any questions or if you find any mistakes please leave me a comment at the end of the relevant section.  
  
Note that this tutorial assumes familiarity with conditional probability and the axioms of probability.  
+
Note that this tutorial assumes familiarity with conditional probability and the axioms of probability. If you interested in the derivation of the conditional distributions for continuous and discrete random variables, you may wish to go over Professor Mary Comer's [[ECE600_F13_rv_conditional_distribution_mhossain|notes]] on the subject.  
 
<pre> Contents
 
<pre> Contents
 
- Bayes' Theorem
 
- Bayes' Theorem
 
- Proof
 
- Proof
- Example 1: Quality Control
+
- Example Problems
- Example 2: The False Positive Paradox
+
- Example 3
+
 
- References
 
- References
 
</pre>
 
</pre>
Line 31: Line 34:
 
== Proof ==
 
== Proof ==
  
We will now derive Bayes'e Theorem as it is expressed in the second form, which simply takes the expression one step further than the first.
+
We will now derive Bayes' Theorem as it is expressed in the second form, which simply takes the expression one step further than the first.
  
 
Let <math>A</math> and <math>B_j</math> be as defined above. By definition of the conditional probability, we have that
 
Let <math>A</math> and <math>B_j</math> be as defined above. By definition of the conditional probability, we have that
Line 39: Line 42:
 
Multiplying both sides with <math>B_j</math>, we get
 
Multiplying both sides with <math>B_j</math>, we get
  
<math>P[A\cap B_j] = P[A|B_j]P[B_j]</math>
+
<math>P[A\cap B_j] = P[A|B_j]P[B_j] \ </math>
  
 
Using the same argument as above, we have that
 
Using the same argument as above, we have that
  
<math>P[B_j|A] = \frac{P[B_j\cap A]}{P[A]}</math>
+
<math>
 +
\begin{align}
 +
P[B_j|A] & = \frac{P[B_j\cap A]}{P[A]} \\
  
<math>\Rightarrow P[B_j\cap A] = P[B_j|A]P[A]</math>
+
\Rightarrow P[B_j\cap A] &= P[B_j|A]P[A]
 +
\end{align}
 +
</math>
  
 
Because of the commutativity property of intersection, we can say that
 
Because of the commutativity property of intersection, we can say that
  
<math>P[B_j|A]P[A] = P[A|B_j]P[B_j]</math>
+
<math> P[B_j|A]P[A] = P[A|B_j]P[B_j] \ </math>
  
 
Dividing both sides by <math>P[A]</math>, we get
 
Dividing both sides by <math>P[A]</math>, we get
  
<math>P[B_j|A] = \frac{P[A|B_j]P[B_j]}{P[A]}</math>
+
<math> P[B_j|A] = \frac{P[A|B_j]P[B_j]}{P[A]}</math>
  
 
Finally, the denominator can be broken down further using the theorem of total probability so that we have the following expression
 
Finally, the denominator can be broken down further using the theorem of total probability so that we have the following expression
Line 60: Line 67:
 
----
 
----
  
== Example 1: Quality Control ==
+
== Example Problems ==
  
The following problem has been adapted from a few practice problems from chapter 2 of Probability, Statistics and Random Processes for Electrical Engineers by Alberto Leon-Garcia. The example illustrates how Bayes' Theorem plays a role in quality control.
+
[[bayes_theorem_eg1_S13|Example 1: Quality Control]]
  
A manufacturer produces a mix of "good" chips and "bad" chips. The proportion of good chips whose lifetime exceeds time <math>t</math> seconds decreases exponentially at the rate <math>\alpha</math>. The proportion of bad chips whose lifetime exceeds t decreases much faster at a rate <math>1000\alpha</math>.
+
[[bayes_theorem_eg2_S13|Example 2: False Positive Paradox]]
Suppose that the fraction of bad chips is <math>p</math>, and of good chips, <math>1 - p</math>
+
  
Let <math>C</math> be the event that the chip is functioning after <math>t</math> seconds.
+
[[bayes_theorem_eg3_S13|Example 3: Monty Hall Problem]]
Let <math>G</math> be the event that the chip is good.
+
----
Let <math>B</math> be the event that the chip is bad.
+
  
Here's what we can infer from the problem statement thus far:
+
== References ==
  
the probability that the lifetime of a good chip exceeds <math>t</math>: <math>P[C|G] = e^{-\alpha t}</math>
+
* Alberto Leon-Garcia, ''Probability, Statistics, and Random Processes for Electrical Engineering,''  Third Edition
 +
----
  
the probability that the lifetime of a bad chip exceeds <math>t</math>: <math>P[C|B] = e^{-1000\alpha t}</math>
+
==Questions and comments==
  
So by the theorem of total probability, we have that
+
If you have any questions, comments, etc. please post them below:
  
<math>P[C] = P[C|G]P[G] + P[C|B]P[B]</math>
+
* Comment / question 1
  
<math> = e^{-\alpha t}(1-p) + e^{-1000\alpha t}p</math>
 
 
Now suppose that in order to weed out the bad chips, every chip is tested for t seconds prior to leaving the factory. the chips that fail are discarded and the remaining chips are sent out to customers. Can you find the value of <math>t</math> for which 99% of the chips sent out to customers are good?
 
 
The problem requires that we find the value of <math>t</math> such that
 
 
<math>P[G|C] = .99</math>
 
 
We find <math>P[G|C]</math> by applying Bayes' Theorem
 
 
<math>P[G|C] = \frac{P[C|G]P[G]}{P[C|G]P[G] + P[C|B]P[B]}</math>
 
 
<math>= \frac{e^{-\alpha t}(1-p)}{e^{-\alpha t}(1-p) + e^{-1000\alpha t}}</math>
 
 
<math>= \frac{1}{1 + \frac{pe^{-1000\alpha t}}{e^{-\alpha t}(1-p)}} = .99</math>
 
 
The above equation can be solved for <math>t</math>
 
 
<math>t = \frac{1}{999\alpha}ln(\frac{99p}{1-p})</math>
 
 
----
 
----
  
== Example 2: The False Positive Paradox ==
+
[[Math_squad|Back to Math Squad page]]
 
+
The false positive paradox occurs when false positive tests are more probable than true positive tests. The fewer the number of incidents in the overall population, the higher the likelihood of a false positive test.
+
 
+
The following example illustrates how you would calculate the false positive rate for a test. For a more visual explanation of the false positive paradox, you may want to check out this youtube [http://http://www.youtube.com/watch?v=D8VZqxcu0I0 video].
+
 
+
A manufacturer claims that its product can detect drug use among athletes 97% of the time (i.e. the test will show a positive 97% of the time given that the athletes used drugs). However, there is a 10% chance of a false alarm (i.e. non drug users will show positive results 10% of the time). Given that only 5% of the team actually use drugs, what is the probability that an athlete who tested positive is a non user?
+
 
+
Let <math>D</math> be the event that the athlete used drugs.
+
 
+
Therefore, <math>D'</math> is the event that the athlete did not use drugs.
+
 
+
Let <math>Y</math> be the event that the test result was positive
+
 
+
Therefore a negative result is described by the event <math>Y'</math>
+
 
+
From the problem statement, we can infer the following.
+
 
+
<math>P[D] = 0.05</math>
+
 
+
<math>P[D'] = 1 - P[D'] = 0.95</math>
+
 
+
<math>P[Y|D] = 0.97</math> (i.e. the probability of a positive test given the athlete used drugs)
+
 
+
<math>P[Y'|D] = 1 - P[Y|D] = 0.03</math> (i.e. the probability of a negative test given the athlete used drugs)
+
  
<math>P[Y|D'] = 0.1</math> (i.e. the probability that the test was positive given the athlete did not take drugs)
 
  
<math>P[Y'|D'] = 1 - P[Y|D'] = 0.9</math> (i.e. the probability of a negative test given the athlete did not use drugs)
+
<div style="font-family: Verdana, sans-serif; font-size: 14px; text-align: justify; width: 70%; margin: auto; border: 1px solid #aaa; padding: 2em;">
 +
The Spring 2013 Math Squad 2013 was supported by an anonymous [https://www.projectrhea.org/learning/donate.php gift] to [https://www.projectrhea.org/learning/about_Rhea.php Project Rhea]. If you enjoyed reading these tutorials, please help Rhea "help students learn" with a [https://www.projectrhea.org/learning/donate.php donation] to this project. Your [https://www.projectrhea.org/learning/donate.php contribution] is greatly appreciated.
 +
</div>

Latest revision as of 13:08, 25 November 2013


Bayes' Theorem

by Maliha Hossain, proud Member of the Math Squad.


keyword: probability, Bayes' Theorem, Bayes' Rule 

INTRODUCTION

Bayes' Theorem (or Bayes' Rule) allows us to calculate P(A|B) from P(B|A) given that P(A) and P(B) are also known. In this tutorial, we will derive Bayes' Theorem and illustrate it with a few examples. After going over the examples, if you have any questions or if you find any mistakes please leave me a comment at the end of the relevant section.

Note that this tutorial assumes familiarity with conditional probability and the axioms of probability. If you interested in the derivation of the conditional distributions for continuous and discrete random variables, you may wish to go over Professor Mary Comer's notes on the subject.

 Contents
- Bayes' Theorem
- Proof
- Example Problems
- References

Bayes' Theorem

Let $ B_1, B_2, ..., B_n $ be a partition of the sample space $ S $, i.e. $ B_1, B_2, ..., B_n $ are mutually exclusive events whose union equals the sample space S. Suppose that the event $ A $ occurs. Then, by Bayes' Theorem, we have that

$ P[B_j|A] = \frac{P[A|B_j]P[B_j]}{P[A]}, j = 1, 2, . . . , n $

Bayes' Theorem is also often expressed in the following form:

$ P[B_j|A] = \frac{P[A|B_j]P[B_j]}{\sum_{k=1}^n P[A|B_k]P[B_k]} $


Proof

We will now derive Bayes' Theorem as it is expressed in the second form, which simply takes the expression one step further than the first.

Let $ A $ and $ B_j $ be as defined above. By definition of the conditional probability, we have that

$ P[A|B_j] = \frac{P[A\cap B_j]}{P[B_j]} $

Multiplying both sides with $ B_j $, we get

$ P[A\cap B_j] = P[A|B_j]P[B_j] \ $

Using the same argument as above, we have that

$ \begin{align} P[B_j|A] & = \frac{P[B_j\cap A]}{P[A]} \\ \Rightarrow P[B_j\cap A] &= P[B_j|A]P[A] \end{align} $

Because of the commutativity property of intersection, we can say that

$ P[B_j|A]P[A] = P[A|B_j]P[B_j] \ $

Dividing both sides by $ P[A] $, we get

$ P[B_j|A] = \frac{P[A|B_j]P[B_j]}{P[A]} $

Finally, the denominator can be broken down further using the theorem of total probability so that we have the following expression

$ P[B_j|A] = \frac{P[A|B_j]P[B_j]}{\sum_{k=1}^n P[A|B_k]P[B_k]} $


Example Problems

Example 1: Quality Control

Example 2: False Positive Paradox

Example 3: Monty Hall Problem


References

  • Alberto Leon-Garcia, Probability, Statistics, and Random Processes for Electrical Engineering, Third Edition

Questions and comments

If you have any questions, comments, etc. please post them below:

  • Comment / question 1

Back to Math Squad page


The Spring 2013 Math Squad 2013 was supported by an anonymous gift to Project Rhea. If you enjoyed reading these tutorials, please help Rhea "help students learn" with a donation to this project. Your contribution is greatly appreciated.

Alumni Liaison

Questions/answers with a recent ECE grad

Ryne Rayburn