Line 25: Line 25:
  
 
A ROC curve shows graphically about the trade-off between the true positive rate (TPR) and the false positive rate (FPR). Now assume that we have a two-class prediction problem (binary classification), in which the outcomes are labeled either as class 1 (C1) or class 2 (C2). There are four possible outcomes from a binary classifier. If the outcome from a prediction is C1 and the actual value is also C1, then it is called a true positive (TP); however if the actual value is C2 then it is said to be a false positive (FP). Conversely, a true negative (TN) has occurred when both the prediction outcome and the actual value are C2, and false negative (FN) is when the prediction outcome is C2 while the actual value is C1. Usually, the outcomes can be summarized into a contingency table or a Confusion Matrix:  
 
A ROC curve shows graphically about the trade-off between the true positive rate (TPR) and the false positive rate (FPR). Now assume that we have a two-class prediction problem (binary classification), in which the outcomes are labeled either as class 1 (C1) or class 2 (C2). There are four possible outcomes from a binary classifier. If the outcome from a prediction is C1 and the actual value is also C1, then it is called a true positive (TP); however if the actual value is C2 then it is said to be a false positive (FP). Conversely, a true negative (TN) has occurred when both the prediction outcome and the actual value are C2, and false negative (FN) is when the prediction outcome is C2 while the actual value is C1. Usually, the outcomes can be summarized into a contingency table or a Confusion Matrix:  
 
+
<center>
 
{| width="600" border="1" cellpadding="1" cellspacing="1"
 
{| width="600" border="1" cellpadding="1" cellspacing="1"
 
|+ A Confusion Matrix  
 
|+ A Confusion Matrix  
 
|-
 
|-
| Actual class \ Predicted class
+
| Actual class \ Predicted class  
| C1
+
| C1  
| C2
+
| C2  
 +
|
 
|-
 
|-
| C1
+
| C1  
| True Positives (TP)
+
| True Positives (TP)  
| False Negatives (FN)
+
| False Negatives (FN)  
 +
| P
 
|-
 
|-
| C2
+
| C2  
| False Positives (FP)
+
| False Positives (FP)  
| True Negatives (TN)
+
| True Negatives (TN)  
 +
| N
 +
|-
 +
|
 +
| P'
 +
| N'
 +
| All
 
|}
 
|}
 
+
</center>
 
+
Each element in the matrix gives the frequency of a outcome. In the signal detection theory, each outcome has a different term. True positive is also known as a hit, while false positive has another name of a false alarm. True negative and false negative are also known as a correct rejection and a miss respectively.
  
 
<br>  
 
<br>  

Revision as of 18:54, 29 April 2014


ROC curve and Neyman Pearsom Criterion

A slecture by ECE student

Partly based on the ECE662 Spring 2014 lecture material of Prof. Mireille Boutin.


 1. Outline of the slecture

Receiver Operating Characteristic (ROC) curve is often used as an important tool to visualize the performance of a binary classifier. The use of ROC curves can be originated from signal detection theory that developed during World War II for radar analysis [2]. What will be covered in the slecture is listed as:

  • Basics in measuring binary classification
  • A quick example about ROC in binary classification
  • Some statistics behind ROC curves
  • Neyman-Pearson Criterion



 2. Introduction

A ROC curve shows graphically about the trade-off between the true positive rate (TPR) and the false positive rate (FPR). Now assume that we have a two-class prediction problem (binary classification), in which the outcomes are labeled either as class 1 (C1) or class 2 (C2). There are four possible outcomes from a binary classifier. If the outcome from a prediction is C1 and the actual value is also C1, then it is called a true positive (TP); however if the actual value is C2 then it is said to be a false positive (FP). Conversely, a true negative (TN) has occurred when both the prediction outcome and the actual value are C2, and false negative (FN) is when the prediction outcome is C2 while the actual value is C1. Usually, the outcomes can be summarized into a contingency table or a Confusion Matrix:

A Confusion Matrix
Actual class \ Predicted class C1 C2
C1 True Positives (TP) False Negatives (FN) P
C2 False Positives (FP) True Negatives (TN) N
P' N' All

Each element in the matrix gives the frequency of a outcome. In the signal detection theory, each outcome has a different term. True positive is also known as a hit, while false positive has another name of a false alarm. True negative and false negative are also known as a correct rejection and a miss respectively.



Reference

[1] Mireille Boutin, "ECE662: Statistical Pattern Recognition and Decision Making Processes," Purdue University, Spring 2014.
[2] Jiawei Han. 2005. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.
[3] Richard O. Duda, Peter E. Hart, and David G. Stork. 2000. Pattern Classification. Wiley-Interscience.
[4] Detection Theory. http://www.ece.iastate.edu/~namrata/EE527_Spring08/l5c_2.pdf.
[5] The Neyman-Pearson Criterion. http://cnx.org/content/m11548/1.2/.



Questions and comments

If you have any questions, comments, etc. please post them on this page.


Back to ECE662, Spring 2014

Alumni Liaison

Prof. Math. Ohio State and Associate Dean
Outstanding Alumnus Purdue Math 2008

Jeff McNeal