An Introduction To ROC Curve (Receiver Operating Characteristics)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

An Introduction to ROC Curve (Receiver Operating Characteristics)

Ming-Chang Lee Department of Information Management Yu Da College of Business

1/16

Outline
1. 2. 3. 4. 5.

Introduction Create an ROC curve Area Under an ROC Curve (AUC) R-package ROCR demo References

2/16

1. Introduction
History : Signal detection theory hit rates and false alarm rates Development:
Diagnostic system Medical decision making Machine learning

3/16

Classier Performance
Problem: two classes classification
Classification model Input (instance, I )

Actual class Actual class {p n} { p ,, n }


PS: actual class {p: positive class, n: negative class}

Predicted class classified {Y,N ) (instance, I }

4/16

Confusion matrix (Contingency table)


Given a classifier and an instance:
Classifier TRUE CLASS

Predicted class

p (positive)
True Positives False Negatives P

n (negative)
False Positives True Negatives N

Y N
Total

P = True Positives + False Negatives

5/16

Performance index
TP FN FP TN

TP FP TPR = = Recall , FPR = P N TP TP + TN Precision = , Accuracy = TP + FP P+N Sensitivity = Recall , Specificity = 1 FPR
6/16

ROC curve
Y axis: TPR X axis: FPR
(0,1)
Benefits (TP) Costs (FP)

(1,1)

(0,0)

(1,0)
7/16

Compare ROC curve

TP FN

FP TN

y=x

(0,0) Numbers of P =0, No FP error, No TP (0,1) perfect D classifiers Northwest location is better. Near x axis and on the left side Conservative e.g. A vs. B Near upper right-hand side Liberal Lower Lower Right ? Right ? (?) Triangle Triangle
8/16

2. Create an ROC curve


A ranking or scoring classifier can be used with a threshold to produce a binary classifier. If the classifier output is above the threshold, the classifier produces a Y, else a N.

9/16

Use thresholds to create ROC curve


(0.1,0.5)

If threshold =0.54 Numbers of Score 0.54

5 5 10

1 9 10

6 14 20

1 x : = 0.1 10 5 y : = 0.5 10

10/16

f(i) : the probabilistic classifier's estimate that instance i is positive;


min and max, the smallest and largest values returned by f; increment : the smallest difference between any two f values.

L Inputs: the set of test instances;

Conceptual Algorithm

11/16

Practical Algorithm

12/16

3. Area Under an ROC Curve (AUC)


AUC (Bradley, 1997) Wilcoxon test of ranks Area : Classifier B > A Average performance B>A

13/16

4. R demo ROCR package


package : ROCR plot ROC curve plot SVM vs. Neural Network

14/16

4. References
1.

Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms, Pattern Recognition, 30 (7), 1145-1159. Fawcett, T. (2003) ROC Graphs: Notes and Practical Considerations for Data Mining Researchers, HP Laboratories technical report. Witten, I.H. and Frank, E. (2005) Data Mining: Practical Machine Learning Tools and Techniques, Second Edition, Morgan Kaufmann. The magnificent ROC: http://www.anaesthetist.com/mnm/stats/roc/
15/16

2.

3.

4.

THANKS
Q&A
Web: http://web.ydu.edu.tw/~alan9956/ Email: alan9956@webmail.ydu.edu.tw
16/16

You might also like