Professional Documents
Culture Documents
Pid Fuzzy Logic
Pid Fuzzy Logic
Confusion Matrix
● Confusion M atrix:
PREDICTED CLASS
Class=Yes Class=No
Class=Yes a b
ACTUAL
CLASS Class=No c d
PREDICTED CLASS
Class=Yes Class=No
Class=Yes a b
ACTUAL (TP) (FN)
CLASS
Class=No c d
(FP) (TN)
a+d TP + TN
Accuracy = =
a + b + c + d TP + TN + FP + FN
02/03/2018 Introduction to Data Mining, 2nd Edition 5
Alternative Measures
PREDICTED CLASS
Class=Yes Class=No
Class=Yes a b
ACTUAL
CLASS Class=No c d
a
Precision (p) =
a+c
a
Recall (r) =
a+b
2rp 2a
F - measure (F) = =
r + p 2a + b + c
02/03/2018 Introduction to Data Mining, 2nd Edition 8
Alternative Measures
10
Precision (p) = = 0.5
PREDICTED CLASS 10 + 10
Class=Yes Class=No 10
Recall (r) = =1
10 + 0
Class=Yes 10 0 2 *1* 0.5
ACTUAL F - measure (F) = = 0.62
CLASS Class=No 10 980 1 + 0.5
990
Accuracy = = 0.99
1000
Alternative Measures
10
Precision (p) = = 0.5
PREDICTED CLASS 10 + 10
Class=Yes Class=No 10
Recall (r) = =1
10 + 0
Class=Yes 10 0 2 *1* 0.5
ACTUAL F - measure (F) = = 0.62
CLASS Class=No 10 980 1 + 0.5
990
Accuracy = = 0.99
1000
PREDICTED CLASS 1
Precision (p) = =1
1+ 0
Class=Yes Class=No
1
Recall (r) = = 0.1
Class=Yes 1 9 1+ 9
ACTUAL 2 * 0.1*1
CLASS Class=No 0 990 F - measure (F) = = 0.18
1 + 0.1
991
Accuracy = = 0.991
1000
02/03/2018 Introduction to Data Mining, 2nd Edition 10
Alternative Measures
PREDICTED CLASS
Precision (p) = 0.8
Class=Yes Class=No
Recall (r) = 0.8
Class=Yes 40 10 F - measure (F) = 0.8
ACTUAL
CLASS Class=No 10 40 Accuracy = 0.8
Alternative Measures
PREDICTED CLASS
Precision (p) = 0.8
Class=Yes Class=No
Recall (r) = 0.8
Class=Yes 40 10 F - measure (F) = 0.8
ACTUAL
CLASS Class=No 10 40 Accuracy = 0.8
PREDICTED CLASS
Class=Yes Class=No Precision (p) =~ 0.04
Class=Yes 40 10 Recall (r) = 0.8
ACTUAL F - measure (F) =~ 0.08
CLASS Class=No 1000 4000
Accuracy =~ 0.8
PREDICTED CLASS
Yes No
ACTUAL
CLASS Yes TP FN
No FP TN
Alternative Measures
PREDICTED CLASS
Precision (p) =~ 0.04
Class=Yes Class=No
TPR = Recall (r) = 0.8
Class=Yes 40 10 FPR = 0.2
ACTUAL
CLASS Class=No 1000 4000 F - measure (F) =~ 0.08
Accuracy =~ 0.8
PREDICTED CLASS
Class=Yes Class=No Precision (p) = 0.5
Class=Yes 10 40
TPR = Recall (r) = 0.2
ACTUAL FPR = 0.2
CLASS Class=No 10 40
PREDICTED CLASS
Precision (p) = 0.5
Class=Yes Class=No
TPR = Recall (r) = 0.5
Class=Yes 25 25
ACTUAL FPR = 0.5
Class=No 25 25
CLASS
(TPR,FPR):
● (0,0): declare e verything
to be negative class
● (1,1): declare e verything
to be positive class
● (1,0): ideal
● Diagonal line:
– Random guessing
– Below diagonal line:
u prediction is o pposite
of the true class
x2 < 12.63
x1 < 7.24
x2 < 8.64
x1 < 13.29 x2 < 17.35
x1 < 12.11
x2 < 1.38 x1 < 6.56 x1 < 2.15
0.059 0.220
x1 < 18.88
x1 < 7.24
x2 < 8.64 0.071
0.107
x1 < 12.11
x2 < 1.38 0.727
0.164
x1 < 18.88
0.143 0.669 0.271
0.654 0
x2 < 12.63
x1 < 7.24
x2 < 8.64 0.071
0.107
x1 < 12.11
x2 < 1.38 0.727
0.164
x1 < 18.88
0.143 0.669 0.271
0.654 0
At threshold t:
TPR=0.5, FNR=0.5, FPR=0.12, TNR=0.88
02/03/2018 Introduction to Data Mining, 2nd Edition 21
● No model consistently
outperform the other
● M1 is better for
small FPR
● M2 is better for
large FPR
● Random guess:
§ Area = 0.5
TP 5 4 4 3 3 3 3 2 2 1 0
FP 5 5 4 4 3 2 1 1 0 0 0
TN 0 0 1 1 2 3 4 4 5 5 5
FN 0 1 1 2 2 2 2 3 3 4 5
TPR 1 0.8 0.8 0.6 0.6 0.6 0.6 0.4 0.4 0.2 0
ROC Curve:
● Cost-sensitive classification
– Misclassifying rare class as majority class is
more expensive than misclassifying majority
as rare class
● Sampling-based approaches
Cost Matrix
PREDICTED CLASS
Class=Yes Class=No
ACTUAL
CLASS Class=Yes f(Yes, Yes) f(Yes,No)
Sampling-based Approaches