Professional Documents
Culture Documents
Lecture 9
Lecture 9
Lecture 9
1
The Confusion Matrix
Type – I Error
Type – II Error
Type – I Error
Type – II Error
2
Performance Metrics
Gull Ahmed is the President of CESD and wants to know
what people are saying about his performance on linkedIn
He builds a system that detects the tweet
• Positive class is tweet about CESD
• Negative is all other tweets
In one day LinkedIn has 1 million posts
𝑡𝑝 + 𝑡𝑛 • 100 post about CESD
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = • Other 999,900 are not related to CESD
𝑡𝑝 + 𝑓𝑝 + 𝑡𝑛 + 𝑓𝑛
Gull team created a classifier that classified every post as
0 + 999,900 “not related to CESD”
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = BAD CLASSIFIER!!!!!!
0 + 0 + 999,900 + 100
TRUE NEGATIVE: 999,900
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 99.99% FALSE NEGATIVE: 100
FALSE POSITIVE: 0
TRUE POSITIVE: 0
Accuracy is not a good metric when the goal is to discover a rare event or at least not
completely balanced in frequency 3
Performance Metrics
4
Performance Metrics
𝑡𝑝 + 𝑡𝑛 𝑀𝑦 𝐶𝑜𝑟𝑟𝑒𝑐𝑡 𝐴𝑛𝑠𝑤𝑒𝑟𝑠
• 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = =
𝑡𝑝 +𝑓𝑝 +𝑡𝑛 +𝑓𝑛 𝐴𝑙𝑙 𝑄𝑢𝑒𝑠𝑡𝑖𝑜𝑛𝑠
• What fraction of the am I correct in my classification
𝑡𝑝 𝑇𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒
• 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = =
𝑡𝑝 +𝑓𝑝 𝑀𝑦 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
• How much should you trust me when I say that something tests positive
• What fraction of my positives are true positives
5
Performance Metrics
𝑡𝑝 𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒
• 𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = =
𝑡𝑝 +𝑓𝑛 𝑅𝑒𝑎𝑙 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒
• How much of reality has been covered by my positive output?
𝑡𝑛 𝑇𝑟𝑢𝑒 𝑛𝑒𝑔𝑡𝑖𝑣𝑒
• 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 = =
𝑡𝑛 +𝑓𝑝 𝑅𝑒𝑎𝑙 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒
• How much of reality has been covered by my negative output?
6
Performance Metrics
• You are shown the marks of 21 students .. 10 pass and 11 Fail. Your
task it to accept all pass students and reject failed ones
Actual Pass Actual Fail
Predicted Pass 5 2
Predicted Fail 5 9
7
Performance Metrics
Actual Pass Actual Fail
Predicted Pass 5 2
Predicted Fail 5 9
𝑡𝑝 5 𝑡𝑛 9
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = = = 71.1% 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 = = = 81.1%
𝑡𝑝 +𝑓𝑝 7 𝑡𝑛 +𝑓𝑝 11
8
Performance Metrics
Actual Pass Actual Fail Actual Pass Actual Fail
Predicted Pass 1 0 Predicted Pass 10 11
Predicted Fail 9 11 Predicted Fail 0 0
𝑡𝑝 1 𝑡𝑝 10
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = = = 100% 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = = = 47.6%
𝑡𝑝 +𝑓𝑝 1 𝑡𝑝 +𝑓𝑝 21
𝑡𝑝 1 𝑡𝑝 10
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = = = 10% 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = = = 100%
𝑡𝑝 +𝑓𝑛 10 𝑡𝑝 +𝑓𝑛 10
𝑡𝑝 0
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = = = 0%
𝑡𝑝 +𝑓𝑝 0
Actual Pass Actual Fail
Predicted Pass 0 0 𝑡𝑝 0
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = = = 0%
𝑡𝑝 +𝑓𝑛 1
Predicted Fail 1 20
𝑡𝑝 + 𝑡𝑛 20
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = = = 95%
𝑡𝑝 +𝑓𝑝 +𝑡𝑛 +𝑓𝑛 21
How do we know what is better?? A combined measure 9
Performance Metrics
10
Performance Metrics
11
Performance Metrics
• Combined measure that assesses the P/R tradeoff is F Measure (F-1 Score)
2 2𝑃𝑅
• 𝐹= 1 1 = Actual Pass Actual Fail
+ 𝑃+𝑅
𝑃 𝑅 Predicted Pass 1 0
Predicted Fail 9 11
2𝑃𝑅 2(1)(0.1)
• 𝐹= = = 0.22
𝑃+𝑅 1+0.1 𝑡𝑝 1
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = = = 100%
𝑡𝑝 +𝑓𝑝 1
𝑡𝑝 1
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = = = 10%
𝑡𝑝 +𝑓𝑛 10
12
Performance Metrics
Actual Pass Actual Fail Actual Pass Actual Fail
Predicted Pass 0 0 Predicted Pass 10 11
Predicted Fail 1 20 Predicted Fail 0 0
𝑡𝑝 10
𝑡𝑝 0 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = = = 47.6%
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = = = 0% 𝑡𝑝 +𝑓𝑝 21
𝑡𝑝 +𝑓𝑝 0
𝑡𝑝 10
𝑡𝑝 0 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = = = 100%
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = = = 0% 𝑡𝑝 +𝑓𝑛 10
𝑡𝑝 +𝑓𝑛 1
𝑡𝑝 + 𝑡𝑛 20
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = = = 95%
𝑡𝑝 +𝑓𝑝 +𝑡𝑛 +𝑓𝑛 21
2𝑃𝑅 2(0.47)(1)
• 𝐹= = = 0.63
2𝑃𝑅 2(0)(0) 𝑃+𝑅 1+0.47
• 𝐹= = =0
𝑃+𝑅 0+0
13
F-Β Measure
1
•𝐹= 1 1−𝛼
𝛼 +
𝑃 𝑅
𝛽 2 + 1 𝑃𝑅 1−𝛼
•𝐹= Where 𝛽=
𝛽 2 𝑃+𝑅 𝛼
14
F - Β Measure
15
Helpful Materials
• Book: Speech and Language Processing
Topics: 4.7 – 4.9
• https://www.youtube.com/watch?v=Btdly0kKoic&list=PLnvLVSNZy9VLfLalXwCY0IasyKTKZboBQ&index=11
• https://www.javatpoint.com/precision-and-recall-in-machine-
learning#:~:text=Recall%20of%20a%20machine%20learning%20model%20is%20dependent%20on%20positi
ve,correctly%20classifying%20all%20positive%20samples.
• https://machinelearningmastery.com/precision-recall-and-f-measure-for-imbalanced-classification/
• https://blog.paperspace.com/deep-learning-metrics-precision-recall-accuracy/
16