Lecture 9

Machine Learning
The Confusion Matrix

Lecture 9
1
The Confusion Matrix
Type – I Error
Type – II Error
Type – I Error
Type – II Error
2
Performance Metrics
Gull Ahmed is the President of CESD and wants to know
what people are saying about his performance on linkedIn
He builds a system that detects the tweet
• Positive class is tweet about CESD
• Negative is all other tweets
In one day LinkedIn has 1 million posts
𝑡𝑝 + 𝑡𝑛 • 100 post about CESD
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = • Other 999,900 are not related to CESD
𝑡𝑝 + 𝑓𝑝 + 𝑡𝑛 + 𝑓𝑛
Gull team created a classifier that classified every post as
0 + 999,900 “not related to CESD”
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = BAD CLASSIFIER!!!!!!
0 + 0 + 999,900 + 100
TRUE NEGATIVE: 999,900
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 99.99% FALSE NEGATIVE: 100
FALSE POSITIVE: 0
TRUE POSITIVE: 0
Accuracy is not a good metric when the goal is to discover a rare event or at least not
completely balanced in frequency 3
Performance Metrics
4
Performance Metrics
𝑡𝑝 + 𝑡𝑛 𝑀𝑦 𝐶𝑜𝑟𝑟𝑒𝑐𝑡 𝐴𝑛𝑠𝑤𝑒𝑟𝑠
• 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = =
𝑡𝑝 +𝑓𝑝 +𝑡𝑛 +𝑓𝑛 𝐴𝑙𝑙 𝑄𝑢𝑒𝑠𝑡𝑖𝑜𝑛𝑠
• What fraction of the am I correct in my classification
𝑡𝑝 𝑇𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒
• 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = =
𝑡𝑝 +𝑓𝑝 𝑀𝑦 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
• How much should you trust me when I say that something tests positive
• What fraction of my positives are true positives
5
Performance Metrics
𝑡𝑝 𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒
• 𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = =
𝑡𝑝 +𝑓𝑛 𝑅𝑒𝑎𝑙 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒
• How much of reality has been covered by my positive output?
𝑡𝑛 𝑇𝑟𝑢𝑒 𝑛𝑒𝑔𝑡𝑖𝑣𝑒
• 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 = =
𝑡𝑛 +𝑓𝑝 𝑅𝑒𝑎𝑙 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒
• How much of reality has been covered by my negative output?
6
Performance Metrics
• You are shown the marks of 21 students .. 10 pass and 11 Fail. Your
task it to accept all pass students and reject failed ones
Actual Pass Actual Fail
Predicted Pass 5 2
Predicted Fail 5 9
• You predicted 7 students correctly

• 5 pass 𝑻𝒓𝒖𝒆 𝑷𝒐𝒔𝒊𝒕𝒊𝒗𝒆: 𝒕𝒑
• 2 Fail 𝑭𝒂𝒍𝒔𝒆 𝑷𝒐𝒔𝒊𝒕𝒊𝒗𝒆: 𝒇𝒑
• Falsely predicted as 5 failed 𝑭𝒂𝒍𝒔𝒆 𝒏𝒆𝒈𝒂𝒕𝒊𝒗𝒆: 𝒇𝒏
• Truly predicted as 9 Failed 𝑻𝒓𝒖𝒆 𝒏𝒆𝒈𝒂𝒕𝒊𝒗𝒆: 𝒕𝒏
7
Performance Metrics
Predicted Pass 5 2
Predicted Fail 5 9
• You predicted 7 students correctly

• 5 pass 𝑻𝒓𝒖𝒆 𝑷𝒐𝒔𝒊𝒕𝒊𝒗𝒆: 𝒕𝒑
• 2 Fail 𝑭𝒂𝒍𝒔𝒆 𝑷𝒐𝒔𝒊𝒕𝒊𝒗𝒆: 𝒇𝒑
• Falsely predicted as 5 failed 𝑭𝒂𝒍𝒔𝒆 𝒏𝒆𝒈𝒂𝒕𝒊𝒗𝒆: 𝒇𝒏
• Truly predicted as 9 Failed 𝑻𝒓𝒖𝒆 𝒏𝒆𝒈𝒂𝒕𝒊𝒗𝒆: 𝒕𝒏
𝑡𝑝 + 𝑡𝑛 14 𝑡𝑝 5
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = = = 66.6% 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = = = 50%
𝑡𝑝 +𝑓𝑝 +𝑡𝑛 +𝑓𝑛 21 𝑡𝑝 +𝑓𝑛 10
𝑡𝑝 5 𝑡𝑛 9
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = = = 71.1% 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 = = = 81.1%
𝑡𝑝 +𝑓𝑝 7 𝑡𝑛 +𝑓𝑝 11
8
Performance Metrics
Actual Pass Actual Fail Actual Pass Actual Fail
Predicted Pass 1 0 Predicted Pass 10 11
Predicted Fail 9 11 Predicted Fail 0 0
𝑡𝑝 1 𝑡𝑝 10
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = = = 100% 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = = = 47.6%
𝑡𝑝 +𝑓𝑝 1 𝑡𝑝 +𝑓𝑝 21
𝑡𝑝 1 𝑡𝑝 10
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = = = 10% 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = = = 100%
𝑡𝑝 +𝑓𝑛 10 𝑡𝑝 +𝑓𝑛 10
𝑡𝑝 0
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = = = 0%
𝑡𝑝 +𝑓𝑝 0
Predicted Pass 0 0 𝑡𝑝 0
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = = = 0%
𝑡𝑝 +𝑓𝑛 1
Predicted Fail 1 20
𝑡𝑝 + 𝑡𝑛 20
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = = = 95%
𝑡𝑝 +𝑓𝑝 +𝑡𝑛 +𝑓𝑛 21
How do we know what is better?? A combined measure 9
Performance Metrics
10
Performance Metrics
• The F-Score is the measure that combines Sensitivity and Precision to

give one metric which encompasses all
11
Performance Metrics
• It is a useful to have a single number to describe performance. Should be high when

both P and R are high
• Combined measure that assesses the P/R tradeoff is F Measure (F-1 Score)
2 2𝑃𝑅
• 𝐹= 1 1 = Actual Pass Actual Fail
+ 𝑃+𝑅
𝑃 𝑅 Predicted Pass 1 0
Predicted Fail 9 11
2𝑃𝑅 2(1)(0.1)
• 𝐹= = = 0.22
𝑃+𝑅 1+0.1 𝑡𝑝 1
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = = = 100%
𝑡𝑝 1
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = = = 10%
12
Performance Metrics
Actual Pass Actual Fail Actual Pass Actual Fail
Predicted Pass 0 0 Predicted Pass 10 11
Predicted Fail 1 20 Predicted Fail 0 0
𝑡𝑝 10
𝑡𝑝 0 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = = = 47.6%
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = = = 0% 𝑡𝑝 +𝑓𝑝 21
𝑡𝑝 10
𝑡𝑝 0 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = = = 100%
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = = = 0% 𝑡𝑝 +𝑓𝑛 10
𝑡𝑝 + 𝑡𝑛 20
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = = = 95%
𝑡𝑝 +𝑓𝑝 +𝑡𝑛 +𝑓𝑛 21
2𝑃𝑅 2(0.47)(1)
• 𝐹= = = 0.63
2𝑃𝑅 2(0)(0) 𝑃+𝑅 1+0.47
• 𝐹= = =0
𝑃+𝑅 0+0
13
F-Β Measure
1
•𝐹= 1 1−𝛼
𝛼 +
𝑃 𝑅
𝛽 2 + 1 𝑃𝑅 1−𝛼
•𝐹= Where 𝛽=
𝛽 2 𝑃+𝑅 𝛼
• 𝐼𝑓 𝛽 = 1 𝑡ℎ𝑒𝑛 𝑒𝑞𝑢𝑎𝑡𝑖𝑜𝑛 𝑖𝑠 𝑏𝑎𝑙𝑎𝑛𝑐𝑒𝑑

• 𝐼𝑓 0 < 𝛽 < 1 𝑡ℎ𝑒𝑛 𝑦𝑜𝑢 𝑎𝑟𝑒 𝑔𝑖𝑣𝑖𝑛𝑔 𝑚𝑜𝑟𝑒 𝑖𝑚𝑝𝑜𝑟𝑡𝑎𝑛𝑐𝑒 𝑡𝑜 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛
• 𝐼𝑓 1 > 𝛽 𝑡ℎ𝑒𝑛 𝑦𝑜𝑢 𝑎𝑟𝑒 𝑔𝑖𝑣𝑖𝑛𝑔 𝑚𝑜𝑟𝑒 𝑖𝑚𝑝𝑜𝑟𝑡𝑎𝑛𝑐𝑒 𝑡𝑜 𝑠𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦
14
F - Β Measure
• The Β parameter differentially weights the importance of recall and

precision
• Based on the needs of an application
• Values of Β > 1 favor sensitivity
• Values of Β < 1 favor precision
• When Β = 1 then precision and sensitivity are equally balanced
• This is the most frequently used metric
15
Helpful Materials
• Book: Speech and Language Processing
Topics: 4.7 – 4.9
• https://www.youtube.com/watch?v=Btdly0kKoic&list=PLnvLVSNZy9VLfLalXwCY0IasyKTKZboBQ&index=11
• https://www.javatpoint.com/precision-and-recall-in-machine-
learning#:~:text=Recall%20of%20a%20machine%20learning%20model%20is%20dependent%20on%20positi
ve,correctly%20classifying%20all%20positive%20samples.
• https://machinelearningmastery.com/precision-recall-and-f-measure-for-imbalanced-classification/
• https://blog.paperspace.com/deep-learning-metrics-precision-recall-accuracy/
16

Lecture 9

Uploaded by

Copyright:

Available Formats

You might also like

Lecture 9

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 9

Uploaded by

Copyright:

Available Formats

Machine Learning

The Confusion Matrix

• You predicted 7 students correctly

• You predicted 7 students correctly

• The F-Score is the measure that combines Sensitivity and Precision to

• It is a useful to have a single number to describe performance. Should be high when

• 𝐼𝑓 𝛽 = 1 𝑡ℎ𝑒𝑛 𝑒𝑞𝑢𝑎𝑡𝑖𝑜𝑛 𝑖𝑠 𝑏𝑎𝑙𝑎𝑛𝑐𝑒𝑑

• The Β parameter differentially weights the importance of recall and

• When Β = 1 then precision and sensitivity are equally balanced

• This is the most frequently used metric

You might also like