Professional Documents
Culture Documents
UNIT4 Evaluation Metrics
UNIT4 Evaluation Metrics
EVALUATION METRICS
Evaluation Metrics
Evaluation Metrics
Evaluation metrics are used to measure the quality of the statistical or
machine learning model.
There are many different types of evaluation metrics available to test a
model.
Some Examples are:
1. classification accuracy
2. logarithmic loss
3. confusion matrix, etc
Classification accuracy is the ratio of the number of correct predictions to
the total number of input samples.
Logarithmic loss, also called log loss, works by penalizing the false
classifications.
Evaluation Metrics
Evaluation Metrics
A confusion matrix gives us a matrix as output and describes the complete
performance of the model.
Evaluation metrics involves using a combination of these individual
evaluation metrics to test a model or algorithm.
Evaluation Metrics
Evaluation metrics for classification,Regression & Clustering
For Classification some of the important evaluation matrices are:
1. Confusion Matrix
2. Recall,Sensitivity & Specificity
3. F1 Score
4. AUC-ROC(Area under ROC curve)
it is a plot of FPR(False positive rate) on the x-axis and TPR(True positive rate)
on the y-axis for different threshold ranging from 0.0 to 1.
Evaluation Metrics
Evaluation metrics for classification
AUC-ROC(Area under ROC curve)
The AUC ROC Plot is one of the most popular metrics used for determining
machine learning model predictive capabilities.
some reasons for using AUC ROC plot-
1. The Different Machine learning models curves can be checked with
different thresholds.
2. Model predictive capability is summarized by the area under the
curve(AUC).
3. AUC is considered to be scaled variant, it measures the rank of predictions
rather than its absolute values
4. AUC always focuses on the quality of the Model’s skills on prediction
irrespective of what threshold has been chosen.
Evaluation Metrics
Evaluation metrics for classification
Logarithmic Loss
Logarithmic Loss or Log loss works by penalizing the false classification.
It is nothing but a negative average of the log of corrected predicted
probabilities for each instance.
Log loss mostly suits the multi-class classification problem.
Log loss takes the probabilities for all classes present in the sample.
If we minimize the log loss for a particular classifier we get better
performance metrics.
Evaluation Metrics
Evaluation metrics for classification
Logarithmic Loss
The math formula is given below
The Silhouette distance shows up to which extent the distance between the
objects of the same class differs from the mean distance between the objects
from different clusters.
Values of silhouette score lie between -1 to +1 .
if the value is close to 1 then it corresponds to good clustering results having
dense and well-defined clusters
Evaluation Metrics
Evaluation metrics for classification,Regression & Clustering
Silhouette
if the value is close to -1 then it represents bad clustering.
Therefore, the higher the silhouette value is, the better the results from
clustering.