Lecture 10 Image

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 48

Image Processing

Lecture 10: Image Analysis


Feature recognition and classification

Modified by:
Assoc. Prof. Dr Hossam Mahmoud Moftah
Faculty of computers and artificial intelligence– Beni-Suef
University
Boundary Descriptors Techniques (lab assignment using python)
 There are several simple geometric measures that can be
useful for describing a boundary
 Length
 the number of pixels along a boundary gives a rough approximation of
its
 Diameter (Major Axis)
 Minor Axis
 the line perpendicular to the major axis
 Eccentricity
 Ratio of major axis to minor axis

Adapted from Rushin Shah, Site Supervisor at Adani Shantigram


Region Descriptors Techniques
 Shape features (lab assignment using python)
 Area and perimeter

Adapted from Idar Dyrdal


Region Descriptors Techniques
 Texture features
 One of the simplest set of statistical features for texture
description consists of the following histogram-based
descriptors of the image (or region):

 mean, variance (or its square root, the standard deviation),


skew, energy (used as a measure of uniformity), and
entropy,

Adapted from Rushin Shah, Site Supervisor at Adani Shantigram


Region Descriptors Techniques

Adapted from Rushin Shah, Site Supervisor at Adani Shantigram


Region Descriptors Techniques

Adapted from Rushin Shah, Site Supervisor at Adani Shantigram


Region Descriptors Techniques

Adapted from Rushin Shah, Site Supervisor at Adani Shantigram


Region Descriptors Techniques

Adapted from Rushin Shah, Site Supervisor at Adani Shantigram


Region Descriptors Techniques
 Histogram-based (statistical) features
 The energy descriptor
 provides another measure of how the pixel values are distributed
along the gray-level range: images with a single constant value
have maximum energy (i.e., energy = 1); images with few gray
levels will have higher energy than the ones with many gray
levels. The energy descriptor can be calculated as

Adapted from Rushin Shah, Site Supervisor at Adani Shantigram


Region Descriptors Techniques

Adapted from Rushin Shah, Site Supervisor at Adani Shantigram


Region Descriptors Techniques
 Histogram-based (statistical) features
 Roughness
 The variance is sometimes used as a normalized descriptor of
roughness (R), defined as:

 Where, 𝜎2 is the normalized (to a [0, 1] interval) variance.


 R = 0 for areas of constant intensity, that is, smooth texture.

Adapted from Rushin Shah, Site Supervisor at Adani Shantigram


Region Descriptors Techniques

Adapted from Rushin Shah, Site Supervisor at Adani Shantigram


Texture Features Example

Adapted from Rushin Shah, Site Supervisor at Adani Shantigram


Texture Features Example

 Highest uniformity has lowest entropy

Adapted from Rushin Shah, Site Supervisor at Adani Shantigram


Texture Features - Gray level co-occurrence Matrix
Texture Features - Gray level co-occurrence Matrix

Adapted from Rushin Shah, Site Supervisor at Adani Shantigram


Texture Features - Gray level co-occurrence Matrix
(lab assignment using python)

Adapted from Rushin Shah, Site Supervisor at Adani Shantigram


Classification
Classification

 To design a classifier it is essential to have a training set of


images
 supervised learning:
 the classes to which the images belong are known

 unsupervised learning:
 they are unknown
 training the classifier: is the process of using data to
determine the best set of features for a classifier

Adapted from Geoff Dougherty, Image Processing for Medical Applications, CAMBRIDGE, 2009
Difference between Classification and Regression in Machine
Learning

Adapted from Jason Brownlee PhD, https://machinelearningmastery.com/about/


Difference between Classification and Regression in Machine
Learning

Adapted from Jason Brownlee PhD, https://machinelearningmastery.com/about/


Statistical classification

 There are two general approaches to statistical


classification parametric and nonparametric:
 Parametric Methods:
 require probability distributions and estimate
parameters derived from them such as the mean and
standard deviation to provide a compact
representation of the classes.
 there is a set of fixed parameters that uses to
determine a probability model that is used in
Machine Learning as well.
 Examples: Logistic Regression, Naïve Bayes
Model, etc.
Adapted from Geoff Dougherty, Image Processing for Medical Applications, CAMBRIDGE, 2009,
https://www.geeksforgeeks.org/difference-between-parametric-and-non-parametric-methods/
Statistical classification

Adapted from Geoff Dougherty, Image Processing for Medical Applications, CAMBRIDGE, 2009,
Statistical classification

 Parametric Methods:
 Consider the case where there are just two classes
 class 1 (ω1) and class 2 (ω2), and a single feature, x.
 We have a training set, i.e. representative examples
from both classes,
 so that we can measure the feature for both classes and
construct probability distributions for each
 These are formally known as the probability density
functions or class-conditional probabilities (Appendix
B.3)
 p(x|ω1) and p(x|ω2), i.e. the probabilities of measuring
the value x, given that the feature is in class 1 or class
2, respectively.
Adapted from Geoff Dougherty, Image Processing for Medical Applications, CAMBRIDGE, 2009,
Statistical classification

 Parametric Methods:
 If we have a large number of examples in each class,
then the probability density functions will be Gaussian
in shape (the Central Limit Theorem)
 The classification problem is: given another feature
measurement, x, to which class does this feature
belong?
 the posterior probability, P(ωi|x), i.e. the probability
that given a feature value of x, the feature belongs to
class ωi.
 Probability theory, and specifically Bayes’ Rule, relates
the posterior probabilities to the class-conditional
probabilities or likelihoods (the derivation is given in
Appendix B.3):
Adapted from Geoff Dougherty, Image Processing for Medical Applications, CAMBRIDGE, 2009,
Statistical classification

 Parametric Methods:

 where P(ωi) is the a priori or prior probability (i.e. the


probability of being in class ω1 or ω2 based on the
relative numbers of those classes in the population,
prior to taking the test)

 and p(x) is often considered a mere scaling factor (the


evidence) that guarantees that the posterior
probabilities sum to unity

Adapted from Geoff Dougherty, Image Processing for Medical Applications, CAMBRIDGE, 2009,
Statistical classification

 Parametric Methods:

 We want to maximize the posterior probability, P(ωi|x)


 which is the same as maximizing p(x|ω1) . P(ωi)).
 Bayes’ decision rule is:

Adapted from Geoff Dougherty, Image Processing for Medical Applications, CAMBRIDGE, 2009,
Statistical classification

 Nonparametric
 In Non-Parametric methods, there is no need to make
any assumption of parameters for the given population
or the population we are studying. 

 Non-parametric methods are gaining popularity

 Examples: KNN, Decision Tree Model, etc.

Adapted from https://www.geeksforgeeks.org/difference-between-parametric-and-non-parametric-methods/


k-nearest-neighbor (k-NN) classifier
(lab assignment using python for medical application)
 K nearest neighbors (KNN) is a simple algorithm that stores
all available cases and classifies new cases based on a
similarity measure (distance function)

 A case is classified by a majority voting of its neighbors, with


the case being assigned to the class most common among its K
nearest neighbors measured by a distance function.

 If K=1, then the case is simply assigned to the class of its


nearest neighbor

Adapted from Bing Liu, CS583, UIC


k-nearest-neighbor (k-NN) classifier

 Distance Function Measurements:


 Most common: Euclidean distance

 To classify a new input vector x, examine the k-closest training data points to x and assign the object to the most frequently occurring class

adopted from https://www.cut-the-knot.org/pythagoras/DistanceFormula.shtml

Adapted from David Sontag, New York University and Vibhav Gogate, Carlos Guestrin,
Mehryar Mohri, & Luke Zettlemoyer
k-nearest-neighbor (k-NN) classifier

Adapted from David Sontag, New York University and Vibhav Gogate, Carlos Guestrin,
Mehryar Mohri, & Luke Zettlemoyer
k-nearest-neighbor (k-NN) classifier

Adapted from Carla P. Gomes


gomes@cs.cornell.edu
KNN Example

Points X1(Acid Durability) X2(Strength) Y(Classification)

P1 7 7 BAD
P2 7 4 BAD
P3 3 4 GOOD
P4 1 4 GOOD
P5 3 7 ?

Adapted from Anand Bhosale presentation, international institude of information technology,innovation and leadership
Euclidean Distance From Each Point

Adapted from Anand Bhosale presentation, international institude of information technology,innovation and leadership
3 Nearest NeighBour

P1 P2 P3 P4

(7,7) (7,4) (3,4) (1,4)


Euclidean
Distance of
P5(3,7) from

Class BAD BAD GOOD GOOD

Adapted from Anand Bhosale presentation, international institude of information technology,innovation and leadership
KNN Classification

Points X1(Durability) X2(Strength) Y(Classification


)
P1 7 7 BAD
P2 7 4 BAD
P3 3 4 GOOD
P4 1 4 GOOD
P5 3 7 GOOD

Adapted from Anand Bhosale presentation, international institude of information technology,innovation and leadership
KNN pseudocode

Adapted from Anand Bhosale presentation, international institude of information technology,innovation and leadership
Unsupervised methods

 With unsupervised classification, the class labels are


unknown, and the data are plotted to see whether they cluster
naturally.

 Unsupervised methods examples:


 k-means clustering (see lecture 5)
 Hierarchical clustering

Adapted from Anand Bhosale presentation, international institude of information technology,innovation and leadership
Measuring Classification Performance

Adapted from https://en.wikipedia.org/wiki/Confusion_matrix


The Confusion Matrix
(lab assignment using python for medical application)

Adapted from https://glassboxmedicine.com/2019/02/17/measuring-performance-the-confusion-matrix/


The Confusion Matrix

Adapted from https://glassboxmedicine.com/2019/02/17/measuring-performance-the-confusion-matrix/


The Confusion Matrix

Adapted from https://glassboxmedicine.com/2019/02/17/measuring-performance-the-confusion-matrix/


The Confusion Matrix

Adapted from https://glassboxmedicine.com/2019/02/17/measuring-performance-the-confusion-matrix/


The Confusion Matrix
(lab assignment using python for medical application)
A receiver operating characteristic (ROC) curve
(lab assignment using python for medical application)
 The ROC curve is a graphical plot that illustrates the
diagnostic ability of a binary classifier system as its
discrimination threshold is varied.

 The ROC curve is created by plotting the true positive


rate (TPR) against the false positive rate (FPR) at various
threshold settings. 

Adapted from https://en.wikipedia.org/wiki/Receiver_operating_characteristic /


A receiver operating characteristic (ROC) curve

Adapted https://en.wikipedia.org/wiki/Receiver_operating_characteristic
Confusion Matrix example

Adopted from Poornima Singh, Sanjay Singh, and Gayatri S Pandi-Jain, Effective heart disease prediction system using
data mining techniques, Int J Nanomedicine, 2018
The End

You might also like