Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 71

FACULTY DEVELOPMENT PROGRAMME ON

(SPONSORED BY DBT)
BIOINSPIRED MACHINE LEARNING

PRESENTED BY

Dr. K. MEENA
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
SCHOOL OF COMPUTING
VELTECH RANHARAJAN DR SAGUNTHALA R&D INSTITUTE OF SCIENCE
AND TECHNOLOGY
OUTLINE
•Introduction to Machine Learning

3/5/22
•Applications

•Machine Learning Types

•Guidelines for Designing ML experiments

•ANN, KNN (Supervised learning)

•Hierarchical (Unsupervised)

2
•Practical demo
ARTIFICIAL INTELLIGENCE, MACHINE
LEARNING & DEEP LEARNING

3/5/22
3
3/5/22
4
3/5/22
5
3/5/22
6
3/5/22
7
WHAT IS MACHINE LEARNING

3/5/22
8
MACHINE LEARNING IS…

3/5/22
9
MACHINE LEARNING IS…

3/5/22
10
MACHINE LEARNING IS…

3/5/22
11
3/5/22
12
3/5/22
13
3/5/22
14
ML TYPES

3/5/22
15
LEARNING SYSTEM MODEL

3/5/22
16
SUPERVISED LEARNING ALGORITHM

3/5/22
17
SUPERVISED LEARNING CONTD…
 Training data include the desired outputs.

3/5/22
 where the algorithm generates a function that maps
inputs to desired outputs.

 Example - classification problem: the learner is required


to learn a function which maps a vector into one of
several classes by looking at several

 input-output examples of the function.


18
SUPERVISED LEARNING CONTD…

3/5/22
19
SUPERVISED LEARNING ALGORITHM

3/5/22
• Linear Regression
• Nearest Neighbor
• Artificial Neural Network (ANN)
• Gaussian Naive Bayes
• Decision Trees
• Support Vector Machine (SVM) 20
• Random Forest
3/5/22
21
UNSUPERVISED LEARNING CONTD…

3/5/22
22
CLUSTERING

3/5/22
23
UNSUPERVISED LEARNING
 Training data do not include the desired outputs.

3/5/22
 Clustering is an unsupervised learning task.
 There is no target value to shoot for.
 Identify groups of “similar” data points, that are
“dissimilar” from others.
 Partition the data into groups (clusters) that satisfy these
constraints
 Points in the same cluster should be similar.
 Points in different clusters should be dissimilar

24
UNSUPERVISED LEARNING ALGORITHM

• k-means clustering

3/5/22
• Association Rules

• Hierarchical Clustering

25
DIFFERENCES:
 Supervised learning: discover patterns in the data that relate

3/5/22
data attributes with a target (class) attribute.

 These patterns are then utilized to predict the values of the


target attribute in future data instances.

 Unsupervised learning: The data have no target attribute.

 We want to explore the data to find some intrinsic structures in


them.
26
SEMI-SUPERVISED LEARNING
 Training data includes a few desired outputs

3/5/22
 Unlabeled data - when used in conjunction with a small amount of

labeled data, can produce considerable improvement in learning

accuracy.

 Labeling is expensive and difficult

 Labeling is unreliable

 Ex. Segmentation applications

 Need for multiple experts.


27
3/5/22
28
REINFORCEMENT LEARNING

 Rewards from sequence of actions

3/5/22
 Decision making (robot, chess machine)
 Learn action to maximize payoff
 Not much information in a payoff signal
 Payoff is often delayed
 learn from reinforcement or (occasional) rewards --- most general
form of learning
 We only get feedback in the form of how well we are doing
(not what we should be doing)
 No supervised output but delayed reward
29
REINFORCEMENT ALGORITHM

3/5/22
30
REINFORCEMENT LEARNING
CONTD…

3/5/22
 Receive rewards from sequential actions.
 Learns a policy of how to act given an observation of the
world.
 Every action has some impact in the environment
 Environment provides feedback that guides the learning
algorithm.

31
3/5/22
32
 

GUIDELINES FOR DESIGNING ML

1. Aim of the StudyEXPERIMENTS

3/5/22
 Objective
 Expected error

2. Selection of the Response Variable


 Performance matrix
 Confusion matrix
 recall
 Accuracy 33

 precision
 

CONFUSION MATRIX

3/5/22
34
 

CONFUSION MATRIX

3/5/22
35
 

CONFUSION MATRIX : EXAMPLE

3/5/22
36
 

ACCURACY

3/5/22
37
 

PRECISION

3/5/22
38
3/5/22
39
 

RECEIVER OPERATING CHACTERISTIC

3/5/22
40
 

RECALL

3/5/22
41
3/5/22
42
 

GUIDELINES FOR DESIGNING ML …


3. Choice of Factors and Levels
4. Choice of Experimental Design

3/5/22
5. Performing the Experiment
6. Statistical Analysis of the Data
43

7. Conclusions and Recommendations


APPLICATIONS OF MACHINE LEARNING
3/5/22
45
MACHINE LEARNING IN COMPUTER
SCIENCE

3/5/22
Speech/ Planning
Audio Locomotion
Processing

Vision/
Natural
Language
Processing
Machine Image
Processing

Learning
Biomedical/ Financial
Chemical Modeling
Informatics
Human
Computer Analytics
Interaction
46
Sample Applications
• Web Search
• Computational Biology

3/5/22
• Finance
• E-commerce
• Space Exploration
• Robotics
• Information Extraction
• Social Networks
• Debugging Software 47

• [Your Favorite Area]


SUCCESSFUL APPLICATIONS OF ML
Learning to recognize spoken words - SPHINX (Lee 1989)

3/5/22

 Learning to drive an autonomous vehicle - ALVINN (Pomerleau 1989)


 Learning to classify celestial objects - (Fayyad et al 1995)
 Learning to play world-class backgammon - TD-GAMMON (Tesauro 1992)
 Designing the morphology and control structure of electro-mechanical artifacts -
GOLEM (Lipton, Pollock 2000)

48
MACHINE LEARNING APPLICATIONS
 Web search

3/5/22
 Computational biology

 Finance

 E-commerce

 Space exploration

 Robotics

 Information extraction

 Social networks

49
MACHINE LEARNING APPLICATIONS
 Computer vision and robotics:

3/5/22
 detection, recognition and categorization of objects
 face recognition
 tracking objects (rigid and articulated) in video modeling
visual attention
 Speech recognition
 Information retrieval, Web search, Google ads...

50
MACHINE LEARNING APPLICATIONS
 Biology and medicine:

3/5/22
 drugdiscovery
 computational genomics (analysis and design)
 medical imaging and diagnosis.

 Financial industry:
 Fraud detection
 Credit approval
 Price and market prediction

51
MACHINE LEARNING APPLICATIONS
 Automate employee access granting and revocation

3/5/22
 Amazon using its large dataset of employee roles and
employee access levels - Machine Learning algorithm that
will predict which employees should be granted access to
what resources

 to minimize the human involvement required to grant or


revoke employee access

52
MACHINE LEARNING APPLICATIONS

3/5/22
 Protecting Animals
 Cornell University – algorithm
 to identify whales in the ocean based on audio recordings so that
ships can avoid hitting them.
 Oregon State University - algorithm
 that will determine which bird species is/are on a given audio
recording collected in field conditions.

53
MACHINE LEARNING APPLICATIONS
 Identifying Heart Failure -
 machine learning algorithm that combs through physicians free-

3/5/22
form text notes (in the electronic health records) and synthesize the
text using Natural Language Processing (NLP)- similar to a
cardiologist can read through another physician’s notes and figure
out the same

 Predicting Hospital Re-admissions -Additive Analytics -


predictive model 
 thatidentifies which patients are at high risk of readmission- can
predict emergency room admissions before they happen—
improving care outcomes and reducing costs.
54
3/5/22
55
03/05/2022
ALGORITHMS: K NEAREST NEIGHBORS

Dr K Meena
56
SIMPLE ANALOGY..
• Tell me about your friends(who your neighbors
are) and I will tell you who you are.

Dr K Meena 03/05/2022 57
INSTANCE-BASED LEARNING

Its very similar to a


Desktop!!

Dr K Meena 03/05/2022 58
KNN – DIFFERENT NAMES

• K-Nearest Neighbors
• Memory-Based Reasoning
• Example-Based Reasoning
• Instance-Based Learning
• Lazy Learning

Dr K Meena 03/05/2022 59
WHAT IS KNN?

• A powerful classification algorithm used in pattern


recognition.

• K nearest neighbors stores all available cases and


classifies new cases based on a similarity measure(e.g
distance function)

• One of the top data mining algorithms used today.

• A non-parametric lazy learning algorithm (An Instance-


based Learning method).

Dr K Meena 03/05/2022 60
KNN: CLASSIFICATION APPROACH

• An object (a new instance) is classified by a


majority votes for its neighbor classes.
• The object is assigned to the most common class
amongst its K nearest neighbors.(measured by a
distant function )

Dr K Meena 03/05/2022 61
Knn: Classification Approach

03/05/2022
Dr K Meena
62
DISTANCE MEASURE

Compute
Distance
Test
Record

Training
Records Choose k of the
“nearest” records

Dr K Meena 03/05/2022 63
03/05/2022 Dr K Meena
64
DISTANCE MEASURE
DISTANCE BETWEEN NEIGHBORS

• Calculate the distance between new example


(E) and all examples in the training set.

• Euclidean distance between two examples.


– X = [x1,x2,x3,..,xn]
– Y = [y1,y2,y3,...,yn]
– The Euclidean distance between X and Y is defined as

D( X ,Y )  (x  y )
ni i
2

 i1

Dr K Meena 11
03/05/2022 6
5
K-NEAREST NEIGHBOR
ALGORITHM
• All the instances correspond to points in an n-dimensional
feature space.

• Each instance is represented with a set of numerical


attributes.

• Each of the training data consists of a set of vectors and a


class label associated with each vector.

• Classification is done by comparing feature vectors of


different K nearest points.

• Select the K-nearest examples to E in the training set.

• Assign E to the most common class among its K-


• nearest neighbors. Dr K Meena 03/05/2022 66
3-KNN: EXAMPLE(1)

sqrt [(35-37)2+(35-50)2 +(3-


2)2]=15.16

sqrt [(22-37)2+(50-50)2 +(2-


2)2]=15
sqrt [(63-37)2+(200-50)2 +(1-
2)2]=152.23
sqrt [(59-37)2+(170-50)2 +(1-
2)2]=122
sqrt [(25-37)2+(40-50)2 +(4-
2)2]=15.74

? YES

Dr K Meena 03/05/2022 67
HOW TO CHOOSE K?
• If K is too small it is sensitive to noise points.

• Larger K works well. But too large K may include majority


points from other classes.

• Rule of thumb is K < sqrt(n), n is number of examples.

Dr K Meena 03/05/2022 68
03/05/2022 Dr K Meena
69
03/05/2022
Dr K Meena
X X X

(a) 1-nearest neighbor (b) 2-nearest neighbor (c) 3-nearest neighbor

K-nearest neighbors of a record x are data points


that have the k smallest distance to x 70
STRENGTHS OF KNN
• Very simple and intuitive.
• Can be applied to the data from any distribution.
• Good classification if the number of samples is large enough.

WEAKNESSES OF KNN

• Takes more time to classify a new example


• ``need to calculate and compare distance from new example
to all other examples.
• Choosing k may be tricky.
• Need large number of samples for accuracy.

Dr K Meena 03/05/2022 71

You might also like