Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 51

MACHINE LEARNING

20CS3020AA – 3-0-4-4

III-I CSE-H

1
Machine Learning

2
What is Machine Learning?
Learn from Learn from
experience experience Follow
Instructions

Experience == Data
3
Machine Learning
Machine learning is the systematic study of
algorithms and systems that improve their
knowledge or performance with experience.

4
Need for Machine Learning

5
Examples

6
Application of Machine Learning

7
Applications of Machine Learning

8
CO-1: Syllabus &Course Outcome
CO1: Introduction: Learning, Types of Machine Learning,
Supervised Learning: The Machine Learning Process,
Performance Measures, The Bias-Variance Tradeoff,
Learning with Trees: Using Decision Trees, Constructing
Decision Trees, Classification and Regression Trees (CART),
Turning Data into Probabilities: The Naïve Bayes’ Classifier,
Bayesian Networks. The EM Algorithm: Estimate Means of K
Gaussians, General Statement of EM Algorithm.

Course Outcome: Understand the basic terminology and


measurements of Machine Learning and Apply Machine
Learning techniques using Tree and Bayesian models. 9
CO-2: Syllabus & Course Outcome
Neural Networks: The Brain and The Neuron, Neural
Networks, The Perceptron, Linear Separability, The Multi-
Layer Perceptron: Going Forwards, Going Backwards
Back-Propagation of Error, The Multi-Layer Perceptron in
Practice, Deriving Back-Propagation. Support Vector
Machines: Optimal Separation, Kernels, The Support
Vector Machine Algorithm, Extensions to the SVM
Course Outcome: Build Neural Network and SVM
Algorithm for solving Classification and Prediction
problems
10
CO-3: Syllabus & Course Outcome
Dimensionality Reduction: The Curse of Dimensionality,
Linear Discriminant Analysis, (LDA), Principal Components
Analysis (PCA), Evolutionary Learning: The Genetic
Algorithm (GA), Generating Offspring: Genetic Operators,
Using Genetic Algorithms. Ensemble Learning: Boosting,
Bagging and Random Forests.

Course Outcome: Apply Dimensionality reduction


methods, Evolutionary learning and Ensemble
methods to solve classification problems.
11
CO-4: Syllabus & Course Outcome
Unsupervised Learning: The k-means algorithm,
Hierarchical Clustering, The Self-Organising Feature Map.
Explanation based Learning, Reinforcement Learning and
Evaluating Hypotheses: Introduction, Learning Task, Q
Learning, Non Deterministic Rewards and Actions. Active
Reinforcement Learning, Generalization in
Reinforcement Learning.
Course Outcome: Illustrate different unsupervised
models, Analytical, Explanation-Based and
reinforcement learning methods 12
MOOCs courses
• Machine Learning :
https://www.coursera.org/learn/machine-learning?u
tm_source=gg&%20utm_medium=sem&utm_campa
ign=07-StanfordML-IN&utm_content=B2C&campaig
nid=1950458127&adgroupid=113440892778&device
=c&keyword=&matchtype=&network=g&devicemod
el=&adpostion=&
creativeid=475416041431&hide_mobile_promo&gcli
d=EAIaIQobChMIqMm7i7TK9wIVlwsrCh1C8QlhEAAY
ASAAEgJxwPD_BwE

• Applied Machine Learning Using Python :


13
https://www.coursera.org/programs/minor-
End Semester Summative Evaluation

Evaluation Evaluation Duration


Weightage/Marks CO1 CO2 CO3 CO4 CO5
Type Component (Hours)
Weightage 10 10
Skill Sem-End
180
Exam
Max Marks 50 50
End Semester Weightage 20 5 5 5 5
Summative End Semester
180
Evaluation Exam
Total= 40 % Max Marks 100 25 25 25 25

Lab End Weightage 10 10


Semester 180
Exam Max Marks 50 50

14
In Semester Formative Evaluation
Evaluation Evaluation Duration
Type Component Weightage/Marks (Hours) CO1 CO2 CO3 CO4 CO5

1.87 1.87
Global Weightage 7.5 1.875 1.875
90 5 5
Challenges
Max Marks 100 25 25 25 25
Weightage 7.5 7.5
Continuous
In Semester Evaluation - Max Marks 100
120
100
Formative Lab Exercise
Evaluation
Total= 30 % MOOCs Weightage 7.5 1.875 1.875 1.87 1.87
90 5 5
Review
Max Marks 100 25 25 25 25
Skilling Weightage 7.5 7.5
Continuous 120
Evaluation Max Marks 100 100
15
In Semester Summative Evaluation

Evaluation Evaluation Weightage/Marks Duration CO1 CO2 CO3 CO4 CO5


Type Component (Hours)
Skill In-Sem Weightage 5 120 5
Exam Max Marks 50 50
Semester in Weightage 5 90 2.5 2.5
Exam-I Max Marks 50 25 25
Semester in Weightage 5 90 2.5 2.5
In Semester Exam-II Max Marks 50 25 25
Summative Lab In Semester Weightage 5 120 5
Evaluation Exam Max Marks 50 50
Total= 30 % MOOCs Weightage 5 120 1.25 1.25 1.25 1.25
Certification Max Marks 100 25 25 25 25
Leaderboard Weightage 5 120 1.25 1.25 1.25 1.25
ranking for Max Marks 100 25 25 25 25
Global
Challenges

16
Text Books
• Stephen Marsland, “Machine Learning an
Algorithmic Perspective”, CRC Press, (2009).

• Tom M. Mitchell, “MachineLearning”,


McGrawHill, 1997.

17
Reference Books :
• 1. Peter Harrington, “Machine Learning in Action”, Manning
Publications
• 2. Ethem lpaydin, “Introduction to Machine Learning”, The
MIT Press, (2010).
• 3. Programming Python by Mark Lutz, O'Reilly
• 4. Chun, J Wesley, Core Python Programming, 2 nd Edition
Pearson 2007 Reprint

Web Links :
• 1. Data Science and Machine Learning:
https://www.edx.org/course/data-science-machinelearning
• 2. Machine Learning: https://www.ocw.mit.edu/courses/6-
867-machine-learning-fall-2006/
18
TYPES OF MACHINE LEARNING
TECHNIQUES

19
Supervised learning
• Supervised learning A training set of
examples with the correct responses
(targets) is provided and, based on
this training set, the algorithm
generalises to respond correctly to all
possible inputs. This is also called
learning from exemplars.

20
SUPERVISED LEARNING

21
Unsupervised learning
• In Unsupervised Learning, Correct responses
are not provided.
• The algorithm tries to identify similarities
between the inputs so that inputs that have
something in common are categorised
together.
• The statistical approach to unsupervised
learning is known as density estimation.

22
UNSUPERVISED LEARNING

23
Reinforcement learning
This is somewhere between supervised and
unsupervised learning.
The algorithm gets told when the answer is wrong,
but does not get told how to correct it. It has to
explore and try out different possibilities until it
works out how to get the answer right.
Reinforcement learning is sometime called learning
with a critic because of this monitor that scores the
answer, but does not suggest improvements.

24
REINFORCEMENT LEARNING

25
Evolutionary learning
Biological evolution can be seen as a learning
process: biological organisms adapt to improve
their survival rates and chance of having
offspring in their environment. We can model
this in a computer, using an idea of fitness,
which corresponds to a score for how good the
current solution is.

26
Evolutionary learning

27
SUPERVISED LEARNING
• There is a set of data (the training data) that
consists of a set of input data that has target
data which is the answer that the algorithm
should produce, attached.
• This is usually written as a set of data (xi , ti),
where the inputs are xi , the targets are ti, and
the i index suggests that we have lots of pieces
of data, indexed by i running from 1 to some
upper limit N.
28
Regression
• regression problem in statistics: fit a
mathematical function describing a curve, so
that the curve passes as close as possible to all
of the datapoints. It is generally a problem of
function approximation or interpolation,
working out the value between values that we
know. The problem is how to work out

29
Regression
• Suppose that we have to tell the value of the
output (which we will call y since it is not a
target datapoint) when x = 0.44.

30
Regression

(a) (b) (c) (d)

(a): A few datapoints from a sample problem.


(c): Two possible ways to predict the values between the known
datapoints: connecting the points with straight lines, or using a
cubic approximation (which in this case misses all of the points).
(b) & (d): Two more complex approximators that pass through the
points, although the (d) is rather better than (b)
31
Classification
• The classification problem consists of taking
input vectors and deciding which of N classes
they belong to, based on training from
exemplars of each class.
• The most important point about the
classification problem is that it is discrete.
• Each example belongs to precisely one class,
and the set of classes covers the whole possible
output space.
32
Example : Coin Classifier
• Let’s consider how to set up a coin classifier.
Features:
• When the coin is pushed into the slot, the machine takes a few
measurements of it.
• These could include the diameter, the weight, and possibly the
shape, and are the features that will generate our input vector.
• In this case, our input vector will have three elements, each of
which will be a number showing the measurement of that
feature (choosing a number to represent the shape would
involve an encoding, for example that 1=circle, 2=hexagon, etc.).
• We can also consider the image, density etc., as features.

33
Example : Coin Classifier

An alternative set of decision


A set of straight line decision boundaries that separate the plusses
boundaries for a classification from the lightening strikes better, but
34
problem requires a line that isn’t straight
Methods
• Methods will aim to find decision boundaries
that can be used to separate out the different
classes.
• Given the features that are used as inputs to
the classifier, we need to identify some values
of those features that will enable us to decide
which class the current input is in.

35
MACHINE LEARNING STEPS

36
Semi-Supervised Learning
• The area of semi-supervised learning attempts to
deal with the need for large amounts of labelled
data; Initially, it is a combination of both supervised
and unsupervised learning.
• When the available data contains small amounts of
labeled data and large quantities of unlabeled data.
• The procedure starts with clustering the similar
data, then depending on the labeled data available
in each cluster, it will provide labels for the
unlabeled data.
37
38
THE MACHINE LEARNING PROCESS
• Data Collection and Preparation
• Feature Selection
• Algorithm Choice
• Parameter and Model Selection
• Training
• Evaluation

39
TERMINOLOGY
• Inputs: An input vector is the data given as one input to
the algorithm. Written as x, with elements xi , where i
runs from 1 to the number of input dimensions, m.

• Weights: wij, are the weighted connections between


nodes i and j. For neural networks these weights are
analogous to the synapses in the brain. They are
arranged into a matrix W.

• Outputs: The output vector is y, with elements yj, where j


runs from 1 to the number of output dimensions, n. We
can write y(x,W) to remind ourselves that the output
depends on the inputs to the algorithm and the current40
TERMINOLOGY

• Targets The target vector t, with elements tj, where j


runs from 1 to the number of output dimensions, n,
are the extra data that we need for supervised
learning, since they provide the ‘correct’ answers
that the algorithm is learning about.

• Error E, a function that computes the inaccuracies of


the network as a function of the outputs y and
targets t. 41
The Curse of Dimensionality
• Dimension of data means the number of
features that can represent an observation or
data point, so-called feature length.
• Dimension space means the available values of
each dimension.
• Curse of dimensionality refers to non-initiative
properties of data observed when working in
high dimension space.
• High-dimension impacts on relevant run-time
issues. 42
The Curse of Dimensionality

Correlation analysis, clustering, information value, variance inflation factor,


principal component analysis (PCA) are some of the ways in which number of
dimensions can be reduced. 43
TESTING MACHINE LEARNING ALGORITHMS

• The purpose of learning is to get better at predicting


the outputs.
• The only way to know how successfully the
algorithm has learnt is to compare the predictions
with known target labels, which is how the training
is done for supervised learning.
• you can observe the error that the algorithm makes
on the training set.
• If we want the algorithms to generalise to examples
that were not seen in the training set, we need
some different data, a test set. 44
Performance-Measures: Confusion Matrix

45
Performance-Measures: Confusion Matrix

46
Bias - Variance
• Bias and variance are used in supervised machine
learning, in which an algorithm learns from training
data or a sample data set of known quantities.
• The correct balance of bias and variance is vital to
building machine-learning algorithms that create
accurate results from their models.
• Bias is the amount that a model’s prediction differs
from the target value, compared to the training data.
• Variance describes how much a random variable
differs from its expected value
47
Underfitting/Overfitting
• Any ML model is said to be underfitting if it cannot
capture the underlying trend of the data, i.e., the model
performs well on training data but performs poorly on
testing data.
• Reasons for Underfitting:
• High bias and low variance 
• The size of the training dataset used is not enough.
• The model is too simple.
• Training data is not cleaned and also contains noise in it.

48
Overfitting
• if we train for too long, then we will overfit
the data, When a model gets trained with
large collection of data for a longer, it starts
learning from the noise and inaccurate data
entries from the data set.
• And when testing with test data results in High
variance. Therefore, the model that we learn
will be much too complicated, and won’t be
able to generalise.
49
Underfitting/Overfitting

Overfitting
Underfitting

50
Good Balance
THE BIAS-VARIANCE TRADEOFF

51

You might also like