Download as pdf or txt
Download as pdf or txt
You are on page 1of 78

Department of Computer Science and Engineering

Course Code: KAI-601


Course Name: Machine Learning Techniques
Faculty Name: Kaleemur rehman
Email :Kaleem.rehman@glbitm.ac.in
Department of Computer Science and Engineering
Vision
To build strong teaching environment that responds to the needs of industry and challenges of the society

Mission
• M1 : Developing strong mathematical & computing skill set among the students.
• M2 : Extending the role of computer science and engineering in diverse areas like Internet of Things (IoT),
Artificial Intelligence & Machine Learning and Data Analytics.
• M3 : Imbibing the students with a deep understanding of professional ethics and high integrity to serve the
Nation.
• M4 : Providing an environment to the students for their growth both as individuals and as globally competent
Computer Science professional wit encouragement for innovation & start-up culture.

Subject: Machine Learning Techniques (KAI-601)


Course Outcome

CO’S TITLE

CO1 Understand the key concerns that are common to all software development processes.

CO2 Select appropriate process models, approaches and techniques to manage a given software development process.

CO3 Able to elicit requirements for a software product and translate these into a documented design.

CO4 Recognize the importance of software reliability and how we can design dependable software, and what measures
are used.

CO5 Understand the principles and techniques underlying the process of inspecting and testing software and making it
free of errors and tolerable.

CO6 Understanding the latest advances and its applications in software engineering and testing.

Subject: Machine Learning Techniques (KAI-601)


Syllabus
Unit-I :
Introduction:
Learning, Types of Learning, Well defined learning problems, Designing a Learning System, History of ML, Introduction
of Machine Learning Approaches – (Artificial Neural Network, Clustering, Reinforcement Learning, Decision Tree
Learning, Bayesian networks, Support Vector Machine, Genetic Algorithm), Issues in Machine Learning and Data
Science Vs Machine Learning

Unit-II :
REGRESSION:
Linear Regression and Logistic Regression BAYESIAN LEARNING - Bayes theorem, Concept learning, Bayes Optimal
Classifier, Naïve Bayes classifier, Bayesian belief networks, EM algorithm. SUPPORT VECTOR MACHINE: Introduction,
Types of support vector kernel – (Linear kernel, polynomial kernel,and Gaussiankernel), Hyperplane – (Decision
surface), Properties of SVM, and Issues in SVM.

Unit-III :
DECISION TREE LEARNING - Decision tree learning algorithm, Inductive bias, Inductive inference with decision trees,
Entropy and information theory, Information gain, ID-3 Algorithm, Issues in Decision tree learning. INSTANCE-BASED
LEARNING – k-Nearest Neighbour Learning, Locally Weighted Regression, Radial basis function networks, Case-based
learning.

Subject:
Syllabus
Unit-IV :
ARTIFICIAL NEURAL NETWORKS :
Perceptron’s, Multilayer perceptron, Gradient descent and the Delta rule, Multilayer networks, Derivation of
Backpropagation Algorithm, Generalization, Unsupervised Learning – SOM Algorithm and its variant; DEEP LEARNING
- Introduction,concept of convolutional neural network , Types of layers – (Convolutional Layers , Activation function
, pooling , fully connected) , Concept of Convolution (1D and 2D) layers, Training of network, Case study of CNN for
eg on Diabetic Retinopathy, Building a smart speaker, Self-deriving car etc..

Unit-V :
REINFORCEMENT LEARNING:
Introduction to Reinforcement Learning , Learning Task,Example of Reinforcement Learning in Practice, Learning
Models for Reinforcement – (Markov Decision process , Q Learning - Q Learning function, Q Learning Algorithm ),
Application of Reinforcement Learning,Introduction to Deep Q Learning. GENETIC ALGORITHMS: Introduction,
Components, GA cycle of reproduction, Crossover, Mutation, Genetic Programming, Models of Evolution and
Learning, Applications

Subject:
Topics to be Covered

➢ Introduction to ML

➢ Types of Learning

➢ Well defined learning problems

➢ Designing a Learning System

➢ History of ML

➢ Inttroduction of Machine Learning Approaches

➢ Issues in Machine Learning and Data Science Vs Machine Learning


Subject: Machine Learning And Techniques
Introduction

• ML is a subset of AI

• Focus on mainly design the system and allow them to learn and make predictions based on some experiences.

• Machine learning is an application of artificial intelligence (AI) that provides systems the ability to
automatically learn and improve from experience without being explicitly programmed.

• Machine learning focuses on the development of computer programs that can access data and use it to learn
for themselves.

Subject: MLT
Introduction

Machine learning (ML) is a field of computer science that gives computers the ability to
automatically learn without being explicitly programmed.

• Learning is the ability to improve one's behaviour based on experience.


• Build computer systems that automatically improve with experience.
• What are the fundamental laws that govern all learning processes?
• Machine Learning explores algorithms that can
• learn from data / build a model from data
• use the model for prediction, decision making or solving some tasks

• Eg ---Diagnosis a disease (How?)


• Computer vision
• Robotic control

Subject:
Types

Three types of Learning are there :- on the basis of whether or not it needs
I. Supervised Learning
II. Unsupervised Learning
III. Reinforcement Learning

Subject:
Types of ML
Supervised Learning

• Supervised learning is a type of


machine learning that uses labeled
data to train machine learning
models. In labeled data, the output is
already known. The model just needs
to map the inputs to the respective
outputs.
• An example of supervised learning is
to train a system that identifies the
image of an animal.
• Applications : Speech recognition,
• Weather forcasting, biometric,health
care sector, retail sector.

Subject: Machine Learning Techniques


Unsupervised Learning
• Unsupervised learning is a type of machine
learning that uses unlabeled data to train
machines. Unlabeled data doesn’t have a fixed
output variable. The model learns from the data,
discovers the patterns and features in the data,
and returns the output.
• Depicted below is an example of an unsupervised
learning technique that uses the images of
vehicles to classify if it’s a bus or a truck. The
model learns by identifying the parts of a vehicle,
such as a length and width of the vehicle, the
front, and rear end covers, roof hoods, the types
of wheels used, etc. Based on these features, the
model classifies if the vehicle is a bus or a truck.
• Eg.Bank sector,retail etc.

Subject: Machine Learning Techniques


Rainforcement Learning

• Reinforcement Learning trains a machine to


take suitable actions and maximize its rewards
in a particular situation. It uses an agent and an
environment to produce actions and rewards.
The agent has a start and an end state. But,
there might be different paths for reaching the
end state, like a maze. In this learning
technique, there is no predefined target
variable.
• An example of reinforcement learning is to
train a machine that can identify the shape of
an object, given a list of different objects. In
the example shown, the model tries to predict
the shape of the object, which is a square in
this case.

Subject: Machine Learning Techniques


Well-posed Learning Problems

Learning : Ability to improve ones behavior based on experience.

Machine Learning Definition: A computer program is said to learn from


experience E with respect to some class of tasks T and performance measure P,
if its performance at tasks in T, as measured by P, improves with experience E.
Learning = Improving with experience at some task
Improve over task T,
With respect to performance measure, P
Based on experience, E.

Subject: Machine Learning And Techniques


Well-posed Learning Problems

Example:
A computer program that learns to play checkers might improve its performance as
measured by its ability to win at the class of tasks involving playing checkers games,
through experience obtained by playing games against itself.
A checkers learning problem:
• Task T: playing checkers
• Performance measure P: percent of games won against opponents
• Training experience E: playing practice games against itself

Subject: Machine Learning And Techniques


Well-posed Learning Problems

Well-defined learning problem

• Learning to recognize spoken words

• Learning to drive an autonomous vehicle.

• Learning to classify new astronomical structures.

• Learning to play world-class backgammon.

Subject: Machine Learning And Techniques


Well-posed Learning Problems

• Information theory

• Philosophy

• Psychology and neurobiology

• Statistics

Subject: Machine Learning And Techniques


Well-posed Learning Problems

Subject: Machine Learning And Techniques


Well-posed Learning Problems

Subject: Machine Learning And Techniques


Designing a Learning System

let us consider designing a program to learn to play checkers:

1. Choosing the Training Experience


2. Choose the target function
3. Choose a representation of TF
4. Choose the parameter fitting algorithm
5. Evaluate the entire system

Subject: Machine Learning And Techniques


Designing a Learning System

1. Choosing the Training Experience:

• Different learning paradigms: reinforcement learning, supervised learning,


unsupervised learning.

• Different availability of data: online/offline.

Subject: Machine Learning And Techniques


Designing a Learning System

2. Choose the target function

• Make it easy as possibly


• Integrate all prior information
• Learn only the aspects.

Subject: Machine Learning And Techniques


Designing a Learning System

• Choose a representation of the target function


• representation of the input data:
• integrate all relevant features -> make it easy
• but not more than necessary -> curse of dimensionality
• representation of the function:
• different types of functions: classification, regression, density
estimation, novelty detection, visualization, ...
• different models: symbolic (logical rules, decision tree, prolog
program, ...), subsymbolic (neural network, statistical estimator,
...), parameterized, lazy model, ...

Subject: Machine Learning And Techniques


Designing a Learning System

• Estimate the parameters


• optimize some error/objective on the given data
• linear/quadratic optimization
• gradient descent
• greedy algorithm, heuristics
• discrete optimization methods such as genetic algorithms
• statistical methods such as EM

Subject: Machine Learning And Techniques


Designing a Learning System

• Evaluation

• Does the system behave well in practice ?


• Generalization to data not used for training

Subject: Machine Learning And Techniques


Designing a Learning System

In order to complete the design of the learning system, we must now


choose

• The exact type of knowledge to be learned


• A representation for this target knowledge
• A learning mechanism

Subject: Machine Learning And Techniques


Designing a Learning System

2. Choosing the Target Function

• if b is a final board state that is won, then V(b) = 100


• if b is a final board state that is lost, then V(b) = -100
• if b is a final board state that is drawn, then V(b) = 0
• if b is a not a final state in the game, then V(b) = V(bl), where b' is the
best final board state that can be achieved starting from b and playing
optimally until the end of the game

Subject: Machine Learning And Techniques


Designing a Learning System

3. Choosing a Representation for the Target Function


• xl: the number of black pieces on the board
• x2: the number of red pieces on the board
• x3: the number of black kings on the board
• x4: the number of red kings on the board
• x5: the number of black pieces threatened by red (i.e., which can be captured
on red's next turn)
• X6: the number of red pieces threatened by black

V’(b) = w0+w1x1+w2x2+w3x3+w4x4+w5x5+w6x6

where wo through W6 are numerical coefficients, or weights, to be chosen by the


learning algorithm

Subject: Machine Learning And Techniques


Designing a Learning System

• Partial design of a checkers learning program:


• Task T: playing checkers
• Performance measure P: percent of games won in the world
tournament
• Training experience E: games played against itself
• Target function: V:Board → y

Target function representation


V’(b) = w0+w1x1+w2x2+w3x3+w4x4+w5x5+w6x6

Subject: Machine Learning And Techniques


Designing a Learning System

4. Choosing a Function Approximation Algorithm


• Estimating Training Values
• Adjusting The Weights
• LMS weight update rule.

5. The Final Design


• Performance System
• Critic
• Generalizer
• Experiment Generator

Subject: Machine Learning And Techniques


Steps in designing a LS

• Type of training exp to be fed machine algo wisely


• The exact type of knowledge to be learned (choosing TF) initially TF is
unknown(possibly chosen action)
• A representation of this target knowledge
• A learning mechanism(choosing algo for TF)
Learning system examples

Eg1: Baby crying


Eg2: Email spam or not
Eg:3 Credit approval

Subject: Machine Learning And Techniques


ML design
• Define the problem statement
• Identify evaluation metrics
• List necessary requirements for model development and deployment
• Select, train and evaluate ML algo
• Create a high level design of the ML system
• Optimize and scale
Advantage of Machine Learning ???

• Learning and writing an algorithm


Its easy for human brain but it is tough for a machine. It takes some
time and good amount of training data for machine to accurately
classify objects.
• Implementation and automation
This is easy for a machine. Once learnt a machine can process one
million images without any fatigue where as human brain can’t
• That’s why ML with bigdata is a deadly combination

Subject: Machine Learning And Techniques


Machine Learning History

• 1950s-1960s
⁻ Samuel’s Algorithm on gaming
⁻ Alan turing proposed turing test
–first algorithm developed based on ideas and from psychology and
statistics.
• 1970s: Practical applications started
⁻ Symbolic concept introduction.
⁻ Image recognition but not accurate
⁻ Natural Language Processing(symbolic)

Subject: Machine Learning And Techniques


Machine Learning History

• 1980s:
⁻ Focus on experimental methodology

⁻ Advanced decision tree and rule learning

⁻ Learning, planning and problem solving

⁻ Neural network

⁻ SVM most accurate and efficient classification output

Subject: Machine Learning And Techniques


Machine Learning History

• 1990’s –statistics based ML


⁻ Support vector machines
⁻ Data mining
⁻ Adaptive agent and web applications
⁻ Text learning
⁻ Reinforcement learning
⁻ Bayesian Network learning
• 1997-IBM computer deep blue –chess playing with human

Subject: Machine Learning And Techniques


Machine Learning History

2000—Advanced ML and DL
• CNN
• RNN
2001 onwards:
• AlphaGo having millions of moves.
• Image recognition
• Object detection
• Speech recognition like (siri 2010).
• Recommender system (All otp platforms)
• Medical diagnosis and prediction

Subject: Machine Learning And Techniques


Machine Learning History

• Popularity of this field in recent time and the reasons behind that
• New software /algorithms
• Neural networks
• Deep learning
• New hardware
• GPU’s
• Cloud Enabled
• Availability of big data

Subject: Machine Learning And Techniques


Issues in Machine Learning

• Data Quality:
• ML is only as good as the data you provide it and you need a lot of data.
Accuracy of ML is driven by the quality of the data.
• Transparency
• It is often very difficult to make definitive statements on how well a model is
going to generalize in new environments.
• Manpower
• Having data and being able to use it so does not introduce bias into the
model. How organizations change how they think about software
development and how they collect and use data. Make sure they have enough
skillsets in the organization.

Subject: Machine Learning And Techniques


Issues in Machine Learning

• What algorithms exist for learning general target functions from specific training
examples? In what settings will particular algorithms converge to the desired
function, given sufficient training data? Which algorithms perform best for which
types of problems and representations?

• How much training data is sufficient? What general bounds can be found to
relate the confidence in learned hypotheses to the amount of training experience
and the character of the learner's hypothesis space?

Subject: Machine Learning And Techniques


Issues in Machine Learning

➢When and how can prior knowledge held by the learner guide the process of
generalizing from examples? Can prior knowledge be helpful even when it is only
approximately correct?

➢What is the best strategy for choosing a useful next training experience, and how
does the choice of this strategy alter the complexity of the learning problem?

➢What is the best way to reduce the learning task to one or more function
approximation problems? Put another way, what specific functions should the
system attempt to learn? Can this process itself be automated?

Subject: Machine Learning And Techniques


Issues in Machine Learning

➢How can the learner automatically alter its representation to improve


its ability to represent and learn the target function?

Subject: Machine Learning And Techniques


Data Science Vs Machine Learning

Subject: Machine Learning And Techniques


Data Science Vs Machine Learning

Subject: Machine Learning And Techniques


Data Science Vs Machine Learning(Case Study)
• Case Study: How Would the Three Be Used Together?
• Suppose we were building a self-driving car, and were working on the specific
problem of stopping at stop signs. We would need skills drawn from all three of
these fields.

• ML The car has to recognize a stop sign using its cameras. We construct a dataset
of millions of photos of street side objects, and train an algorithm to predict
which have stop signs in them.

• AI: Once our car can recognize stop signs, it needs to decide when to take the
action of applying the brakes. It’s dangerous to apply them too early or too late,
and we need it to handle varying road conditions (for example, to recognize on a
slippery road that it is not slowing down quickly enough), which is a problem of
control theory.
Subject: Machine Learning And Techniques
Data Science Vs Machine Learning(Case Study)

• Data science: In street tests we find that the car’s performance isn’t
good enough, with some false negatives in which it drives right by a
stop sign. After analysing the street test data, we gain the insight that
the rate of false negatives depends on the time of day: it is more likely
to miss a stop sign before sunrise or after sunset. We realize that
most of our training data included only objects in full daylight, so we
construct a better dataset including night time images and go back to
the machine learning step.

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

• Decision Tree Learning:

• Decision tree belongs to the deterministic family of classifier and is a


model based techniques.

• The most common implementation of decision tree algo uses a binary


tree structure for classification.

• It is in the nature of the decision tree algorithm to overfit the training


data.
• DT pruning: Experimenting with the depth of the tree to avoid
overfitting is know as decision tree pruning.

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

• Genetic algorithms
• Genetic algorithms are stochastic search algorithms which act on a
population of possible solutions.

• They are loosely based on the mechanics of population genetics and


selection.

• The potential solutions are encoded as ‘genes’ — strings of


characters from some alphabet.

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

• Abstraction of real biological evolution.

• Solve complex problems(like NP hard)

• Focus on optimization

• Population(searchsize) of possible solutions for a given problems.

• From a group of individual the best will survive.

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

• New

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

• New

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

• Reinforcement Learning: A reinforcement learning algorithm, or


agent, learns by interacting with its environment.
• Reinforcement learning is a goal-driven, highly adaptive machine
learning technique in the field of artificial intelligence , in which there
are two basic elements: state and action
• Performing an action in a certain state is a strategy. The learner must
constantly explore to generate an optimal strategy.

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

• Different from supervised and unsupervised learning, it regards learning


as a process of interaction between agents and the environment through
exploration and evaluation.

• The agent selects an action to be applied to the environment by sensing


the current state of the environment.

• After the environment accepts the action, the state changes, and a reward
is given to the agent.

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

• According to the new state of the environment, the agent continues


to select the next action, and this is repeated until it reaches the
terminated state.

• The goal of reinforcement learning is to maximize the accumulated


rewards by adjusting strategies.

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

Learning Models of Reinforcement

• There are two important learning models in reinforcement learning:

• Markov Decision Process


• Q learning

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

Support Vector Machine: SVM derived from ‘Maximal Margin Classifier’


only suitable for binary Classification.

Maximal Margin classifier:


Dimension: (p=1) ,(p=2)
Hyperplane: (p-1)= 0 ,(p-1)=1,

Support vector Machine works with Linearly Separable data


In 1990 SVM works on in replacement of neural network as powerful
Algorithm. but today it works on simple data

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

• Clustering:

• Process of grouping similar items together


• Clusters should be very similar to each other but..
• Should be very different from the objects of other clusters/ other
clusters
• We can say that intra-cluster similarity between objects is high
and inter-cluster similarity is low
• Important human activity --- used from early childhood in
distinguishing between different items such as cars and cats,
animals and plants etc

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

• Assumes documents are real-valued vectors.


• Clusters based on centroids (aka the centre of gravity or mean) of points in a
cluster, c:
• Reassignment of instances to clusters is based on distance to the current
cluster centroids.
• (Or one can equivalently phrase it in terms of similarities)
• Select K random docs {s1, s2,… sK} as seeds.
• Until clustering converges (or other stopping criterion):
• For each doc di:
• Assign di to the cluster cj such that dist(xi, sj) is minimal.
• (Next, update the seeds to the centroid of each cluster) For each
cluster cj

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

• Simplest case: one numeric attribute A


• Distance(X,Y) = A(X) – A(Y)
• Several numeric attributes:
• Distance(X,Y) = Euclidean distance between X,Y
• Are all attributes equally important?
• Weighting the attributes might be necessary

Subject: Machine Learning And Techniques


Data Science Vs Machine Learning(Case Study)

• Simple Clustering: K-means


1. Works with numeric data only
2. Pick a number (K) of cluster centers (at random)
3. Assign every item to its nearest cluster center (e.g. using Euclidean
distance)
4. Move each cluster center to the mean of its assigned items
5. Repeat steps 2,3 until convergence (change in cluster assignments
less than a threshold)

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

Subject: Machine Learning And Techniques


Applications of Clustering
• Marketing
• Biology
• Libraries
• Insurance
• City planning
• Earthquake Studies
• Image processing
• Genetics
• Finance eg. stock market
• Customer service
• Manufacturing
• Medical diagnosis
• Fraud detection
• Traffic analysis
• Social network analysis
• Climate analysis
• Sports analysis
Introduction of Machine Learning Approaches

• Termination conditions
• Several possibilities, e.g.,
• A fixed number of iterations.
• Doc partition unchanged.
• Centroid positions don’t change.

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

• How Many Clusters?


• Number of clusters K is given
• Partition n docs into predetermined number of clusters
• Finding the “right” number of clusters is part of the problem
• Given docs, partition into an “appropriate” number of subsets.
• E.g., for query results - ideal value of K not known up front -
though UI may impose limits.
• Can usually take an algorithm for one flavor and convert to the other.

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches
• Hierarchical Clustering
• Build a tree-based hierarchical taxonomy (dendrogram) from a
set of documents.
animal

vertebrate invertebrate

fish reptile amphib. mammal worm insect crustacean

• One approach: recursive application of a partitional clustering


algorithm

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

• Starts with each doc in a separate cluster


• then repeatedly joins the closest pair of clusters, until there is only one cluster.
• The history of merging forms a binary tree or hierarchy.

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

• Many variants to defining closest pair of clusters:


• Single-link
• Similarity of the most cosine-similar (single-link)
• Complete-link
• Similarity of the “furthest” points, the least cosine-similar
• Centroid
• Clusters whose centroids (centres of gravity) are the most cosine-
similar
• Average-link
• Average cosine between pairs of elements

Subject: Machine Learning And Techniques


Introduction of Machine Learning Approaches

• Many variants to defining closest pair of clusters:


• Single-link
• Similarity of the most cosine-similar (single-link)
• Complete-link
• Similarity of the “furthest” points, the least cosine-similar
• Centroid
• Clusters whose centroids (centres of gravity) are the most cosine-
similar
• Average-link
• Average cosine between pairs of elements

Subject: Machine Learning And Techniques


Questions

Q1: List out the types of machine learning.


Q2: Define Artificial neural netwok.
Q3: What is a spline ?
Q4: Distinguish between classification and regression.
Q5: Define Bayesian network.
Q6: Explain applications and challenges of Machine Learning.
Q7: What is Overfitting, and How Can You Avoid It?
Q8: Explain well posed learning system with example.
Q9: Differentiate between AI,ML,Data science .
Q10: Differentiate between following:
a) Supervised Learning and Un-Supervised Learning
b) Bias and Variance

Subject: Machine Learning And Techniques

You might also like