P 1.1.2 ML Types

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 29

University Institute of Engineering

DEPARTMENT OF COMPUTER SCIENCE


& ENGINEERING
Bachelor of Engineering (Computer Science & Engineering)
Subject Name : Machine Learning
Subject Code: CST-316
Topic: Machine learning Types
Lecture-1.2
By : Baljeet Kaur Nagra
DISCOVER . LEARN . EMPOWER
Course Outcomes
CO-1:Apply the basic concept of Machine learning
and statistics learning to deal with real-life
Problems.

CO-2

CO-3:

CO-4:

CO-5:

2
Course Objectives

To study learning
processes:
To provide a
supervised and
comprehensive To understand
To understand the unsupervised,
foundation to modern techniques
history and deterministic and
Machine Learning and practical trends
development of statistical
and Optimization of Machine
Machine Learning. knowledge of
methodology with learning.
Machine learners,
applications t.
and ensemble
learning

3
Syllabus
• UNIT-I                      

• Chapter-1    
                 
Fundamentals of Machine Learning: Introduction to Machine Learning (ML),
Different types of Machine Learning, Machine Learning Life Cycle: Data Discovery,
Exploratory Analysis: Data Preparation, Model Planning, Model Building, Model
Evaluation, Real World Case Study. Foundation of ML: ML Techniques.

                                                            

4
Deep Learning

DEEP
LEARNING

• Deep learning is a class of machine learning algorithms that uses multiple layers


(artificial neural networks)to progressively extract higher level features from the
raw input.
• For example, in image processing, lower layers may identify edges, while higher
layers may identify the concepts relevant to a human such as digits or letters or
5
faces.
CONTENTS
• Terminology
• Process
• Machine Learning Types
• Supervised Learning
• Classification
• Regression
• Unsupervised Learning
• Clustering
• Association
• Semi-supervised Learning
6
• Reinforcement Learning
Terminology
• Dataset: A set of data examples, that contain features important to solving the
problem.
• Feature: With respect to a dataset, a feature represents an attribute and value
combination. Colour is an attribute. “Colour is blue” is a feature.
• Model : A data structure that stores a representation of a dataset (weights and
biases). Models are created/learned when you train an algorithm on a dataset.
• Noise :Any irrelevant information or randomness in a dataset which obscures the
underlying pattern.
• Outlier :An observation that deviates significantly from other observations in the
dataset.
• Test set: A set of observations used at the end of model training and validation to
assess the predictive power of your model. How generalizable is your model to
unseen data?
• Training set : A set of observations used to generate machine learning models. 7
Process
• Data Collection: Collect the data that the algorithm will learn from.

• Data Preparation: Format and engineer the data into the optimal format, extracting
important features and performing dimensionality reduction.

• Training: Also known as the fitting stage, this is where the Machine Learning algorithm
actually learns by showing it the data that has been collected and prepared.

• Evaluation: Test the model to see how well it performs.

• Tuning: Fine tune the model to maximise it’s performance.

8
Approaches
• Supervised Learning

• Unsupervised Learning

• Semi-supervised Learning

• Reinforcement Learning

9
Supervised Learning
• In supervised learning, the goal is to learn the mapping (the rules) between a set
of inputs and outputs.
• For example, the inputs could be the weather forecast, and the outputs would be the
visitors to the beach.
• The goal in supervised learning would be to learn the mapping that describes the
relationship between temperature with other weather conditions and number of beach
visitors.
• So here number of visitors (dependent variable) will be dependent on weather conditions
(independent variable).

• Example labelled data is provided of past input and output pairs during the learning


process to teach the model how it should behave, hence, called ‘supervised’ learning.
• For the beach example, new inputs can then be fed in of forecast temperature and the
Machine learning algorithm will then output a future prediction for the number of 10
visitors.
Supervised Learning
• Being able to adapt to new inputs and make predictions is the crucial generalisation part
of machine learning.
• In training, we want to maximise generalisation, so the supervised model defines the real
‘general’ underlying relationship.
• If the model is over-trained, we cause over-fitting to the examples used and the model
would be unable to adapt to new, previously unseen inputs.
• A side effect to be aware of in supervised learning that the supervision we provide
introduces bias to the learning.
• The model can only be imitating exactly what it was shown, so it is very important to
show it reliable, unbiased examples.
• Also, supervised learning usually requires a lot of data before it learns.
• Obtaining enough reliably labelled data is often the hardest and most expensive part of
using supervised learning.
11
Supervised Learning
• The output from a supervised Machine Learning model could be a category from a finite
set e.g [low, medium, high] for the number of visitors to the beach.
• This is called classification problem.

• The output from a supervised Machine Learning model could be a numeric value from a
finite set e.g [500-2000] for the number of visitors to the beach.
• This is called regression problem.

• Supervised learning is of two types: Classification and Regression.

12
Supervised Learning-Classification
• Classification is used to group the similar data points into different sections in order to
classify them.
• Machine Learning is used to find the rules that explain how to separate the different data
points.
• They all focus on using data and answers to discover rules that linearly separate data
points.
• Linear separability is a key concept in machine learning.
• Classification approaches try to find the best way to separate data points with a line.
• The lines drawn between classes are known as the decision boundaries.
• The entire area that is chosen to define a class is known as the decision surface.
• The decision surface defines that if a data point falls within its boundaries, it will be
assigned a certain class.
13
Supervised Learning-Classification
• Binary Classification CLASS 1
• Multi-Class Classification
• Multi-Label Classification
CLASS 2
• Imbalanced Classification

LINEAR SEPERABLE

14
Binary Classification
• Binary Classification refers to those classification tasks that have two class labels.
• Example: Email spam detection (spam or not).

• Typically, binary classification tasks involve one class that is the normal state and
another class that is the abnormal state.
• For example “not spam” is the normal state and “spam” is the abnormal state.

• Another example is “cancer not detected” is the normal state of a task that involves a
medical test and “cancer detected” is the abnormal state.

• The class for the normal state is assigned the class label 0 and the class with the
abnormal state is assigned the class label 1.
15
Multi-Class Classification
• Multi-Class Classification refers to those classification tasks that have more than two
class labels.
• Examples include:
• Face classification.
• Plant species classification.
• Optical character recognition.

• Examples are classified as belonging to one among a range of known classes.

• The number of class labels may be very large on some problems.


• For example, a model may predict a photo as belonging to one among thousands or tens
of thousands of faces in a face recognition system.
16
Multi-Label Classification
• Multi-Label Classification refers to those classification tasks that have two or more
class labels, where one or more class labels may be predicted for each example.

• Consider the example of photo classification, where a given photo may have multiple
objects in the scene and a model may predict the presence of multiple known objects in
the photo, such as “bicycle,” “apple,” “person,” etc.

• This is unlike binary classification and multi-class classification, where a single class
label is predicted for each example.

17
Imbalanced Classification
• Imbalanced Classification refers to classification tasks where the number of examples
in each class is unequally distributed.
• Typically, imbalanced classification tasks are binary classification tasks where the
majority of examples in the training dataset belong to the normal class and a minority of
examples belong to the abnormal class.

• Examples include:
• Fraud detection.
• Outlier detection.
• Medical diagnostic tests.

18
Supervised Learning-Regression
• The difference between classification and CLASS 1
regression is that regression outputs a
number rather than a class.
CLASS 2

• Therefore, regression is useful when


predicting number based problems like
stock market prices, the temperature for a
given day, or the probability of an event.

19
Unsupervised Learning
• In unsupervised learning, only input data is provided in the examples.
• There are no labelled example outputs to aim for.
• But it may be surprising to know that it is still possible to find many interesting and
complex patterns hidden within data without any labels.
• An example of unsupervised learning in real life would be sorting different colour coins
into separate piles. Nobody taught you how to separate them, but by just looking at their
features such as colour, you can see which colour coins are associated and cluster them
into their correct groups.

• Unsupervised learning can be harder than supervised learning, as the removal of


supervision means the problem has become less defined. The algorithm has a less
focused idea of what patterns to look for.

20
Unsupervised Learning
• Unsupervised machine learning finds all kind of unknown patterns in data.
• Unsupervised methods help you to find features which can be useful for categorization.
• It is taken place in real time, so all the input data to be analyzed and labeled in the
presence of learners.
• It is easier to get unlabeled data from a computer than labeled data, which needs manual
intervention.

• Unsupervised Learning is of two types: Clustering and Association.

21
Unsupervised Learning-Clustering
• Unsupervised learning is mostly used for clustering.
• Clustering is the act of creating groups with differing characteristics.
• Clustering attempts to find various subgroups within a dataset.
• As this is unsupervised learning, we are not restricted to any set of labels and are free to
choose how many clusters to create.
• This is both a blessing and a curse.
• Picking a model that has the correct number of clusters (complexity) has to be conducted
via an empirical model selection process.

22
Unsupervised Learning-Association
• In Association Learning you want to uncover the rules that describe your data.
• For example, if a person watches video A they will likely watch video B.

• Association rules are perfect for examples such as this where you want to find related
items.

• Common example is Market Basket Analysis:


• Market Basket Analysis is one of the key techniques used by large retailers to uncover associations
between items. It works by looking for combinations of items that occur together frequently in
transactions. To put it another way, it allows retailers to identify relationships between the items that
people buy.
• Association Rules are widely used to analyze retail basket or transaction data, and are intended to
identify strong rules discovered in transaction data using measures of interestingness, based on the
concept of strong rules.
23
Semi-supervised Learning
• Semi-supervised learning is a mix between supervised and unsupervised approaches.
• The learning process isn’t closely supervised with example outputs for every single
input, but we also don’t let the algorithm do its own thing and provide no form of
feedback.
• Semi-supervised learning takes the middle road.
• By being able to mix together a small amount of labelled data with a much larger
unlabeled dataset it reduces the burden of having enough labelled data.
• Therefore, it opens up many more problems to be solved with machine learning.
• Example:
• Internet Content Classification: Labeling each webpage is an impractical and
unfeasible process and thus uses Semi-Supervised learning algorithms. Even the Google
search algorithm uses a variant of Semi-Supervised learning to rank the relevance of a
webpage for a given query.
24
Reinforcement Learning
• In this approach, occasional positive and negative feedback is used to reinforce
behaviours.
• Think of it like training a dog, good behaviours are rewarded with a treat and become
more common. Bad behaviours are punished and become less common.
• This reward-motivated behaviour is key in reinforcement learning.
• It is less common and much more complex, but it has generated incredible results.
• It doesn’t use labels as such, and instead uses rewards to learn.

25
Reinforcement Learning
• This is very similar to how we as humans also learn.
• Throughout our lives, we receive positive and negative signals and constantly learn from
them.
• The chemicals in our brain are one of many ways we get these signals.
• When something good happens, the neurons in our brains provide a hit of positive
neurotransmitters such as dopamine which makes us feel good and we become more
likely to repeat that specific action.
• We don’t need constant supervision to learn like in supervised learning.
• By only giving the occasional reinforcement signals, we still learn very effectively.

26
Reinforcement Learning
• One of the most exciting parts of Reinforcement Learning is that is a first step away from
training on static datasets, and instead of being able to use dynamic, noisy data-rich
environments.
• This brings Machine Learning closer to a learning style used by humans. The world is
simply our noisy, complex data-rich environment.
• Games are very popular in Reinforcement Learning research. They provide ideal data-
rich environments.
• The scores in games are ideal reward signals to train reward-motivated behaviours.
Additionally, time can be sped up in a simulated game environment to reduce overall
training time.
• A Reinforcement Learning algorithm just aims to maximise its rewards by playing the
game over and over again. If you can frame a problem with a frequent ‘score’ as a
reward, it is likely to be suited to Reinforcement Learning.
27
References
• Books and Journals
• Understanding Machine Learning: From Theory to Algorithms by Shai Shalev-Shwartz and Shai
Ben-David-Cambridge University Press 2014
• Introduction to machine Learning – the Wikipedia Guide by Osman Omer.

• Video Link-
• https://www.youtube.com/watch?v=9f-GarcDY58
• https://www.youtube.com/watch?v=GwIo3gDZCVQ

• Web Link-
• https://data-flair.training/blogs/types-of-machine-learning-algorithms/
• https://towardsdatascience.com/machine-learning-an-introduction-23b84d51e6d0
• https://towardsdatascience.com/introduction-to-machine-learning-f41aabc55264

28
THANK YOU

You might also like