Professional Documents
Culture Documents
6 - CSE3013 - Learning Systems
6 - CSE3013 - Learning Systems
6 - CSE3013 - Learning Systems
Learning Systems
Dr. Pradeep K V
Assistant Professor (Sr.)
School of Computer Science and Engineering
VIT - Chennai
Definition-1
It is a system of computer algorithms that can learn from example through
self-improvement without being explicitly coded by a programmer.
Definition-2
It is all about making computers how to learn from data to make decisions /
predictions / identify patterns without being explicitly programmed to.
Definition-3
Machine learning enables a machine to automatically learn from data, improve
performance from experiences, and predict things without being explicitly
programmed.
A Machine Learning system learns from historical data, builds the prediction
models, and whenever it receives new data, predicts the output for it
Here, we provide data and let the machine find out the patterns in the dataset.
For instance, provided 3 different shapes (circles, triangles, and squares) and let
the machine cluster them. Such a technique is called clustering.
Here, the machine is commonly referred to as an agent, and the agent receives
a reward (or a penalty) based on each of its actions. It then learns what would
be the best actions to maximize the rewards and alleviate the penalties.
After getting trained on data, the goal of our trained model is the
generalize on unseen data as accurately as possible.
If the model yield very accurate results on training data but fails to
generalize on unseen data, it’s called over-fitting because the model
over-fits the training data.
If the model doesn’t even predict accurately on training data, that means
the model has not learned anything, which is known as under-fitting.
Augmentation:
Machine learning, which assists humans with their day-to-day tasks,
personally or commercially without having complete control of the output.
Such machine learning is used in different ways such as Virtual Assistant,
Data analysis, software solutions. The primary user is to reduce errors due
to human bias.
Automation:
Machine learning, which works entirely autonomously in any field without
the need for any human intervention. For example, robots performing the
essential process steps in manufacturing plants.
Finance Industry :
Machine learning is growing in popularity in the finance industry. Banks
are mainly using ML to find patterns inside the data but also to prevent
fraud.
Government organization :
The government makes use of ML to manage public safety and utilities.
Take the example of China with the massive face recognition. The
government uses Artificial intelligence to prevent jaywalker.
Healthcare industry
Healthcare was one of the first industry to use machine learning with
image detection.
Marketing
Broad use of AI is done in marketing thanks to abundant access to data.
Before the age of mass data, researchers develop advanced mathematical
tools like Bayesian analysis to estimate the value of a customer. With the
boom of data, marketing department relies on AI to optimize the customer
relationship and marketing campaign.
1 Gathering Data : is the first step to identify and obtain all data-related
problems. The quantity and quality of the collected data will determine
the efficiency of the output. The more will be the data, the more accurate
will be the prediction.
Identify various data(Structured/Unstructured) sources
(Files/Database/Internet)
Collect data
Integrate the data obtained from different sources (coherent set of data -
Dataset)
2 Data Preparation : is a step where we put our data into a suitable place
and prepare it to use in our machine learning training.
Data exploration: It is used to understand the nature of data that we have
to work with. We need to understand the characteristics, format, and
quality of data. A better understanding of data leads to an effective
outcome. In this, we find Correlations, general trends, and outliers.
Data pre-processing: Now the next step is preprocessing of data for its
analysis.
6 Test Model : To check for the accuracy of the trained model by providing
a test dataset to it. Testing the model determines the percentage accuracy
of the model as per the requirement of project or problem.
What is a Dataset?
A dataset is a collection of data in which data is arranged in some order.
A dataset can contain any data from a series of an array to a database
table.
Note: The datasets are of large size, so to download these datasets, you must
have fast internet on your computer.
Dr. Pradeep K V
Assistant Professor (Sr.)
School of Computer Science and Engineering
VIT - Chennai
Labeled data indicates that some input data has already been tagged with the
appropriate output.
Supervised learning is the process of providing correct input and output data to
a machine learning model. And the goal is to find a mapping function that
maps the input variable (X) to the output variable (Y).
In the real-world, supervised learning can be used for Risk Assessment, Image
classification, Fraud Detection, spam filtering, etc.
Models are trained using labelled datasets, where the model learns about each
type of data. After the training process is completed, the model is tested on
test data (a subset of the training set) and predicts the output.
The machine has already been trained on all types of shapes, and when it
discovers a new one, it classifies it based on a number of sides and predicts the
output.
With the help of supervised learning, the model can predict the output on
the basis of prior experiences.
In supervised learning, we can have an exact idea about the classes of
objects.
Supervised learning model helps us to solve various real-world problems
such as fraud detection, spam filtering, etc.
Supervised learning models are not suitable for handling the complex tasks.
Supervised learning cannot predict the correct output if the test data is
different from the training dataset.
Training required lots of computation times.
In supervised learning, we need enough knowledge about the classes of
object.
However, there may be many cases where we do not have labelled data and
must find hidden patterns in the given dataset. Unsupervised learning
techniques are required to solve such types of cases in machine learning.
It is a type of ML in which models are trained using unlabeled dataset and are
allowed to act on that data without any supervision.
Given a dataset containing images of various types of cats and dogs The
algorithm is never trained on the given dataset, so it has no idea about the
dataset’s characteristics.
The task of this learning is to identify the image features on their own. And
will perform by clustering the image dataset into the groups according to
similarities between images.
Now, this unlabeled data is fed to the ML model in order to train it. Firstly, it
will interpret the raw data to find the hidden patterns from the data and then
will apply suitable algorithms such as k-means clustering, Decision tree, etc.
Once it applies the suitable algorithm, the algorithm divides the data objects
into groups according to the similarities and difference between the objects.
K-means clustering
KNN (k-nearest neighbors)
Hierarchal clustering
Anomaly detection
Neural Networks
Principle Component Analysis
Independent Component Analysis
Apriori algorithm
Singular value decomposition