AI-driven Applications.: Differences Between AI vs. Machine Learning vs. Deep Learning

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 10

Differences Between AI vs. Machine Learning vs.

Deep Learning Self-awareness - These systems are designed and

Artificial Intelligence is the concept of creating smart created to be aware of themselves. They understand

intelligent machines.  their own internal states, predict other people’s

Machine Learning is a subset of artificial intelligence feelings, and act appropriately

that helps you build AI-driven applications.


Deep Learning is a subset of machine learning that Applications of Artificial Intelligence

uses vast volumes of data and complex algorithms to


train a model.  Machine Translation such as Google
Translate

Artificial Intelligence?
 Self-Driving Vehicles such as Google’s
Artificial intelligence, commonly referred to as AI, is the
Waymo
process of imparting data, information, and human
intelligence to machines. The main goal of Artificial  AI Robots such as Sophia and Aibo

Intelligence is to develop self-reliant machines that can


 Speech Recognition applications like
think and act like humans. These machines can mimic
Apple’s Siri or OK Google
human behavior and perform tasks by learning and
problem-solving. Most of the AI systems simulate Machine Learning? (Past data -> learns from past data

natural intelligence to solve complex problems. -. predicts the output)

Amazon Echo is a smart speaker that uses Alexa, the


virtual assistant AI technology developed by Amazon. Machine learning is a discipline of computer science

Amazon Alexa is capable of voice interaction, playing that uses computer algorithms and analytics to build

music, setting alarms, playing audiobooks, and giving predictive models that can solve business problems.

real-time information such as news, weather, sports,


and traffic reports. How Does Machine Learning Work?

Types of Artificial Intelligence  Machine learning accesses vast amounts of data (both

Reactive Machines - These are systems that only structured and unstructured) and learns from it to

react. These systems don’t form memories, and they predict the future. It learns from the data by using

don’t use any past experiences for making new multiple algorithms and techniques. Below is a diagram

decisions. that shows how a machine learns from data.

Limited Memory - These systems reference the past,


and information is added over a period of time. The
Types of Machine Learning
referenced information is short-lived. 

Machine learning algorithms are classified into three


Theory of Mind - These covers systems that are able
main categories:
to understand human emotions and how they affect
decision making. They are trained to adjust their
behaviour accordingly. 1. Supervised Learning
In supervised learning, the data is already labelled,  Fraud analysis in banking
which means you know the target variable. Using this
 Product recommendations
method of learning, systems can predict future
outcomes based on past data. It requires that at least  Stock price prediction
an input and output variable be given to the model for it
to be trained. Some examples of supervised learning
include linear regression, logistic regression, support
Deep Learning?
vector machines, Naive Bayes, and decision tree.

Deep learning is a subset of machine learning that


2. Unsupervised Learning
deals with algorithms inspired by the structure and
function of the human brain. Deep learning
Unsupervised learning algorithms employ unlabelled
algorithms can work with an enormous amount of both
data to discover patterns from the data on their own.
structured and unstructured data. Deep learning’s core
The systems are able to identify hidden features from
concept lies in artificial neural networks, which enable
the input data provided. Once the data is more
machines to make decisions. 
readable, the patterns and similarities become more
evident. Some examples of unsupervised learning
The major difference between deep learning vs
include k-means clustering, hierarchical clustering, and
machine learning is the way data is presented to the
anomaly detection.
machine. Machine learning algorithms usually require
structured data, whereas deep learning networks work
3. Reinforcement Learning
on multiple layers of artificial neural networks.

The goal of reinforcement learning is to train an agent


to complete a task within an uncertain environment.
The agent receives observations and a reward from Deep Learning Applications

the environment and sends actions to the environment.


The reward measures how successful action is with  Cancer tumour detection
respect to completing the task goal. Examples of
 Caption Bot for captioning an image
reinforcement learning algorithms include Q-learning
and Deep Q-learning Neural Networks  Music generation

.  Image colouring

 Object detection

Machine Learning Applications

 Sales forecasting for different products


Machine learning is widely used by various e-
commerce and entertainment companies such
as Amazon, Netflix, etc., for product recommendation
to the user. Whenever we search for some product on
APPLICATIONS OF MACHINE LEARNING Amazon, then we started getting an advertisement for
the same product while internet surfing on the same
1. Image Recognition: browser and this is because of machine learning.

Image recognition is one of the most common Google understands the user interest using various
applications of machine learning. It is used to identify machine learning algorithms and suggests the product
objects, persons, places, digital images, etc. The as per customer interest.
popular use case of image recognition and face
detection is, Automatic friend tagging suggestion:
As similar, when we use Netflix, we find some
Facebook provides us a feature of auto friend tagging recommendations for entertainment series, movies,
suggestion. Whenever we upload a photo with our
Facebook friends, then we automatically get a tagging etc., and this is also done with the help of machine
suggestion with name, and the technology behind this
is machine learning's face detection and recognition learning.
algorithm.
5. Self-driving cars:
It is based on the Facebook project named "Deep
Face," which is responsible for face recognition and
person identification in the picture. One of the most exciting applications of machine
learning is self-driving cars. Machine learning plays a
significant role in self-driving cars. Tesla, the most
2. Speech Recognition popular car manufacturing company is working on self-
driving car. It is using unsupervised learning method to
While using Google, we get an option of "Search by train the car models to detect people and objects while
voice," it comes under speech recognition, and it's a driving.
popular application of machine learning.
6. Email Spam and Malware Filtering:
Speech recognition is a process of converting voice
instructions into text, and it is also known as "Speech Whenever we receive a new email, it is filtered
to text", or "Computer speech recognition." At automatically as important, normal, and spam. We
present, machine learning algorithms are widely used always receive an important mail in our inbox with the
by various applications of speech recognition. Google important symbol and spam emails in our spam box,
assistant, Siri, Cortana, and Alexa are using speech and the technology behind this is Machine learning.
recognition technology to follow the voice instructions. Below are some spam filters used by Gmail:

3. Traffic prediction:
o Content Filter
If we want to visit a new place, we take help of Google o Header filter
Maps, which shows us the correct path with the
shortest route and predicts the traffic conditions. o General blacklists filter

It predicts the traffic conditions such as whether traffic o Rules-based filters


is cleared, slow-moving, or heavily congested with the
help of two ways: o Permission filters

Some machine learning algorithms such as Multi-


o Real Time location of the vehicle form Google Layer Perceptron, Decision tree, and Naïve Bayes
Map app and sensors classifier are used for email spam filtering and
malware detection.
o Average time has taken on past days at the
same time. 7. Virtual Personal Assistant:

Everyone who is using Google Map is helping this app We have various virtual personal assistants such
to make it better. It takes information from the user and as Google assistant, Alexa, Cortana, Siri. As the
sends back to its database to improve the name suggests, they help us in finding the information
performance. using our voice instruction. These assistants can help
us in various ways just by our voice instructions such
as Play music, call someone, open an email,
4. Product recommendations: Scheduling an appointment, etc.
8. Online Fraud Detection: MODULE 3 – FEATURE EXTRACTION AND
SELECTION
Machine learning is making our online transaction safe
and secure by detecting fraud transaction. Whenever
we perform some online transaction, there may be
various ways that a fraudulent transaction can take
place such as fake accounts, fake ids, and steal
money in the middle of a transaction. So, to detect Feature Selection
this, Feed Forward Neural network helps us by
checking whether it is a genuine transaction or a fraud
transaction.
Concepts & Techniques
Simply speaking, feature selection is
about selecting a subset of features out of
For each genuine transaction, the output is converted
into some hash values, and these values become the the original features in order to reduce model
input for the next round. For each genuine transaction, complexity, enhance the computational
there is a specific pattern which gets change for the efficiency of the models and reduce
fraud transaction hence, it detects it and makes our generalization error introduced due to noise by
online transactions more secure. irrelevant features. The following represents
some of the important feature selection
9. Stock Market trading: techniques:
 Regularization techniques such
Machine learning is widely used in stock market as L1 norm regularisation which
trading. In the stock market, there is always a risk of up results in most features’ weight to
and downs in shares, so for this machine turn to zero
learning's long short-term memory neural  Feature importance
network is used for the prediction of stock market techniques such as
trends.
using estimator such as Random
Forest algorithm to fit a model
10. Medical Diagnosis: and select features based on the
value of attribute such
In medical science, machine learning is used for as feature_importances_ .
diseases diagnoses. With this, medical technology is  Greedy search algorithms such
growing very fast and able to build 3D models that can
predict the exact position of lesions in the brain.
as some of the following which are
useful for algorithms (such as K-
It helps in finding brain tumours and other brain-related nearest neighbours, K-NN) where
diseases easily. regularization techniques are not
supported.
 Sequential forward
11. Automatic Language Translation:
selection
 Sequential floating
Nowadays, if we visit a new place and we are not
aware of the language then it is not a problem at all, as forward selection
for this also machine learning helps us by converting  Sequential backward
the text into our known languages. Google's GNMT selection
(Google Neural Machine Translation) provide this  Sequential floating
feature, which is a Neural Machine Learning that
translates the text into our familiar language, and it
backward selection
called as automatic translation. According to the utilized training data (labeled,
unlabeled, or partially labeled), feature
The technology behind the automatic translation is a selection methods can be divided into
sequence-to-sequence learning algorithm, which is supervised, unsupervised, and semi-supervised
used with image recognition and translates the text models. According to their relationship with
from one language to another language.
learning methods, feature selection methods
can be classified into the following:

 Filter methods: The filter model


only considers the association
between the feature and the class
label
 Wrapper methods
 Embedded methods: In  Linear discriminant analysis (LDA)
embedded method, the features are as a supervised dimensionality reduction
selected in the training process of technique for maximizing class
learning model, and the feature separability
selection result outputs  Nonlinear dimensionality
automatically while the training reduction via kernel principal
process is finished. Training the component analysis (KPCA)
Lasso regression model is a classic
example of embedded method for
feature selection.
According to the evaluation criterion, feature
selection methods can be derived from
correlation, Euclidean distance, consistency,
dependence and information measures.

According to the type of output, feature


selection methods can be divided into feature
rank (weighting) and subset selection models.

Feature Extraction
Concepts & Techniques
Feature extraction is
about extracting/deriving information from
the original features set to create a new features
subspace. The primary idea behind feature
extraction is to compress the data with the goal
of maintaining most of the relevant
information. As with feature selection :--7 steps to making a machine learning model?
It can be broken down into 7 major steps :
techniques, these techniques are also used for Collecting Data: As you know, machines initially learn
reducing the number of features from the from the data that you give them. ...
original features set to reduce model Preparing the Data: After you have your data, you
complexity, model overfitting, enhance model have to prepare it. ...
computation efficiency and reduce Choosing a Model: ...
generalization error. The following are different Training the Model: ...
types of feature extraction techniques: Evaluating the Model: ...
Parameter Tuning: ...
 Principal component
Making Predictions
analysis (PCA) for unsupervised
data compression. Here is a detailed
post on feature extraction using
PCA with Python example. You will Pruning: Getting an Optimal Decision tree
get a good understanding of how Pruning is a process of deleting the unnecessary
PCA can help with finding the nodes from a tree in order to get the optimal decision
directions of maximum variance in tree.
high-dimensional data and projects A too-large tree increases the risk of overfitting, and a
the data onto a new subspace with small tree may not capture all the important features of
equal or fewer dimensions than the the dataset. Therefore, a technique that decreases the
original one. This is explained with size of the learning tree without reducing accuracy is
example of identifying Taj Mahal known as Pruning. There are mainly two types of tree
(7th wonder of world) from top pruning technology used:
view or side view based
Cost Complexity Pruning
on dimensions in which there
Reduced Error Pruning.
is maximum variance. The
diagram below shows the
dimensions of maximum variance
(PCA1 and PCA2) as a result of PCA. Advantages of the Decision Tree
It is simple to understand as it follows the same
process which a human follow while making any
decision in real-life.
It can be very useful for solving decision-related
problems.
It helps to think about all the possible outcomes for a
problem.
There is less requirement of data cleaning compared
to other algorithms.
Disadvantages of the Decision Tree
The decision tree contains lots of layers, which makes
it complex.
It may have an overfitting issue, which can be resolved
using the Random Forest algorithm.
For more class labels, the computational complexity of
the decision tree may increase.
1:-7 steps to making a machine learning model? terminates the generation of new branch based on the
It can be broken down into 7 major steps : given condition.
Collecting Data: As you know, machines initially learn
from the data that you give them. ...
Preparing the Data: After you have your data, you
have to prepare it. ... :- Knn algorithm
Choosing a Model: ...
Training the Model: ...
Evaluating the Model: ...
Parameter Tuning: ... K-Nearest Neighbor(KNN) Algorithm for Machine
Making Predictions. Learning
K-Nearest Neighbour is one of the simplest Machine
Pruning: Getting an Optimal Decision tree Learning algorithms based on Supervised Learning
Pruning is a process of deleting the unnecessary technique.
nodes from a tree in order to get the optimal decision K-NN algorithm assumes the similarity between the
tree. new case/data and available cases and put the new
case into the category that is most similar to the
A too-large tree increases the risk of overfitting, and a available categories.
small tree may not capture all the important features of K-NN algorithm stores all the available data and
the dataset. Therefore, a technique that decreases the classifies a new data point based on the similarity. This
size of the learning tree without reducing accuracy is means when new data appears then it can be easily
known as Pruning. There are mainly two types of tree classified into a well suite category by using K- NN
pruning technology used: algorithm.
K-NN algorithm can be used for Regression as well as
Cost Complexity Pruning for Classification but mostly it is used for the
Reduced Error Pruning. Classification problems.
K-NN is a non-parametric algorithm, which means it
does not make any assumption on underlying data.
It is also called a lazy learner algorithm because it
Advantages of the Decision Tree does not learn from the training set immediately
It is simple to understand as it follows the same instead it stores the dataset and at the time of
process which a human follow while making any classification, it performs an action on the dataset.
decision in real-life. KNN algorithm at the training phase just stores the
It can be very useful for solving decision-related dataset and when it gets new data, then it classifies
problems. that data into a category that is much similar to the
It helps to think about all the possible outcomes for a new data.
problem.
There is less requirement of data cleaning compared Example: Suppose, we have an image of a creature
to other algorithms. that looks similar to cat and dog, but we want to know
Disadvantages of the Decision Tree either it is a cat or dog. So for this identification, we
The decision tree contains lots of layers, which makes can use the KNN algorithm, as it works on a similarity
it complex. measure. Our KNN model will find the similar features
It may have an overfitting issue, which can be resolved of the new data set to the cats and dogs images and
using the Random Forest algorithm. based on the most similar features it will put it in either
For more class labels, the computational complexity of cat or dog category.
the decision tree may increase.
How does K-NN work?
The K-NN working can be explained on the basis of
the below algorithm:

difference between pre pruning and post pruning? Step-1: Select the number K of the neighbors
Decision trees are notoriously famous for overfitting. Step-2: Calculate the Euclidean distance of K number
Pruning is a regularization method which penalizes the of neighbors
length of tree, i.e. increases the value of cost function. Step-3: Take the K nearest neighbors as per the
calculated Euclidean distance.
Pruning is of two types: Step-4: Among these k neighbors, count the number of
the data points in each category.
Post Pruning(Backward Pruning): Full tree is Step-5: Assign the new data points to that category for
generated and then the non-significant branches are which the number of the neighbor is maximum.
pruned/removed. Cross validation is performed at Step-6: Our model is ready.
every step to check whether addition of the new
branch leads to increase in accuracy. If not the branch K-Nearest Neighbor(KNN) Algorithm for Machine
is converted to leaf node. Learning
Pre Pruning(Forward Pruning): This approach stops As we can see the 3 nearest neighbors are from
the non-significant branches from generating. It category A, hence this new data point must belong to
category A.
How to select the value of K in the K-NN Algorithm? While many machine learning algorithms are rigid and
Below are some points to remember while selecting constricted to a single approach, Factor Analysis does
the value of K in the K-NN algorithm: not work that way.

There is no particular way to determine the best value


for "K", so we need to try some values to find the best Rather, this statistical model has a flexible approach
out of them. The most preferred value for K is 5. towards multivariate datasets that let one obtain
A very low value for K such as K=1 or K=2, can be relationships or correlations between various variables
noisy and lead to the effects of outliers in the model. and their underlying components.
Large values for K are good, but it may find some
difficulties.
Advantages of KNN Algorithm:
It is simple to implement. Cons of Factor Analysis
It is robust to the noisy training data
It can be more effective if the training data is large.
Disadvantages of KNN Algorithm: Incomprehensive Results
Always needs to determine the value of K which may
be complex some time.
While there are many pros of Factor Analysis, there
The computation cost is high because of calculating
are various cons of this method as well. Primarily,
the distance between the data points for all the training
Factor Analysis can procure incompetent results due to
samples.
incomprehensive datasets.

:-Pros of Factor Analysis


While various data points can have similar traits, some
other variables or factors can go unnoticed due to
being isolated in a vast dataset. That said, the results
Measurable Attributes
of this method could be incomprehensive.
The first and foremost pro of FA is that it is open to all
measurable attributes. Be it subjective or objective,
any kind of attribute can be worked upon when it
comes to this statistical technique.
Non-Identification of Complicated Factors

Unlike some statistical models that only work on Another drawback of Factor Analysis is that it does not
objective attributes, Factor Analysis goes well with identify complicated factors that underlie a dataset.
both subjective and objective attributes.

While some results could clearly indicate a correlation


between two variables, some complicated correlations
can go unnoticed in such a method.
Cost-Effective

While data research and data mining algorithms can


Perhaps the non-identification of complicated factors
cost a lot due to the extraordinary charges, this
and their relationships could be an issue for data
statistical model is surprisingly cost-effective and does
research.
not take many resources to work with.

That said, it can be incorporated by any beginner or an


experienced professional in light of its cost-effective
and easy approach towards data mining and data Reliant on Theory
reduction.
Even though Factor Analysis skills can be imitated by
machine learning algorithms, this method is still reliant
on theory and thereby data researchers.

Flexible Approach
While many components of a dataset can be handled Increase the weight of the wrongly classified data
by a computer, some other details are required to be points and decrease the weights of correctly classified
looked into by humans. data points. And then normalize the weights of all data
points.
if (got required results)
Goto step 5
Thus, one of the major drawbacks of Factor Analysis is else
that it is somehow reliant on theory and cannot fully Goto step 2
function without manual assistance. End

:Similarities Between Bagging and Boosting


Bagging and Boosting, both being the commonly used
: Bagging: It is a homogeneous weak learners’ model methods, have a universal similarity of being classified
that learns from each other independently in parallel as ensemble methods. Here we will explain the
and combines them for determining the model similarities between them.
average.
Boosting: It is also a homogeneous weak learners’ Both are ensemble methods to get N learners from 1
model but works differently from Bagging. In this learner.
model, learners learn sequentially and adaptively to Both generate several training data sets by random
improve model predictions of a learning algorithm. sampling.
Both make the final decision by averaging the N
learners (or taking the majority of them i.e Majority
: Bagging Voting).
Bootstrap Aggregating, also known as bagging, is a Both are good at reducing variance and provide higher
machine learning ensemble meta-algorithm designed stability.
to improve the stability and accuracy of machine
learning algorithms used in statistical classification and
regression. It decreases the variance and helps to
avoid overfitting. It is usually applied to decision tree
methods. Bagging is a special case of the model
averaging approach.

Implementation Steps of Bagging

Step 1: Multiple subsets are created from the original


data set with equal tuples, selecting observations with
replacement.
Step 2: A base model is created on each of these
subsets.
Step 3: Each model is learned in parallel with each
training set and independent of each other.
Step 4: The final predictions are determined by
combining the predictions from all the models.

: Boosting
Boosting is an ensemble modeling technique that
attempts to build a strong classifier from the number of
weak classifiers. It is done by building a model by
using weak models in series. Firstly, a model is built
from the training data. Then the second model is built
which tries to correct the errors present in the first
model. This procedure is continued and models are
added until either the complete training data set is
predicted correctly or the maximum number of models
is added.
: Boosting algorithm

Algorithm:

Initialise the dataset and assign equal weight to each


of the data point.
Provide this as input to the model and identify the
wrongly classified data points.

You might also like