Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

Machine Learning

MISSION NO BACKLOGs

Machine Learning 1
🏡 🙏README🙏
- So far I have created notes for 3 subjects which are precise, time-saving and easy to read
and understand.
- While the response and praise it received was brilliant, the donation part of it was less
responsive.
- Don't be shy in that aspect and show a little bit of gratitude by donating. Research work and
creating a structured PDF takes a lot of effort and time.
- I am also a student.
- Consider donating to this Nobel cause. Any Amount between 10rs - 100rs is appreciated.
These are the finals notes that I have created so I expect utmost support from your side. Peace

UNIT - 1(Introduction)

Define Machine learning? Explain with specific example


1. Machine learning is a method of teaching computers to learn from data, without explicitly
programming them.

2. It involves feeding the computer large amounts of data and allowing the computer to discover
patterns and relationships within the data.

3. Machine learning algorithms use statistical analysis to find patterns in data.

4. Machine learning can be used for a wide variety of tasks, such as image recognition, language
translation, and even playing games.

5. Machine learning can be supervised, where the data is labeled and the algorithm is trained to
predict the output based on the input data, or unsupervised, where the algorithm is not given
any labeled data and must find patterns and relationships in the data on its own.

Machine Learning 2
One example of machine learning is using a program to analyze a dataset of customer data
and automatically identifying patterns that can be used to predict which customers are likely to
make a purchase. The program can then use this information to send targeted promotions or
recommendations to those customers, increasing the likelihood of a sale.

How will you design a learning system ? Explain with


examples
i know this answer is long but it very easy to understand just go through it
There are many steps that go into designing a machine learning system, and the specific
steps will depend on the specific problem you are trying to solve. Here is a general
outline of the process:

1. Define the problem:

a. Start by clearly defining the problem you are trying to solve.

b. This will help you determine the appropriate machine learning techniques to use
and the type of data you will need to collect.

2. Collect and pre-process data:

a. Next, you will need to collect and pre-process the data that you will use to train
your machine learning model.

b. This may involve cleaning the data, normalizing it, and selecting a relevant
subset of the data to use.

3. Select a model:

a. Choose a machine learning model that is appropriate for the problem you are
trying to solve.

b. There are many different types of models to choose from, each with its own
strengths and weaknesses.

4. Train the model:

a. Use the collected and pre-processed data to train the machine learning model.

b. This involves feeding the data into the model and adjusting the model's
parameters to optimize its performance.

5. Evaluate the model:

a. Once the model has been trained, you will need to evaluate its performance to
ensure that it is accurate and reliable.

b. You can use a variety of metrics to assess the model's performance, such as
precision, recall, and accuracy.

Machine Learning 3
6. Fine-tune the model:

a. If the model's performance is not satisfactory, you may need to fine-tune the
model by adjusting its parameters or collecting additional data.

7. Deploy the model:

a. Once the model is performing well, you can deploy it for use in real-world
applications.

b. This may involve integrating the model into a larger system or making it
available as a web service.

Design learning system with examples


Suppose you are trying to build a machine learning system to predict the likelihood
that a customer will churn (cancel their subscription to your service).

1. Define the problem:

a. The problem you are trying to solve is predicting customer churn.

2. Collect and pre-process data:

a. You will need to collect data on your customers, such as their age, location,
and usage of your service.

b. You will also need to label the data to indicate which customers have
churned and which have not. You will then need to pre-process the data by
cleaning it and normalizing it.

3. Select a model:

a. You might choose a supervised learning model such as a decision tree or


logistic regression, as you have labeled data indicating which customers
have churned.

4. Train the model:

a. You will train the model using the collected and pre-processed data. This will
involve adjusting the model's parameters to optimize its performance.

5. Evaluate the model:

a. Once the model has been trained, you will need to evaluate its performance
to ensure that it is accurate and reliable.

b. You can do this by splitting the data into a training set and a test set and
using the test set to make predictions.

c. You can then compare the predictions to the actual labels to assess the
model's performance.

6. Fine-tune the model:

Machine Learning 4
a. If the model's performance is not satisfactory, you may need to fine-tune the
model by adjusting its parameters or collecting additional data.

7. Deploy the model:

a. Once the model is performing well, you can deploy it in your business.

b. For example, you might use the model to predict which customers are at risk
of churning and send them targeted promotions to try to retain them as
customers.

Differentiate between Supervised Learning, Unsupervised


Learning, Reinforcement Learning

Criteria Supervised ML Unsupervised ML Reinforcement ML

Trained using unlabeled


Learns by using labelled Works on interacting
Definition data without any
data with the environment
guidance.

Type of problems Labelled data Unlabeled data No - predefined data

Regression and Association and Exploitation or


Type of data
classification Clustering Exploration
Supervision Extra supervision No supervision No supervision

Li near Regression,
K — Means, C — Means,
Algorithms Logistic Regression, SVM, Q — Learning, SARSA
Apriori
KNN etc.

Discover underlying Learn a series of action


Aim Calculate outcomes
patterns Application

Recommendation
Risk Evaluation, Forecast Self Driving Cars,
Application System, Anomaly
Sales Gaming, Healthcare
Detection

Explain Linear Regression


1. Linear regression is a statistical method used to model the linear relationship between a
dependent variable and one or more independent variables.

2. It involves fitting a linear equation to the data, in the form: price = β0 + β1size +
β2bedrooms, where β0, β1, and β2 are the model coefficients and size and bedrooms

Machine Learning 5
are the independent variables.

➖ price = β0 + β1size + β2bedrooms

3. The coefficients β1 and β2 represent the change in the dependent variable (price) for
each unit change in the independent variables (size and bedrooms).

4. Linear regression can be used to make predictions or estimate the effect of one variable
on the dependent variable.

5. Linear regression can be extended to handle multiple independent variables and can be
used with other types of dependent variables, such as binary variables.

6. To fit a linear regression model, you need to choose the values of the coefficients that
minimize the error between the model predictions and the observed data. This can be
done using an optimization algorithm such as gradient descent.

Linear regression is a simple and widely used method for predicting a continuous
dependent variable from one or more independent variables. It can be extended to handle
multiple independent variables and can be used with other types of dependent variables,
such as binary variables.

EXPLAIN Knowledge discovery in databases (KDD)

Knowledge discovery in databases (KDD) is the process of automatically identifying


patterns and relationships in data and using those insights to make predictions or decisions.

Machine Learning 6
1. KDD involves using machine learning algorithms and techniques to extract useful
insights and patterns from large datasets stored in databases.

2. It can be done using a range of techniques, such as classification, clustering, and


regression.

3. help organizations uncover hidden trends and patterns in their data and make better
decisions based on those insights.

4. It can be used in a variety of fields and applications, such as marketing, finance,


healthcare, and cybersecurity.

5. KDD is a key step in the data mining process and typically involves several steps,
including data preparation, data selection, data transformation, data mining, pattern
evaluation, and knowledge presentation.

WHAT IS SEMMA (Sample, Explore, Modify, Model, and


Assess)

SEMMA is a acronym for a methodology for data mining that was developed by SAS
Institute.
It stands for Sample, Explore, Modify, Model, and Assess.

Machine Learning 7
SEMMA is a process that can be used to guide the development of a data mining project
from start to finish.

1. Sample:

a. The first step in the SEMMA process is to select a representative sample of the
data to be used in the analysis.

2. Explore:

a. The next step is to explore the data to gain a better understanding of its
characteristics and patterns.

b. This may involve generating summary statistics, creating visualizations, and


identifying any missing or incorrect data.

3. Modify:

a. The third step is to modify the data as needed to prepare it for modeling.

b. This may involve imputing missing values, transforming variables, or selecting a


relevant subset of the data.

4. Model:

a. The fourth step is to build one or more models using the modified data.

b. This may involve applying machine learning algorithms or statistical techniques to


the data.

5. Assess:

a. The final step is to assess the performance of the model and determine whether it is
accurate and reliable.

b. This may involve evaluating the model using various metrics, such as precision,
recall, and accuracy.

c.

SEMMA is a useful methodology for ensuring that a data mining project is well-structured
and that all necessary steps are taken to prepare the data and build accurate models. It can
be applied to a wide range of data mining projects and can help organizations gain insights
and make better decisions based on their data.

Machine Learning 8
UNIT - 2 (Machine Learning Perspective of Data)

Explain different ways to handle Missing Data


Definition of Missing Data
Missing data means absence of observations in columns. It appears in values such as "o", "NA",
"NaN", "NULL", "Not Applicable", "None".

Dealing With Missing Data

Dealing with missing data is a common challenge in machine learning and data analysis.

Missing data can occur for a variety of reasons, such as data being lost during collection or
measurement errors.

One of the biggest impact of Missing Data is, it can bias the results of the machine learning
models or reduce the accuracy of the model.

The first step in handling missing values is to look at the data carefully and find out all the
missing values.

In order to check missing values in Python Pandas Data Frame, we use a function like isnull()
and notnull() which help in checking whether a value is "NaN"(True) or not and return Boolean
values.

Explain Scales of Measurement


In the field of statistics, there are four scales of measurement that are commonly used to describe
the features (also known as variables or predictors) in a dataset: nominal, ordinal, interval, and
ratio.
These scales of measurement are important because they determine what types of statistical
techniques can be used to analyze the data.

1. Nominal scale:

Machine Learning 9
a. This is the lowest level of measurement and indicates that the variables are simply
different categories with no inherent order.

b. For example, a variable that indicates the color of a car (red, blue, green) is a nominal
variable.

2. Ordinal scale:

a. This scale is similar to the nominal scale, but the variables have an inherent order.

b. For example, a variable that indicates the size of a T-shirt (small, medium, large) is an
ordinal variable.

3. Interval scale:

a. This scale is similar to the ordinal scale, but the difference between the values is equal.

b. For example, a variable that indicates the temperature in degrees Fahrenheit is an interval
variable.

4. Ratio scale:

a. This is the highest level of measurement and indicates that the variables have an inherent
order and a meaningful zero point.

b. For example, a variable that indicates the weight of an object in pounds is a ratio variable.

It's important to choose the appropriate scale of measurement when analyzing data because
different statistical techniques are appropriate for different scales. For example, you can use mean
and standard deviation to analyze interval and ratio variables, but these measures are not
appropriate for nominal or ordinal variables.

How Handling Categorical Data ?


Categorical data is data that can be divided into categories or groups.

It is common in machine learning tasks to encounter categorical data, and it is important to


handle it correctly in order to build an accurate model.

There are several ways to handle categorical data in machine learning:

1. One-hot encoding:

a. One-hot encoding is a technique that converts categorical data into numerical data by
creating a new binary column for each category and encoding the categories as 0 or 1.

Machine Learning 10
b. For example, if we have a categorical variable with three categories (A, B, and C), we
would create three new binary columns, one for each category. Each row would then
have a 1 in the column corresponding to the category, and 0s in the other columns.
One-hot encoding is simple and effective, but it can lead to a high-dimensional dataset
if there are a large number of categories.

2. Label encoding:

a. Label encoding is a technique that converts categorical data into numerical data by
assigning a unique integer to each category.

b. For example, if we have the same categorical variable with three categories (A, B, and
C), we could label encode the categories as 0, 1, and 2. Label encoding is simple and
efficient, but it can introduce an artificial ordinal relationship between the categories,
which may not be meaningful.

3. Frequency encoding:

a. Frequency encoding is a technique that converts categorical data into numerical data
by encoding the categories as the frequency of the category in the dataset.

b. For example, if we have a categorical variable with three categories (A, B, and C), and
category A occurs twice as often as category B and C, we could encode the categories
as 0.5, 0.25, and 0.25. Frequency encoding is simple and can be effective, but it can
be sensitive to the distribution of the categories in the dataset.

4. Ordinal encoding:

a. Ordinal encoding is a technique that converts categorical data into numerical data by
assigning a unique integer to each category and preserving the ordinal relationship
between the categories.

b. For example, if we have a categorical variable with three categories (A, B, and C), and
category A is "less than" category B and category C is "greater than" category B, we
could ordinal encode the categories as 1, 2, and 3. Ordinal

Explain the procedure for Normalizing Data


Normalizing data is often a necessary preprocessing step in machine learning.

This is because many machine learning algorithms assume that the features in the dataset are in
the same scale, and can become sensitive to the scale of the features.

Machine Learning 11
Mathematically, we can calculate normalization with the below formula:

Xn = (X - Xminimum) / ( Xmaximum - Xminimum)

Xn = Value of Normalization
Xmaximum = Maximum value of a feature
Xminimum = Minimum value of a feature

Example: Let's assume we have a model dataset having maximum and minimum values of
feature as mentioned above. To normalize the machine learning model, values are shifted
and rescaled so their range can vary between 0 and 1. This technique is also known as MinMax
scaling.

Explain Correlation and Causation

Correlation refers to the relationship between two variables, and how they change with respect
to each other.

Three types of correlation - positively, negatively, No correlation

positively correlated

it means that they tend to move in the same direction; when one variable increases, the
other variable also tends to increase.

negatively correlated

it means that they tend to move in opposite directions; when one variable increases, the
other variable tends to decrease.

No correlation

When both the variables are completely unrelated and change in one
leads to no change in other.

Causation refers to the relationship between an independent variable and a dependent


variable, where the independent variable is the cause and the dependent variable is the effect.

Machine Learning 12
UNIT - 3 (Introduction to Machine Learning
Algorithms)

What is decision trees. Explain in detail?

1. Decision trees are a type of supervised learning algorithm. This means that they are trained on
labeled data, where the correct output (i.e., the class or value) is known for each sample in the
training set.

2. Decision trees are used for classification and regression.

a. In classification, the goal is to predict a discrete class label (e.g., "spam" or "not spam")

b. in regression, the goal is to predict a continuous value (e.g., the price of a house)

3. Decision trees make predictions by constructing a tree-like model of decisions based on the
features of the data.

4. To make a prediction, we start at the root of the tree and follow the splits until we reach a leaf
node.

5. Decision trees are easy to interpret and visualize, which makes them a popular choice for data
exploration and for explaining the results of a model.

6. Decision trees can handle both numerical and categorical data. For numerical data, the splits are
based on the values of the features. For categorical data, the splits are based on the categories of
the features.

7. Decision trees are relatively simple to train and do not require much data preparation (e.g., scaling
or centering).

An example of a decision tree can be explained using above binary tree. Let’s say you want to
predict whether a person is fit given their information like age, eating habit, and physical activity,
etc. The decision nodes here are questions like ‘What’s the age?’, ‘Does he exercise?’, ‘Does he
eat a lot of pizzas’? And the leaves, which are outcomes like either ‘fit’, or ‘unfit’. In this case this
was a binary classification problem (a yes no type problem).

Machine Learning 13
Define clustering. What are different types of clustering explain?

Clustering is the process of dividing a set of data points into groups, or clusters, such that the
points within a cluster are more similar to each other than to points in other clusters.
or

Grouping unlabeled data is called clustering.

Clustering is an unsupervised learning method, which means that the labels or classes for the data
are not known beforehand.

Clustering is dividing data points into homogeneous classes or clusters.

Machine Learning 14
Types of Clustering Methods:

1. Partitioning Clustering

2. Density-Based Clustering

3. Distribution Model-Based Clustering

4. Hierarchical Clustering

5. Fuzzy Clustering

1. Partitioning Clustering:

It is a type of clustering that divides the data into non-hierarchical groups

Partitioning methods divide the data into a fixed number of clusters by minimizing the within-
cluster sum of squares.

Examples include k-means clustering and k-medoids clustering.

Machine Learning 15
2. Density-Based Clustering

a. Density-based methods divide the data into clusters based on the density of the points.
Examples include DBSCAN and OPTICS.

b. One of the main advantages of density-based clustering is that it can handle clusters of
arbitrary shape and is not sensitive to the initial locations of the clusters

3. Hierarchical Clustering:

a. Hierarchical methods create a hierarchy of clusters, where each cluster is formed by


merging two smaller clusters.

b. There are two main types of hierarchical clustering: agglomerative (bottom-up) and divisive
(top-down).

Machine Learning 16
4. Distribution Model-Based Clustering:

a. Distribution model-based methods represent the data as a mixture of multiple probabilistic


models, such as Gaussian mixture models or latent Dirichlet allocation.

b. The grouping is done by assuming some distributions commonly Gaussian Distribution.

5. Fuzzy Clustering:

a. Fuzzy clustering methods allow a point to belong to multiple clusters with different degrees
of membership.

b. Examples include fuzzy C-means clustering and the fuzzy ART algorithm.

UNIT - 4 (Model Diagnosis and Tuning)

Machine Learning 17
Explain Bias and Variance.

Bias and Variance Explained in Hindi l Machine Learning Course

https://youtu.be/L8h2CHh5KJ4?t=42

In machine learning, bias and variance refer to two sources of error that can affect the
performance of a model.

Bias

refers to the error introduced by using a simplified model to represent a real-life problem.

A model with high bias may be too simple to capture the complexity of the problem,
leading to poor performance.

Variance

on the other hand, refers to the sensitivity of a model to small fluctuations in the training
data.

A model with high variance may be prone to overfitting, meaning it performs well on the
training data but poorly on new, unseen data.

In general, we want to find a balance between bias and variance to build a good model.

A model with low bias and low variance will have good generalization error and will perform
well on new data.

However, it can be challenging to achieve both low bias and low variance at the same time,
and often we have to trade off one for the other.

Explain K-Fold Cross Validation, Bagging.

Cross-Validation ll K-Fold Cross-Validation ll Explained with Example in Hindi

https://www.youtube.com/watch?v=poKFir0QGCI

Machine Learning 18
Extra Resource

K-Fold Cross Validation Technique and its Essentials - Analytics Vidhya


This article was published as a part of the Data Science Blogathon . uys! Before
getting started, just have a look at the below visualization and tell me, what are
your observations? Yes, here we're monitoring the performance of the model
https://www.analyticsvidhya.com/blog/2022/02/k-fold-cross-validation-techniq
ue-and-its-essentials/

K-fold cross-validation is a technique used to evaluate the performance of a machine learning


model.

It involves dividing the dataset into k groups, or folds, and training the model on k-1 folds and
testing it on the remaining one.

This process is repeated k times, with each fold serving as the test set in one of the iterations.

The performance measure is then averaged across all k iterations to provide an estimate of
the model's performance.

K-fold cross-validation is useful because it allows you to use all of the available data to
evaluate the model's performance.

It also helps to reduce the variance of the performance estimate by averaging over multiple
iterations.

However, the choice of the value of k can affect the bias and variance of the performance
estimate, and it is generally recommended to use a value of k that is larger than 10.

Explain Bagging.

Machine Learning 19
Fig Bagging

1. Bagging is a method used to improve the performance of a machine learning model by training
multiple models on different subsets of the training data and combining their predictions.

2. The goal of bagging is to reduce the variance of the model's predictions by aggregating the
predictions of multiple models.

3. Bagging is a type of ensemble method, which means that it combines the predictions of
multiple models to make a final prediction.

4. One variation of bagging is bootstrapped aggregating, which involves training each model on a
different bootstrapped sample of the training data.

5. Bagging can be applied to a wide range of machine learning models, including decision trees,
neural networks, and support vector machines.

6. Bagging is simple to implement and can be effective at reducing overfitting and improving the
generalization ability of the model.

7. Bagging is often used in conjunction with other ensemble methods, such as boosting and
random forests, to further improve the performance of the model.

8. The main disadvantage of bagging is that it can be computationally expensive, as it requires


training multiple models.

9. Bagging is a relatively simple ensemble method and is often used as a baseline for
comparison with more complex methods.

10. Bagging is widely used in practice and has been applied to a variety of real-world problems,
including image classification, natural language processing, and fraud detection.

Machine Learning 20
Explain random forest

Random Forest Step-Wise Explanation ll Machine Learning Course Explained in Hindi

https://www.youtube.com/watch?v=WkFtIqWmX9o

1. Random forests are an ensemble machine learning model that can be used for both classification
and regression tasks.

2. They are composed of a large number of decision trees, and each tree is trained on a different
bootstrapped sample of the training data.

3. The predictions of the individual trees are combined to make a final prediction.

4. Random forests can handle high-dimensional data and are resistant to overfitting.

Machine Learning 21
5. They are relatively simple to implement and can be trained using a parallelized algorithm, making
them efficient to train on large datasets.

6. In addition to classification and regression, random forests can be used for tasks such as feature
selection, outlier detection, and construction of decision boundaries.

7. Random forests are widely used in practice and have been applied to a range of real-world
problems, including credit fraud detection, customer churn prediction, and genetic classification.

Random Forest algorithm work?


Random Forest works in two-phase first is to create the random forest by combining N decision tree,
and second is to make predictions for each tree created in the first phase.

The Working process can be explained in the below steps and diagram:
Step-1: Select random K data points from the training set.

Step-2: Build the decision trees associated with the selected data points (Subsets).

Step-3: Choose the number N for decision trees that you want to build.
Step-4: Repeat Step 1 & 2.

Step-5: For new data points, find the predictions of each decision tree, and assign the new data
points to the category that wins the majority votes.
The working of the algorithm can be better understood by the below example:

Machine Learning 22
Explain Gradient Boosting, Stacking.

GRADIENT BOOSTING

1. Gradient boosting is an ensemble machine learning algorithm that is used to improve the
performance of a model by combining the predictions of multiple weak models.

2. The weak models in a gradient boosting algorithm are usually decision trees, and the algorithm
works by fitting a decision tree to the residual errors of the previous tree.

3. The residual errors are the differences between the predicted values and the true values of the
training data.

4. By fitting the decision trees to the residual errors, the gradient boosting algorithm is able to
correct the mistakes of the previous trees and improve the overall accuracy of the ensemble.

5. It is known for its ability to achieve high accuracy and is often used as a baseline for
comparison with more complex models.

6. One of the main advantages of gradient boosting is its flexibility, as it can be applied to a wide
range of models and is not limited to decision trees.

7. Another advantage is that it is relatively insensitive to the hyperparameters of the model and
can often achieve good results with little tuning.

8. However, gradient boosting can be computationally expensive to train, as it requires fitting


many weak models to the data.

STACKING

1. Stacking is an ensemble machine learning algorithm that combines the predictions of


multiple models to improve the overall accuracy of the ensemble.

2. It works by training a second-level model, called the meta-model, on the predictions of the
base models, which are trained on the original training data.

3. The base models and the meta-model can be any type of machine learning model, and the
predictions of the base models are often called "features" in the context of stacking.

4. The process of training the meta-model on the predictions of the base models is known as
"stacking the models," and it is done using a training dataset that consists of the predictions
of the base models on the original training data.

5. Stacking can be used to improve the performance of any type of machine learning model,
and it is particularly useful for improving the accuracy of models that have high bias or low
variance.

6. One of the main advantages of stacking is that it can help to reduce overfitting, as the
meta-model is trained on the predictions of the base models rather than on the original
training data.

Machine Learning 23
7. Another advantage is that stacking is flexible, as it can be used with any type of machine
learning model and is not limited to a particular type of model.

8. However, stacking can be computationally expensive, as it requires training multiple


models and can be time-consuming to tune the hyperparameters of the meta-model.

UNIT - 5 (Artificial Neural Network (ANN))

What is ANN? Explain.

1. Artificial neural networks (ANNs) are computational models inspired by the structure and
function of the human brain.

2. They are composed of interconnected units called artificial neurons, which are inspired by the
structure and function of neurons in the brain.

3. ANNs are designed to process and transmit information and are commonly used for tasks
such as image recognition, language translation, and prediction.

4. ANNs are composed of layers of artificial neurons, and the input data is transmitted through
the layers of the network by means of weighted connections between the neurons.

Machine Learning 24
5. The output of the network is produced at the final layer of the network, and the process of
transmitting and transforming the input data through the layers of the network is known as
forward propagation.

6. ANNs are trained using a learning algorithm, which adjusts the weights of the connections
between the neurons based on the error between the predicted output and the true output.

7. This process is known as backpropagation.

8. ANNs are powerful machine learning tools and have been applied to a wide range of real-
world problems.

9. However, they can be computationally expensive to train, and their performance can be
sensitive to the architecture of the network and the learning algorithm used.

10. ANNs are a type of deep learning model, which means that they have multiple layers of
artificial neurons and are able to learn hierarchical representations of the data.

Explain Single Artificial Neuron.

Introduction To Artificial Neural Network Explained In Hindi

https://www.youtube.com/watch?v=8eaORgKmmh4

1. An artificial neuron, also known as a perceptron, is a simple computational unit that is used in
artificial neural networks.

2. It is inspired by the structure and function of neurons in the human brain and is designed to
process and transmit information.

3. A single artificial neuron consists of a set of input connections, weights associated with each
input, an activation function, and an output.

4. The input connections receive input data, and the weights are used to adjust the importance of
each input.

5. The activation function is used to transform the weighted input into an output signal.

6. The output is used to transmit the processed information to other neurons or to the final output
layer of the network.

7. Artificial neurons are the building blocks of artificial neural networks and are connected
together in layers to form a network.

Machine Learning 25
8. The output of one layer serves as the input to the next layer, and the process of transmitting
and transforming the input data through the layers of the network is known as forward
propagation.

9. Single artificial neurons are simple computational units, but when connected together in a
network, they can perform complex tasks such as image classification and language
translation.

10. The performance of an artificial neural network is determined by the architecture of the
network, the choice of activation function, and the learning algorithm used to adjust the
weights of the network.

Explain Multilayer Perceptrons (Feedforward Neural Network).

Perceptron

The perceptron consists of a set of input connections, weights associated with each input, and
an activation function.

The perceptron is a type of artificial neural network that is used for binary classification tasks.

The input connections receive the input data, and the weights are used to adjust the
importance of each input.

The activation function is used to transform the weighted input into an output signal, which is
either a 0 or 1.

MULTI-LAYER perceptron

A multi-layer perceptron has one input layer and for each input, there is one neuron(or node), it
has one output layer with a single node for each output and it can have any number of hidden
layers and each hidden layer can have any number of nodes. A schematic diagram of a Multi-
Layer Perceptron (MLP) is depicted below.

Machine Learning 26
Explain Restricted Boltzman Machines (RBMs).

Restricted Boltzmann Machines - Ep. 6 (Deep Learning SIMPLIFIED)

https://www.youtube.com/watch?v=puux7KZQfsE

1. Restricted Boltzmann Machines (RBMs) are a type of unsupervised machine learning model that
can be used to learn a probability distribution over the inputs.

2. They are composed of a set of visible units that represent the input data and a set of hidden units
that capture the underlying structure of the data.

3. RBMs are trained using an iterative algorithm called contrastive divergence, which adjusts the
weights of the connections between the visible and hidden units based on the difference between
the observed data distribution and the model's current estimate of the data distribution.

4. The process of adjusting the weights is repeated until the model has learned a good
representation of the data distribution.

5. RBMs are relatively simple models, but they can be powerful tools for learning a compact
representation of the data.

6. They are often used as building blocks for more complex models, such as deep belief networks,
and have been applied to a variety of tasks, including image recognition, natural language
processing, and collaborative filtering.

Machine Learning 27
7. One of the main advantages of RBMs is that they can learn a distributed representation of the
data, which means that each unit in the model encodes a different aspect of the data.

8. Another advantage is that they can learn the structure of the data without being

Machine Learning 28

You might also like