Machine Learning

Emoji Creator Project
A
Summer
“Training Report”
On
“WebTek Labs Pvt. Ltd”
Submitted in Partial fulfillment
For the award of the Degree of
B.Tech in Department of Computer Science & Engineering
( With Specialization in Computer Science & Engineering )
Session – 2020-2021
Submitted To : Submitted By :
Mr. Devendra Suthar Aryan Kumar
(Assistant Professor) Roll no. 17ERACS007
Computer Science (B.Tech, 7th Sem)
Department of Computer Science And Engineering

Aravali Institute of Technical Studies, Udaipur (Raj.)
Board of Rajasthan Technical University, Kota
(September – 2019)
CERTIFICATE
CONDIDATE DECLEARATION
I hereby declare that the summer training from “WEBTEK LABS PVT. LTD,
JAIPUR, RAJASTHAN”. in partial fulfilment for the award of degree of
“Bachelor of technology in Department of Computer Science, Aravali
Institute of Technical Studies, Udaipur is a record of my own investigation
carried under the guidance of Mr. Jitendra Singh, Department of Computer
Engineering, Aravali Institute Of Technical Studies.
I have not submitted the presented in this report anywhere for the award of any
other degree.
Aryan Kumar
Computer Science
R no. 17ERASC007
AITS - Udaipur, Rajasthan
H.O.D DIRECTOR
Dr. Jitendra Singh Dr.Hemant Dhabhai
ACKNOWLEDGEMENT
I have gave my best for this project , And it was possible for me without any
guides of other people .But I would like to thanks my friends for the reason is
they are help me all time if any problem is occur . I am highly indebted to
WEBTEK LABS PVT. LTD, JAIPUR, RAJASTHAN
For providing me a creative environment and all the faucitis to learn advance
think. WEBTEK LABS PVT. LTD, doing best in IT field for providing us
more opportunity to explore our ideas and make a really great project in our
field .And also I would like thank Dr. Jitendra Singh Chauhan, head of
Department of computer science and engineering for the support and
encouragement during the project. I perceive this project as a big milestone in
our career I will strive to use gained skills and knowledge in the best possible
way, and I will continue to work on their improvement in Order to attain
desired objective .
Aryan Kumar
Computer Science
R no. 17ERASC007
AITS - Udaipur, Rajasthan
COMPANY PROFILE
WebTek Labs Pvt. Ltd. is recognized as a leading IT solution providing

organization with a dynamic and fast growing team of diversely talented
individuals. Incorporated in 2001, in our aim to provide the best talent, we initially
started with Recruitment & Staffing services. We paralleled this by providing
knowledge and skill development certification training programs. WebTek
Certified Tester (WCT) Program that aims to provide IT companies trained
software Testers has reached soaring heights of recognition over the years. Few
years later after its inception, WebTek Labs added Software development & testing
services to the portfolio.
Having partnered and worked with some of the leading names across Education,
IT, ITES, Banking, Insurance, Aviation, Retail, Healthcare, Hospitality, Media,
Manufacturing and FMCG sectors, WebTek Labs has explored business
opportunities in software solutions with the Government, Corporate and Institutes.
With over a decade of experience we create and deliver high-impact solutions,
enabling our clients to achieve their business goals and enhance their
competitiveness. In our pursuit of excellence, WebTek’s Research & Development
team consistently innovates to provide up-to-date solutions keeping in pace with
changing times. Our mission is for businesses to leverage the internet and mobility
to work smarter and grow faster. We work as your outsourcing and consulting
partner.
INDEX
Page no.
College Details [i]
Certificate ………………………………………………………………. [ii]
Candidate Declaration ………………………………………………….. [iii]
Acknowledgements …………………………………………………….. [iv]
Company Profile ………………………………………………………. [v]
Contents ………………………………………………………………… [vi]
List of Figure …………………………………………………………… [vii]
INTRODUCTION
About
1. Machine Learning
2. Supervised Machine Learning
3. Unsupervised Machine Learning
4. Semi-Supervised Machine Learning App Permissions
5. Reinforcement Machine Learning
Core Topics
6. Environment Setup For Machine Learning
7. Installing All Required Modules For Machine Learning
My Projects
8. Create your emoji with Deep Learning
9. Testing
10.Conclusion
11.Refrences
Chapter-1
MACHINE LEARNING
What is Machine Learning?
Machine learning is a subfield of computer science that evolved from the study of pattern
recognition and computational learning theory in artificial intelligence. Machine learning
explores the construction and study of algorithms that can learn from and make predictions
on data. Such algorithms operate by building a model from example inputs in order to make
data driven predictions or decisions, rather than following strictly static program instructions.
Machine learning is closely related to and often overlaps with computational statistics; a
discipline that also specializes in prediction-making. It has strong ties to mathematical
optimization, which deliver methods, theory and application domains to the field. Machine
learning is employed in a range of computing tasks where designing and programming
explicit algorithms is infeasible. Example applications include spam filtering, optical
character recognition (OCR), search engines and computer vision. Machine learning is some
times conflated with data mining, although that focuses more on exploratory data analysis.
Machine learning and pattern recognition “can be viewed as two facets of the same field.”
When employed in industrial contexts, machine learning methods may be referred to as
predictive analytics or predictive modelling.
In 1959, Arthur Samuel defined machine learning as a “Field of study that gives computers
the ability to learn without being explicitly programmed”. Tom M. Mitchell provided a
widely quoted, more formal definition: “A computer program is said to learn from
experience E with respect to some class of tasks T and performance measure P, if its
performance at tasks in T, as measured by P, improves with experience E”. This definition is
notable for its defining machine learning in fundamentally operational rather than cognitive
terms, thus following Alan Turing's proposal in his paper "Computing Machinery and
Intelligence" that the question “Can machines think?" be replaced with the question “Can
machines do what we (as thinking entities) can do?"
7
Steps involved in Machine Learning:
A machine learning project involves the following steps −
1. Defining a Problem
2. Preparing Data
3. Evaluating Algorithms
4. Improving Results
5. Presenting Results
The best way to get started using Python for machine learning is to work through a project
endto-end and cover the key steps like loading data, summarizing data, evaluating algorithms
and making some predictions. This gives you a replicable method that can be used dataset
after dataset.
Terminologies of Machine Learning

• Model
A model is a specific representation learned from data by applying some machine
learning algorithm. A model is also called hypothesis.
8
• Feature
A feature is an individual measurable property of our data. A set of numeric features
can be conveniently described by a feature vector. Feature vectors are fed as input to
the model. For example, in order to predict a fruit, there may be features like colour,
smell, taste, etc.
Note: Choosing informative, discriminating and independent features is a crucial step
for effective algorithms. We generally employ a feature extractor to extract the
relevant features from the raw data.
• Target(Label)
A target variable or label is the value to be predicted by our model. For the fruit
example discussed in the features section, the label with each set of input would be the
name of the fruit like apple, orange, banana, etc.
• Training
The idea is to give a set of inputs(features) and it’s expected outputs(labels), so after
training, we will have a model (hypothesis) that will then map new data to one of the
categories trained on.
• Prediction
Once our model is ready, it can be fed a set of inputs to which it will provide a
predicted output(label).
The figure shown below clears the above concepts:
Types of machine learning problems

There are various ways to classify machine learning problems. Here, we discuss the most
obvious ones.
1.On basis of the nature of the learning “signal” or “feedback” available to a learning
system
9
•Supervised learning: The computer is presented with example inputs and their desired
outputs, given by a “teacher”, and the goal is to learn a general rule that maps inputs to
outputs. The training process continues until the model achieves the desired level of accuracy
on the training data. Some real-life examples are:
•Image Classification: You train with images/labels. Then in the future you give a new
image expecting that the computer will recognize the new object.
•Market Prediction/Regression: You train the computer with historical market data and ask
the computer to predict the new price in the future.
•Unsupervised learning: No labels are given to the learning algorithm, leaving it on its own
to find structure in its input. It is used for clustering population in different groups.
Unsupervised learning can be a goal in itself (discovering hidden patterns in data).
•Clustering: You ask the computer to separate similar data into clusters, this is essential in
research and science.
•High Dimension Visualization: Use the computer to help us visualize high dimension data.
•Generative Models: After a model captures the probability distribution of your input data,
it will be able to generate more data. This can be very useful to make your classifier more
robust.
A simple diagram which clears the concept of supervised and unsupervised learning is shown
below:
As you can see clearly, the data in supervised learning is labelled, where as data in
unsupervised learning is unlabelled.
•Semi-supervised learning: Problems where you have a large amount of input data and only
some of the data is labeled, are called semi-supervised learning problems. These problems sit
10
in between both supervised and unsupervised learning. For example, a photo archive where
only some of the images are labeled, (e.g. dog, cat, person) and the majority are unlabeled.
•Reinforcement learning: A computer program interacts with a dynamic environment in

which it must perform a certain goal (such as driving a vehicle or playing a game against an
opponent). The program is provided feedback in terms of rewards and punishments as it
navigates its problem space.
2. On the basis of “output” desired from a machine learned system
•Classification: Inputs are divided into two or more classes, and the learner must produce a
model that assigns unseen inputs to one or more (multi-label classification) of these classes.
This is typically tackled in a supervised way. Spam filtering is an example of classification,
where the inputs are email (or other) messages and the classes are “spam” and “not spam”.
•Regression: It is also a supervised learning problem, but the outputs are continuous rather
than discrete. For example, predicting the stock prices using historical data.
An example of classification and regression on two different datasets is shown below:
•Clustering: Here, a set of inputs is to be divided into groups. Unlike in classification, the
groups are not known beforehand, making this typically an unsupervised task.
As you can see in the example below, the given dataset points have been divided into groups
identifiable by the colours red, green and blue.
11
•Density estimation: The task is to find the distribution of inputs in some space.
•Dimensionality reduction: It simplifies inputs by mapping them into a lower-dimensional

space. Topic modeling is a related problem, where a program is given a list of human
language documents and is tasked to find out which documents cover similar topics.
SOME APPLICATIONS OF MACHINE LEARNING ARE:
• Vision processing
• Language processing
• Forecasting things like stock market trends, weather
• Pattern recognition
• Games
• Data mining
• Expert systems
• Robotics
12
How does Machine Learning Work?
Machine Learning algorithm is trained using a training data set to create a model. When new
input data is introduced to the ML algorithm, it makes a prediction on the basis of the model.
The prediction is evaluated for accuracy and if the accuracy is acceptable, the Machine
Learning algorithm is deployed. If the accuracy is not acceptable, the Machine Learning
algorithm is trained again and again with an augmented training data set.
This is just a very high-level example as there are many factors and other steps involved.
13
Chapter-2
SUPERVISED MACHINE LEARNING

What is Supervised Machine Learning?
In supervised learning, learning data comes with description, labels, targets or desired
outputs and the objective is to find a general rule that maps inputs to outputs. This kind of
learning data is called labeled data. The learned rule is then used to label new data with
unknown outputs.
Supervised learning is commonly used in real world applications, such as face and speech
recognition, products or movie recommendations, and sales forecasting.
Supervised learning is when the model is getting trained on a labelled dataset. Labelled
dataset is one which have both input and output parameters. In this type of learning both
training and validation datasets are labelled as shown in the figures below.
14
Both the above figures have labelled data set –
•Figure A: It is a dataset of a shopping store which is useful in predicting whether a
customer will purchase a particular product under consideration or not based on his/ her
gender, age and salary.
Input: Gender, Age, Salary
Output: Purchased i.e. 0 or 1;1 means yes, the customer will purchase and 0 means that
customer won’t purchase it.
•Figure B: It is a Meteorological dataset which serves the purpose of predicting wind speed
based on different parameters.
Input: Dew Point, Temperature, Pressure, Relative Humidity, Wind Direction
Output: Wind Speed
Training the system:
While training the model, data is usually split in the ratio of 80:20 i.e. 80% as training data
and rest as testing data. In training data, we feed input as well as output for 80% data. The
model learns from training data only. We use different machine learning algorithms (which
we will discuss in detail in next articles) to build our model. By learning, it means that the
model will build some logic of its own.
Once the model is ready then it is good to be tested. At the time of testing, input is fed from
remaining 20% data which the model has never seen before, the model will predict some
value and we will compare it with actual output and calculate the accuracy.
15
Types of Supervised Learning:
1.Classification: It is a Supervised Learning task where output is having defined labels
(discrete value). For examples in above Figure A, Output – Purchased has defined labels i.e.
0 or 1;1 means the customer will purchase and 0 means that customer won’t purchase. The
goal here is to predict discrete values belonging to a particular class and evaluate on the basis
of accuracy.
It can be either binary or multi class classification. In binary classification, model predicts
either 0 or 1 ; yes or no but in case of multi class classification, model predicts more than one
class.
Example: Gmail classifies mails in more than one classes like social, promotions, updates,
forum.
A classification problem is when the output variable is a category, such as “red” or “blue” or
“disease” and “no disease”. A classification model attempts to draw some conclusion from
observed values. Given one or more inputs a classification model will try to predict the value
of one or more outcomes.
For example, when filtering emails “spam” or “not spam”, when looking at transaction data,
“fraudulent”, or “authorized”. In short Classification either predicts categorical class labels or
classifies data (construct a model) based on the training set and the values (class labels) in
classifying attributes and uses it in classifying new data. There are a number of classification
models. Classification models include logistic regression, decision tree, random forest,
gradient-boosted tree, multilayer perceptron, one-vs-rest, and Naive Bayes.
For example:
Which of the following is/are classification problem(s)?
• Predicting the gender of a person by his/her handwriting style
• Predicting house price based on area
16
• Predicting whether monsoon will be normal next year
• Predict the number of copies a music album will be sold next month
Solution: Predicting the gender of a person Predicting whether monsoon will be normal next
year. The other two are regression.
As we discussed classification with some examples. Now there is an example of
classification in which we are performing classification on the iris dataset using
RandomForestClassifier in python.
Dataset Description
Title: Iris Plants Database
Attribute Information:
1. sepal length in cm
2. sepal width in cm
3. petal length in cm
4. petal width in cm
5. class:
-- Iris Setosa
-- Iris Versicolour
-- Iris Virginica
Missing Attribute Values: None
Class Distribution: 33.3% for each of 3 classes
Program:
# Python code to illustrate
# classification using data set #Importing the required
library import pandas as pd from
sklearn.cross_validation import train_test_split from
sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import LabelEncoder
17
from sklearn.metrics import confusion_matrix from
sklearn.metrics import accuracy_score from
sklearn.metrics import classification_report
#Importing the dataset

dataset=pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-'+ '
databases/iris/iris.data',sep= ',', header= None) data = dataset.iloc[:, :]
#checking for null values

print("Sum of NULL values in each column. ")
print(data.isnull().sum())
#seperating the predicting column from the whole

dataset X = data.iloc[:, :-1].values y = dataset.iloc[:,
4].values
#Encoding the predicting variable

labelencoder_y = LabelEncoder()
y =
labelencoder_y.fit_transform(y)
#Spliting the data into test and train dataset

X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size = 0.3, random_state = 0)
#Using the random forest classifier for the prediction

classifier=RandomForestClassifier()
classifier=classifier.fit(X_train,y_train)
predicted=classifier.predict(X_test)
#printing the results print ('Confusion Matrix :')

print(confusion_matrix(y_test, predicted)) print
('Accuracy Score :',accuracy_score(y_test, predicted))
18
print ('Report : ') print (classification_report(y_test,
predicted))
Output:
Sum of NULL values in each column.

1 0
2 0
3 0
4 0
5 0
Confusion Matrix:
[[16 0 0]
[ 0 17 1]
[ 0 0 11]]
Accuracy Score : 97.7
Report:
precision recall f1-score support
1 1.00 1.00 1.00 16
2 1.00 0.94 0.97 18 2
0.92 1.00 0.96 11 avg/total
0.98 0.98 0.98 45
2.Regression: It is a Supervised Learning task where output is having continuous value.
19
Example in above Figure B, Output – Wind Speed is not having any discrete value but is
continuous in the particular range. The goal here is to predict a value as much closer to actual
output value as our model can and then evaluation is done by calculating error value. The
smaller the error the greater the accuracy of our regression model.
A regression problem is when the output variable is a real or continuous value, such as
“salary” or “weight”. Many different models can be used, the simplest is the linear
regression. It tries to fit data with the best hyper-plane which goes through the points.
Types of Regression Models:
For Examples:
Which of the following is a regression task?
• Predicting age of a person
• Predicting nationality of a person
20
• Predicting whether stock price of a company will increase tomorrow
• Predicting whether a document is related to sighting of UFOs?
Solution: Predicting age of a person (because it is a real value, predicting nationality is
categorical, whether stock price will increase is discrete-yes/no answer, predicting whether a
document is related to UFO is again discrete- a yes/no answer).
Let’s take an example of linear regression. We have a Housing data set and we want to
predict the price of the house. Following is the python code for it
# Python code to
illustrate # regression
using data set import
matplotlib
matplotlib.use('GTKAgg'
)
import matplotlib.pyplot as plt import

numpy as np from sklearn import
datasets, linear_model import pandas as
pd
# Load CSV and columns df =

pd.read_csv("Housing.csv")
Y = df['price']
X = df['lotsize']
X=X.reshape(len(X),1)
Y=Y.reshape(len(Y),1)
# Split the data into training/testing sets

X_train = X[:-250]
X_test = X[-250:]
# Split the targets into training/testing sets
21
Y_train = Y[:-250]
Y_test = Y[-250:]
# Plot outputs plt.scatter(X_test,

Y_test, color='black') plt.title('Test
Data') plt.xlabel('Size')
plt.ylabel('Price') plt.xticks(())
plt.yticks(())
# Create linear regression object regr

= linear_model.LinearRegression()
# Train the model using the training sets

regr.fit(X_train, Y_train)
# Plot outputs plt.plot(X_test, regr.predict(X_test),

color='red',linewidth=3) plt.show()
The output of above code is:
Some types of Supervised Learning Algorithms:

• Linear Regression
• Nearest Neighbour
• Gaussian Naive Bayes
22
• Decision Trees
• Support Vector Machine (SVM)
• Random Forest
Chapter-3
UNSUPERVISED MACHINE LEARNING

Introduction:
23
Unsupervised learning is used to detect anomalies, outliers, such as fraud or defective
equipment, or to group customers with similar behaviors for a sales campaign. It is the
opposite of supervised learning. There is no labeled data here.
When learning data contains only some indications without any description or labels, it is up
to the coder or to the algorithm to find the structure of the underlying data, to discover
hidden patterns, or to determine how to describe the data. This kind of learning data is called
unlabeled data.
Unlike supervised learning, no teacher is provided that means no training will be given to the
machine. Therefore, machine is restricted to find the hidden structure in unlabelled data by
ourself.
For instance, suppose it is given an image having both dogs and cats which have not seen
ever.
Thus, the machine has no idea about the features of dogs and cat so we can’t categorize it in
dogs and cats. But it can categorize them according to their similarities, patterns, and
differences i.e., we can easily categorize the above picture into two parts. First may contain
all pics having dogs in it and second part may contain all pics having cats in it. Here you
didn’t learn anything before, means no training data or examples.
Types of Unsupervised Machine Learning:

1) Clustering
2) Association
1) Clustering: A clustering problem is where you want to discover the inherent groupings
in the data, such as grouping customers by purchasing behaviour.
24
Clustering is the task of dividing the population or data points into a number of groups such
that data points in the same groups are more similar to other data points in the same group
and dissimilar to the data points in other groups. It is basically a collection of objects on the
basis of similarity and dissimilarity between them.
For ex– The data points in the graph below clustered together can be classified into one
single group. We can distinguish the clusters, and we can identify that there are 3 clusters in
the below picture.
It is not necessary for clusters to be a spherical. Such as:
25
DBSCAN Density data
These data points are clustered by using the basic concept that the data point lies within the
given constraint from the cluster centre. Various distance methods and techniques are used
for calculation of the outliers.
2) Association: An association rule learning problem is where you want to discover rules
that describe large portions of your data, such as people that buy X also tend to buy Y.
Chapter-4
26
SEMI-SUPERVISED MACHINE LEARNING
Introduction:
If some learning samples are labelled, but some other are not labelled, then it is
semisupervised learning. It makes use of a large amount of unlabelled data for training and
a small amount of labeled data for testing. Semi-supervised learning is applied in cases
where it is expensive to acquire a fully labelled dataset while more practical to label a small
subset. For example, it often requires skilled experts to label certain remote sensing images,
and lots of field experiments to locate oil at a particular location, while acquiring unlabelled
data is relatively easy.
Today’s Machine Learning algorithms can be broadly classified into three categories,
Supervised Learning, Unsupervised Learning and Reinforcement Learning. Casting
Reinforced Learning aside, the primary two categories of Machine Learning problems are
Supervised and Unsupervised Learning. The basic difference between the two is that
Supervised Learning datasets have an output label associated with each tuple while
Unsupervised Learning datasets do not.
The most basic disadvantage of any Supervised Learning algorithm is that the dataset has to
be hand-labeled either by a Machine Learning Engineer or a Data Scientist. This is a very
costly process, especially when dealing with large volumes of data. The most basic
disadvantage of any Unsupervised Learning is that its application spectrum is limited.
To counter these disadvantages, the concept of Semi-Supervised Learning was introduced. In

this type of learning, the algorithm is trained upon a combination of labelled and unlabelled
data. Typically, this combination will contain a very small amount of labeled data and a very
large amount of unlabelled data. The basic procedure involved is that first, the programmer
will cluster similar data using an unsupervised learning algorithm and then use the existing
labeled data to label the rest of the unlabelled data. The typical use cases of such type of
algorithm have a common property among them – The acquisition of unlabelled data is
relatively cheap while labelling the said data is very expensive.
27
Intuitively, one may imagine the three types of learning algorithms as Supervised learning
where a student is under the supervision of a teacher at both home and school, Unsupervised
learning where a student has to figure out a concept himself and Semi-Supervised learning
where a teacher teaches a few concepts in class and gives questions as homework which are
based on similar concepts.
A Semi-Supervised algorithm assumes the following about the data –
1. Continuity Assumption: The algorithm assumes that the points which are closer to each
other are more likely to have the same output label.
2. Cluster Assumption: The data can be divided into discrete clusters and points in the
same cluster are more likely to share an output label.
3. Manifold Assumption: The data lie approximately on a manifold of much lower

dimension than the input space. This assumption allows the use of distances and densities
which are defined on a manifold.
Practical applications of Semi-Supervised Learning –
1) Speech Analysis: Since labelling of audio files is a very intensive task, Semi-Supervised
learning is a very natural approach to solve this problem.
28
2) Internet Content Classification: Labelling each webpage is an impractical and
unfeasible process and thus uses Semi-Supervised learning algorithms. Even the Google
search algorithm uses a variant of Semi-Supervised learning to rank the relevance of a
webpage for a given query.
3) Protein Sequence Classification: Since DNA strands are typically very large in size, the
rise of Semi-Supervised learning has been imminent in this field.
29
Chapter-5
REINFORCEMENT MACHINE LEARNING

Introduction:
Here learning data gives feedback so that the system adjusts to dynamic conditions in order
to achieve a certain objective. The system evaluates its performance based on the feedback
responses and reacts accordingly. The best-known instances include self-driving cars and
chess master algorithm AlphaGo.
Reinforcement learning is an area of Machine Learning. Reinforcement. It is about taking

suitable action to maximize reward in a particular situation. It is employed by various
software and machines to find the best possible behaviour or path it should take in a specific
situation. Reinforcement learning differs from the supervised learning in a way that in
supervised learning the training data has the answer key with it so the model is trained with
the correct answer itself whereas in reinforcement learning, there is no answer but the
reinforcement agent decides what to do to perform the given task. In the absence of training
dataset, it is bound to learn from its experience.
Example: The problem is as follows: We have an agent and a reward, with many hurdles in
between. The agent is supposed to find the best possible path to reach the reward. The
following problem explains the problem more easily.
The above image shows robot, diamond and fire. The goal of the robot is to get the reward
that is the diamond and avoid the hurdles that is fire. The robot learns by trying all the
possible paths and then choosing the path which gives him the reward with the least hurdles.
30
Each right step will give the robot a reward and each wrong step will subtract the reward of
the robot. The total reward will be calculated when it reaches the final reward that is the
diamond.
Main points in Reinforcement learning –

• Input: The input should be an initial state from which the model will start
• Output: There are many possible outputs as there are variety of solution to a particular
problem
• Training: The training is based upon the input; the model will return a state and the user
will decide to reward or punish the model based on its output.
• The model keeps continues to learn.
• The best solution is decided based on the maximum reward.
Types of Reinforcement: There are two types of Reinforcement:
1)Positive –
Positive Reinforcement is defined as when an event, occurs due to a particular behaviour,

increases the strength and the frequency of the behaviour. In other words, it has a positive
effect on the behaviour.
Advantages of reinforcement learning are:
a) Maximizes Performance
b) Sustain Change for a long period of time
Disadvantages of reinforcement learning:
i) Too much Reinforcement can lead to overload of states which can diminish the results
2) Negative –
Negative Reinforcement is defined as strengthening of a behaviour because a negative

condition is stopped or avoided.
Advantages of reinforcement learning:
a) Increases Behaviour
b) Provide defiance to minimum standard of performance Disadvantages of
reinforcement learning:
31
i) It Only provides enough to meet up the minimum behaviour
Chapter-6
ENVIRONMENT SETUP FOR MACHINE LEARNING
Programming Language Setup:

PYTHON INSTALLATION
Step-1:
Open browser and go to official page of Python to download www.python.org
The following page will appear in your browser.
Sterp-2:
32
Click the Windows link (two lines below the Download Python 3.7.4 button).
The following page will appear in your browser.
Step-3:
Click on the Download Windows x86-64 executable installer link under the top-left Stable
Releases.
33
The following pop-up window titled Opening python-3.74-amd64.exe will appear.
Click the Save File button.
The file named python-3.7.4-amd64.exe should start downloading into your standard
download folder. This file is about 30 Mb so it might take a while to download fully if you
are on a slow internet connection (it took me about 10 seconds over a cable modem).
The file should appear as
1. Move this file to a more permanent location, so that you can install Python (and
reinstall it easily later, if necessary).
2. Feel free to explore this webpage further; if you want to just continue the installation,
you can terminate the tab browsing this webpage.
3. Start the Installing instructions directly below. Installing Step-1:
Double-click the icon labeling the file python-3.7.4-amd64.exe.
A Python 3.7.4 (64-bit) Setup pop-up window will appear.
34
Ensure that the Install launcher for all users (recommended) and the Add Python 3.7 to
PATH checkboxes at the bottom are checked.
If the Python Installer finds an earlier version of Python installed on your computer, the
Install Now message may instead appear as Upgrade Now (and the checkboxes will not
appear).
Step-2:
Highlight the Install Now (or Upgrade Now) message, and then click it.
When run, a User Account Control pop-up window may appear on your screen. I could not
capture its image, but it asks, Do you want to allow this app to make changes to your
device.
Step-3:
Click the Yes button.
A new Python 3.7.4 (64-bit) Setup pop-up window will appear with a Setup
35
Progress message and a progress bar.
During installation, it will show the various components it is installing and move the progress
bar towards completion. Soon, a new Python 3.7.4 (64-bit) Setup pop-up window will
appear with a Setup was successfully message.
36
Step-4:
Click the Close button.
Python should now be installed.
Verifying
To try to verify installation,
1. Navigate to the directory C:\Users\Pattis\AppData\Local\Programs\Python\Python37 (or to

whatever directory Python was installed: see the pop-up window for Installing step 3).
2. Double-click the icon/file python.exe.
The following pop-up window will appear.
A popup window with the title
C:\Users\Pattis\AppData\Local\Programs\Python\Python37\python.exe appears, and inside

the window; on the first line is the text Python 3.7.4 ... (notice that it should also say 64 bit).
Inside the window, at the bottom left, is the prompt >>>: type exit() to this prompt and press
enter to terminate Python.
You should keep the file python-3.7.4.exe somewhere on your computer in case you need to
reinstall Python (not likely necessary).
Installing Jupyter Notebook using pip
As an existing or experienced Python user, you may wish to install Jupyter using Python’s
package manager, pip.
If you have Python 3 installed (which is recommended):
37
python3 -m pip install --upgrade pip
python3 -m pip install jupyter
If you have Python 2 installed:
python -m pip install --upgrade pip
python -m pip install jupyter
Congratulations, you have installed Jupyter Notebook! To run the notebook, run the
following command at the Terminal (Mac/Linux) or Command Prompt (Windows):
jupyter notebook
See Running the Notebook for more details.
38
Chapter-7
INSTALLING ALL REQUIRED MODULES FOR
MACHINE LEARNING
STEP-1:
Open Command Prompt Step-
2:
Enter the following keyword in the command prompt shell for installing corresponding
Modules
1. PANDAS:
pip install pandas
2. NUMPY:
pip install numpy
3. SCIPY:
pip install scipy
4. MATPLOTLIB:
pip install matplotlib
5. SCIKIT LEARN:
pip install scikit-learn
39
6. SPEECH RECOGNITION:
pip install SpeechRecognition
7. PYTTSX3:
pip install pyttsx3
Opening Jupyter Notebook:
Enter following keyword in command prompt: jupyter notebook
After this following window will open:
40
PROJECT WORK:
Why IRIS project:
• Attributes are numeric so you have to figure out how to load and handle data.
• It is a classification problem, allowing you to practice with perhaps an easier type of
supervised learning algorithm.
• It is a multi-class classification problem (multi-nominal) that may require some specialized
handling.
• It only has 4 attributes and 150 rows, meaning it is small and easily fits into memory (and a
screen or A4 page).
• All of the numeric attributes are in the same units and the same scale, not requiring any
special scaling or transforms to get started
Here is an overview of what we are going to cover:
1. Installing the Python and SciPy platform.
41
2. Loading the dataset.
3. Summarizing the dataset.
4. Visualizing the dataset.
5. Evaluating some algorithms.
6. Making some predictions
Whole project work explained step-by-step:

Step-1: Start Python and Check Versions
42
Step-
2: Import Libraries
Step-3: Load Dataset
Step-4: Summarize Dataset
Step-5: Check Dimensions of Dataset
6: Peek at the Data
43
Step-
Step-7: Statistical Summary
Step-8: Class Distribution
9: Data Visualization:
Univariate Plots
44
Step-
Histogram plots
Multivariate Plots
45
Step-10: Evaluate some Algorithms
Step-11: Compare Model by plotting
Step-12: Make Prediction on validation dataset
46
47
Chapter-8
Create your emoji with Deep Learning
About the Dataset?
The FER2013 dataset ( facial expression recognition) consists of 48*48 pixel grayscale face
images. The images are centered and occupy an equal amount of space. This dataset consist of
facial emotions of following categories:
1. Angry
2. Disgust
3. Feat
4. Happy
5. Sad
6. Surprise
7. Natural
Facial Emotion Recognition using CNN

In the below will build a convolution neural network architecture and train the model on
FER2013 dataset for Emotion recognition from images.
Download the dataset from the above link. Extract it in the data folder with separate train and test
directories.
1. Imports (Make a file train.py):
import numpy as np
import cv2
from keras.emotion_models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D
from keras.optimizers import Adam
from keras.layers import MaxPooling2D
from keras.preprocessing.image import ImageDataGenerator
48
2. Initialize the training and validation generators:
train_dir = 'data/train'
val_dir = 'data/test'
train_datagen = ImageDataGenerator(rescale=1./255)
val_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
train_dir,
target_size=(48,48),
batch_size=64,
color_mode="gray_framescale",
class_mode='categorical')
validation_generator = val_datagen.flow_from_directory(
val_dir,
target_size=(48,48),
batch_size=64,
color_mode="gray_framescale",
class_mode='categorical')
49
3. Build the convolution network architecture:
emotion_model = Sequential()
emotion_model.add(Conv2D(32, kernel_size=(3, 3), activation='relu',

input_shape=(48,48,1)))
emotion_model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
emotion_model.add(MaxPooling2D(pool_size=(2, 2)))
emotion_model.add(Dropout(0.25))

emotion_model.add(Flatten())
emotion_model.add(Dense(1024, activation='relu'))
emotion_model.add(Dense(7, activation='softmax'))
4. Compile and train the model:
emotion_model.compile(loss='categorical_crossentropy',optimizer=Adam(
lr=0.0001, decay=1e-6),metrics=['accuracy'])
['accuracy'])
emotion_model_info==emotion_model.fit_generator(
emotion_model_info emotion_model.fit_generator(
train_generator,
train_generator,
steps_per_epoch=28709////64,
steps_per_epoch=28709 64,
epochs=50,
epochs=50,
validation_data=validation_generator,
validation_data=validation_generator,
validation_steps=7178 // 64)
50
5. Save the model weight
emotion_model.save_weights('model.h5')
6. Using openCV haarcascade xml detect the bounding boxes of face in the
webcam and predict the emotions:
cv2.ocl.setUseOpenCL(False)
emotion_dict = {0: "Angry", 1: "Disgusted", 2: "Fearful", 3: "Happy", 4: "Neutral", 5:

"Sad", 6: "Surprised"}
cap = cv2.VideoCapture(0)
while True:
ret, frame = cap.read()
if not ret:
break
bounding_box = cv2.CascadeClassifier('/home/shivam/.local/lib/python3.6/site-
packages/cv2/data/haarcascade_frontalface_default.xml')
gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2gray_frame)
num_faces = bounding_box.detectMultiScale(gray_frame,scaleFactor=1.3,
minNeighbors=5)
for (x, y, w, h) in num_faces:

cv2.rectangle(frame, (x, y-50), (x+w, y+h+10), (255, 0, 0), 2)
roi_gray_frame = gray_frame[y:y + h, x:x + w]
cropped_img = np.expand_dims(np.expand_dims(cv2.resize(roi_gray_frame, (48,
48)), -1), 0)
emotion_prediction = emotion_model.predict(cropped_img)
maxindex = int(np.argmax(emotion_prediction))
cv2.putText(frame, emotion_dict[maxindex], (x+20, y-60),
cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2, cv2.LINE_AA)
51
b6. Using openCV haarcascade xml detect the bounding boxes of face in
cv2.imshow('Video', cv2.resize(frame,(1200,860),interpolation =
webcam and predict the emotions:
cv2.INTER_CUBIC))
if cv2.waitKey(1) & 0xFF == ord('q'):
cap.release()
cv2.destroyAllWindows()
Code for GUI and mapping with emojis

Create a folder named emojis and save the emojis corresponding to each of the
seven emotions in the dataset.
. import tkinter as tk
from tkinter import *

import cv2
from PIL import Image, ImageTk
import os
import numpy as np
import cv2
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D
from keras.optimizers import Adam
from keras.layers import MaxPooling2D
from keras.preprocessing.image import ImageDataGenerator
emotion_model = Sequential()
emotion_model.add(Conv2D(32, kernel_size=(3, 3), activation='relu',

input_shape=(48,48,1)))
52
bre
emotion_model.add(Flatten())
emotion_model.add(Dense(1024, activation='relu'))
emotion_model.add(Dense(7, activation='softmax'))
emotion_model.load_weights('model.h5')
cv2.ocl.setUseOpenCL(False)
emotion_dict = {0: " Angry ", 1: "Disgusted", 2: " Fearful ", 3: " Happy ", 4: " Neutral
", 5: " Sad ", 6: "Surprised"}
emoji_dist={0:"./emojis/angry.png",2:"./emojis/disgusted.png",2:"./emojis/fearful.png",3:"./e
mojis/happy.png",4:"./emojis/neutral.png",5:"./emojis/sad.png",6:"./emojis/surpriced.png"}
global last_frame1
last_frame1 = np.zeros((480, 640, 3), dtype=np.uint8)
global cap1
show_text=[0]
def show_vid():
cap1 = cv2.VideoCapture(0)
if not cap1.isOpened():
print("cant open the camera1")
flag1, frame1 = cap1.read()
frame1 = cv2.resize(frame1,(600,500))
53
bounding_box = cv2.CascadeClassifier('/home/shivam/.local/lib/python3.6/site-
packages/cv2/data/haarcascade_frontalface_default.xml')
gray_frame = cv2.cvtColor(frame1, cv2.COLOR_BGR2GRAY)
num_faces = bounding_box.detectMultiScale(gray_frame,scaleFactor=1.3,
minNeighbors=5)
for (x, y, w, h) in num_faces:

cv2.rectangle(frame1, (x, y-50), (x+w, y+h+10), (255, 0, 0), 2)
roi_gray_frame = gray_frame[y:y + h, x:x + w]
cropped_img = np.expand_dims(np.expand_dims(cv2.resize(roi_gray_frame, (48,
48)), -1), 0)
prediction = emotion_model.predict(cropped_img)
maxindex = int(np.argmax(prediction))
cv2.putText(frame1, emotion_dict[maxindex], (x+20, y-60),
cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2, cv2.LINE_AA)
show_text[0]=maxindex
if flag1 is None:
print ("Major error!")
elif flag1:
global last_frame1
last_frame1 = frame1.copy()
pic = cv2.cvtColor(last_frame1, cv2.COLOR_BGR2RGB)
img = Image.fromarray(pic)
imgtk = ImageTk.PhotoImage(image=img)
lmain.imgtk = imgtk
lmain.configure(image=imgtk)
lmain.after(10, show_vid)
if cv2.waitKey(1) & 0xFF == ord('q'):
exit()
54
def show_vid2():
frame2=cv2.imread(emoji_dist[show_text[0]])
pic2=cv2.cvtColor(frame2,cv2.COLOR_BGR2RGB)
img2=Image.fromarray(frame2)
imgtk2=ImageTk.PhotoImage(image=img2)
lmain2.imgtk2=imgtk2
lmain3.configure(text=emotion_dict[show_text[0]],font=('arial',45,'bold'))
lmain2.configure(image=imgtk2)
lmain2.after(10, show_vid2)
if __name__ == '__main__':
root=tk.Tk()
img = ImageTk.PhotoImage(Image.open("logo.png"))
heading = Label(root,image=img,bg='black')
heading.pack()
heading2=Label(root,text="Photo to Emoji",pady=20,
font=('arial',45,'bold'),bg='black',fg='#CDCDCD')
heading2.pack()
lmain = tk.Label(master=root,padx=50,bd=10)
lmain2 = tk.Label(master=root,bd=10)
lmain3=tk.Label(master=root,bd=10,fg="#CDCDCD",bg='black')
lmain.pack(side=LEFT)
lmain.place(x=50,y=250)
lmain3.pack()
lmain3.place(x=960,y=250)
lmain2.pack(side=RIGHT)
lmain2.place(x=900,y=350)
55
`root.title("Photo To Emoji")
root.geometry("1400x900+100+10")
root['bg']='black'
exitbutton = Button(root,
text='Quit',fg="red",command=root.destroy,font=('arial',25,'bold')).pack(side =
BOTTOM)
show_vid()
show_vid2()
root.mainloop()
Chapter-9
56
Testing
57
Chapter-10
CONCLUSION
In this deep learning project for beginners, we have built a convolution neural network to
recognize facial emotions. We have trained our model on the FER2013 dataset. Then we are
mapping those emotions with the corresponding emojis or avatars.
Using OpenCV’s haar cascade xml we are getting the bounding box of the faces in the webcam.
Then we feed these boxes to the trained model for classification.
Chapter-11
REFRENCES
DataFlair is committed to provide all the resources to make you a data scientist, which includes
detailed tutorials, practicals, use-cases as well as projects with source code. Did you like our
efforts? if yes, please give DataFlair 5 Stars on Google.
Workshop / Production technology by WEBTEK LABS PVT. LTD,

JAIPUR, RAJASTHAN
 Study material provided by technical training center
 Study material provided by Puja Batiya
 https://data-flair.training/blogs/create-emoji-with-deep-learning/
 https://webteklabs.com/
 https://en.wikipedia.org/wiki/MachineLearning
58

Machine Learning

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Machine Learning

Uploaded by

Copyright:

Available Formats

Emoji Creator Project

Department of Computer Science And Engineering

WebTek Labs Pvt. Ltd. is recognized as a leading IT solution providing

A machine learning project involves the following steps −

Terminologies of Machine Learning

Types of machine learning problems

•Reinforcement learning: A computer program interacts with a dynamic environment in

2. On the basis of “output” desired from a machine learned system

An example of classification and regression on two different datasets is shown below:

•Dimensionality reduction: It simplifies inputs by mapping them into a lower-dimensional

SOME APPLICATIONS OF MACHINE LEARNING ARE:

• Forecasting things like stock market trends, weather

SUPERVISED MACHINE LEARNING

#Importing the dataset

#checking for null values

#seperating the predicting column from the whole

#Encoding the predicting variable

#Spliting the data into test and train dataset

#Using the random forest classifier for the prediction

#printing the results print ('Confusion Matrix :')

Sum of NULL values in each column.

Accuracy Score : 97.7

2.Regression: It is a Supervised Learning task where output is having continuous value.

Types of Regression Models:

import matplotlib.pyplot as plt import

# Load CSV and columns df =

# Split the data into training/testing sets

# Split the targets into training/testing sets

# Plot outputs plt.scatter(X_test,

# Create linear regression object regr

# Train the model using the training sets

# Plot outputs plt.plot(X_test, regr.predict(X_test),

The output of above code is:

Some types of Supervised Learning Algorithms:

UNSUPERVISED MACHINE LEARNING

Types of Unsupervised Machine Learning:

It is not necessary for clusters to be a spherical. Such as:

To counter these disadvantages, the concept of Semi-Supervised Learning was introduced. In

A Semi-Supervised algorithm assumes the following about the data –

3. Manifold Assumption: The data lie approximately on a manifold of much lower

Practical applications of Semi-Supervised Learning –

REINFORCEMENT MACHINE LEARNING

Reinforcement learning is an area of Machine Learning. Reinforcement. It is about taking

Main points in Reinforcement learning –

Positive Reinforcement is defined as when an event, occurs due to a particular behaviour,

Advantages of reinforcement learning are:

b) Sustain Change for a long period of time

Disadvantages of reinforcement learning:

Negative Reinforcement is defined as strengthening of a behaviour because a negative

b) Provide defiance to minimum standard of performance Disadvantages of

ENVIRONMENT SETUP FOR MACHINE LEARNING

Programming Language Setup:

The following page will appear in your browser.

The following page will appear in your browser.

Click the Save File button.

The file should appear as

Double-click the icon labeling the file python-3.7.4-amd64.exe.

A Python 3.7.4 (64-bit) Setup pop-up window will appear.

Click the Yes button.

Click the Close button.

Python should now be installed.

1. Navigate to the directory C:\Users\Pattis\AppData\Local\Programs\Python\Python37 (or to