DeepLearing Theory

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 51

Deep Learning

Introduction to Neural Network


A neural network is a series of algorithms that endeavors to recognize underlying
relationships in a set of data through a process that mimics the way the human brain
operates. In this sense, neural networks refer to systems of neurons, either organic or
artificial in nature

Each neuron is made up of a cell body (the central mass of the cell) with a number of
connections coming off it: numerous dendrites (the cell's inputs—carrying information
toward the cell body) and a single axon (the cell's output—carrying information away).
Neurons are so tiny that you could pack about 100 of their cell bodies into a single
millimeter. (It's also worth noting, briefly in passing, that neurons make up only 10–50
percent of all the cells in the brain

How does a neural network learn things?


Information flows through a neural network in two ways. When it's learning (being
trained) or operating normally (after being trained), patterns of information are fed into
the network via the input units, which trigger the layers of hidden units, and these in turn
arrive at the output units. This common design is called a feedforward network. Not all
units "fire" all the time. Each unit receives inputs from the units to its left, and the inputs
are multiplied by the weights of the connections they travel along. Every unit adds up all
the inputs it receives in this way and (in the simplest type of network) if the sum is more
than a certain threshold value, the unit "fires" and triggers the units it's connected to (those
on its right).
Step 1:
For each input, multiply the input value xᵢ with weights wᵢ and sum all the multiplied
values. Weights — represent the strength of the connection between neurons and decides
how much influence the given input will have on the neuron’s output. If the weight w ₁ has a
higher value than the weight w₂, then the input x₁ will have a higher influence on the
output than w₂.

The row vectors of the inputs and weights are x = [x₁, x₂, … , xₙ] and w =[w ₁, w ₂, … , w ₙ]
respectively and their dot product is given by

Step 2:
Add bias b to the summation of multiplied values and let’s call this z. Bias — also known as
the offset is necessary in most of the cases, to move the entire activation function to the left
or right to generate the required output values. Z = X.W + B

Step 3:
Pass the value of z to a non-linear activation function. Activation functions — are used to
introduce non-linearity into the output of the neurons, without which the neural network
will just be a linear function. Moreover, they have a significant impact on the learning speed
of the neural network. Perceptrons have binary step function as their activation function.
However, we shall use sigmoid — also known as logistic function as our activation function.

Bias
This means when calculating the output of a node, the inputs are multiplied by weights, and
a bias value is added to the result. The bias value allows the activation function to be
shifted to the left or right, to better fit the data. Hence changes to the weights alter the
steepness of the sigmoid curve, whilst the bias offsets it, shifting the entire curve so it fits
better. Note also how the bias only influences the output values, it doesn’t interact with the
actual input data.
link: https://towardsdatascience.com/why-we-need-bias-in-neural-networks-
db8f7e07cb98

Activation Function
It’s just a thing function that you use to get the output of node. It is also known as Transfer
Function.
It is used to determine the output of neural network like yes or no. It maps the resulting
values in between 0 to 1 or -1
The Activation Functions can be basically divided into 2 types-
1. Linear Activation Function

2. Non-linear Activation Functions

1. Linear or Identity Activation Function


As you can see the function is a line or linear. Therefore, the output of the functions will not
be confined between any range.

Equation : f(x) = x
Range : (-infinity to infinity)
It doesn’t help with the complexity or various parameters of usual data that is fed to the
neural networks.

2. Non-linear Activation Function


The Nonlinear Activation Functions are the most used activation functions. Nonlinearity
helps to makes the graph look something like this
Derivative or Differential: Change in y-axis w.r.t. change in x-axis.It is also known as slope.
Monotonic function: A function which is either entirely non-increasing or non-decreasing.

2.1 Sigmoid or Logistic Activation Function


The Sigmoid Function curve looks like a S-shape.
The main reason why we use sigmoid function is because it exists between (0 to 1).
Therefore, it is especially used for models where we have to predict the probability as an
output.Since probability of anything exists only between the range of 0 and 1, sigmoid is
the right choice.derivate of sigmoid function lies between 0 to 0.25

2.2 Tanh or hyperbolic tangent Activation Function or threshold activation function


tanh is also like logistic sigmoid but better. The range of the tanh function is from (-1 to 1).
tanh is also sigmoidal (s - shaped).derivate of tanh lies between 0 to less than 1
2.3 ReLU (Rectified Linear Unit) Activation Function
The ReLU is the most used activation function in the world right now.Since, it is used in
almost all the convolutional neural networks or deep learning.derivate of z>0 it is 1, z<0 it
is 0 it have only two values.
As you can see, the ReLU is half rectified (from bottom). f(z) is zero when z is less than zero
and f(z) is equal to z when z is above or equal to zero.
Range: [ 0 to infinity)
The function and its derivative both are monotonic.
differentiation in almost every part of Machine Learning and Deep Learning.
##### Leaky ReLU Dead neuron:
if z<0 the derivative become 0, new calculated weight equals to old weight this neuron
called dead neuron.in order to fix this we use leaky ReLU.
Another problem we see in ReLU is the Dying ReLU problem where some ReLU Neurons
essentially die for all inputs and remain inactive no matter what input is supplied, here no
gradient flows and if large number of dead neurons are there in a Neural Network it’s
performance is affected, this can be corrected by making use of what is called Leaky ReLU
where slope is changed left of x=0 in above figure and thus causing a leak and extending the
range of ReLU.
ELU

SoftMax
Softmax is a very interesting activation function because it not only maps our output to a
[0,1] range but also maps each output in such a way that the total sum is 1. The output of
Softmax is therefore a probability distribution.
Optimizers
link:https://ruder.io/optimizing-gradient-descent/ https://heartbeat.fritz.ai/exploring-
optimizers-in-machine-learning-7f18d94cd65b
Optimizers associate loss function and model parameters together by updating the model,
i.e. the weights and biases of each node based on the output of the loss function.
Epoch:
One Epoch is when an ENTIRE dataset is passed forward and backward through the neural
network only ONCE

Iteration:
An iteration describes the number of times a batch of data passed through the algorithm. In
the case of neural networks, that means the forward pass and backward pass. So, every
time you pass a batch of data through the NN, you completed an iteration.

Gradient Descent
Well, of course we need to start off with the biggest star of our post — gradient descent.
Gradient descent is an iterative optimization algorithm. It is dependent on the derivatives
of the loss function for finding minima. Running the algorithm for numerous iterations and
epochs helps to reach the global minima (or closest to it).
• Gradient decent consider all data points(it is like population, ex we have 1000 data
points it consider all point while calculating error)

• it require large computational power for large data sets

stochastic Gradient Descent(SGD)


SGD randomly picks one data point from the whole data set at each iteration to reduce the
computations enormously.
• no of iterations more for large data sets so it will increase time and computational
power
Mini batch Stochastic Gradient Descent
sample a small number of data points instead of just one point at each step and that is
called “mini-batch” gradient descent. Mini-batch tries to strike a balance between the
goodness of gradient descent and speed of SGD.
• because of zig zag movement we experience noise in mini batch and stochastic to
over come this we introducing momentum

SGD with momentum


• we remove noise with help of Exponential moving average
The exponential moving average (EMA)": is a technical chart indicator that tracks the price
of an investment (like a stock or commodity) over time. The EMA is a type of weighted
moving average (WMA) that gives more weighting or importance to recent price data

Adagrad Optimizer
Adaptive Gradient Algorithm (Adagrad) is an algorithm for gradient-based optimization. ...
It performs smaller updates As a result, it is well-suited when dealing with sparse data
(NLP or image recognition) Each parameter has its own learning rate that improves
performance on problems with sparse gradients.
adagrad uses different learning rate for every iteration
we compute learning rate for every iteration

dense:
most of features will be non-zeros

sparse:
mpst of the featues are basically zeros

Advantages of Using AdaGrad


• it eliminates the need to manually tune the learning rate

• convergence is faster and more reliable – than simple SGD when the scaling of the
weights is unequal
• It is not very sensitive to the size of the master step

• one disadvantage is some time alpha t become very high with increasing of no of
iterations

AdaDelta
Adadelta is a more robust extension of Adagrad that adapts learning rates based on a
moving window of gradient updates, instead of accumulating all past gradients Compared
to Adagrad, in the original version of Adadelta you don't have to set an initial learning rate.

RMSprop
RMSprop is a gradient based optimization technique used in training neural networks. ...
This normalization balances the step size (momentum), decreasing the step for large
gradients to avoid exploding, and increasing the step for small gradients to avoid vanishing.
ADAM Optimizer

stochastic gradient vs gradient descent vs Mini batch

• in gradient descent it directly moves to Global minima

• in stochastic and mini batch its take time to move lo Global minima because its take
zig zag movement
Global minima & Local minima
A local minimum of a function is a point where the function value is smaller than at nearby
points, but possibly greater than at a distant point. A global minimum is a point where the
function value is smaller than at all other feasible points.
in some loss functions we experience below local minia and global minima

Convex function and non convex function


• convex function contains only contain global minima

• non convex function contain both global local minima

Chainrule
chain rule helps while readusting weights in backpropagation while finding derivatives
Vanishing Gradient Problem
In machine learning, the vanishing gradient problem is encountered when training artificial
neural networks with gradient-based learning methods and backpropagation.
the derivate of sigmoid function always lies between 0 to 0.25 because of that when you
have more layers the the new weights are almost equals to old weights this reason for
vanishing gradient problem and this is main reason we are not using sigmoid activation
function in all layers this can be over come by ReLU(rectified linear unit)
Certain activation functions, like the sigmoid function, squishes a large input space into a
small input space between 0 and 1. Therefore, a large change in the input of the sigmoid
function will cause a small change in the output. Hence, the derivative becomes small.
However, when n hidden layers use an activation like the sigmoid function, n small
derivatives are multiplied together. Thus, the gradient decreases exponentially as we
propagate down to the initial layers. A small gradient means that the weights and biases of
the initial layers will not be updated effectively with each training session. Since these
initial layers are often crucial to recognizing the core elements of the input data, it can lead
to overall inaccuracy of the whole network.

Exploding gradient problem


Exploding gradients are a problem where large error gradients accumulate and result in
very large updates to neural network model weights during training. This has the effect of
your model being unstable and unable to learn from your training data.

Dropout layers & Regularization


Dropout may be implemented on any or all hidden layers in the network as well as the
visible or input layer. It is not used on the output layer. The term “dropout” refers to
dropping out units (hidden and visible) in a neural network. — Dropout: A Simple Way to
Prevent Neural Networks from Overfitting.
there are 2 ways to overcome overfitting
1. Regularization, 2. drop out
dropout: before doing drop out we select drop out ratio(like probability ratio p).it is
between 0 to 1. we select sub set of input features in data and activation functions in the
hidden layers.
we define p value it randomly select some values and it deactivate that features and
activation functions and remaining process is same new weights are calculate in the 1st
iteration.n the 2nd iteration it will again select some features randomly with respect to p-
values.it only apply to the training data for test data we simply multiply with p-value with
weights obtained from training set.
How to select p-value: we apply hyperparameters to find p-values

Weights initialization Techniques


• Weight should be small

• weight should not be same

• weights should have good varience

there is no techniques gives best results we can only say doing experiment
1. uniform distribution

2. Xavire/Gorat

1.xavire normal

2.xavire uniform

3. He init

a. He uniform

b. He normal
Type of Loss Function or cost function or error function

1.Regression

1.1 MSE(Mean squred error):

Advantages:
1. in the form of quadratic equation ax^2 + bx + c

2. plot the quadratic quation we got a gradient descent with only one Global minima
and we didnt get any local minima

Disadvantage:
it is not robust to outliers

1.2 Absolute error loss (MAE)


we are taking absolute value

Advantage:
1. more robust to outliers as compared to MSE
Disadvantage:
1. computation of MAE is more difficult and take time

2. it may have local minima


1.3 Huber Loss:
Huber loss is a loss function used in robust regression, that is less sensitive to outliers in
data than the squared error loss. A variant for classification is also sometimes used.it is
combination of MAE and MSE

2. Classification
https://towardsdatascience.com/cross-entropy-for-classification-d98e7f974451
https://keras.io/api/losses/

2.1Cross Entropy:
Cross-entropy loss, or log loss, measures the performance of a classification model whose
output is a probability value between 0 and 1. Cross-entropy loss increases as the predicted
probability diverges from the actual label.it can solve only binary classification.

2.2 Multi Class Cross Entropy Loss:


How to train neural network on back propagation

how to train multi layer neural network


Multilayer networks solve the classification problem for non linear sets by employing
hidden layers, whose neurons are not directly connected to the output. The additional
hidden layers can be interpreted geometrically as additional hyper-planes, which enhance
the separation capacity of the network.

ANN

Create Artificial Neural Network using Weights initialization Tricks


import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

data = pd.read_csv('bank.csv')

x = data.iloc[:, 3:13]
y = data.iloc[:,13]

geography = pd.get_dummies(x['Geography'], drop_first = True)


gender = pd.get_dummies(x['Gender'], drop_first = True)

data.head()

RowNumber CustomerId Surname CreditScore Geography Gender Age


\
0 1 15634602 Hargrave 619 France Female 42
1 2 15647311 Hill 608 Spain Female 41

2 3 15619304 Onio 502 France Female 42

3 4 15701354 Boni 699 France Female 39

4 5 15737888 Mitchell 850 Spain Female 43

Tenure Balance NumOfProducts HasCrCard IsActiveMember \


0 2 0.00 1 1 1
1 1 83807.86 1 0 1
2 8 159660.80 3 1 0
3 1 0.00 2 0 0
4 2 125510.82 1 1 1

EstimatedSalary Exited
0 101348.88 1
1 112542.58 0
2 113931.57 1
3 93826.63 0
4 79084.10 0

geography.head()

Germany Spain
0 0 0
1 0 1
2 0 0
3 0 0
4 0 1

gender.head()

Male
0 0
1 0
2 0
3 0
4 0

x = pd.concat([x,geography,gender],axis = 1)

x.head()

CreditScore Geography Gender Age Tenure Balance


NumOfProducts \
0 619 France Female 42 2 0.00
1
1 608 Spain Female 41 1 83807.86
1
2 502 France Female 42 8 159660.80
3
3 699 France Female 39 1 0.00
2
4 850 Spain Female 43 2 125510.82
1

HasCrCard IsActiveMember EstimatedSalary Germany Spain Male


0 1 1 101348.88 0 0 0
1 0 1 112542.58 0 1 0
2 1 0 113931.57 0 0 0
3 0 0 93826.63 0 0 0
4 1 1 79084.10 0 1 0

x = x.drop(['Geography','Gender'], axis = 1)

x.head()

CreditScore Age Tenure Balance NumOfProducts HasCrCard \


0 619 42 2 0.00 1 1
1 608 41 1 83807.86 1 0
2 502 42 8 159660.80 3 1
3 699 39 1 0.00 2 0
4 850 43 2 125510.82 1 1

IsActiveMember EstimatedSalary Germany Spain Male


0 1 101348.88 0 0 0
1 1 112542.58 0 1 0
2 0 113931.57 0 0 0
3 0 93826.63 0 0 0
4 1 79084.10 0 1 0

from sklearn.model_selection import train_test_split


x_train,x_test,y_train,y_test = train_test_split(x,y, test_size = 0.3,
random_state = 0 )

from sklearn.preprocessing import StandardScaler


sc = StandardScaler()

x_train = sc.fit_transform(x_train)
x_test = sc.transform(x_test)

importing deep learning libraries


import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LeakyReLU,PReLU,ELU
from keras.layers import Dropout
initializing the ANN
classifier = Sequential()

Adding the input layer and 1st Hidden layer


classifier.add((Dense(units = 6, kernel_initializer = 'he_uniform',
activation = 'relu', input_dim = 11)))

• in the above code units = 6 represents no of hidden neurons

• kernel_initializer = he_uniform is weight initialization technique

• activation = 'relu' is a ReLU activation function

• input_dim = 11 is number of input feature

Adding 2nd Hidden Layer


classifier.add(Dense(units = 6, kernel_initializer = 'he_uniform',
activation = 'relu'))

Adding output layer


classifier.add(Dense(units = 1, kernel_initializer = 'glorot_uniform',
activation = 'sigmoid'))

• in above units = 1 indicates one output value

compiling ANN
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy',
metrics = ['accuracy'])

• optimizer = 'adam' represents which type of optiization function we are using

• loss = 'binary_crossentropy' for 0 or 1 we using binary for multiple outputs we are


using different loss function-- see keras documentation for more info

Fiting the ANN for training the data


batch size = no of data points for time
epoch = no of iterations
model = classifier.fit(x_train,y_train,validation_split = 0.33,
batch_size = 10, epochs = 100)

----------------------------------------------------------------------
-----
NameError Traceback (most recent call
last)
<ipython-input-1-7e13d534d424> in <module>
----> 1 model = classifier.fit(x_train,y_train,validation_split =
0.33, batch_size = 10, epochs = 100)

NameError: name 'classifier' is not defined


----------------------------------------------------------------------
-----
AttributeError Traceback (most recent call
last)
<ipython-input-39-5f15418b3570> in <module>
----> 1 model.summary()

AttributeError: 'History' object has no attribute 'summary'

y_pred = classifier.predict(x_test)
y_pred = (y_pred > 0.5)

----------------------------------------------------------------------
-----
NameError Traceback (most recent call
last)
<ipython-input-1-32fc0427837c> in <module>
----> 1 y_pred = classifier.predict(x_test)
2 y_pred = (y_pred > 0.5)

NameError: name 'classifier' is not defined

from sklearn.metrics import confusion_matrix


cm = confusion_matrix(y_test,y_pred)

# accuracy
from sklearn.metrics import accuracy_score
score = accuracy_score(y_pred, y_test)

score

0.8596666666666667

How to select Hiden layers and Hidden neurons in ANN using keras tuner
link:https://www.tensorflow.org/tutorials/keras/keras_tuner

Hyperparameters
• how many numbers of hidden layers we should have?

• how many number neurons we should have in hidden layers?

• learning rate
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tensorflow
from tensorflow import keras
from tensorflow.keras import layers
from kerastuner.tuners import RandomSearch

df = pd.read_csv('Real.csv')

df.head()

T TM Tm SLP H VV V VM PM 2.5
0 7.4 9.8 4.8 1017.6 93.0 0.5 4.3 9.4 219.720833
1 7.8 12.7 4.4 1018.5 87.0 0.6 4.4 11.1 182.187500
2 6.7 13.4 2.4 1019.4 82.0 0.6 4.8 11.1 154.037500
3 8.6 15.5 3.3 1018.7 72.0 0.8 8.1 20.6 223.208333
4 12.4 20.9 4.4 1017.3 61.0 1.3 8.7 22.2 200.645833

x = df.iloc[:,:-1]
y = df.iloc[:,-1]

def model_builder(hp):
model = keras.Sequential()
model.add(keras.layers.Flatten(input_shape=(28, 28)))
# Tune the number of units in the first Dense layer
# Choose an optimal value between 32-512
hp_units = hp.Int('units', min_value=32, max_value=512, step=32)
model.add(keras.layers.Dense(units=hp_units, activation='relu'))
model.add(keras.layers.Dense(10))

# Tune the learning rate for the optimizer


# Choose an optimal value from 0.01, 0.001, or 0.0001
hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3,
1e-4])

model.compile(optimizer=keras.optimizers.Adam(learning_rate=hp_learnin
g_rate),

loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])

return model

tuner = RandomSearch(model_builder,
objective='val_mean_absolute_error',
max_trials = 5,
executions_per_trails = 3,
directory='projects',
project_name='intro_to_kt')

----------------------------------------------------------------------
-----
TypeError Traceback (most recent call
last)
<ipython-input-8-0f55ecab4360> in <module>
----> 1 tuner = RandomSearch(model_builder,
2 objective='val_mean_absolute_error',
3 max_trials = 5,
4 executions_per_trails = 3,
5 directory='projects',

~\anaconda3\lib\site-packages\kerastuner\tuners\randomsearch.py in
__init__(self, hypermodel, objective, max_trials, seed,
hyperparameters, tune_new_entries, allow_new_entries, **kwargs)
132 tune_new_entries=tune_new_entries,
133 allow_new_entries=allow_new_entries)
--> 134 super(RandomSearch, self).__init__(
135 oracle,
136 hypermodel,

~\anaconda3\lib\site-packages\kerastuner\engine\
multi_execution_tuner.py in __init__(self, oracle, hypermodel,
executions_per_trial, **kwargs)
56 executions_per_trial=1,
57 **kwargs):
---> 58 super(MultiExecutionTuner, self).__init__(
59 oracle, hypermodel, **kwargs)
60 if isinstance(oracle.objective, list):

TypeError: __init__() got an unexpected keyword argument


'executions_per_trails'

CNN(Convolution neural network)


https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-
networks-the-eli5-way-3bd2b1164a53
• used for inputs in form of images and live video processing

cerebral cortex:
The cerebral cortex is the largest site of neural integration in the central nervous system. It
plays a key role in attention, perception, awareness, thought, memory, language, and
consciousness.
visual cortex:
The visual cortex of the brain is the area of the cerebral cortex that processes visual
information. It is located in the occipital lobe. Sensory input originating from the eyes
travels through the lateral geniculate nucleus in the thalamus and then reaches the visual
cortex.
it have different layers like V1, V2, V3, V4, V5 this layers plays very important role.
• V1 layer responsible for finding the edges of image and goes to layer V2, V3, V4, V5.

• V2 responsible for orientation, spatial frequency, and colour.

• V3 seems to play a role in processing motion

• V4 layer mainly used for face recognization,

• V5 perceiving motion and processing of complex stimuli

• V6 visual stimuli associated with self-motion and wide-field stimulation.

each layer have specific function to grab information from the images. we implement the
layers in cnn to process the images
What is convolution:
In mathematics (in particular, functional analysis), convolution is a mathematical operation
on two functions (f and g) that produces a third function ( ) that expresses how the shape of
one is modified by the other. The term convolution refers to both the result function and to
the process of computing it.

ex: in the above example we multiply image with filter or kernel to grab the information.
the filters may be a edge filter, vertical filters etc.

What is Padding & Stridge:


Padding is a term relevant to convolutional neural networks as it refers to the amount of
pixels added to an image when it is being processed by the kernel of a CNN. For example, if
the padding in a CNN is set to zero, then every pixel value that is added will be of value
zero.
Stride denotes how many steps we are moving in each steps in convolution.By default it is
one. Convolution with Stride 1. We can observe that the size of output is smaller that input.
To maintain the dimension of output as in input , we use padding. Padding is a process of
adding zeros to the input matrix symmetrically.

use of padding:
Padding is simply a process of adding layers of zeros to our input images so as to avoid the
problems mentioned above. This prevents shrinking as, if p = number of layers of zeros
added to the border of the image, then our (n x n) image becomes (n + 2p) x (n + 2p) image
after padding.

Max Pooling:
Max pooling is a sample-based discretization process. The objective is to down-sample an
input representation (image, hidden-layer output matrix, etc.), reducing its dimensionality
and allowing for assumptions to be made about features contained in the sub-regions
binned.

Use of max pooling:


Max pooling selects the brighter pixels from the image. It is useful when the background of
the image is dark and we are interested in only the lighter pixels of the image
Data Agumentation CNN:
link:https://machinelearningmastery.com/how-to-configure-image-data-augmentation-
when-training-deep-learning-neural-networks/
Data augmentation is a technique to artificially create new training data from existing
training data. This is done by applying domain-specific techniques to examples from the
training data that create new and different training examples
Creating data set using Data Agumentation
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import keras
from keras.preprocessing.image import ImageDataGenerator,
array_to_img, img_to_array, load_img

datagen = ImageDataGenerator(rotation_range =
40,width_shift_range=0.2,
height_shift_range=0.2,zoom_range=0.2,
channel_shift_range=0.2, horizontal_flip= True,
fill_mode='nearest' )
img = load_img('hero1.png')# load image
# img.show()# see image
x = img_to_array(img)
x = x.reshape((1,)+x.shape)

i = 0
for batch in datagen.flow(x, batch_size =1, save_to_dir =
'Preview' ,save_prefix = 'cat', save_format = 'jpeg'):

i+=1
if i> 20:
break

Creating CNN Model and Optimize using keras tuner


import pandas as pd
import numpy as np
import os
import tensorflow as tf
import cv2
import keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from tensorflow.keras import layers
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras.layers import BatchNormalization
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from kerastuner.tuners import RandomSearch
import kerastuner as kt
import PIL
import theano

link:https://vijayabhaskar96.medium.com/tutorial-on-keras-flow-from-dataframe-
1fd4493d237c loading image data from different files link: https://github.com/keras-
team/keras-tuner link:https://keras-team.github.io/keras-tune
# loading the images
df = pd.read_csv(r'C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\labels.csv',dtype=str)
# data type must be as str,list or tuple

datagen=ImageDataGenerator(rescale=1./255.,validation_split=0.25)
data_train = datagen.flow_from_dataframe(dataframe= df,
directory= r'C:\Users\tharu\OneDrive\Desktop\DataScience\
Deep-Learning\emrgency vehicle data\train_SOaYf6m\images',
x_col = "id",
y_col = "label",
subset = 'training',
batch_size=32,
target_size=(128, 128), # we need give size to resize all
images to single size
color_mode = 'rgb', # we need to specify for color images
seed=42,
shuffle=True,
class_mode="categorical")

Found 1235 validated image filenames belonging to 2 classes.

data_test=datagen.flow_from_dataframe(
dataframe= df,
directory= r"C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\images",
x_col="id",
y_col="label",
subset="validation",
batch_size=32,
target_size=(128, 128),
color_mode = 'rgb',
seed=42,
shuffle=True,
class_mode="categorical")

Found 411 validated image filenames belonging to 2 classes.

Building Model

create model
The model type that we will be using is Sequential. Sequential is the easiest way to build a
model in Keras. It allows you to build a model layer by layer.

model = Sequential()
adding layers
We use the ‘add()’ function to add layers to our model.
• Our first 2 layers are Conv2D layers. These are convolution layers that will deal with
our input images, which are seen as 2-dimensional matrices.

• 64 in the first layer and 32 in the second layer are the number of nodes(A node, also
called a neuron or Perceptron) in each layer. This number can be adjusted to be
higher or lower, depending on the size of the dataset. In our case, 64 and 32 work
well, so we will stick with this for now.

• Kernel size is the size of the filter matrix for our convolution. So a kernel size of 3
means we will have a 3x3 filter matrix. Refer back to the introduction and the first
image for a refresher on this.

• Activation is the activation function for the layer. The activation function we will be
using for our first 2 layers is the ReLU, or Rectified Linear Activation. This activation
function has been proven to work well in neural networks.

• Our first layer also takes in an input shape. This is the shape of each input image,
28,28,1 as seen earlier on, with the 1 signifying that the images are greyscale.

• In between the Conv2D layers and the dense layer, there is a ‘Flatten’ layer. Flatten
serves as a connection between the convolution and dense layers.

• ‘Dense’ is the layer type we will use in for our output layer. Dense is a standard layer
type that is used in many cases for neural networks.

• We will have 10 nodes in our output layer, one for each possible outcome (0–9).The
activation is ‘softmax’. Softmax makes the output sum up to 1 so the output can be
interpreted as probabilities. The model will then make its prediction based on which
option has the highest probability.

model.add(Conv2D(64,padding='valid', kernel_size=3, activation='relu',


input_shape=(128,128,3)))
model.add(Conv2D(32, kernel_size=3, activation='relu'))
model.add(Flatten())
model.add(Dense(2, activation='softmax'))

compile model using accuracy to measure model performance


model.compile(optimizer='adam', loss='categorical_crossentropy',
metrics=['accuracy'])

train the model


STEP_SIZE_TRAIN = data_train.n//data_train.batch_size
STEP_SIZE_VALID = data_test.n//data_test.batch_size
model.fit(x = data_train,steps_per_epoch=STEP_SIZE_TRAIN,
validation_data = data_test,validation_steps = STEP_SIZE_VALID,
epochs=10
)

Epoch 1/10
38/38 [==============================] - 119s 3s/step - loss: 2.0871 -
accuracy: 0.5894 - val_loss: 0.5837 - val_accuracy: 0.7005
Epoch 2/10
38/38 [==============================] - 33s 858ms/step - loss: 0.4566
- accuracy: 0.7858 - val_loss: 0.6593 - val_accuracy: 0.7448
Epoch 3/10
38/38 [==============================] - 34s 908ms/step - loss: 0.2640
- accuracy: 0.9034 - val_loss: 0.5637 - val_accuracy: 0.7188
Epoch 4/10
38/38 [==============================] - 38s 989ms/step - loss: 0.1519
- accuracy: 0.9589 - val_loss: 0.6387 - val_accuracy: 0.7396
Epoch 5/10
38/38 [==============================] - 33s 864ms/step - loss: 0.0367
- accuracy: 0.9960 - val_loss: 0.7001 - val_accuracy: 0.7396
Epoch 6/10
38/38 [==============================] - 34s 903ms/step - loss: 0.0156
- accuracy: 0.9971 - val_loss: 0.7280 - val_accuracy: 0.7552
Epoch 7/10
38/38 [==============================] - 34s 887ms/step - loss: 0.0274
- accuracy: 0.9963 - val_loss: 0.8293 - val_accuracy: 0.7474
Epoch 8/10
38/38 [==============================] - 31s 812ms/step - loss: 0.0261
- accuracy: 0.9963 - val_loss: 0.7220 - val_accuracy: 0.7708
Epoch 9/10
38/38 [==============================] - 31s 806ms/step - loss: 0.0140
- accuracy: 0.9968 - val_loss: 0.7128 - val_accuracy: 0.7370
Epoch 10/10
38/38 [==============================] - 31s 810ms/step - loss: 0.0162
- accuracy: 0.9989 - val_loss: 0.6729 - val_accuracy: 0.7578

<tensorflow.python.keras.callbacks.History at 0x15435a28d60>

import cv2

img = cv2.imread(r'C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\vehicle.jpg',1)
width = 128
height = 128
dim = (width, height)
resized = cv2.resize(img, dim, interpolation = cv2.INTER_ARE)

x_val = np.array(resized) / 255


x_val = x_val.reshape(-1, 128, 128, 3)

predicting given image is emergency vehicle or not


pred = model.predict(x_val)

print(pred)
[[0.99537885 0.0046212 ]]

img1 = cv2.imread(r'C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\car.jpg',1)
width = 128
height = 128
dim = (width, height)
resized1 = cv2.resize(img1, dim, interpolation = cv2.INTER_AREA)

x_val1 = np.array(resized1) / 255


x_val1 = x_val1.reshape(-1, 128, 128, 3)

pred1 = model.predict(x_val1)

print(pred1)

[[0.9767471 0.02325287]]

img2 = cv2.imread(r'C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\car.jpg',1)
width = 128
height = 128
dim = (width, height)
resized2 = cv2.resize(img2, dim, interpolation = cv2.INTER_AREA)

x_val2 = np.array(resized2) / 255


x_val2 = x_val2.reshape(-1, 128, 128, 3)

pred2 = model.predict(x_val2)

print(pred2)

[[0.9767471 0.02325287]]

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 126, 126, 64) 1792
_________________________________________________________________
conv2d_1 (Conv2D) (None, 124, 124, 32) 18464
_________________________________________________________________
flatten (Flatten) (None, 492032) 0
_________________________________________________________________
dense (Dense) (None, 2) 984066
=================================================================
Total params: 1,004,322
Trainable params: 1,004,322
Non-trainable params: 0
_________________________________________________________________
x = tf.random.normal((4, 28, 28, 3))

x.shape

TensorShape([4, 28, 28, 3])

input_shape = (4, 28, 28, 3)

input_shape[1:]

(28, 28, 3)

building model with different layers


model = Sequential()

model.add(Conv2D(16,padding='same', kernel_size=3, activation='relu',


input_shape=(128,128,3)) )

model.add(Conv2D(16, 3, activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(rate=0.25))
model.add(Conv2D(32, 3, activation='relu'))
model.add(Conv2D(64, 3, activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(rate=0.25))
model.add(Flatten())
model.add(Dense(units=128, activation='relu'))
model.add(Dropout(rate=0.25))
model.add(Dense(2, activation='softmax'))

model.compile(optimizer='adam', loss='categorical_crossentropy',
metrics=['accuracy'])

STEP_SIZE_TRAIN = data_train.n//data_train.batch_size
STEP_SIZE_VALID = data_test.n//data_test.batch_size
model.fit(x = data_train,steps_per_epoch=STEP_SIZE_TRAIN,
validation_data = data_test,validation_steps = STEP_SIZE_VALID,
epochs=10
)

Epoch 1/10
38/38 [==============================] - 26s 666ms/step - loss: 0.9023
- accuracy: 0.5405 - val_loss: 0.5957 - val_accuracy: 0.6198
Epoch 2/10
38/38 [==============================] - 25s 668ms/step - loss: 0.5373
- accuracy: 0.7355 - val_loss: 0.5600 - val_accuracy: 0.7422
Epoch 3/10
38/38 [==============================] - 25s 656ms/step - loss: 0.4629
- accuracy: 0.7889 - val_loss: 0.4603 - val_accuracy: 0.7865
Epoch 4/10
38/38 [==============================] - 27s 704ms/step - loss: 0.4224
- accuracy: 0.8205 - val_loss: 0.4932 - val_accuracy: 0.7682
Epoch 5/10
38/38 [==============================] - 26s 675ms/step - loss: 0.4203
- accuracy: 0.8062 - val_loss: 0.4751 - val_accuracy: 0.7865
Epoch 6/10
38/38 [==============================] - 26s 677ms/step - loss: 0.3727
- accuracy: 0.8394 - val_loss: 0.4425 - val_accuracy: 0.8047
Epoch 7/10
38/38 [==============================] - 28s 739ms/step - loss: 0.2902
- accuracy: 0.8834 - val_loss: 0.5390 - val_accuracy: 0.7995
Epoch 8/10
38/38 [==============================] - 27s 702ms/step - loss: 0.2439
- accuracy: 0.8974 - val_loss: 0.5148 - val_accuracy: 0.8021
Epoch 9/10
38/38 [==============================] - 26s 688ms/step - loss: 0.2069
- accuracy: 0.9188 - val_loss: 0.5248 - val_accuracy: 0.7995
Epoch 10/10
38/38 [==============================] - 26s 672ms/step - loss: 0.1792
- accuracy: 0.9332 - val_loss: 0.6022 - val_accuracy: 0.7865

<tensorflow.python.keras.callbacks.History at 0x1e7a1200f70>

img = cv2.imread(r'C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\train.jpg',1)
width = 128
height = 128
dim = (width, height)
resized = cv2.resize(img, dim, interpolation = cv2.INTER_AREA)
x_val = np.array(resized) / 255
x_val = x_val.reshape(-1, 128, 128, 3)

pred = model.predict(x_val)

print(pred)

[[0.4878648 0.5121352]]

building model
model = Sequential()
model.add(Conv2D(32,padding='same', kernel_size=3, activation='relu',
input_shape=(128,128,3)))
model.add(MaxPooling2D((2, 2)))
model.add(BatchNormalization())
model.add(Conv2D(64, kernel_size=3,padding='same', activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(BatchNormalization())
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(2, activation='softmax'))
model.add(Dropout(0.5))

model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])

STEP_SIZE_TRAIN = data_train.n//data_train.batch_size
STEP_SIZE_VALID = data_test.n//data_test.batch_size
model.fit(x = data_train,steps_per_epoch=STEP_SIZE_TRAIN,
validation_data = data_test,validation_steps = STEP_SIZE_VALID,
epochs=10
)

Epoch 1/10
38/38 [==============================] - 22s 525ms/step - loss: 6.4985
- accuracy: 0.5817 - val_loss: 6.2706 - val_accuracy: 0.5911
Epoch 2/10
38/38 [==============================] - 20s 521ms/step - loss: 6.2590
- accuracy: 0.6425 - val_loss: 5.9911 - val_accuracy: 0.6094
Epoch 3/10
38/38 [==============================] - 20s 516ms/step - loss: 6.5250
- accuracy: 0.6183 - val_loss: 6.1508 - val_accuracy: 0.5990
Epoch 4/10
38/38 [==============================] - 19s 509ms/step - loss: 6.1739
- accuracy: 0.6448 - val_loss: 6.1508 - val_accuracy: 0.5990
Epoch 5/10
38/38 [==============================] - 20s 531ms/step - loss: 5.6002
- accuracy: 0.6697 - val_loss: 6.1908 - val_accuracy: 0.5964
Epoch 6/10
38/38 [==============================] - 20s 532ms/step - loss: 6.4007
- accuracy: 0.6061 - val_loss: 6.1982 - val_accuracy: 0.5885
Epoch 7/10
38/38 [==============================] - 19s 509ms/step - loss: 6.4062
- accuracy: 0.5876 - val_loss: 5.9796 - val_accuracy: 0.5938
Epoch 8/10
38/38 [==============================] - 21s 544ms/step - loss: 6.0466
- accuracy: 0.6506 - val_loss: 4.9934 - val_accuracy: 0.6562
Epoch 9/10
38/38 [==============================] - 19s 510ms/step - loss: 6.2585
- accuracy: 0.6384 - val_loss: 5.5318 - val_accuracy: 0.6276
Epoch 10/10
38/38 [==============================] - 20s 533ms/step - loss: 6.5785
- accuracy: 0.6166 - val_loss: 4.4664 - val_accuracy: 0.6953

<tensorflow.python.keras.callbacks.History at 0x1e7a16a28e0>

img = cv2.imread(r'C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\vehicle.jpg',1)
width = 128
height = 128
dim = (width, height)
resized = cv2.resize(img, dim, interpolation = cv2.INTER_AREA)
x_val = np.array(resized) / 255
x_val = x_val.reshape(-1, 128, 128, 3)

pred = model.predict(x_val)

print(pred)

[[1. 0.]]

imge = cv2.imread(r'C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\train.jpg',1)
width = 128
height = 128
dim = (width, height)
resize = cv2.resize(imge, dim, interpolation = cv2.INTER_AREA)
x_vale = np.array(resize) / 255
x_vale = x_vale.reshape(-1, 128, 128, 3)
pred = model.predict(x_vale)
print(pred)

[[1.0000000e+00 1.5728423e-38]]

cnn using keras tuner


import pandas as pd
import numpy as np
import os
import tensorflow as tf
import cv2
import keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from tensorflow.keras import layers
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras.layers import BatchNormalization
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from kerastuner.tuners import RandomSearch
import kerastuner as kt
import PIL
import theano

df = pd.read_csv(r'C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\labels.csv',dtype=str)

datagen=ImageDataGenerator(rescale=1./255.,validation_split=0.25)
data_train = datagen.flow_from_dataframe(dataframe= df,
directory= r'C:\Users\tharu\OneDrive\Desktop\DataScience\
Deep-Learning\emrgency vehicle data\train_SOaYf6m\images',
x_col = "id",
y_col = "label",
subset = 'training',
batch_size=32,
target_size=(64, 64), # we need give size to resize all
images to single size
color_mode = 'rgb', # we need to specify for color images
seed=42,
shuffle=True,
class_mode="categorical")

data_test=datagen.flow_from_dataframe(
dataframe= df,
directory= r"C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\images",
x_col="id",
y_col="label",
subset="validation",
batch_size=32,
target_size=(64, 64),
color_mode = 'rgb',
seed=42,
shuffle=True,
class_mode="categorical")

Found 1235 validated image filenames belonging to 2 classes.


Found 411 validated image filenames belonging to 2 classes.

model = Sequential()
model.add(Conv2D(16,padding='same', kernel_size=3, activation='relu',
input_shape=(64,64,3)) )

model.add(Conv2D(16, 3, activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(rate=0.25))
model.add(Conv2D(32, 3, activation='relu'))
model.add(Conv2D(64, 3, activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(rate=0.25))
model.add(Flatten())
model.add(Dense(units=128, activation='relu'))
model.add(Dropout(rate=0.25))
model.add(Dense(2, activation='softmax'))

model.compile(
optimizer=keras.optimizers.Adam(1e-3),
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)

STEP_SIZE_TRAIN = data_train.n//data_train.batch_size
STEP_SIZE_VALID = data_test.n//data_test.batch_size
model.fit(x = data_train,steps_per_epoch=STEP_SIZE_TRAIN,
validation_data = data_test,validation_steps = STEP_SIZE_VALID,
epochs=10
)

Epoch 1/10

----------------------------------------------------------------------
-----
InvalidArgumentError Traceback (most recent call
last)
<ipython-input-83-f39f8a4579a2> in <module>
1 STEP_SIZE_TRAIN = data_train.n//data_train.batch_size
2 STEP_SIZE_VALID = data_test.n//data_test.batch_size
----> 3 model.fit(x = data_train,steps_per_epoch=STEP_SIZE_TRAIN,
validation_data = data_test,validation_steps = STEP_SIZE_VALID,
4 epochs=10
5 )

~\anaconda3\lib\site-packages\tensorflow\python\keras\engine\
training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks,
validation_split, validation_data, shuffle, class_weight,
sample_weight, initial_epoch, steps_per_epoch, validation_steps,
validation_batch_size, validation_freq, max_queue_size, workers,
use_multiprocessing)
1098 _r=1):
1099 callbacks.on_train_batch_begin(step)
-> 1100 tmp_logs = self.train_function(iterator)
1101 if data_handler.should_sync:
1102 context.async_wait()

~\anaconda3\lib\site-packages\tensorflow\python\eager\def_function.py
in __call__(self, *args, **kwds)
826 tracing_count = self.experimental_get_tracing_count()
827 with trace.Trace(self._name) as tm:
--> 828 result = self._call(*args, **kwds)
829 compiler = "xla" if self._experimental_compile else
"nonXla"
830 new_tracing_count =
self.experimental_get_tracing_count()

~\anaconda3\lib\site-packages\tensorflow\python\eager\def_function.py
in _call(self, *args, **kwds)
886 # Lifting succeeded, so variables are initialized and
we can run the
887 # stateless function.
--> 888 return self._stateless_fn(*args, **kwds)
889 else:
890 _, _, _, filtered_flat_args = \

~\anaconda3\lib\site-packages\tensorflow\python\eager\function.py in
__call__(self, *args, **kwargs)
2940 (graph_function,
2941 filtered_flat_args) = self._maybe_define_function(args,
kwargs)
-> 2942 return graph_function._call_flat(
2943 filtered_flat_args,
captured_inputs=graph_function.captured_inputs) # pylint:
disable=protected-access
2944

~\anaconda3\lib\site-packages\tensorflow\python\eager\function.py in
_call_flat(self, args, captured_inputs, cancellation_manager)
1916 and executing_eagerly):
1917 # No tape is watching; skip to running the function.
-> 1918 return
self._build_call_outputs(self._inference_function.call(
1919 ctx, args,
cancellation_manager=cancellation_manager))
1920 forward_backward =
self._select_forward_and_backward_functions(

~\anaconda3\lib\site-packages\tensorflow\python\eager\function.py in
call(self, ctx, args, cancellation_manager)
553 with _InterpolateFunctionError(self):
554 if cancellation_manager is None:
--> 555 outputs = execute.execute(
556 str(self.signature.name),
557 num_outputs=self._num_outputs,

~\anaconda3\lib\site-packages\tensorflow\python\eager\execute.py in
quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
57 try:
58 ctx.ensure_initialized()
---> 59 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle,
device_name, op_name,
60 inputs, attrs,
num_outputs)
61 except core._NotOkStatusException as e:

InvalidArgumentError: logits and labels must have the same first


dimension, got logits shape [32,2] and labels shape [64]
[[node
sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/
SparseSoftmaxCrossEntropyWithLogits (defined at <ipython-input-83-
f39f8a4579a2>:3) ]] [Op:__inference_train_function_14342]

Function call stack:


train_function

TensorBoard: TensorFlow's visualization toolkit


link:link:https://www.tensorflow.org/tensorboard/get_started
TensorBoard provides the visualization and tooling needed for machine learning
experimentation:
• Tracking and visualizing metrics such as loss and accuracy

• Visualizing the model graph (ops and layers)

• Viewing histograms of weights, biases, or other tensors as they change over time

• Projecting embeddings to a lower dimensional space

• Displaying images, text, and audio

TRANSFER LEARING
Transfer learning is a machine learning method where a model developed for a task is
reused as the starting point for a model on a second task.
It is a popular approach in deep learning where pre-trained models are used as the starting
point on computer vision and natural language processing tasks given the vast compute
and time resources required to develop neural network models on these problems and
from the huge jumps in skill that they provide on related problems.
link:https://www.analyticsvidhya.com/blog/2020/08/top-4-pre-trained-models-for-
image-classification-with-python-code/
#LINK:https://keras.io/api/applications/

import pandas as pd
import numpy as np
import os
import tensorflow as tf
import cv2
import keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from tensorflow.keras import layers
from keras.layers import Dense, Dropout, Flatten, Lambda
from keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras.layers import BatchNormalization
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from kerastuner.tuners import RandomSearch
import kerastuner as kt
import PIL
import theano
import glob

from tensorflow.keras.applications.resnet50 import ResNet50


from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input,
decode_predictions

df = pd.read_csv(r'C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\labels.csv',dtype=str)
datagen=ImageDataGenerator(rescale=1./255.,validation_split=0.25)
data_train = datagen.flow_from_dataframe(dataframe= df,
directory= r'C:\Users\tharu\OneDrive\Desktop\DataScience\
Deep-Learning\emrgency vehicle data\train_SOaYf6m\images',
x_col = "id",
y_col = "label",
subset = 'training',
batch_size=32,
target_size=(64, 64), # we need give size to resize all
images to single size
color_mode = 'rgb', # we need to specify for color images
seed=42,
shuffle=True,
class_mode="categorical")

data_test=datagen.flow_from_dataframe(
dataframe= df,
directory= r"C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\images",
x_col="id",
y_col="label",
subset="validation",
batch_size=32,
target_size=(64, 64),
color_mode = 'rgb',
seed=42,
shuffle=True,
class_mode="categorical")

Found 1235 validated image filenames belonging to 2 classes.


Found 411 validated image filenames belonging to 2 classes.

base_model = ResNet50(input_shape = (64, 64,


3),weights='imagenet',include_top=False)

for layer in base_model.layers:


layer.trainable = False

x = layers.Flatten()(base_model.output)

x = layers.Dense(1, activation='sigmoid')(x)

model = tf.keras.models.Model(base_model.input, x)

model.compile(optimizer = tf.keras.optimizers.RMSprop(lr=0.0001), loss


= 'binary_crossentropy',metrics = ['acc'])
STEP_SIZE_TRAIN = data_train.n//data_train.batch_size
STEP_SIZE_VALID = data_test.n//data_test.batch_size
vgghist = model.fit(x = data_train,steps_per_epoch=STEP_SIZE_TRAIN,
validation_data = data_test,validation_steps = STEP_SIZE_VALID,
epochs=10)

Epoch 1/10
38/38 [==============================] - 23s 437ms/step - loss: 0.7749
- acc: 0.5000 - val_loss: 0.7174 - val_acc: 0.5000
Epoch 2/10
38/38 [==============================] - 14s 376ms/step - loss: 0.7110
- acc: 0.5000 - val_loss: 0.6989 - val_acc: 0.5000
Epoch 3/10
38/38 [==============================] - 15s 383ms/step - loss: 0.6974
- acc: 0.5000 - val_loss: 0.6944 - val_acc: 0.5000
Epoch 4/10
38/38 [==============================] - 14s 381ms/step - loss: 0.6941
- acc: 0.5000 - val_loss: 0.6934 - val_acc: 0.5000
Epoch 5/10
38/38 [==============================] - 16s 414ms/step - loss: 0.6933
- acc: 0.5000 - val_loss: 0.6932 - val_acc: 0.5000
Epoch 6/10
38/38 [==============================] - 14s 381ms/step - loss: 0.6932
- acc: 0.5000 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 7/10
38/38 [==============================] - 15s 383ms/step - loss: 0.6931
- acc: 0.5000 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 8/10
38/38 [==============================] - 15s 384ms/step - loss: 0.6931
- acc: 0.5000 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 9/10
38/38 [==============================] - 14s 378ms/step - loss: 0.6931
- acc: 0.5000 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 10/10
38/38 [==============================] - 14s 375ms/step - loss: 0.6931
- acc: 0.5000 - val_loss: 0.6931 - val_acc: 0.5000

imge = cv2.imread(r'C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\car.jpg',1)
width = 64
height = 64
dim = (width, height)
resize = cv2.resize(imge, dim, interpolation = cv2.INTER_AREA)
x_vale = np.array(resize) / 255
x_vale = x_vale.reshape(-1, 64, 64, 3)
pred = model.predict(x_vale)
print(pred)

[[0.49948993]]
RNN(Recurrent Neural Network)
it work very well with sequence data like nlp, time series analysis
A recurrent neural network is a neural network that is specialized for processing a
sequence of data x(t)= x(1), . . . , x(τ) with the time step index t ranging from 1 to τ. For
tasks that involve sequential inputs, such as speech and language, it is often better to use
RNNs. In a NLP problem, if you want to predict the next word in a sentence it is important
to know the words before it. RNNs are called recurrent because they perform the same task
for every element of a sequence, with the output being depended on the previous
computations. Another way to think about RNNs is that they have a “memory” which
captures information about what has been calculated so far.

The left side of the above diagram shows a notation of an RNN and on the right side an RNN
being unrolled (or unfolded) into a full network. By unrolling we mean that we write out
the network for the complete sequence. For example, if the sequence we care about is a
sentence of 3 words, the network would be unrolled into a 3-layer neural network, one
layer for each word.
Input: x(t) is taken as the input to the network at time step t. For example, x1,could be a
one-hot vector corresponding to a word of a sentence.
Hidden state: h(t) represents a hidden state at time t and acts as “memory” of the network.
h(t) is calculated based on the current input and the
previous time step’s hidden state: h(t) = f(U x(t) + W h(t−1)). The function f is taken to be a
non-linear transformation such as tanh, ReLU.
Weights: The RNN has input to hidden connections parameterized by a weight matrix U,
hidden-to-hidden recurrent connections parameterized by a weight matrix W, and hidden-
to-output connections parameterized by a weight matrix V and all these weights (U,V,W)
are shared across time.
Output: o(t) illustrates the output of the network. In the figure I just put an arrow after o(t)
which is also often subjected to non-linearity, especially when the network contains further
layers downstream.
Forward Pass

Problems with simple RNN


• However, RNNs suffer from the problem of vanishing gradients, which hampers
learning of long data sequences. The gradients carry information used in the RNN
parameter update and when the gradient becomes smaller and smaller, the
parameter updates become insignificant which means no real learning is done.

• For the vanishing gradient problem, the further you go through the network, the
lower your gradient is and the harder it is to train the weights, which has a domino
effect on all of the further weights throughout the network. That was the main
roadblock to using Recurrent Neural Networks.

File "<ipython-input-17-6c86d668c54d>", line 1


+b = 5
^
SyntaxError: cannot assign to operator

File "<ipython-input-18-855ab69d5122>", line 1


9hero = 20
^
SyntaxError: invalid syntax

----------------------------------------------------------------------
-----
TypeError Traceback (most recent call
last)
<ipython-input-22-86a4cc37efd5> in <module>
----> 1 'srtt'-'dfdddf'

TypeError: unsupported operand type(s) for -: 'str' and 'str'

You might also like