Professional Documents
Culture Documents
DeepLearing Theory
DeepLearing Theory
DeepLearing Theory
Each neuron is made up of a cell body (the central mass of the cell) with a number of
connections coming off it: numerous dendrites (the cell's inputs—carrying information
toward the cell body) and a single axon (the cell's output—carrying information away).
Neurons are so tiny that you could pack about 100 of their cell bodies into a single
millimeter. (It's also worth noting, briefly in passing, that neurons make up only 10–50
percent of all the cells in the brain
The row vectors of the inputs and weights are x = [x₁, x₂, … , xₙ] and w =[w ₁, w ₂, … , w ₙ]
respectively and their dot product is given by
Step 2:
Add bias b to the summation of multiplied values and let’s call this z. Bias — also known as
the offset is necessary in most of the cases, to move the entire activation function to the left
or right to generate the required output values. Z = X.W + B
Step 3:
Pass the value of z to a non-linear activation function. Activation functions — are used to
introduce non-linearity into the output of the neurons, without which the neural network
will just be a linear function. Moreover, they have a significant impact on the learning speed
of the neural network. Perceptrons have binary step function as their activation function.
However, we shall use sigmoid — also known as logistic function as our activation function.
Bias
This means when calculating the output of a node, the inputs are multiplied by weights, and
a bias value is added to the result. The bias value allows the activation function to be
shifted to the left or right, to better fit the data. Hence changes to the weights alter the
steepness of the sigmoid curve, whilst the bias offsets it, shifting the entire curve so it fits
better. Note also how the bias only influences the output values, it doesn’t interact with the
actual input data.
link: https://towardsdatascience.com/why-we-need-bias-in-neural-networks-
db8f7e07cb98
Activation Function
It’s just a thing function that you use to get the output of node. It is also known as Transfer
Function.
It is used to determine the output of neural network like yes or no. It maps the resulting
values in between 0 to 1 or -1
The Activation Functions can be basically divided into 2 types-
1. Linear Activation Function
Equation : f(x) = x
Range : (-infinity to infinity)
It doesn’t help with the complexity or various parameters of usual data that is fed to the
neural networks.
SoftMax
Softmax is a very interesting activation function because it not only maps our output to a
[0,1] range but also maps each output in such a way that the total sum is 1. The output of
Softmax is therefore a probability distribution.
Optimizers
link:https://ruder.io/optimizing-gradient-descent/ https://heartbeat.fritz.ai/exploring-
optimizers-in-machine-learning-7f18d94cd65b
Optimizers associate loss function and model parameters together by updating the model,
i.e. the weights and biases of each node based on the output of the loss function.
Epoch:
One Epoch is when an ENTIRE dataset is passed forward and backward through the neural
network only ONCE
Iteration:
An iteration describes the number of times a batch of data passed through the algorithm. In
the case of neural networks, that means the forward pass and backward pass. So, every
time you pass a batch of data through the NN, you completed an iteration.
Gradient Descent
Well, of course we need to start off with the biggest star of our post — gradient descent.
Gradient descent is an iterative optimization algorithm. It is dependent on the derivatives
of the loss function for finding minima. Running the algorithm for numerous iterations and
epochs helps to reach the global minima (or closest to it).
• Gradient decent consider all data points(it is like population, ex we have 1000 data
points it consider all point while calculating error)
Adagrad Optimizer
Adaptive Gradient Algorithm (Adagrad) is an algorithm for gradient-based optimization. ...
It performs smaller updates As a result, it is well-suited when dealing with sparse data
(NLP or image recognition) Each parameter has its own learning rate that improves
performance on problems with sparse gradients.
adagrad uses different learning rate for every iteration
we compute learning rate for every iteration
dense:
most of features will be non-zeros
sparse:
mpst of the featues are basically zeros
• convergence is faster and more reliable – than simple SGD when the scaling of the
weights is unequal
• It is not very sensitive to the size of the master step
• one disadvantage is some time alpha t become very high with increasing of no of
iterations
AdaDelta
Adadelta is a more robust extension of Adagrad that adapts learning rates based on a
moving window of gradient updates, instead of accumulating all past gradients Compared
to Adagrad, in the original version of Adadelta you don't have to set an initial learning rate.
RMSprop
RMSprop is a gradient based optimization technique used in training neural networks. ...
This normalization balances the step size (momentum), decreasing the step for large
gradients to avoid exploding, and increasing the step for small gradients to avoid vanishing.
ADAM Optimizer
• in stochastic and mini batch its take time to move lo Global minima because its take
zig zag movement
Global minima & Local minima
A local minimum of a function is a point where the function value is smaller than at nearby
points, but possibly greater than at a distant point. A global minimum is a point where the
function value is smaller than at all other feasible points.
in some loss functions we experience below local minia and global minima
Chainrule
chain rule helps while readusting weights in backpropagation while finding derivatives
Vanishing Gradient Problem
In machine learning, the vanishing gradient problem is encountered when training artificial
neural networks with gradient-based learning methods and backpropagation.
the derivate of sigmoid function always lies between 0 to 0.25 because of that when you
have more layers the the new weights are almost equals to old weights this reason for
vanishing gradient problem and this is main reason we are not using sigmoid activation
function in all layers this can be over come by ReLU(rectified linear unit)
Certain activation functions, like the sigmoid function, squishes a large input space into a
small input space between 0 and 1. Therefore, a large change in the input of the sigmoid
function will cause a small change in the output. Hence, the derivative becomes small.
However, when n hidden layers use an activation like the sigmoid function, n small
derivatives are multiplied together. Thus, the gradient decreases exponentially as we
propagate down to the initial layers. A small gradient means that the weights and biases of
the initial layers will not be updated effectively with each training session. Since these
initial layers are often crucial to recognizing the core elements of the input data, it can lead
to overall inaccuracy of the whole network.
there is no techniques gives best results we can only say doing experiment
1. uniform distribution
2. Xavire/Gorat
1.xavire normal
2.xavire uniform
3. He init
a. He uniform
b. He normal
Type of Loss Function or cost function or error function
1.Regression
Advantages:
1. in the form of quadratic equation ax^2 + bx + c
2. plot the quadratic quation we got a gradient descent with only one Global minima
and we didnt get any local minima
Disadvantage:
it is not robust to outliers
Advantage:
1. more robust to outliers as compared to MSE
Disadvantage:
1. computation of MAE is more difficult and take time
2. Classification
https://towardsdatascience.com/cross-entropy-for-classification-d98e7f974451
https://keras.io/api/losses/
2.1Cross Entropy:
Cross-entropy loss, or log loss, measures the performance of a classification model whose
output is a probability value between 0 and 1. Cross-entropy loss increases as the predicted
probability diverges from the actual label.it can solve only binary classification.
ANN
data = pd.read_csv('bank.csv')
x = data.iloc[:, 3:13]
y = data.iloc[:,13]
data.head()
EstimatedSalary Exited
0 101348.88 1
1 112542.58 0
2 113931.57 1
3 93826.63 0
4 79084.10 0
geography.head()
Germany Spain
0 0 0
1 0 1
2 0 0
3 0 0
4 0 1
gender.head()
Male
0 0
1 0
2 0
3 0
4 0
x = pd.concat([x,geography,gender],axis = 1)
x.head()
x = x.drop(['Geography','Gender'], axis = 1)
x.head()
x_train = sc.fit_transform(x_train)
x_test = sc.transform(x_test)
compiling ANN
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy',
metrics = ['accuracy'])
----------------------------------------------------------------------
-----
NameError Traceback (most recent call
last)
<ipython-input-1-7e13d534d424> in <module>
----> 1 model = classifier.fit(x_train,y_train,validation_split =
0.33, batch_size = 10, epochs = 100)
y_pred = classifier.predict(x_test)
y_pred = (y_pred > 0.5)
----------------------------------------------------------------------
-----
NameError Traceback (most recent call
last)
<ipython-input-1-32fc0427837c> in <module>
----> 1 y_pred = classifier.predict(x_test)
2 y_pred = (y_pred > 0.5)
# accuracy
from sklearn.metrics import accuracy_score
score = accuracy_score(y_pred, y_test)
score
0.8596666666666667
How to select Hiden layers and Hidden neurons in ANN using keras tuner
link:https://www.tensorflow.org/tutorials/keras/keras_tuner
Hyperparameters
• how many numbers of hidden layers we should have?
• learning rate
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tensorflow
from tensorflow import keras
from tensorflow.keras import layers
from kerastuner.tuners import RandomSearch
df = pd.read_csv('Real.csv')
df.head()
T TM Tm SLP H VV V VM PM 2.5
0 7.4 9.8 4.8 1017.6 93.0 0.5 4.3 9.4 219.720833
1 7.8 12.7 4.4 1018.5 87.0 0.6 4.4 11.1 182.187500
2 6.7 13.4 2.4 1019.4 82.0 0.6 4.8 11.1 154.037500
3 8.6 15.5 3.3 1018.7 72.0 0.8 8.1 20.6 223.208333
4 12.4 20.9 4.4 1017.3 61.0 1.3 8.7 22.2 200.645833
x = df.iloc[:,:-1]
y = df.iloc[:,-1]
def model_builder(hp):
model = keras.Sequential()
model.add(keras.layers.Flatten(input_shape=(28, 28)))
# Tune the number of units in the first Dense layer
# Choose an optimal value between 32-512
hp_units = hp.Int('units', min_value=32, max_value=512, step=32)
model.add(keras.layers.Dense(units=hp_units, activation='relu'))
model.add(keras.layers.Dense(10))
model.compile(optimizer=keras.optimizers.Adam(learning_rate=hp_learnin
g_rate),
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
return model
tuner = RandomSearch(model_builder,
objective='val_mean_absolute_error',
max_trials = 5,
executions_per_trails = 3,
directory='projects',
project_name='intro_to_kt')
----------------------------------------------------------------------
-----
TypeError Traceback (most recent call
last)
<ipython-input-8-0f55ecab4360> in <module>
----> 1 tuner = RandomSearch(model_builder,
2 objective='val_mean_absolute_error',
3 max_trials = 5,
4 executions_per_trails = 3,
5 directory='projects',
~\anaconda3\lib\site-packages\kerastuner\tuners\randomsearch.py in
__init__(self, hypermodel, objective, max_trials, seed,
hyperparameters, tune_new_entries, allow_new_entries, **kwargs)
132 tune_new_entries=tune_new_entries,
133 allow_new_entries=allow_new_entries)
--> 134 super(RandomSearch, self).__init__(
135 oracle,
136 hypermodel,
~\anaconda3\lib\site-packages\kerastuner\engine\
multi_execution_tuner.py in __init__(self, oracle, hypermodel,
executions_per_trial, **kwargs)
56 executions_per_trial=1,
57 **kwargs):
---> 58 super(MultiExecutionTuner, self).__init__(
59 oracle, hypermodel, **kwargs)
60 if isinstance(oracle.objective, list):
cerebral cortex:
The cerebral cortex is the largest site of neural integration in the central nervous system. It
plays a key role in attention, perception, awareness, thought, memory, language, and
consciousness.
visual cortex:
The visual cortex of the brain is the area of the cerebral cortex that processes visual
information. It is located in the occipital lobe. Sensory input originating from the eyes
travels through the lateral geniculate nucleus in the thalamus and then reaches the visual
cortex.
it have different layers like V1, V2, V3, V4, V5 this layers plays very important role.
• V1 layer responsible for finding the edges of image and goes to layer V2, V3, V4, V5.
each layer have specific function to grab information from the images. we implement the
layers in cnn to process the images
What is convolution:
In mathematics (in particular, functional analysis), convolution is a mathematical operation
on two functions (f and g) that produces a third function ( ) that expresses how the shape of
one is modified by the other. The term convolution refers to both the result function and to
the process of computing it.
ex: in the above example we multiply image with filter or kernel to grab the information.
the filters may be a edge filter, vertical filters etc.
use of padding:
Padding is simply a process of adding layers of zeros to our input images so as to avoid the
problems mentioned above. This prevents shrinking as, if p = number of layers of zeros
added to the border of the image, then our (n x n) image becomes (n + 2p) x (n + 2p) image
after padding.
Max Pooling:
Max pooling is a sample-based discretization process. The objective is to down-sample an
input representation (image, hidden-layer output matrix, etc.), reducing its dimensionality
and allowing for assumptions to be made about features contained in the sub-regions
binned.
datagen = ImageDataGenerator(rotation_range =
40,width_shift_range=0.2,
height_shift_range=0.2,zoom_range=0.2,
channel_shift_range=0.2, horizontal_flip= True,
fill_mode='nearest' )
img = load_img('hero1.png')# load image
# img.show()# see image
x = img_to_array(img)
x = x.reshape((1,)+x.shape)
i = 0
for batch in datagen.flow(x, batch_size =1, save_to_dir =
'Preview' ,save_prefix = 'cat', save_format = 'jpeg'):
i+=1
if i> 20:
break
link:https://vijayabhaskar96.medium.com/tutorial-on-keras-flow-from-dataframe-
1fd4493d237c loading image data from different files link: https://github.com/keras-
team/keras-tuner link:https://keras-team.github.io/keras-tune
# loading the images
df = pd.read_csv(r'C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\labels.csv',dtype=str)
# data type must be as str,list or tuple
datagen=ImageDataGenerator(rescale=1./255.,validation_split=0.25)
data_train = datagen.flow_from_dataframe(dataframe= df,
directory= r'C:\Users\tharu\OneDrive\Desktop\DataScience\
Deep-Learning\emrgency vehicle data\train_SOaYf6m\images',
x_col = "id",
y_col = "label",
subset = 'training',
batch_size=32,
target_size=(128, 128), # we need give size to resize all
images to single size
color_mode = 'rgb', # we need to specify for color images
seed=42,
shuffle=True,
class_mode="categorical")
data_test=datagen.flow_from_dataframe(
dataframe= df,
directory= r"C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\images",
x_col="id",
y_col="label",
subset="validation",
batch_size=32,
target_size=(128, 128),
color_mode = 'rgb',
seed=42,
shuffle=True,
class_mode="categorical")
Building Model
create model
The model type that we will be using is Sequential. Sequential is the easiest way to build a
model in Keras. It allows you to build a model layer by layer.
model = Sequential()
adding layers
We use the ‘add()’ function to add layers to our model.
• Our first 2 layers are Conv2D layers. These are convolution layers that will deal with
our input images, which are seen as 2-dimensional matrices.
• 64 in the first layer and 32 in the second layer are the number of nodes(A node, also
called a neuron or Perceptron) in each layer. This number can be adjusted to be
higher or lower, depending on the size of the dataset. In our case, 64 and 32 work
well, so we will stick with this for now.
• Kernel size is the size of the filter matrix for our convolution. So a kernel size of 3
means we will have a 3x3 filter matrix. Refer back to the introduction and the first
image for a refresher on this.
• Activation is the activation function for the layer. The activation function we will be
using for our first 2 layers is the ReLU, or Rectified Linear Activation. This activation
function has been proven to work well in neural networks.
• Our first layer also takes in an input shape. This is the shape of each input image,
28,28,1 as seen earlier on, with the 1 signifying that the images are greyscale.
• In between the Conv2D layers and the dense layer, there is a ‘Flatten’ layer. Flatten
serves as a connection between the convolution and dense layers.
• ‘Dense’ is the layer type we will use in for our output layer. Dense is a standard layer
type that is used in many cases for neural networks.
• We will have 10 nodes in our output layer, one for each possible outcome (0–9).The
activation is ‘softmax’. Softmax makes the output sum up to 1 so the output can be
interpreted as probabilities. The model will then make its prediction based on which
option has the highest probability.
Epoch 1/10
38/38 [==============================] - 119s 3s/step - loss: 2.0871 -
accuracy: 0.5894 - val_loss: 0.5837 - val_accuracy: 0.7005
Epoch 2/10
38/38 [==============================] - 33s 858ms/step - loss: 0.4566
- accuracy: 0.7858 - val_loss: 0.6593 - val_accuracy: 0.7448
Epoch 3/10
38/38 [==============================] - 34s 908ms/step - loss: 0.2640
- accuracy: 0.9034 - val_loss: 0.5637 - val_accuracy: 0.7188
Epoch 4/10
38/38 [==============================] - 38s 989ms/step - loss: 0.1519
- accuracy: 0.9589 - val_loss: 0.6387 - val_accuracy: 0.7396
Epoch 5/10
38/38 [==============================] - 33s 864ms/step - loss: 0.0367
- accuracy: 0.9960 - val_loss: 0.7001 - val_accuracy: 0.7396
Epoch 6/10
38/38 [==============================] - 34s 903ms/step - loss: 0.0156
- accuracy: 0.9971 - val_loss: 0.7280 - val_accuracy: 0.7552
Epoch 7/10
38/38 [==============================] - 34s 887ms/step - loss: 0.0274
- accuracy: 0.9963 - val_loss: 0.8293 - val_accuracy: 0.7474
Epoch 8/10
38/38 [==============================] - 31s 812ms/step - loss: 0.0261
- accuracy: 0.9963 - val_loss: 0.7220 - val_accuracy: 0.7708
Epoch 9/10
38/38 [==============================] - 31s 806ms/step - loss: 0.0140
- accuracy: 0.9968 - val_loss: 0.7128 - val_accuracy: 0.7370
Epoch 10/10
38/38 [==============================] - 31s 810ms/step - loss: 0.0162
- accuracy: 0.9989 - val_loss: 0.6729 - val_accuracy: 0.7578
<tensorflow.python.keras.callbacks.History at 0x15435a28d60>
import cv2
img = cv2.imread(r'C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\vehicle.jpg',1)
width = 128
height = 128
dim = (width, height)
resized = cv2.resize(img, dim, interpolation = cv2.INTER_ARE)
print(pred)
[[0.99537885 0.0046212 ]]
img1 = cv2.imread(r'C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\car.jpg',1)
width = 128
height = 128
dim = (width, height)
resized1 = cv2.resize(img1, dim, interpolation = cv2.INTER_AREA)
pred1 = model.predict(x_val1)
print(pred1)
[[0.9767471 0.02325287]]
img2 = cv2.imread(r'C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\car.jpg',1)
width = 128
height = 128
dim = (width, height)
resized2 = cv2.resize(img2, dim, interpolation = cv2.INTER_AREA)
pred2 = model.predict(x_val2)
print(pred2)
[[0.9767471 0.02325287]]
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 126, 126, 64) 1792
_________________________________________________________________
conv2d_1 (Conv2D) (None, 124, 124, 32) 18464
_________________________________________________________________
flatten (Flatten) (None, 492032) 0
_________________________________________________________________
dense (Dense) (None, 2) 984066
=================================================================
Total params: 1,004,322
Trainable params: 1,004,322
Non-trainable params: 0
_________________________________________________________________
x = tf.random.normal((4, 28, 28, 3))
x.shape
input_shape[1:]
(28, 28, 3)
model.add(Conv2D(16, 3, activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(rate=0.25))
model.add(Conv2D(32, 3, activation='relu'))
model.add(Conv2D(64, 3, activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(rate=0.25))
model.add(Flatten())
model.add(Dense(units=128, activation='relu'))
model.add(Dropout(rate=0.25))
model.add(Dense(2, activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy',
metrics=['accuracy'])
STEP_SIZE_TRAIN = data_train.n//data_train.batch_size
STEP_SIZE_VALID = data_test.n//data_test.batch_size
model.fit(x = data_train,steps_per_epoch=STEP_SIZE_TRAIN,
validation_data = data_test,validation_steps = STEP_SIZE_VALID,
epochs=10
)
Epoch 1/10
38/38 [==============================] - 26s 666ms/step - loss: 0.9023
- accuracy: 0.5405 - val_loss: 0.5957 - val_accuracy: 0.6198
Epoch 2/10
38/38 [==============================] - 25s 668ms/step - loss: 0.5373
- accuracy: 0.7355 - val_loss: 0.5600 - val_accuracy: 0.7422
Epoch 3/10
38/38 [==============================] - 25s 656ms/step - loss: 0.4629
- accuracy: 0.7889 - val_loss: 0.4603 - val_accuracy: 0.7865
Epoch 4/10
38/38 [==============================] - 27s 704ms/step - loss: 0.4224
- accuracy: 0.8205 - val_loss: 0.4932 - val_accuracy: 0.7682
Epoch 5/10
38/38 [==============================] - 26s 675ms/step - loss: 0.4203
- accuracy: 0.8062 - val_loss: 0.4751 - val_accuracy: 0.7865
Epoch 6/10
38/38 [==============================] - 26s 677ms/step - loss: 0.3727
- accuracy: 0.8394 - val_loss: 0.4425 - val_accuracy: 0.8047
Epoch 7/10
38/38 [==============================] - 28s 739ms/step - loss: 0.2902
- accuracy: 0.8834 - val_loss: 0.5390 - val_accuracy: 0.7995
Epoch 8/10
38/38 [==============================] - 27s 702ms/step - loss: 0.2439
- accuracy: 0.8974 - val_loss: 0.5148 - val_accuracy: 0.8021
Epoch 9/10
38/38 [==============================] - 26s 688ms/step - loss: 0.2069
- accuracy: 0.9188 - val_loss: 0.5248 - val_accuracy: 0.7995
Epoch 10/10
38/38 [==============================] - 26s 672ms/step - loss: 0.1792
- accuracy: 0.9332 - val_loss: 0.6022 - val_accuracy: 0.7865
<tensorflow.python.keras.callbacks.History at 0x1e7a1200f70>
img = cv2.imread(r'C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\train.jpg',1)
width = 128
height = 128
dim = (width, height)
resized = cv2.resize(img, dim, interpolation = cv2.INTER_AREA)
x_val = np.array(resized) / 255
x_val = x_val.reshape(-1, 128, 128, 3)
pred = model.predict(x_val)
print(pred)
[[0.4878648 0.5121352]]
building model
model = Sequential()
model.add(Conv2D(32,padding='same', kernel_size=3, activation='relu',
input_shape=(128,128,3)))
model.add(MaxPooling2D((2, 2)))
model.add(BatchNormalization())
model.add(Conv2D(64, kernel_size=3,padding='same', activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(BatchNormalization())
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(2, activation='softmax'))
model.add(Dropout(0.5))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
STEP_SIZE_TRAIN = data_train.n//data_train.batch_size
STEP_SIZE_VALID = data_test.n//data_test.batch_size
model.fit(x = data_train,steps_per_epoch=STEP_SIZE_TRAIN,
validation_data = data_test,validation_steps = STEP_SIZE_VALID,
epochs=10
)
Epoch 1/10
38/38 [==============================] - 22s 525ms/step - loss: 6.4985
- accuracy: 0.5817 - val_loss: 6.2706 - val_accuracy: 0.5911
Epoch 2/10
38/38 [==============================] - 20s 521ms/step - loss: 6.2590
- accuracy: 0.6425 - val_loss: 5.9911 - val_accuracy: 0.6094
Epoch 3/10
38/38 [==============================] - 20s 516ms/step - loss: 6.5250
- accuracy: 0.6183 - val_loss: 6.1508 - val_accuracy: 0.5990
Epoch 4/10
38/38 [==============================] - 19s 509ms/step - loss: 6.1739
- accuracy: 0.6448 - val_loss: 6.1508 - val_accuracy: 0.5990
Epoch 5/10
38/38 [==============================] - 20s 531ms/step - loss: 5.6002
- accuracy: 0.6697 - val_loss: 6.1908 - val_accuracy: 0.5964
Epoch 6/10
38/38 [==============================] - 20s 532ms/step - loss: 6.4007
- accuracy: 0.6061 - val_loss: 6.1982 - val_accuracy: 0.5885
Epoch 7/10
38/38 [==============================] - 19s 509ms/step - loss: 6.4062
- accuracy: 0.5876 - val_loss: 5.9796 - val_accuracy: 0.5938
Epoch 8/10
38/38 [==============================] - 21s 544ms/step - loss: 6.0466
- accuracy: 0.6506 - val_loss: 4.9934 - val_accuracy: 0.6562
Epoch 9/10
38/38 [==============================] - 19s 510ms/step - loss: 6.2585
- accuracy: 0.6384 - val_loss: 5.5318 - val_accuracy: 0.6276
Epoch 10/10
38/38 [==============================] - 20s 533ms/step - loss: 6.5785
- accuracy: 0.6166 - val_loss: 4.4664 - val_accuracy: 0.6953
<tensorflow.python.keras.callbacks.History at 0x1e7a16a28e0>
img = cv2.imread(r'C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\vehicle.jpg',1)
width = 128
height = 128
dim = (width, height)
resized = cv2.resize(img, dim, interpolation = cv2.INTER_AREA)
x_val = np.array(resized) / 255
x_val = x_val.reshape(-1, 128, 128, 3)
pred = model.predict(x_val)
print(pred)
[[1. 0.]]
imge = cv2.imread(r'C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\train.jpg',1)
width = 128
height = 128
dim = (width, height)
resize = cv2.resize(imge, dim, interpolation = cv2.INTER_AREA)
x_vale = np.array(resize) / 255
x_vale = x_vale.reshape(-1, 128, 128, 3)
pred = model.predict(x_vale)
print(pred)
[[1.0000000e+00 1.5728423e-38]]
df = pd.read_csv(r'C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\labels.csv',dtype=str)
datagen=ImageDataGenerator(rescale=1./255.,validation_split=0.25)
data_train = datagen.flow_from_dataframe(dataframe= df,
directory= r'C:\Users\tharu\OneDrive\Desktop\DataScience\
Deep-Learning\emrgency vehicle data\train_SOaYf6m\images',
x_col = "id",
y_col = "label",
subset = 'training',
batch_size=32,
target_size=(64, 64), # we need give size to resize all
images to single size
color_mode = 'rgb', # we need to specify for color images
seed=42,
shuffle=True,
class_mode="categorical")
data_test=datagen.flow_from_dataframe(
dataframe= df,
directory= r"C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\images",
x_col="id",
y_col="label",
subset="validation",
batch_size=32,
target_size=(64, 64),
color_mode = 'rgb',
seed=42,
shuffle=True,
class_mode="categorical")
model = Sequential()
model.add(Conv2D(16,padding='same', kernel_size=3, activation='relu',
input_shape=(64,64,3)) )
model.add(Conv2D(16, 3, activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(rate=0.25))
model.add(Conv2D(32, 3, activation='relu'))
model.add(Conv2D(64, 3, activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(rate=0.25))
model.add(Flatten())
model.add(Dense(units=128, activation='relu'))
model.add(Dropout(rate=0.25))
model.add(Dense(2, activation='softmax'))
model.compile(
optimizer=keras.optimizers.Adam(1e-3),
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
STEP_SIZE_TRAIN = data_train.n//data_train.batch_size
STEP_SIZE_VALID = data_test.n//data_test.batch_size
model.fit(x = data_train,steps_per_epoch=STEP_SIZE_TRAIN,
validation_data = data_test,validation_steps = STEP_SIZE_VALID,
epochs=10
)
Epoch 1/10
----------------------------------------------------------------------
-----
InvalidArgumentError Traceback (most recent call
last)
<ipython-input-83-f39f8a4579a2> in <module>
1 STEP_SIZE_TRAIN = data_train.n//data_train.batch_size
2 STEP_SIZE_VALID = data_test.n//data_test.batch_size
----> 3 model.fit(x = data_train,steps_per_epoch=STEP_SIZE_TRAIN,
validation_data = data_test,validation_steps = STEP_SIZE_VALID,
4 epochs=10
5 )
~\anaconda3\lib\site-packages\tensorflow\python\keras\engine\
training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks,
validation_split, validation_data, shuffle, class_weight,
sample_weight, initial_epoch, steps_per_epoch, validation_steps,
validation_batch_size, validation_freq, max_queue_size, workers,
use_multiprocessing)
1098 _r=1):
1099 callbacks.on_train_batch_begin(step)
-> 1100 tmp_logs = self.train_function(iterator)
1101 if data_handler.should_sync:
1102 context.async_wait()
~\anaconda3\lib\site-packages\tensorflow\python\eager\def_function.py
in __call__(self, *args, **kwds)
826 tracing_count = self.experimental_get_tracing_count()
827 with trace.Trace(self._name) as tm:
--> 828 result = self._call(*args, **kwds)
829 compiler = "xla" if self._experimental_compile else
"nonXla"
830 new_tracing_count =
self.experimental_get_tracing_count()
~\anaconda3\lib\site-packages\tensorflow\python\eager\def_function.py
in _call(self, *args, **kwds)
886 # Lifting succeeded, so variables are initialized and
we can run the
887 # stateless function.
--> 888 return self._stateless_fn(*args, **kwds)
889 else:
890 _, _, _, filtered_flat_args = \
~\anaconda3\lib\site-packages\tensorflow\python\eager\function.py in
__call__(self, *args, **kwargs)
2940 (graph_function,
2941 filtered_flat_args) = self._maybe_define_function(args,
kwargs)
-> 2942 return graph_function._call_flat(
2943 filtered_flat_args,
captured_inputs=graph_function.captured_inputs) # pylint:
disable=protected-access
2944
~\anaconda3\lib\site-packages\tensorflow\python\eager\function.py in
_call_flat(self, args, captured_inputs, cancellation_manager)
1916 and executing_eagerly):
1917 # No tape is watching; skip to running the function.
-> 1918 return
self._build_call_outputs(self._inference_function.call(
1919 ctx, args,
cancellation_manager=cancellation_manager))
1920 forward_backward =
self._select_forward_and_backward_functions(
~\anaconda3\lib\site-packages\tensorflow\python\eager\function.py in
call(self, ctx, args, cancellation_manager)
553 with _InterpolateFunctionError(self):
554 if cancellation_manager is None:
--> 555 outputs = execute.execute(
556 str(self.signature.name),
557 num_outputs=self._num_outputs,
~\anaconda3\lib\site-packages\tensorflow\python\eager\execute.py in
quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
57 try:
58 ctx.ensure_initialized()
---> 59 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle,
device_name, op_name,
60 inputs, attrs,
num_outputs)
61 except core._NotOkStatusException as e:
• Viewing histograms of weights, biases, or other tensors as they change over time
TRANSFER LEARING
Transfer learning is a machine learning method where a model developed for a task is
reused as the starting point for a model on a second task.
It is a popular approach in deep learning where pre-trained models are used as the starting
point on computer vision and natural language processing tasks given the vast compute
and time resources required to develop neural network models on these problems and
from the huge jumps in skill that they provide on related problems.
link:https://www.analyticsvidhya.com/blog/2020/08/top-4-pre-trained-models-for-
image-classification-with-python-code/
#LINK:https://keras.io/api/applications/
import pandas as pd
import numpy as np
import os
import tensorflow as tf
import cv2
import keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from tensorflow.keras import layers
from keras.layers import Dense, Dropout, Flatten, Lambda
from keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras.layers import BatchNormalization
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from kerastuner.tuners import RandomSearch
import kerastuner as kt
import PIL
import theano
import glob
df = pd.read_csv(r'C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\labels.csv',dtype=str)
datagen=ImageDataGenerator(rescale=1./255.,validation_split=0.25)
data_train = datagen.flow_from_dataframe(dataframe= df,
directory= r'C:\Users\tharu\OneDrive\Desktop\DataScience\
Deep-Learning\emrgency vehicle data\train_SOaYf6m\images',
x_col = "id",
y_col = "label",
subset = 'training',
batch_size=32,
target_size=(64, 64), # we need give size to resize all
images to single size
color_mode = 'rgb', # we need to specify for color images
seed=42,
shuffle=True,
class_mode="categorical")
data_test=datagen.flow_from_dataframe(
dataframe= df,
directory= r"C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\images",
x_col="id",
y_col="label",
subset="validation",
batch_size=32,
target_size=(64, 64),
color_mode = 'rgb',
seed=42,
shuffle=True,
class_mode="categorical")
x = layers.Flatten()(base_model.output)
x = layers.Dense(1, activation='sigmoid')(x)
model = tf.keras.models.Model(base_model.input, x)
Epoch 1/10
38/38 [==============================] - 23s 437ms/step - loss: 0.7749
- acc: 0.5000 - val_loss: 0.7174 - val_acc: 0.5000
Epoch 2/10
38/38 [==============================] - 14s 376ms/step - loss: 0.7110
- acc: 0.5000 - val_loss: 0.6989 - val_acc: 0.5000
Epoch 3/10
38/38 [==============================] - 15s 383ms/step - loss: 0.6974
- acc: 0.5000 - val_loss: 0.6944 - val_acc: 0.5000
Epoch 4/10
38/38 [==============================] - 14s 381ms/step - loss: 0.6941
- acc: 0.5000 - val_loss: 0.6934 - val_acc: 0.5000
Epoch 5/10
38/38 [==============================] - 16s 414ms/step - loss: 0.6933
- acc: 0.5000 - val_loss: 0.6932 - val_acc: 0.5000
Epoch 6/10
38/38 [==============================] - 14s 381ms/step - loss: 0.6932
- acc: 0.5000 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 7/10
38/38 [==============================] - 15s 383ms/step - loss: 0.6931
- acc: 0.5000 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 8/10
38/38 [==============================] - 15s 384ms/step - loss: 0.6931
- acc: 0.5000 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 9/10
38/38 [==============================] - 14s 378ms/step - loss: 0.6931
- acc: 0.5000 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 10/10
38/38 [==============================] - 14s 375ms/step - loss: 0.6931
- acc: 0.5000 - val_loss: 0.6931 - val_acc: 0.5000
imge = cv2.imread(r'C:\Users\tharu\OneDrive\Desktop\DataScience\Deep-
Learning\emrgency vehicle data\train_SOaYf6m\car.jpg',1)
width = 64
height = 64
dim = (width, height)
resize = cv2.resize(imge, dim, interpolation = cv2.INTER_AREA)
x_vale = np.array(resize) / 255
x_vale = x_vale.reshape(-1, 64, 64, 3)
pred = model.predict(x_vale)
print(pred)
[[0.49948993]]
RNN(Recurrent Neural Network)
it work very well with sequence data like nlp, time series analysis
A recurrent neural network is a neural network that is specialized for processing a
sequence of data x(t)= x(1), . . . , x(τ) with the time step index t ranging from 1 to τ. For
tasks that involve sequential inputs, such as speech and language, it is often better to use
RNNs. In a NLP problem, if you want to predict the next word in a sentence it is important
to know the words before it. RNNs are called recurrent because they perform the same task
for every element of a sequence, with the output being depended on the previous
computations. Another way to think about RNNs is that they have a “memory” which
captures information about what has been calculated so far.
The left side of the above diagram shows a notation of an RNN and on the right side an RNN
being unrolled (or unfolded) into a full network. By unrolling we mean that we write out
the network for the complete sequence. For example, if the sequence we care about is a
sentence of 3 words, the network would be unrolled into a 3-layer neural network, one
layer for each word.
Input: x(t) is taken as the input to the network at time step t. For example, x1,could be a
one-hot vector corresponding to a word of a sentence.
Hidden state: h(t) represents a hidden state at time t and acts as “memory” of the network.
h(t) is calculated based on the current input and the
previous time step’s hidden state: h(t) = f(U x(t) + W h(t−1)). The function f is taken to be a
non-linear transformation such as tanh, ReLU.
Weights: The RNN has input to hidden connections parameterized by a weight matrix U,
hidden-to-hidden recurrent connections parameterized by a weight matrix W, and hidden-
to-output connections parameterized by a weight matrix V and all these weights (U,V,W)
are shared across time.
Output: o(t) illustrates the output of the network. In the figure I just put an arrow after o(t)
which is also often subjected to non-linearity, especially when the network contains further
layers downstream.
Forward Pass
• For the vanishing gradient problem, the further you go through the network, the
lower your gradient is and the harder it is to train the weights, which has a domino
effect on all of the further weights throughout the network. That was the main
roadblock to using Recurrent Neural Networks.
----------------------------------------------------------------------
-----
TypeError Traceback (most recent call
last)
<ipython-input-22-86a4cc37efd5> in <module>
----> 1 'srtt'-'dfdddf'