Professional Documents
Culture Documents
Creating and Training Custom Layers in TensorFlow 2 - by Arjun Sarkar - Towards Data Science
Creating and Training Custom Layers in TensorFlow 2 - by Arjun Sarkar - Towards Data Science
You have 2 free member-only stories left this month. Sign up for Medium and get an extra one
1. Previously we’ve seen how to create custom loss functions — Creating custom Loss
functions using TensorFlow 2
2. Next, I wrote about creating custom Activation Functions using Lambda layers —
Creating Custom Activation Functions with Lambda Layers in TensorFlow 2
This is the third part of the series, where we create custom Dense Layers and train them
in TensorFlow 2.
Introduction:
Lambda layers are simple layers in TensorFlow that can be used to create some custom
activation functions. But lambda layers have many limitations, especially when it comes
to training these layers. So, the idea is to create custom layers that are trainable, using
the inheritable Keras layers in TensorFlow — with a special focus on Dense layers.
https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 1/11
8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science
What is a Layer?
Get started Open in app
A layer is a class that receives some parameters, passes them through state and
computations, and passes out an output, as required by the neural network. Every model
architecture contains multiple layers, be it a Sequential or a Functional API.
State — Mostly trainable features which are trained during ‘model.fit’. In a Dense layer,
the states constitute the weights and the bias, as shown in Figure 1. These values are
updated to give better results as the model trains. In some layers, the state can also
contain non-trainable features.
https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 2/11
8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science
class SimpleDense(Layer):
super(SimpleDense, self).__init__()
self.units = units
w_init = tf.random_normal_initializer()
self.w = tf.Variable(name="kernel",
initial_value=w_init(shape=(input_shape[-1], self.units),
dtype='float32'),trainable=True)
b_init = tf.zeros_initializer()
self.b = tf.Variable(name="bias",initial_value=b_init(shape=
(self.units,), dtype='float32'),trainable=True)
Explanation of the code above — The class is named SimpleDense. When we create a
custom layer, we have to inherit Keras’s layer class. This is done in the line ‘class
SimpleDense(Layer)’.
‘__init__’ is the first method in the class that will help to initialize the class. ‘init’ accepts
parameters and converts them to variables that can be used within the class. This is
inheriting from the ‘Layer’ class and hence requires some initialization. This
initialization is done using the ‘super’ keyword. ‘units’ is a local class variable. This is
analogous to the number of units in the Dense layer. The default value is set to 32, but
can always be changed when the class is called.
https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 3/11
8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science
‘build’ is the next method in the class. This is used to specify the states. In the Dense
Get started Open in app
layer, the two states required are ‘w’ and ‘b’, for weights and biases. When the Dense
layer is being created, we are not just creating one neuron of the network’s hidden layer,
but multiple neurons at one go (in this case 32 neurons will be created). Every neuron in
the layer needs to be initialized and given some random weight and bias values.
TensorFlow contains many built-in functions to initialize these values.
For initializing the biases, TensorFlow’s ‘zeros_initializer’ function is used. This sets all
the initial bias values to zero. ‘self.b’ is a tensor with a size same as the size of the units
(here 32), and each of these 32 bias terms are set to zero initially. This is also set to
‘trainable’, so the bias terms will update as training starts. The name ‘bias’ is added to be
able to trace it later.
‘call’ is the last method that performs the computation. In this case, as it is a Dense layer,
it multiplies the inputs with the weights, adds the bias, and finally returns the output.
The ‘matmul’ operation is used as self.w and self.b are tensors and not single numerical
values.
my_dense = SimpleDense(units=1)
x = tf.ones((1, 1))
y = my_dense(x)
print(my_dense.variables)
https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 4/11
8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science
Output:
Get started Open in app
Explanation of the code above — The first line creates a Dense layer containing just one
neuron (unit =1). x (input) is a tensor of shape (1,1) with the value 1. Y =
my_dense(x), helps initialize the Dense layer. ‘.variables’ helps us to look at the values
initialized inside the Dense layers (weights and biases).
The output of ‘my_dense.variable’ is shown below the code block. It shows that there are
two variables in ‘simple_dense’ called ‘kernel’ and ‘bias’. The kernel ‘w’ is initialized a
value 0.0038, a random normal distribution value, and the bias ‘b’ is initialized with the
value 0. This is just the initial state of the layer. Once trained, these values will change
accordingly.
import numpy as np
# use the Sequential API to build a model with our custom layer
my_layer = SimpleDense(units=1)
model = tf.keras.Sequential([my_layer])
model.compile(optimizer='sgd', loss='mean_squared_error')
model.fit(xs, ys, epochs=500,verbose=0)
# perform inference
print(model.predict([10.0]))
https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 5/11
8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science
print(my_layer.variables)
Get started Open in app
Output:
[[18.981567]]
Explanation of the code above —The code used above is a very simple way to check if
the custom layers work. Input and output are set, and the model is compiled using the
custom layer and finally trained for 500 epochs. What is important to see is that after
training the model, the values of the weights and biases have now changed. The weight
which was initially set as 0.0038 is now 1.9973, and the bias which was initially set as
zero is now -0.9917.
The answer is a simple tweak in the ‘__init__’ and the ‘call’ methods in the custom Dense
layer.
class SimpleDense(Layer):
super(SimpleDense, self).__init__()
https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 6/11
8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science
self.units = units
Get started
Open in app
# define the activation to get from the built-in activation
layers in Keras
self.activation = tf.keras.activations.get(activation)
w_init = tf.random_normal_initializer()
self.w = tf.Variable(name="kernel",
initial_value=w_init(shape=(input_shape[-1],
self.units),dtype='float32'),trainable=True)
b_init = tf.zeros_initializer()
self.b = tf.Variable(name="bias",
initial_value=b_init(shape=(self.units,),
dtype='float32'),trainable=True)
super().build(input_shape)
Explanation of the code above — Most of the code is exactly similar to the code that we
used before.
To add the activation we need to specify in the ‘__init__’ that we need an activation.
Either a string or an instance of an activation object can be passed into this activation. It
is set to default as None, so if no activation function is mentioned it will not throw an
error. Next, we have to initialize the activation function as —
‘tf.keras.activations.get(activation)’.
The final edit is in the ‘call’ method where just before the computation of the weights
and the biases we need to add self.activation to activate the computation. So now the
return is the computation along with the activation.
Complete code of Custom Dense layer with Activation on the mnist dataset:
https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 7/11
8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science
class SimpleDense(Layer):
super(SimpleDense, self).__init__()
self.units = units
self.activation = tf.keras.activations.get(activation)
w_init = tf.random_normal_initializer()
self.w = tf.Variable(name="kernel",
initial_value=w_init(shape=(input_shape[-1],
self.units),dtype='float32'),trainable=True)
b_init = tf.zeros_initializer()
self.b = tf.Variable(name="bias",
initial_value=b_init(shape=(self.units,),
dtype='float32'),trainable=True)
super().build(input_shape)
mnist = tf.keras.datasets.mnist
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
SimpleDense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 8/11
8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science
model.compile(optimizer='adam',
metrics=['accuracy'])
model.evaluate(x_test, y_test)
Training the model with our custom Dense layer and activation gives a training accuracy
of 97.8% and a validation accuracy of 97.7%.
Conclusion:
This is the way to create custom layers in TensorFlow. Even though we only see the
working of a Dense Layer, this can easily be replaced by any other layers such as a
Quadratic Layer which does the following computation —
Computation:
import tensorflow as tf
class SimpleQuadratic(Layer):
super(SimpleQuadratic,self).__init__()
self.units=units
self.activation=tf.keras.activations.get(activation)
https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 9/11
8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science
Get started
Open in app
a_init = tf.random_normal_initializer()
a_init_val = a_init(shape=(input_shape[-1],self.units),dtype=
'float32')
self.a = tf.Variable(initial_value=a_init_val,
trainable='true')
b_init = tf.random_normal_initializer()
b_init_val = b_init(shape=(input_shape[-1],self.units),dtype=
'float32')
self.b = tf.Variable(initial_value=b_init_val,
trainable='true')
c_init= tf.zeros_initializer()
c_init_val = c_init(shape=(self.units,),dtype='float32')
self.c =
tf.Variable(initial_value=c_init_val,trainable='true')
x_squared= tf.math.square(inputs)
x_squared_times_a = tf.matmul(x_squared,self.a)
x_times_b= tf.matmul(inputs,self.b)
x2a_plus_xb_plus_c = x_squared_times_a+x_times_b+self.c
return self.activation(x2a_plus_xb_plus_c)
mnist = tf.keras.datasets.mnist
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
SimpleQuadratic(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.evaluate(x_test, y_test)
This Quadratic layer gives a validation accuracy of 97.8% on the mnist dataset.
https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 10/11
8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science
Thus, we see we can implement our own layers along with the desired activation into the
Get started Open in app
TensorFlow models to edit or maybe even improve overall accuracies.
Every Thursday, the Variable delivers the very best of Towards Data Science: from hands-on tutorials
and cutting-edge research to original features you don't want to miss. Take a look.
Towards Data Science TensorFlow Deep Learning Machine Learning Neural Networks
https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 11/11