Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science

Get started Open in app

Follow 571K Followers

You have 2 free member-only stories left this month. Sign up for Medium and get an extra one

Creating and Training Custom Layers in


TensorFlow 2
Learning to create your own custom layers and training them in TensorFlow 2

Arjun Sarkar Jun 24 · 8 min read

1. Previously we’ve seen how to create custom loss functions — Creating custom Loss
functions using TensorFlow 2

2. Next, I wrote about creating custom Activation Functions using Lambda layers —
Creating Custom Activation Functions with Lambda Layers in TensorFlow 2

This is the third part of the series, where we create custom Dense Layers and train them
in TensorFlow 2.

Introduction:
Lambda layers are simple layers in TensorFlow that can be used to create some custom
activation functions. But lambda layers have many limitations, especially when it comes
to training these layers. So, the idea is to create custom layers that are trainable, using
the inheritable Keras layers in TensorFlow — with a special focus on Dense layers.

https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 1/11
8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science

What is a Layer?
Get started Open in app

Figure 1. Layer — Dense Layer representation (Source: image created by Author)

A layer is a class that receives some parameters, passes them through state and
computations, and passes out an output, as required by the neural network. Every model
architecture contains multiple layers, be it a Sequential or a Functional API.

State — Mostly trainable features which are trained during ‘model.fit’. In a Dense layer,
the states constitute the weights and the bias, as shown in Figure 1. These values are
updated to give better results as the model trains. In some layers, the state can also
contain non-trainable features.

Computation — Computation helps in transforming a batch of input data into a batch of


output data. In this part of the layer, the calculation takes place. In a Dense layer, the
computation does the following computation —

Y = (w*X+c), and returns Y.

Y is the output, X is the input, w = weights, c = bias.

Creating a custom Dense Layer:


Now that we know what happens inside Dense layers, let’s see how we can create our
own Dense layer and use it in a model.

https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 2/11
8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science

Get started Open in app


import tensorflow as tf

from tensorflow.keras.layers import Layer

class SimpleDense(Layer):

def __init__(self, units=32):

'''Initializes the instance attributes'''

super(SimpleDense, self).__init__()

self.units = units

def build(self, input_shape):

'''Create the state of the layer (weights)'''

# initialize the weights

w_init = tf.random_normal_initializer()

self.w = tf.Variable(name="kernel",
initial_value=w_init(shape=(input_shape[-1], self.units),

dtype='float32'),trainable=True)

# initialize the biases

b_init = tf.zeros_initializer()

self.b = tf.Variable(name="bias",initial_value=b_init(shape=
(self.units,), dtype='float32'),trainable=True)

def call(self, inputs):

'''Defines the computation from inputs to outputs'''

return tf.matmul(inputs, self.w) + self.b

Explanation of the code above — The class is named SimpleDense. When we create a
custom layer, we have to inherit Keras’s layer class. This is done in the line ‘class
SimpleDense(Layer)’.

‘__init__’ is the first method in the class that will help to initialize the class. ‘init’ accepts
parameters and converts them to variables that can be used within the class. This is
inheriting from the ‘Layer’ class and hence requires some initialization. This
initialization is done using the ‘super’ keyword. ‘units’ is a local class variable. This is
analogous to the number of units in the Dense layer. The default value is set to 32, but
can always be changed when the class is called.

https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 3/11
8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science

‘build’ is the next method in the class. This is used to specify the states. In the Dense
Get started Open in app
layer, the two states required are ‘w’ and ‘b’, for weights and biases. When the Dense
layer is being created, we are not just creating one neuron of the network’s hidden layer,
but multiple neurons at one go (in this case 32 neurons will be created). Every neuron in
the layer needs to be initialized and given some random weight and bias values.
TensorFlow contains many built-in functions to initialize these values.

For initializing the weights we use the ‘random_normal_initializer’ function from


TensorFlow, which will initialize weights randomly using a normal distribution. ‘self.w’
contains the states of the weights in the form of a tensor variable. These states will
initialize using ‘w_init’. The value contained as weights will be in the ‘float_32’ format. It
is set to ‘trainable’, which means after every run, these initial weights will be updated in
accordance with the loss function and optimizer. The name ‘kernel’ is added so that it
can be easily traced later.

For initializing the biases, TensorFlow’s ‘zeros_initializer’ function is used. This sets all
the initial bias values to zero. ‘self.b’ is a tensor with a size same as the size of the units
(here 32), and each of these 32 bias terms are set to zero initially. This is also set to
‘trainable’, so the bias terms will update as training starts. The name ‘bias’ is added to be
able to trace it later.

‘call’ is the last method that performs the computation. In this case, as it is a Dense layer,
it multiplies the inputs with the weights, adds the bias, and finally returns the output.
The ‘matmul’ operation is used as self.w and self.b are tensors and not single numerical
values.

# declare an instance of the class

my_dense = SimpleDense(units=1)

# define an input and feed into the layer

x = tf.ones((1, 1))

y = my_dense(x)

# parameters of the base Layer class like `variables` can be used

print(my_dense.variables)

https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 4/11
8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science

Output:
Get started Open in app

[<tf.Variable 'simple_dense/kernel:0' shape=(1, 1) dtype=float32,


numpy=array([[0.00382898]], dtype=float32)>,

<tf.Variable 'simple_dense/bias:0' shape=(1,) dtype=float32,


numpy=array([0.], dtype=float32)>]

Explanation of the code above — The first line creates a Dense layer containing just one
neuron (unit =1). x (input) is a tensor of shape (1,1) with the value 1. Y =
my_dense(x), helps initialize the Dense layer. ‘.variables’ helps us to look at the values
initialized inside the Dense layers (weights and biases).

The output of ‘my_dense.variable’ is shown below the code block. It shows that there are
two variables in ‘simple_dense’ called ‘kernel’ and ‘bias’. The kernel ‘w’ is initialized a
value 0.0038, a random normal distribution value, and the bias ‘b’ is initialized with the
value 0. This is just the initial state of the layer. Once trained, these values will change
accordingly.

import numpy as np

# define the dataset

xs = np.array([-1.0, 0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)

ys = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype=float)

# use the Sequential API to build a model with our custom layer

my_layer = SimpleDense(units=1)

model = tf.keras.Sequential([my_layer])

# configure and train the model

model.compile(optimizer='sgd', loss='mean_squared_error')
model.fit(xs, ys, epochs=500,verbose=0)

# perform inference

print(model.predict([10.0]))

# see the updated state of the variables

https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 5/11
8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science

print(my_layer.variables)
Get started Open in app

Output:

[[18.981567]]

[<tf.Variable 'sequential/simple_dense_1/kernel:0' shape=(1, 1)


dtype=float32, numpy=array([[1.9973286]], dtype=float32)>,

<tf.Variable 'sequential/simple_dense_1/bias:0' shape=(1,)


dtype=float32, numpy=array([-0.99171764], dtype=float32)>]

Explanation of the code above —The code used above is a very simple way to check if
the custom layers work. Input and output are set, and the model is compiled using the
custom layer and finally trained for 500 epochs. What is important to see is that after
training the model, the values of the weights and biases have now changed. The weight
which was initially set as 0.0038 is now 1.9973, and the bias which was initially set as
zero is now -0.9917.

Adding an Activation Function to the Custom Dense Layer:


Previously we created the custom Dense layer but we did not add any activations along
with this layer. Of course to add activation we can just write the activation as a separate
line in the model, or add the activation as a Lambda layer. But how do we implement the
activation in the same custom layer that we created above.

The answer is a simple tweak in the ‘__init__’ and the ‘call’ methods in the custom Dense
layer.

class SimpleDense(Layer):

# add an activation parameter

def __init__(self, units=32, activation=None):

super(SimpleDense, self).__init__()

https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 6/11
8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science

self.units = units

Get started
Open in app
# define the activation to get from the built-in activation
layers in Keras

self.activation = tf.keras.activations.get(activation)

def build(self, input_shape):

w_init = tf.random_normal_initializer()

self.w = tf.Variable(name="kernel",

initial_value=w_init(shape=(input_shape[-1],
self.units),dtype='float32'),trainable=True)

b_init = tf.zeros_initializer()

self.b = tf.Variable(name="bias",

initial_value=b_init(shape=(self.units,),
dtype='float32'),trainable=True)

super().build(input_shape)

def call(self, inputs):

# pass the computation to the activation layer

return self.activation(tf.matmul(inputs, self.w) + self.b)

Explanation of the code above — Most of the code is exactly similar to the code that we
used before.

To add the activation we need to specify in the ‘__init__’ that we need an activation.
Either a string or an instance of an activation object can be passed into this activation. It
is set to default as None, so if no activation function is mentioned it will not throw an
error. Next, we have to initialize the activation function as —
‘tf.keras.activations.get(activation)’.

The final edit is in the ‘call’ method where just before the computation of the weights
and the biases we need to add self.activation to activate the computation. So now the
return is the computation along with the activation.

Complete code of Custom Dense layer with Activation on the mnist dataset:

https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 7/11
8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science

Get started Open in app


import tensorflow as tf

from tensorflow.keras.layers import Layer

class SimpleDense(Layer):

def __init__(self, units=32, activation=None):

super(SimpleDense, self).__init__()

self.units = units

# define the activation to get from the built-in activation


layers in Keras

self.activation = tf.keras.activations.get(activation)

def build(self, input_shape):

w_init = tf.random_normal_initializer()

self.w = tf.Variable(name="kernel",

initial_value=w_init(shape=(input_shape[-1],
self.units),dtype='float32'),trainable=True)

b_init = tf.zeros_initializer()

self.b = tf.Variable(name="bias",

initial_value=b_init(shape=(self.units,),
dtype='float32'),trainable=True)

super().build(input_shape)

def call(self, inputs):

# pass the computation to the activation layer

return self.activation(tf.matmul(inputs, self.w) + self.b)

mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()

x_train, x_test = x_train / 255.0, x_test / 255.0

# build the model

model = tf.keras.models.Sequential([

tf.keras.layers.Flatten(input_shape=(28, 28)),

# our custom Dense layer with activation

SimpleDense(128, activation='relu'),

tf.keras.layers.Dropout(0.2),

tf.keras.layers.Dense(10, activation='softmax')

])

https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 8/11
8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science

# compile the model

model.compile(optimizer='adam',

Get started Open in app


loss='sparse_categorical_crossentropy',

metrics=['accuracy'])

# fit the model

model.fit(x_train, y_train, epochs=5)

model.evaluate(x_test, y_test)

Training the model with our custom Dense layer and activation gives a training accuracy
of 97.8% and a validation accuracy of 97.7%.

Conclusion:
This is the way to create custom layers in TensorFlow. Even though we only see the
working of a Dense Layer, this can easily be replaced by any other layers such as a
Quadratic Layer which does the following computation —

It has 3 state variables: a,b and c,

Computation:

Replacing the Dense Layer with a Quadratic layer:

import tensorflow as tf

from tensorflow.keras.layers import Layer

class SimpleQuadratic(Layer):

def __init__(self, units=32, activation=None):

'''Initializes the class and sets up the internal


variables'''

super(SimpleQuadratic,self).__init__()

self.units=units

self.activation=tf.keras.activations.get(activation)

def build(self, input_shape):

https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 9/11
8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science

'''Create the state of the layer (weights)'''

Get started
Open in app
a_init = tf.random_normal_initializer()

a_init_val = a_init(shape=(input_shape[-1],self.units),dtype=
'float32')

self.a = tf.Variable(initial_value=a_init_val,
trainable='true')

b_init = tf.random_normal_initializer()

b_init_val = b_init(shape=(input_shape[-1],self.units),dtype=
'float32')

self.b = tf.Variable(initial_value=b_init_val,
trainable='true')

c_init= tf.zeros_initializer()

c_init_val = c_init(shape=(self.units,),dtype='float32')

self.c =
tf.Variable(initial_value=c_init_val,trainable='true')

def call(self, inputs):

'''Defines the computation from inputs to outputs'''

x_squared= tf.math.square(inputs)

x_squared_times_a = tf.matmul(x_squared,self.a)

x_times_b= tf.matmul(inputs,self.b)

x2a_plus_xb_plus_c = x_squared_times_a+x_times_b+self.c

return self.activation(x2a_plus_xb_plus_c)

mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()

x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([

tf.keras.layers.Flatten(input_shape=(28, 28)),

SimpleQuadratic(128, activation='relu'),

tf.keras.layers.Dropout(0.2),

tf.keras.layers.Dense(10, activation='softmax')

])

model.compile(optimizer='adam',

loss='sparse_categorical_crossentropy',

metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5)

model.evaluate(x_test, y_test)

This Quadratic layer gives a validation accuracy of 97.8% on the mnist dataset.

https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 10/11
8/23/2021 Creating and Training Custom Layers in TensorFlow 2 | by Arjun Sarkar | Towards Data Science

Thus, we see we can implement our own layers along with the desired activation into the
Get started Open in app
TensorFlow models to edit or maybe even improve overall accuracies.

Sign up for The Variable


By Towards Data Science

Every Thursday, the Variable delivers the very best of Towards Data Science: from hands-on tutorials
and cutting-edge research to original features you don't want to miss. Take a look.

Get this newsletter

Towards Data Science TensorFlow Deep Learning Machine Learning Neural Networks

About Write Help Legal

Get the Medium app

https://towardsdatascience.com/creating-and-training-custom-layers-in-tensorflow-2-6382292f48c2 11/11

You might also like