Image Generator - Drawing Cartoons With Generative Adversarial Networks PDF

11/11/2019 Image Generator - Drawing Cartoons with Generative Adversarial Networks
Image Generator - Drawing Cartoons

with Generative Adversarial Networks
Generating Simpsons with DCGANs
Greg Surma Follow

Feb 10 · 8 min read
In today’s article, we are going to implement a machine learning model that can
generate an infinite number of alike image samples based on a given dataset. In order to
do so, we are going to demystify Generative Adversarial Networks (GANs) and feed it
with a dataset containing
Get one characters
more story from ‘The
in your member Simspons’.
preview whenByyou
thesign
endup.
of this article, you
It’s free.
will be familiar with the basics behind the GANs and you will be able to build a
generative model on your own! Sign up with Google
To get a better idea about the GANs’ capabilities, take a look at the following example of
Sign up with Facebook
the Homer Simpson evolution during the training process.
Already have an account? Sign in
https://towardsdatascience.com/image-generator-drawing-cartoons-with-generative-adversarial-networks-45e814ca9b6b 1/16
Fascinating, right?
Let’s dive into some theory to get a better understanding of how it actually works.
Generative Adversarial Networks (GANs)

Let’s start our GAN journey with defining a problem that we are going to solve.
We would like to provide a set of images as an input, and generate samples based on
them as an output.
Input Images -> GAN -> Output Samples
With the following problem definition, GANs fall into the Unsupervised Learning
bucket because we are not going to feed the model with any expert knowledge (like for
example labels in the classification task).
The idea of generating samples based on a given dataset without any human supervision
sounds very promising.
Let’s find out how it is possible with GANs!

Get one more story in your member preview when you sign up. It’s free.
. . .
Sign up with Google
The underlying idea Signbehind GAN is that it contains

up with Facebook
two neural networks that compete against each

other in a zero-sum game framework, i.e. generator

and a discriminator.
Generator
The Generator takes random noise as an input and generates samples as an output. It’s
goal is to generate such samples that will fool the Discriminator to think that it is seeing
real images while actually seeing fakes. We can think of the Generator as a counterfeit.
Discriminator
Discriminator takes both real images from the input dataset and fake images from the
Generator and outputs a verdict whether a given image is legit or not. We can think of
the Discriminator as a policeman trying to catch the bad guys while letting the good guys
free.
Minimax Representation
If we think once again about Discriminator’s and Generator’s goals, we can see that they
are opposing each other. Discriminator’s success is a Generator’s failure and vice-versa.
That is why we can represent GANs framework more like Minimax game framework
rather than an optimization problem.
Sign up with Google

(source: http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture13.pdf)
(source: http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture13.pdf)
GANs are designed to reach a Nash equilibrium at

which each player cannot reduce their cost without
changing the other players’ parameters.
For those of you who are familiar with the Game Theory and Minimax algorithm, this
idea will seem more comprehensible. For those who are not, I recommend you to check
my previous article that covers the Minimax basics.
Tic Tac Toe — Creating Unbeatable AI

Introduction to Minimax Algorithm
towardsdatascience.com
Data Flow and Backpropagation

Get onerepresentation
While Minimax more story in your member
of two preview
adversarial when competing
networks you sign up.with
It’s free.
each other
seems reasonable, we still don’t know how to make them improve themselves to
Sign up with Google
ultimately transform random noise to a realistic looking image.
From random noise to the realistic looking image.
Let’s start with the Discriminator.
It gets both real images and fake ones and tries to tell whether they are legit or not. We,
as the system designers know whether they came from a dataset (reals) or from a
generator (fakes). We can use this information to label them accordingly and perform a
classic backpropagation allowing the Discriminator to learn over time and get better in
distinguishing images. If the Discriminator correctly classifies fakes as fakes and reals as
reals, we can reward it with positive feedback in the form of a loss gradient. If it fails at
its job, it gets negative feedback. This mechanism allows it to learn and get better.
. . .
Now let’s move on to the Generator.
It takes random noise as input and samples the output in order to fool the Discriminator
Getreal
that it’s the oneimage.
more story
Oncein your
the member output
Generator’s previewgoes
when you sign
through theup. It’s free.
Discriminator, we
know the Discriminator’s verdict whether it thinks that it was a real image or a fake one.
Sign up with Google
We can use this information to feed the Generator and perform backpropagation again.
If the Discriminator identifies the Generator’s output as real, it means that the Generator
did a good job and it should be rewarded. On the other hand, if the Discriminator
recognized that it was given aAlready
fake, ithave
means that the Generator
an account? Sign in failed and it should be
punished with negative feedback.
. . .
If you think about it for a while, you’ll realize that with the above approach we’ve tackled
the Unsupervised Learning problem with combining Game Theory, Supervised
Learning and a bit of Reinforcement Learning.
. . .
GAN data flow can be represented as in the following diagram.
(source: https://www.oreilly.com/ideas/deep-convolutional-generative-adversarial-networks-with-
tensor ow)
And with some underlying math.
Sign up with Google
(source: https://medium.com/@jonathan_hui/gan-whats-generative-adversarial-networks-and-its-
application-f39ed278ef09)
I hope you are not scared by the above equations, they will definitely get more
comprehensible as we will move on to the actual GAN implementation.
Image Generator (DCGAN)

As always, you can find the full codebase for the Image Generator project on GitHub.
Everything is contained in a single Jupyter notebook that you can run on a platform of
your choice. For more info about the dataset check simspons_dataset.txt. I encourage
you to check it and follow along.
gsurma/image_generator
DCGAN image generator 🖼. Contribute to gsurma/image_generator
development by creating an account on GitHub.
github.com
. . .
Since we are going to deal with image data, we have to find a way of how to represent it
effectively. It can be achieved with Deep Convolutional Neural Networks, thus the
name - DCGAN.
Model
In our project, we are going to use a well-tested model architecture by Radford et al.,
2015 that can be seen below.
Sign up with Google
You can find my TensorFlow implementation of this model here in the discriminator and
generator functions.
As you can see in the above visualization. Generator and Discriminator have almost the
same architectures, but reflected. We won’t dive deeper into the CNN aspect of this topic
but if you are more curious about the underlying aspects, feel free to check the following
article.
Image Classi er - Cats🐱 vs Dogs🐶

Leveraging Convolutional Neural Networks (CNNs) and Google Colab’s
Free GPU
towardsdatascience.com
Loss Functions
In order for our Discriminator and Generator to learn over time, we need to provide loss
functions that will allow backpropagation to take place.
1 def model_loss(input_real, input_z, output_channel_dim):

2 g_model = generator(input_z, output_channel_dim, True)
3
4 noisy_input_real = input_real + tf.random_normal(shape=tf.shape(input_real),
5 mean=0.0,
6 stddev=random.uniform(0.0, 0.1),
7 dtype=tf.float32)
8
9 d_model_real, d_logits_real = discriminator(noisy_input_real, reuse=False)
10 d_model_fake, d_logits_fake = discriminator(g_model, reuse=True)
11
12
d_loss_real = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=d_logits_real,
13 labels=tf.ones_like(d_mo
14 d_loss_fake = Sign up with Google
tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=d_logits_fake,
15 labels=tf.zeros_like(d_m
16 Sign up with+ Facebook
d_loss = tf.reduce_mean(0.5 * (d_loss_real d_loss_fake))
17 g_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=d_logits_fake,
18 Already have an account? Sign in labels=tf.ones_like(d_model_f
19 return d_loss, g_loss
image_generator_losses.py hosted with ❤ by GitHub view raw
While the above loss declarations are consistent with the theoretic explanations from the
previous chapter, you may notice two extra things:
1. Gaussian noise added to the real input in line 4.
2. One-sided label smoothening for the real images recognized by the Discriminator in
line 12.
You’ll notice that training GANs is notoriously hard because of the two loss functions
(for the Generator and Discriminator) and getting a balance between them is a key to
the good results.
Because of the fact that it’s very common for the Discriminator to get too strong over the
Generator, sometimes we need to weaken the Discriminator and we are doing it with the
above modifications. We’ll cover other techniques of achieving the balance later.
Optimizers
We are going to optimize our models with the following Adam optimizers.
1 def model_optimizers(d_loss, g_loss):

2 t_vars = tf.trainable_variables()
3 g_vars = [var for var in t_vars if var.name.startswith("generator")]
4 d_vars = [var for var in t_vars if var.name.startswith("discriminator")]
5
6 update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
7 gen_updates = [op for op in update_ops if op.name.startswith('generator')]
8
9 with tf.control_dependencies(gen_updates):
10 d_train_opt = tf.train.AdamOptimizer(learning_rate=LR_D, beta1=BETA1).minimize(d_loss, v
11
Get one more story
g_train_opt
in your member preview when you
= tf.train.AdamOptimizer(learning_rate=LR_G,
sign up. It’s free.
beta1=BETA1).minimize(g_loss, v
12 return d_train_opt, g_train_opt
Sign up with Google
image_generator_optimizers.py hosted with ❤ by GitHub view raw
Similarly to the declarations of the loss

Already functions,
have we Sign
an account? can also
in balance the Discriminator
and the Generator with appropriate learning rates.
LR_D = 0.00004
LR_G = 0.0004
BETA1 = 0.5
As the above hyperparameters are very use-case specific, don’t hesitate to tweak them
but also remember that GANs are very sensitive to the learning rates modifications so
tune them carefully.
Training
Finally, we can begin training.
1 def train(get_batches, data_shape, checkpoint_to_load=None):

2 input_images, input_z, lr_G, lr_D = model_inputs(data_shape[1:], NOISE_SIZE)
3 d_loss, g_loss = model_loss(input_images, input_z, data_shape[3])
4 d_opt, g_opt = model_optimizers(d_loss, g_loss)
5
6 with tf.Session() as sess:
7 sess.run(tf.global_variables_initializer())
8 epoch = 0
9 iteration = 0
10 d_losses = []
11 g_losses = []
12
13 for epoch in range(EPOCHS):
14 epoch += 1
15 start_time = time.time()
16
17 for batch_images in get_batches:
18 iteration += 1
19 batch_z = np.random.uniform(-1, 1, size=(BATCH_SIZE, NOISE_SIZE))
20 _ = sess.run(d_opt, feed_dict={input_images: batch_images, input_z: batch_z, lr_D
21 _ = sess.run(g_opt, feed_dict={input_images: batch_images, input_z: batch_z, lr_G
22 Get one more story in your member previewbatch_z,
d_losses.append(d_loss.eval({input_z: when you sign up. It’s
input_images: free.
batch_images}))
23 g_losses.append(g_loss.eval({input_z: batch_z}))
24
Sign up with Google
25 summarize_epoch(epoch, time.time()-start_time, sess, d_losses, g_losses, input_z, dat

image_generator_training.py hosted with ❤ by GitHub view raw
Above function contains a standard machine learning training protocol. We are dividing
our dataset into batches of a specific size and performing training for a given number of
epochs.
The core training part is in lines 20–23 where we are training Discriminator and
Generator. Same as with the loss functions and learning rates, it’s also a possible place to
balance the Discriminator and the Generator. Some researchers found that modifying
the ratio between Discriminator and Generator training runs may benefit the results. In
my case 1:1 ratio performed the best but feel free to play with it as well.
Moreover, I have used the following hyperparameters but they are not written in stone,
so don’t hesitate to modify them.
IMAGE_SIZE = 128
NOISE_SIZE = 100
BATCH_SIZE = 64
EPOCHS = 300
It’s very important to regularly monitor model’s loss functions and its performance. I
recommend to do it every epoch, like in the code snippet above. Let’s see some samples
that were generated during training.
We can clearly see that our model gets better and learns how to generate more real-
looking Simpsons.
Let’s focus on the main character, the man of the house, Homer Simpson.
Sign up with Google
Epoch 0 Random noise
Epoch 5 Yellow color
Epoch 15 Head shape
Epoch 50 Brown beard
Epoch 100 Mouth
Sign up with Google
Sign up
Epoch with
200 EyeFacebook
balls
Epoch 250 Head shape
Epoch 300 Slightly smiling Homer 🙂
Homer Simpson evolving over time
Final Results
Ultimately, after 300 epochs of training that took about 8 hours on NVIDIA P100
(Google Cloud), we can see that our artificially generated Simpsons actually started
looking like the real ones! Take a look at the following cherry-picked samples.
Sign up with Google
As expected, there were some funny-looking malformed faces as well.
What’s next?
While GAN image generation proved to be very successful, it’s not the only possible
application of the Generative Adversarial Networks. For example, take a look at the
following Image-to-Image translation with CycleGAN.
Sign up with Google
Already
(source: have an account? Sign in
https://junyanz.github.io/CycleGAN/)
Amazing, right?
I encourage you to dive deeper into the GANs field as there is still more to explore!
. . .
Don’t forget to check the project’s github page.
gsurma/image_generator
DCGAN image generator 🖼. Contribute to gsurma/image_generator
development by creating an account on GitHub.
github.com
. . .
Questions? Comments? Feel free to leave your feedback in the comments section or
contact me directly at https://gsurma.github.io.
And don’t forget to 👏 if you enjoyed this article 🙂.
Sign up with Google
Machine Learning Arti cial Intelligence Data Science Programming Technology
About Help Legal
Sign up with Google

Image Generator - Drawing Cartoons With Generative Adversarial Networks PDF

Uploaded by

Copyright:

Available Formats

You might also like

Image Generator - Drawing Cartoons With Generative Adversarial Networks PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Image Generator - Drawing Cartoons With Generative Adversarial Networks PDF

Uploaded by

Copyright:

Available Formats

11/11/2019 Image Generator - Drawing Cartoons with Generative Adversarial Networks

Image Generator - Drawing Cartoons

Greg Surma Follow

Generative Adversarial Networks (GANs)

Input Images -> GAN -> Output Samples

Let’s find out how it is possible with GANs!

The underlying idea Signbehind GAN is that it contains

two neural networks that compete against each

other in a zero-sum game framework, i.e. generator

Sign up with Google

Sign up with Facebook

Already have an account? Sign in

GANs are designed to reach a Nash equilibrium at

Tic Tac Toe — Creating Unbeatable AI

Data Flow and Backpropagation

Sign up with Facebook

Already have an account? Sign in

From random noise to the realistic looking image.

Let’s start with the Discriminator.

Now let’s move on to the Generator.

GAN data flow can be represented as in the following diagram.

And with some underlying math.

Sign up with Google

Sign up with Facebook

Already have an account? Sign in

Image Generator (DCGAN)

Sign up with Google

Sign up with Facebook

Already have an account? Sign in

Image Classi er - Cats🐱 vs Dogs🐶

1 def model_loss(input_real, input_z, output_channel_dim):

image_generator_losses.py hosted with ❤ by GitHub view raw

1. Gaussian noise added to the real input in line 4.

1 def model_optimizers(d_loss, g_loss):

Sign up with Facebook

Similarly to the declarations of the loss

1 def train(get_batches, data_shape, checkpoint_to_load=None):

Sign up with Facebook

Already have an account? Sign in

Sign up with Facebook

Already have an account? Sign in

Epoch 0 Random noise

Epoch 5 Yellow color

Epoch 15 Head shape

Epoch 50 Brown beard

Epoch 100 Mouth

Sign up with Google

Already have an account? Sign in

Epoch 250 Head shape

Epoch 300 Slightly smiling Homer 🙂

Homer Simpson evolving over time

Sign up with Google

Sign up with Facebook

Already have an account? Sign in

As expected, there were some funny-looking malformed faces as well.

Sign up with Google

Sign up with Facebook

Don’t forget to check the project’s github page.