Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

SUMMARY

Progressive growing of GNA’s for improved quality and


variation:

GAN- Generative Adversial Networks


The current most prominent approaches for producing novel
samples from high-dimensional data distributions are:
• Autoregressive models- Pixel CNN - limits the applicability
• Variational Auto Encodes (VAE’s) – produces blurry images due to
restrictions in models
• Generative Adversal Networks (GAN’s) – produces sharp images in fairly
small resolutions

GAN:
➢ Typically, GAN consists of two networks-Generator and discriminator.
Here, generator produces an image(samples) from a latent code i.e. it
is certain to be of main interest and discriminator is an adaptive loss
function that gets disregarded once the generator has been trained.
➢ The distance between training generation and distributed generation
are calculated through many formulation but the current usage is of
improvised Wasserstein loss along with experiments on least-square
loss.
➢ The evaluation of contribution is done on CELEBA, LSUN, CIFAR10
datasets. CELEBA dataset has been improvised to allow
experimentation output resolutions upto 10242 pixels.
➢ The training methodology for GAN’s is done by starting with low-
resolution images and then progressively increase resolution by
adding layers to the networks. This helps to stabilize training
sufficiently to realiably synthesize mega-pixel scale images using
WGAN-GP and LSGAN loss.
➢ The training of a GAN starts with two networks having low spatial
resolutions of 4*4 pixels and layers are added by training the
G(Generator) and D(Discriminator)at each layers throughout the
process until a resolution of 1024*1024 is obtained.
➢ Since GAN’s have a tendency to capture only subset of variation sound
in training data, ”minibatch discrimination” is considered as solution
i.e; a minibatch layers is added at the end of discriminator inorder to
improve the variation.
➢ Alternate solution could be “repelling regularize”
➢ The normalization is generator and discriminator is done in two ways:
“Equalized learning rate” and “Pixelwise feature vector normalization
in generator”
➢ A single Laplacian pyramid analyses the similarities between local
images patches by representations of generated and targeted images
starting at 16*16 pixels resolutions.
➢ Sliced Wasserstein distance (SWD) and multi-scale structural similarity
(MS-SSLM) are used to evaluate importance of contributions by
building on top of a previous state-of-the-art loss function (WGAN-GP)
and training configuration in an unsupervised setting using CELEBA
and LSUN BEDROOM datasets in 1282 pixels resolution.
➢ CELEBA dataset resulted in high 1024*1024 pixel resolution, whereas
the best inception scores for CIFAR10 were 7.90 for unsupervised and
8.87 for label conditioned setups and resulted after removing ‘ghosts’
that appear between classes in unsupervised setting was found to be
8.80.

You might also like