Professional Documents
Culture Documents
4158-Article Text-10648-5-10-20230225
4158-Article Text-10648-5-10-20230225
1
Informatics Engineering; Universitas 17 Agustus 1945 Abstract
Surabaya; Jl. Semolowaru No. 45 Surabaya; (031)
5931800; e-mail: bagushardiansyah@untag-sby.ac.id, We proposed an Enhanced Face Image Generative
elvianto.evh@untag-sby.ac.id Adversarial Network (EFGAN). Single image super-
resolution (SISR) using a convolutional is often a problem
Submitted : 17/01/2022 in enhancing more refined texture upscaling factors. Our
Revised : 27/01/2022 approach focused on mean square error (MSE),
Accepted : 25/02/2022 validation peak-signal-to-noise ratio (PSNR),
Published : 26/03/2022
and Structural Similarity Index (SSIM). However, the
peak-signal-to-noise ratio has a high value to detail. The
generative Adversarial Network (GAN) loss function
optimizes the super-resolution (SR) model. Thus, the
generator network is developed with skip connection
architecture to improve performance feature distribution.
1. Introduction
2. Research Method
Single image SR methods can be divided into three categories: interpolation,
reconstruction, and example learning. The example learning by (Hardiansyah, 2021)
have achieved explosive development. However, we focus on discussing example-
based algorithm for better performance.
Recently, the previous works of GAN are one of the most common methods for
SR (Christian et al., 2016). The discriminative network of GAN methods generates HR
images to perform sharper details than other models (Emily et al., 2015). Furthermore,
reconstructed to perform detailed images with refined texture, (Christian et al., 2016)
presented a deep residual model. This perceptual loss is adversarial based on high-
frequency element mapping of the VGG (Karen et al., 2015).
SISR aims to predict the patch mapping HR output 𝐼 𝐻𝑅 from LR input image
𝐼 𝐿𝑅 . The downsample 𝐼 𝐿𝑅 to corresponding 𝐼 𝐻𝑅 in general SR approach. (Philip et al.,
2016) conditional generative adversarial networks (Mehdi et al., 2014) approach for
various pixel matrices tasks. 𝐼 𝐿𝑅 to 𝐼 𝐻𝑅 is evaluated as a dependent improvement task,
corresponding 𝐼 𝐿𝑅 to generate 𝐼 𝐻𝑅 . Then, EFGAN is proposed to optimize the space
of the networks in our model.
2.1. Network Architecture
Architecture generator 𝐺 ∶ 𝑅𝑁𝑥 → 𝑅𝑁𝑦 is entirely convolutional to generate an
HR appropriate to LR. Furthermore, 𝑁𝑥 = 𝐻 × 𝑊 × 𝐶 is dimensions of 𝑥 define 𝐻, 𝑊, 𝐶
pixel matrices image. The dimensionality of connection features in different layers
relate the convolution size kernel of 4 × 4 layer to reduce the element mapping
dimensionality. Framework generator network G shown in figure 1 includes
downsampling and upsampling convolutional layers factor 4. Furthermore, network G
is referred to as size 178×218×3(input) → 178×218×3 (output).
Architecture discriminator D: 𝑅𝑁𝑟𝑦 → 𝑅𝑁𝑟𝑦 , where 𝑅𝑁𝑟𝑦 , the dimensionality (H ×
W × 2C), is cluster generative SR G and appropriate original SR image. Architecture
D is similar G in figure 1. Therefore, there are two essentials between the G and D,
network dimensionality to downscale and upscale layers.
2.2. Learning Loss Function
The characteristic of GAN in training the data into generator G as well as
learning the input x to generate fake G(x) and discriminator D indicates the distribution
to real or fake data. We approach the process to correspond allocation directly to the
matrix’s element. For framework, we use norm 𝐿1 to measure loss function between
generative model G(z) and related element x. Motivated in previous work by (David et
al., 2017), GAN loss function norm 𝐿1 is employed to optimize generator and
discriminator model following equation 1.
𝑦 = 𝐺(𝑥 𝜃𝐷 ) (2)
ℒ𝐷 = ℒ 𝐷𝑟 − ℒ𝐷𝑓 , 𝑓𝑜𝑟 𝜃𝐷
(3)
ℒ𝐺 = ℒ(𝐺(𝑧)) − 𝑥, 𝑓𝑜𝑟 𝜃𝐷
ℒ𝐷 = ℒ𝐷𝑟 − 𝑘𝑡 ℒ𝐷 𝑓
(4)
𝑘𝑡+1 = 𝑘𝑡 + 𝜆𝑘 (𝛾ℒ𝐷𝑟 − ℒ𝐺 )
These equations have two important differences: (1) We are given an input
generator, which is an LR face image, not a random vector sample. We assume
requirement for generating an HR face to our approach contains the generative face
image, (2) We use matrices norm 𝐿1 as the pixel loss function of the generator, as
shown in equation 3.
2.3. Generative Adversarial Network (GAN)
GAN is introduced by (Goodfellow, 2014), in which the artificial algorithm is specifically
used through machine learning. This method constructs images that look original to
human vision and realistic and natural texture. Therefore, the basic concept of GAN is
to train divide two networks, i.e., the generator network produces a face image, and
the discriminator network attempts to distinguish the image generated by a generator
or from the original image to a fake image. Therefore, GAN algorithm requirements
convolution layers set at a 3x3 kernel size and feature 512 kernels are input two dense
blocks and a sigmoid to produce classification prediction.
3. Results and Analysis
3.1. Experiment
We trained with a learning rate parameter of 0.001 and training with CelebA
face dataset (Ziwei et al., 2015). We used NVIDIA T4 for experiment to evaluate
qualitatively and quantitatively.
Datasets CelebA contains more than 200.000 images with 40 attributes. Large
pose variations and backgrounds. Thus, training our proposed model with CelebA
images original size 178 × 218.
3.2. Result
In this section, all datasets CelebA are trained and tested to generate SR. After
validation of qualitative/quantitative, we provide in our work the more details
PSNR/SSIM of generative HR. Our proposed EFGAN method has significant result
qualitative/quantitative for validated PSNR/SSIM values. Pointing out that EFGAN
perform generate best result face images (4×) regardless of face expression, pose,
and other factors.
Table 1. Quantitative Comparisons On The CelebA Dataset
Dataset Name Size Scale Training Set PNSR SSIM
000001 178x218 x4 CelebA 20.92 0.6137
000038 178x218 x4 CelebA 24.40 0.7760
025315 178x218 x4 CelebA 25.64 0.7901
133459 178x218 x4 CelebA 27.47 0.8038
188371 178x218 x4 CelebA 27.89 0.8595
Source: Research Result
4. Conclusion
We used the input dataset to the EFGAN network into generative HR image
size (178 × 218) that develop an evolved network to generate an HR directly (e.g., 32
PIKSEL status is accredited by the Directorate General of Research Strengthening and
Development No. 225/E/KPT/2022 with Indonesian Scientific Index (SINTA) journal-level of S3,
starting from Volume 10 (1) 2022 to Volume 14 (2) 2026.
37
Bagus Hardiansyah, Elvianto Dwi Hartono
× 32). The result display better performance face image SR task, and our proposed
architecture face image SR showed the sharp detail characteristic corresponding to
PSNR/SSIM validation.
Author Contributions
Bagus Hardiansyah proposed the topic; Bagus Hardiansyah and Elvianto Dwi
Hartono conceived models and designed the experiments; Bagus Hardiansyah and
Elvianto Dwi Hartono conceived the optimization algorithms. Bagus Hardiansyah and
Elvianto Dwi Hartono analyzed the result.
Conflicts of Interest
The author declare no conflict of interest.
References
Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A.
Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in neural
information processing systems, 2014, pp. 2672–2680.
J. Jiang, J. Ma, C. Chen, X. Jiang, and Z. Wang. Noise robust face image super-
resolution through smooth sparse representation. IEEE Transactions on
Cybernetics, PP(99):1–12, 2016.
Junyu Wu, Shengyong Ding, Wei Xu, and Hongyang Chao. Deep joint face
hallucination and recognition. arXiv preprint arXiv:1611.08091, 2016.
Shizhan Zhu, Sifei Liu, Chen Change Loy, and Xiaoou Tang. Deep cascaded bi-
network for face hallucination. In European Conference on Computer Vision,
pages 614–630. Springer, 2016.
David Berthelot, Tom Schumm, and Luke Metz. BEGAN: boundary equilibrium
generative adversarial networks. arXiv preprint arXiv:1703.10717, 2017
Hardiansyah, B., Lu, Y. Single image super-resolution via multiple linear mapping
anchored neighborhood regression. Multimed Tools Appl 80, 28713–28730
(2021).
Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew P. Aitken,
Alykhan Tejani, Johannes Totz, Zehan Wang, and Wenzhe Shi. Photo-realistic
single image super-resolution using a generative adversarial network. arXiv
preprint arXiv:1609.04802, 2016.
Emily L Denton, Soumith Chintala, Rob Fergus, et al. Deep generative image models
using a laplacian pyramid of adversarial networks. In Advances in neural
information processing systems, pages 1486–1494, 2015.
Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-
scale image recognition. International Conference on Learning
Representations(ICLR), 2015.
Junjun Jiang, Chen Chen, Jiayi Ma, Zheng Wang, Zhongyuan Wang, and Ruimin Hu.
Srlsp: A face image super-resolution algorithm using smooth regression with
local structure prior. IEEE Transactions on Multimedia, 19(1):27–40, 2017.
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. Image-to-image
translation with conditional adversarial networks. CoRR, abs/1611.07004,
2016.
Mehdi Mirza and Simon Osindero. Conditional generative adversarial nets. arXiv
preprint arXiv:1411.1784, 2014.
Martin Arjovsky, Soumith Chintala, and Lon Bottou. Wasserstein gan. arXiv preprint
arXiv:1701.07875, 2017.
Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes
in the wild. In Proceedings of the IEEE International Conference on Computer
Vision, pages 3730–3738, 2015
W. Shi et al., “Real-Time Single Image and Video Super-Resolution Using an Efficient
Sub-Pixel Convolutional Neural Network,” pp. 1–10, 2016