Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

GRD Journals- Global Research and Development Journal for Engineering | Volume 6 | Issue 6 | May 2021

ISSN- 2455-5703

Image Based Virtual Try on Network


Murale C Mohammed Marzuk Ali S P
Assistant Professor Student
Department of Information Technology Department of Information Technology
Coimbatore Institute of Technology, Coimbatore, Tamil Coimbatore Institute of Technology, Coimbatore, Tamil
Nadu, India Nadu, India

Nikesh C Sridhar S
Student Student
Department of Information Technology Department of Information Technology
Coimbatore Institute of Technology, Coimbatore, Tamil Coimbatore Institute of Technology, Coimbatore, Tamil
Nadu, India Nadu, India

Abstract
Image-based garment transfer systems aim to swap the specified garments from a listing to impulsive users. However, existing
works cannot offer the capability for users to undertake on varied fashion articles (e.g., tops, pants or both) in keeping with their
needs. During this paper, we tend to propose associate degree Image-based Virtual fitting Network (I-VTON) that permits the user
to undertake on impulsive garments from the image in a very selective manner. To comprehend the property, we tend to reshape
the virtual fitting as a task of image inpainting. Firstly, the feel from the garment and also the user are extracted severally to make
a rough coarse result. During this part, users will decide that garments they hope to undertake on via associate degree interactive
texture management mechanism. Secondly, the missing regions within the coarse result are recovered via a Texture Inpainting
Network (TIN). We tend to introduce a triplet coaching strategy to make sure the naturalness of the ultimate result. Qualitative and
quantitative experimental results demonstrate that I-VTON outperforms the progressive strategies on each the garment details and
also the user identity. It’s additionally confirmed our approach will flexibly transfer the garments in a very selective manner.
Keywords- Machine Learning; Virtual Try on Network

I. INTRODUCTION
Modern world accept on-line searching. Recent years have witnessed the increasing demands of on-line buying fashion things.
Shopping for clothes on-line is increasing. Existing services in on-line garment searching has not offered most satisfaction as it
did not provide virtual attempt. Thus, permitting customers to just about don garments won't solely enhance their searching
expertise, reworking the means folks buy garments, however conjointly save price for retailers. Image based mostly garment
transfer systems is planned to swap the required cloths in selective manner.
Image visual run aims at transferring a target consumer goods image onto reference person and has become a hot topic in
recent years. Previous arts sometimes specialize in conserving the character of a consumer goods image (e.g. texture, badge,
embroidery) once distortion it to arbitrary human create. But it remains a giant challenge to get photo-realistic run pictures once
massive occlusions and human poses square measure conferred within the reference person. To deal with this issue, we tend to
propose a completely unique visual run network specifically adaptational Content Generating and conserving Network (ACGPN).
Above all, ACGPN initial predicts linguistics layout of the reference image that may be modified once run then determines whether
or not its image content must be generated or preserved per the predicted semantic layout, leading to photo-realistic try-on and rich
clothing details.

II. RELATED WORK

A. Generative Adversarial Networks


Image synthesis and manipulation have benefited greatly from the use of Generative Adversarial Networks (GAN). A GAN is
made up of two parts: a generator and a discriminator. To deceive the discriminator, the generator learns to produce realistic
images, while the discriminator learns to differentiate the synthesised images from the actual ones. Because of GAN's powerful
capabilities, it is widely used for tasks like style transfer, image inpainting, and image editing. GAN's wide range of applications
demonstrates its dominance in image synthesis.

All rights reserved by www.grdjournals.com 34


Image Based Virtual Try on Network
(GRDJE/ Volume 6 / Issue 6 / 006)

B. Fashion Analysis and Synthesis


Due to its significant potential in real-world applications, fashion-related tasks have recently received a lot of attention. Clothing
compatibility and matching learning, clothing landmark recognition, and fashion image analysis are the subjects of the majority of
the current research. Virtual try-on is one of the most difficult tasks in fashion research.

C. Virtual Try-on
Even before the revival of deep learning, virtual try-on was a common subject [51, 7, 40, 13]. Virtual try-on has gotten a lot of
attention in recent years, due to the advancements in deep neural networks. It has a lot of potential in a lot of real-world applications.
Existing deep learning-based virtual tryon methods can be divided into three categories: 3D model-based approaches and 2D
image-based approaches, with the latter being further divided into whether or not to hold the pose. A multipose directed picture
based virtual try-on network is presented by Dong et al many current try-on approaches, like our ACGPN, concentrate on the
challenge of maintaining posture and identity. To generate a clothed individual, methods like VITON and CP-VTON use a coarse
human shape and pose map as input. To synthesise a clothed human, methods like SwapGAN, SwapNet and VTNFP use semantic
segmentation as an input. A summary of some representative strategies is presented in Table 1. VITON uses a Thin-Plate Spline
(TPS) based warping process to deform the in-shop clothes first, then use a composition mask to map the texture to the refined
result. CP-VTON has a similar structure to VITON, but instead of using image descriptors, it uses a neural network to learn the
transformation parameters of TPS warping, resulting in more precise alignment performance. Since CP-VTON and VITON only
concentrate on the clothing, the bottom clothes and posture specifics are coarse and fuzzy. VTNFP solves this problem by
concatenating high-level features derived from body parts and bottom clothing, resulting in better results than CP-VTON and
VITON. However, since it lacks the semantic layout of the reference image, fuzzy body parts and objects still abound in the results.

III. IMAGE BASED VIRTUAL TRY ON


The system is made up of five modules
1) Input Module
2) Semantic Segmentation Module
3) Pose Detection Module
4) Cloth Warping Module
5) Content Fusion Module

A. Input Module
The input module gets the data from the VTON datasets and rescaling all train and test images into same pixel of 256*192.
Importing all the packages for the PyTorch environment. The system reads data from a dataset and scaling is performed in this
module. This input is passed to semantic segmentation module.

B. Semantic Segmentation Module


Algorithm employed in this module is PSPNet and its constituent classifiers were able to capture the context of the total image. .
The two-stage SGM will function as core element for correct understanding of body-parts and garments layouts in virtual test and
guiding the adjustive conserving of image content by composition.
– The fitting mask generation module initial synthesize the masks of the body elements MS ω (ω = (h:head, a:arms, b:bottom
clothes)), that helps to adaptively preserve body elements rather than coarse feature within the resulting steps.
– In second stage, resultant masks are combined to come up with synthesized mask of the garments.

Fig. 1: Semantic Segmentation

All rights reserved by www.grdjournals.com 35


Image Based Virtual Try on Network
(GRDJE/ Volume 6 / Issue 6 / 006)

C. Pose Detection Module


Algorithm used in this module is PoseNet which is a Pre-Trained model. Pose estimation is a computer vision technique used to
predict the position/pose of body parts or joint positions of a person. Pose estimation happens in 2 phase:
– First, through CNN an RBG image is fed as an input.
– After that, single or multi-pose model is applied to decode pose, key point position and its accuracy.

Fig. 2: Plotting Predicted Points in the image

D. Cloth Warping Module


Clothes deformation aims to suit the garments into the form of target article of clothing region with visually natural deformation
in keeping with estimated pose of human. Training along with spatial Transformation Network, that cannot make sure the precise
transformation particularly once handling exhausting cases (i.e. the garments with advanced texture and wealthy colors).Processing
the result after applying Thin-Plate Spline (TPS) can fully bring back the character of the target clothes.

E. Content Fusion Module


Content Fusion Module (CFM) first produces the composited body part mask using the original clothing mask, the synthesized
clothing mask, the body part mask, and the synthesized body part mask. Exploits a fusion network to generate the try-on images
by utilizing the information from cloth warping result (i.e., target cloth mask), and body part image from previous modules.

Fig. 3: Content Fusion

All rights reserved by www.grdjournals.com 36


Image Based Virtual Try on Network
(GRDJE/ Volume 6 / Issue 6 / 006)

IV. RESULT AND DISCUSSION


We use Structural Similarity (SSIM) to assess the similarity of synthesised and ground-truth images, as well as Inception Score
(IS) to assess the visual consistency of synthesised images. Higher scores on both metrics mean that the findings are of higher
quality.
ACGPN, on the other hand, does a much better job at maintaining both the character of the clothing and the details about
the body parts. It prevents Logo distortion and realises character preservation, making the warping process more stable to maintain
texture and embroideries, thanks to the proposed second-order spatial transformation restriction in CWM.

V. CONCLUSION
Image-put together Virtual Try-With respect to Network (I-VTON) empowers a client to take a stab at various garments in a
specific way. This work propose a novel versatile substance creating and protecting Network, which targets producing photograph
reasonable take a stab at result while safeguarding both the personality of garments and subtleties of human character (act, body
parts, base garments). With the guide of an intuitive surface control instrument, our technique deftly trades the chose garments
without preparing additional organization. Also, validating the adequacy of the image tests and the skin misfortune. The
Quantitative and subjective examinations show that our I-VTON outflanks the baselines for both the piece of clothing subtleties
and the client personality.
The future degree would be accessibility of enormous datasets, the geology is huge and different, mainland dataset can
arise and by and large can prepare the model to make it into widespread article of clothing virtual take a stab at. Aside from design
domain, execution in diversion and gaming would help the gaming business. The calculation can be obstructed with GANs and
GANs produced outfits can be attempted onto the exceptionally picked people and the other way around for showcasing and ads.
Also, expect to track down a more powerful surface extraction way to deal with decrease the disappointment cases in our future
work. We would likewise stretch out our technique to trade more design things like shoes and scarfs.

REFERENCES
[1] P. Isola, J.-Y. Zhu, T. Zhou and A. A. Efros, "Image-to-image translation with conditional adversarial networks", Proc. IEEE Conf. Comput. Vis. Pattern
Recognit., pp. 1125-1134, Jul. 2017.
[2] W. Xian, P. Sangkloy, V. Agrawal, A. Raj, J. Lu, C. Fang, et al., "TextureGAN: Controlling deep image synthesis with texture patches", Proc. IEEE Conf.
Comput. Vis. Pattern Recognit., pp. 8456-8465, Jun. 2018.
[3] H. Zhang, I. Goodfellow, D. Metaxas and A. Odena, "Self-attention generative adversarial networks", arXiv:1805.08318, 2018, [online] Available:
https://arxiv.org/abs/1805.08318.
[4] A. Raj, P. Sangkloy, H. Chang, J. Hays, D. Ceylan and J. Lu, "SwapNet: Image based garment transfer", Proc. Eur. Conf. Comput. Vis, pp. 679-695, 2018.
[5] X. Han, Z. Wu, Z. Wu, R. Yu and L. S. Davis, "VITON: An image-based virtual try-on network", Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp.
7543-7552, Jun. 2018.
[6] B. Wang, H. Zheng, X. Liang, Y. Chen, L. Lin and M. Yang, "Toward characteristic-preserving image-based virtual try-on network", Proc. Eur. Conf.
Comput. Vis., pp. 589-604, 2018.
[7] R. A. Güler, N. Neverova and I. Kokkinos, "Densepose: Dense human pose estimation in the wild", Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp.
7297-7306, Jun. 2018.
[8] J.-Y. Zhu, T. Park, P. Isola and A. A. Efros, "Unpaired image-to-image translation using cycle-consistent adversarial networks", Proc. IEEE Int. Conf.
Comput. Vis., pp. 2223-2232, Oct. 2017.
[9] S. Honda, "VITON-GAN: Virtual try-on image generator trained with adversarial loss", Proc. 40th Annu. Conf. Eur. Assoc. Comput. Graphics (Eurographics),
pp. 1-2, May 2019.
[10] K. He, X. Zhang, S. Ren and J. Sun, "Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification", Proc. IEEE Int. Conf.
Comput. Vis. (ICCV), pp. 1026-1034, Dec. 2015.
[11] M. Omran, C. Lassner, G. Pons-Moll, P. Gehler and B. Schiele, "Neural body fitting: Unifying deep learning and model based human pose and shape
estimation", Proc. Int. Conf. 3D Vis. (3DV), pp. 484-494, Sep. 2018.

All rights reserved by www.grdjournals.com 37

You might also like