Original Feature Preserving Virtual Try-on Network Based on Receptive Field Block

Zhaoyang WANG, Ran TAO, Hailun LU

Journal of Donghua University(English Edition) ›› 2024, Vol. 41 ›› Issue (01) : 28-36.

PDF
Journal of Donghua University(English Edition) ›› 2024, Vol. 41 ›› Issue (01) : 28-36. DOI: 10.19884/j.1672-5220.202211004
Special Topic:Artificial Intelligence on Fashion and Textiles

Original Feature Preserving Virtual Try-on Network Based on Receptive Field Block

Author information +
History +

Abstract

Computer vision-based virtual try-on (VITON) technology refers to warping and composing the try-on clothing according to the model image features into the model image to replace the original clothing parts. Current VITON methods have two main challenges: insufficient preservation of original features such as the head, bottom, and background of the model image; poor matching of the warped try-on clothing to the model image. To solve these two problems, an original feature preserving virtual try-on network ( OFP-VTON ) is proposed, which consists of semantic segmentation map generation, try-on clothing warping, and try-on image synthesis. In the try-on clothing warping phase, the network learns the mapping of warping of the clothing worn in the model image to better constrain the try-on warping. In the try-on image synthesis phase, the original features of the model image are extracted and preserved, and a receptive field block (RFB) is introduced to preserve the features of try-on clothing as much as possible. Qualitative and quantitative experiments on the publicly available VITON dataset show that the proposed OFP-VTON better preserves the original features and that the warped try-on clothing matches the model images better than the baseline method.

Keywords

virtual try-on (VITON) / deep learning / receptive field block(RFB) / original feature preserving

Cite this article

Download citation ▾
Zhaoyang WANG, Ran TAO, Hailun LU. Original Feature Preserving Virtual Try-on Network Based on Receptive Field Block. Journal of Donghua University(English Edition), 2024, 41(01): 28‒36 https://doi.org/10.19884/j.1672-5220.202211004

References

[[1]]
WEI X X, ZHAO J, XU Z B. Present situation and development trend of virtual fitting technology based on deep learning[J]. Journal of Donghua University (Natural Science), 2022, 48(3):131-138. (in Chinese)
[[2]]
GE C J, SONG Y B, GE Y Y, et al. Disentangled cycle consistency for highly-realistic virtual try-on[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Los Alamitos,California:IEEE, 2021:19628-16937.
[[3]]
WANG B C, ZHENG H B, LIANG X D, et al. Toward characteristic-preserving image-based virtual try-on network[C]// Proceedings of the European Conference on Computer Vision (ECCV).Berlin:Springer, 2018:589-604.
[[4]]
GUAN P, REISS L, HIRSHBERG D A, et al. Drape:dressing any person[J]. ACM Transactions on Graphics, 2012, 31(4):1-10.
[[5]]
SEKINE M, SUGITA K, PERBET F, et al. Virtual fitting by single-shot body shape estimation[C]// Proceedings of the International Conference on 3D Body Scanning Technologies.Lugano:[s.n.], 2014:406-413.
[[6]]
PONS-MOLL G, PUJADES S, HU S, et al. ClothCap:seamless 4D clothing capture and retargeting[J]. ACM Transactions on Graphics, 2017, 36(4):1-15.
[[7]]
HAN X T, WU Z X, WU Z, et al. VITON:an image-based virtual try-on network[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2018:7543-7552.
[[8]]
JEAN D. Splines minimizing rotation-invariant semi-norms in Sobolev spaces[M]//Constructive Theory of Functions of Several Variables. Berlin: Springer, 1977:85-100.
[[9]]
NEUBERGER A, BORENSTEIN E, HILLELI B, et al. Image based virtual try-on network from unpaired data[C]// Proceedings of the Computer Vision and Pattern Recognition. New York: IEEE, 2020:5184-5193.
[[10]]
CHOI S, PARK S, LEE M, et al. VITON-HD:high-resolution virtual try-on via misalignment-aware normalization[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Los Alamitos,California:IEEE, 2021:14131-14140.
[[11]]
HONDA S. VITON-GAN:virtual try-on image generator trained with adversarial loss[EB/OL].(2019-11-12)[2022-11-01]. https://arxiv.org/abs/1911.07926.
[[12]]
YANG H, ZHANG R M, GUO X B, et al. Towards photo-realistic virtual try-on by adaptively generating-preserving image content[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Los Alamitos,California:IEEE, 2020:7850-7859.
[[13]]
DAI J, QI H, XIONG Y, et al. Deformable convolutional networks[C]// Proceedings of the IEEE International Conference on Computer Vision. New York: IEEE, 2017:764-773.
[[14]]
LIU S T, HHUANG D. Receptive field block net for accurate and fast object detection[C]// Proceedings of the European Conference on Computer Vision (ECCV).Berlin:Springer, 2018:385-400.
[[15]]
GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks[J]. Advances in Neural Information Processing Systems, 2014, 3:2672-2680.
[[16]]
MIRZA M, OSINDERO S. Conditional generative adversarial nets[EB/OL].(2014-11-06)[2022-10-30]. http://arXiv.org/abs/1411.1784.
[[17]]
REED S, AKATA Z, YAN X C, et al. Generative adversarial text to image synthesis[C]// International Conference on Machine Learning. New York: ACM, 2016:1060-1069.
[[18]]
TALASILA V, NARASINGARAO M R. BI-LSTM based encoding and GAN for text-to-image synthesis[J]. Sensing and Imaging, 2022, 23(1):1-17.
[[19]]
ISOLA P, ZHU J Y, ZHOU T H, et al. Image-to-image translation with conditional adversarial networks[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2017:1125-1134.
[[20]]
SHAO X N, ZHANG W D. SPatchGAN:a statistical feature based discriminator for unsupervised image-to-image translation[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision.Los Alamitos,California:IEEE, 2021:6546-6555.
[[21]]
BAEK K, CHOI Y, UH Y, et al. Rethinking the truly unsupervised image-to-image translation[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision.Los Alamitos,California:IEEE, 2021:14154-14163.
[[22]]
PATHAK D, KRÄHENBÜHL P, DONAHUE J, et al. Context encoders:feature learning by inpainting[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2016:2536-2544.
[[23]]
RONNEBERGER O, FISCHER P, BROX T. U-net:convolutional networks for biomedical image segmentation[C]// International Conference on Medical Image Computing and Computer-Assisted Intervention.Berlin:Springer, 2015:234-241.
[[24]]
ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]// Proceedings of the IEEE International Conference on Computer Vision. New York: IEEE, 2017:2223-2232.
[[25]]
BROCK A, DONAHUE J, SIMONYAN K. Large scale GAN training for high fidelity natural image synthesis[EB/OL].(2018-9-28)[2022-10-30]. http:/arxiv.org/abs/1809.11096.
[[26]]
JETCHEV N, BERGMANN U. The conditional analogy GAN:swapping fashion articles on people images[C]// Proceedings of the IEEE International Conference on Computer Vision Workshops. New York: IEEE, 2017:2287-2292.
[[27]]
ISSENHUTH T, MARY J, CALAUZèNES C. End-to-end learning of geometric deformations of feature maps for virtual try-on[EB/OL].(2019-06-04)[2022-10-30]. http://arxiv.org/abs/1906.01347.
[[28]]
RAJ A, SANGKLOY P, CHANG H W, et al. SwapNet:image based garment transfer[C]// European Conference on Computer Vision.Berlin:Springer, 2018:679-695.
[[29]]
MINAR M R, TUAN T, AHN H, et al. CP-VTON+:clothing shape and texture preserving image-based virtual try-on[C]// CVPR Workshops.Seattle:[s.n.], 2020.
[[30]]
HAN X T, HU X J, HUANG W L, et al. ClothFlow:a flow-based model for clothed person generation[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision.Los Alamitos,California:IEEE, 2019:10471-10480.
[[31]]
SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL].(2014-09-04)[2022-10-30]. http://arxiv.org/abs/1409.1556.
[[32]]
KINGMA D P, BA J. Adam:A method for stochastic optimization[EB/OL].(2014-12-22)[2022-11-01]. http://arxiv.org/abs/1412.6980.
[[33]]
SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2016:2818-2826.
Funding
National Key Research and Development Program of China(2020YFB1707700); Fundamental Research Funds for the Central Universities, China(20D111201)
PDF

Accesses

Citations

Detail

Sections
Recommended

/