Unsupervised image-to-image translation by semantics consistency and self-attention

Zhibin Zhang , Wanli Xue , Guokai Fu

Optoelectronics Letters ›› 2022, Vol. 18 ›› Issue (3) : 175 -180.

PDF
Optoelectronics Letters ›› 2022, Vol. 18 ›› Issue (3) : 175 -180. DOI: 10.1007/s11801-022-0165-3
Article

Unsupervised image-to-image translation by semantics consistency and self-attention

Author information +
History +
PDF

Abstract

Unsupervised image-to-image translation is a challenging task for computer vision. The goal of image translation is to learn a mapping between two domains, without corresponding image pairs. Many previous works only focused on image-level translation but ignored image features processing, which led to a certain semantics loss, such as the changes of the background of the generated image, partial transformation, and so on. In this work, we propose a method of image-to-image translation based on generative adversarial nets (GANs). We use autoencoder structure to extract image features in the generator and add semantic consistency loss on extracted features to maintain the semantic consistency of the generated image. Self-attention mechanism at the end of generator is used to obtain long-distance dependency in image. At the same time, as expanding the convolution receptive field, the quality of the generated image is enhanced. Quantitative experiment shows that our method significantly outperforms previous works. Especially on images with obvious foreground, our model shows an impressive improvement.

Cite this article

Download citation ▾
Zhibin Zhang, Wanli Xue, Guokai Fu. Unsupervised image-to-image translation by semantics consistency and self-attention. Optoelectronics Letters, 2022, 18(3): 175-180 DOI:10.1007/s11801-022-0165-3

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

GoodfellowI, Pouget-AbadieJ, MirzaM, et al.. Generative adversarial nets[C], 2014, 27: 2672-2680

[2]

IsolaP, ZhuJ Y, ZhouT H, et al.. Image-to-image translation with conditional adversarial networks[C], 2017, New York, IEEE: 1125-1134

[3]

ZhuJ Y, ParkT, IsolaP, et al.. Unpaired image-to-image translation using cycle-consistent adversarial networks[C], 2017, New York, IEEE: 2223-2232

[4]

ChenX, JiaC Y. An overview of image-to-image translation using generative adversarial networks[C], 2021, Berlin, Heidelberg, Springer Cham: 366-380

[5]

ZhuJ Y, PhilippK, ShechtmanE, et al.. Generative visual manipulation on the natural image manifold[C], 2016, Berlin, Heidelberg, Springer-Verlag: 597-613

[6]

LarsenA, SønderbyS, LarochelleH, et al.. Autoencoding beyond pixels using a learned similarity metric[C], 2016, New York, ACM: 1558-1566

[7]

ZhaoY, ChenC Y. Unpaired image-to-image translation via latent energy transport[C], 2021, New York, IEEE: 16418-16427

[8]

ChenX Y, XuC, YangX K, et al.. Attention-GAN for object transfiguration in wild images[C], 2018, Heidelberg, Springer-Verlag: 167-184

[9]

ZhangH, GoodfellowI, MetaxasD, et al.. Self-attention generative adversarial networks[C], 2019, New York, ACM: 7354-7363

[10]

CaoJ Z, MoL Y, ZhangY F, et al.. Multi-marginal wasserstein GAN[C], 2019, Cambridge, MIT Press: 1776-178632

[11]

LiuH D, GuX F, SamarasD. Wasserstein GAN with quadratic transport cost[C], 2019, New York, IEEE: 4832-4841

[12]

MaoX D, LiQ, XieH R, et al.. Least squares generative adversarial networks[C], 2017, New York, IEEE: 2794-2802

[13]

HeuselM, RamsauerH, UnterthinerT, et al.. GANs trained by a two time-scale update rule converge to a local Nash equilibrium[C], 2017, 30: 6626-6637

[14]

ZeilerM, FergusR. Visualizing and understanding convolutional networks[C], 2014, Berlin, Heidelberg, Springer-Verlag: 818-833

[15]

HuangX, LiuM Y, BelongieS, et al.. Multimodal unsupervised image-to-image translation[C], 2018, Berlin, Heidelberg, Springer-Verlag: 185-208

[16]

MejjatiY, RichardtC, TompkinJ, et al.. Unsupervised attention-guided image-to-image translation[C], 2018, Cambridge, MIT Press: 3693-370331

AI Summary AI Mindmap
PDF

128

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/