Referring image segmentation with attention guided cross modal fusion for semantic oriented languages

Qianli ZHOU; Rong WANG; Haimiao HU; Quange TAN; Wenjin ZHANG

doi:10.1007/s11704-022-1136-3

Front. Comput. Sci. ›› 2022, Vol. 16 ›› Issue (6) :166342 DOI: 10.1007/s11704-022-1136-3

Artificial Intelligence

LETTER

Referring image segmentation with attention guided cross modal fusion for semantic oriented languages

Author information +

History +

PDF (310KB)

Cite this article

Download citation ▾

Qianli ZHOU, Rong WANG, Haimiao HU, Quange TAN, Wenjin ZHANG. Referring image segmentation with attention guided cross modal fusion for semantic oriented languages. Front. Comput. Sci., 2022, 16 (6) : 166342 DOI:10.1007/s11704-022-1136-3

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Mogadala A , Kalimuthu M , Klakow D . Trends in integration of vision and language research: a survey of tasks, datasets, and methods. Journal of Artificial Intelligence Research, 2021, 71 : 1183– 1317

[2]	Wu Y , Luo X , Yang Z . Semantic separator learning and its applications in unsupervised Chinese text parsing. Frontiers of Computer Science, 2013, 7( 1): 55– 68

[3]	Margffoy-Tuay E, Pérez J C, Botero E, Arbeláez P. Dynamic multimodal instance segmentation guided by natural language queries. In: Proceedings of the 15th European Conference on Computer Vision. 2018, 656–672

[4]	Lei T, Zhang Y, Wang S I, Dai H, Artzi Y. Simple recurrent units for highly parallelizable recurrence. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2018, 4470–4481

[5]	Zhang Y, Lei T. Training RNNs as fast as CNNs. See Openreview.net website. 2018

[6]	Ye L, Rochan M, Liu Z, Wang Y. Cross-modal self-attention network for referring image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019, 10494−10503

[7]	Jadon S. A survey of loss functions for semantic segmentation. In: Proceedings of the IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB). 2020, 1– 7

[8]	Yu L, Poirson P, Yang S, Berg A C, Berg T L. Modeling context in referring expressions. In: Proceedings of the 14th European Conference on Computer Vision. 2016, 69–85

[9]	Mao J, Huang J, Toshev A, Camburu O, Yuille A, Murphy K. Generation and comprehension of unambiguous object descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016, 11– 20

[10]	Kazemzadeh S, Ordonez V, Matten M, Berg T. ReferItGame: referring to objects in photographs of natural scenes. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNL). 2014, 787– 798