Referring image segmentation with attention guided cross modal fusion for semantic oriented languages

Qianli ZHOU, Rong WANG, Haimiao HU, Quange TAN, Wenjin ZHANG

PDF(310 KB)
PDF(310 KB)
Front. Comput. Sci. ›› 2022, Vol. 16 ›› Issue (6) : 166342. DOI: 10.1007/s11704-022-1136-3
Artificial Intelligence
LETTER

Referring image segmentation with attention guided cross modal fusion for semantic oriented languages

Author information +
History +

Cite this article

Download citation ▾
Qianli ZHOU, Rong WANG, Haimiao HU, Quange TAN, Wenjin ZHANG. Referring image segmentation with attention guided cross modal fusion for semantic oriented languages. Front. Comput. Sci., 2022, 16(6): 166342 https://doi.org/10.1007/s11704-022-1136-3

References

[1]
Mogadala A , Kalimuthu M , Klakow D . Trends in integration of vision and language research: a survey of tasks, datasets, and methods. Journal of Artificial Intelligence Research, 2021, 71 : 1183– 1317
[2]
Wu Y , Luo X , Yang Z . Semantic separator learning and its applications in unsupervised Chinese text parsing. Frontiers of Computer Science, 2013, 7( 1): 55– 68
[3]
Margffoy-Tuay E, Pérez J C, Botero E, Arbeláez P. Dynamic multimodal instance segmentation guided by natural language queries. In: Proceedings of the 15th European Conference on Computer Vision. 2018, 656–672
[4]
Lei T, Zhang Y, Wang S I, Dai H, Artzi Y. Simple recurrent units for highly parallelizable recurrence. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2018, 4470–4481
[5]
Zhang Y, Lei T. Training RNNs as fast as CNNs. See Openreview.net website. 2018
[6]
Ye L, Rochan M, Liu Z, Wang Y. Cross-modal self-attention network for referring image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019, 10494−10503
[7]
Jadon S. A survey of loss functions for semantic segmentation. In: Proceedings of the IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB). 2020, 1– 7
[8]
Yu L, Poirson P, Yang S, Berg A C, Berg T L. Modeling context in referring expressions. In: Proceedings of the 14th European Conference on Computer Vision. 2016, 69–85
[9]
Mao J, Huang J, Toshev A, Camburu O, Yuille A, Murphy K. Generation and comprehension of unambiguous object descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016, 11– 20
[10]
Kazemzadeh S, Ordonez V, Matten M, Berg T. ReferItGame: referring to objects in photographs of natural scenes. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNL). 2014, 787– 798

Acknowledgements

This work was supported in part by National Natural Science Foundation of China (Grant No.62076246).

Supporting Information

The supporting information is available online at journal. hep. com. cn and link. springer. com.

RIGHTS & PERMISSIONS

2022 Higher Education Press
AI Summary AI Mindmap
PDF(310 KB)

Accesses

Citations

Detail

Sections
Recommended

/