Label distribution learning for scene text detection

Haoyu MA , Ningning LU , Junjun MEI , Tao GUAN , Yu ZHANG , Xin GENG

Front. Comput. Sci. ›› 2023, Vol. 17 ›› Issue (6) : 176339

PDF (2041KB)
Front. Comput. Sci. ›› 2023, Vol. 17 ›› Issue (6) : 176339 DOI: 10.1007/s11704-022-1446-5
Excellent Young Computer Scientists Forum
RESEARCH ARTICLE

Label distribution learning for scene text detection

Author information +
History +
PDF (2041KB)

Abstract

Recently, segmentation-based scene text detection has drawn a wide research interest due to its flexibility in describing scene text instance of arbitrary shapes such as curved texts. However, existing methods usually need complex post-processing stages to process ambiguous labels, i.e., the labels of the pixels near the text boundary, which may belong to the text or background. In this paper, we present a framework for segmentation-based scene text detection by learning from ambiguous labels. We use the label distribution learning method to process the label ambiguity of text annotation, which achieves a good performance without using additional post-processing stage. Experiments on benchmark datasets demonstrate that our method produces better results than state-of-the-art methods for segmentation-based scene text detection.

Graphical abstract

Keywords

scene text detection / multi-task learning / label distribution learning

Cite this article

Download citation ▾
Haoyu MA, Ningning LU, Junjun MEI, Tao GUAN, Yu ZHANG, Xin GENG. Label distribution learning for scene text detection. Front. Comput. Sci., 2023, 17(6): 176339 DOI:10.1007/s11704-022-1446-5

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Zhu A, Uchida S . Scene word recognition from pieces to whole. Frontiers of Computer Science, 2019, 13( 2): 292–301

[2]

He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016, 770−778

[3]

Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. 2015, 3431−3440

[4]

Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C Y, Berg A C. SSD: single shot multibox detector. In: Proceedings of the 14th European Conference on Computer Vision. 2016, 21−37

[5]

Ren S, He K, Girshick R, Sun J . Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39( 6): 1137–1149

[6]

Jiang H, Cheng M M, Li S J, Borji A, Wang J . Joint salient object detection and existence prediction. Frontiers of Computer Science, 2019, 13( 4): 778–788

[7]

Li M, Mao J, Qi X, Jin C . A framework for cloned vehicle detection. Frontiers of Computer Science, 2020, 14( 5): 145609

[8]

Tian Z, Huang W, He T, He P, Qiao Y. Detecting text in natural image with connectionist text proposal network. In: Proceedings of the 14th European Conference on Computer Vision. 2016, 56−72

[9]

Liao M, Shi B, Bai X, Wang X, Liu W. TextBoxes: a fast text detector with a single deep neural network. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017, 4161−4167

[10]

Zhou X, Yao C, Wen H, Wang Y, Zhou S, He W, Liang J. EAST: an efficient and accurate scene text detector. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 2642−2651

[11]

Deng D, Liu H, Li X, Cai D. PixelLink: detecting scene text via instance segmentation. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018, 6773−6780

[12]

Long S, Ruan J, Zhang W, He X, Wu W, Yao C. TextSnake: a flexible representation for detecting text of arbitrary shapes. In: Proceedings of the 15th European Conference on Computer Vision. 2018, 19−35

[13]

Wang W, Xie E, Li X, Hou W, Lu T, Yu G, Shao S. Shape robust text detection with progressive scale expansion network. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 9328−9337

[14]

Liao M, Wan Z, Yao C, Chen K, Bai X. Real-time scene text detection with differentiable binarization. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020, 11474−11481

[15]

Shi B, Bai X, Belongie S. Detecting oriented text in natural images by linking segments. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 3482−3490

[16]

Jiang Y, Zhu X, Wang X, Yang S, Li W, Wang H, Fu P, Luo Z. R2CNN: rotational region CNN for orientation robust scene text detection. 2017, arXiv preprint arXiv: 1706.09579

[17]

Gao B B, Xing C, Xie C W, Wu J, Geng X . Deep label distribution learning with label ambiguity. IEEE Transactions on Image Processing, 2017, 26( 6): 2825–2838

[18]

Geng X, Yin C, Zhou Z H . Facial age estimation by learning from label distributions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35( 10): 2401–2412

[19]

Liao M, Shi B, Bai X . Textboxes++: a single-shot oriented scene text detector. IEEE Transactions on Image Processing, 2018, 27( 8): 3676–3690

[20]

Liu Y, Jin L. Deep matching prior network: Toward tighter multi-oriented text detection. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 3454−3461

[21]

Yao C, Bai X, Sang N, Zhou X, Zhou S, Cao Z. Scene text detection via holistic, multi-channel prediction. 2016, arXiv preprint arXiv: 1606.09002

[22]

Cour T, Sapp B, Jordan C, Taskar B. Learning from ambiguously labeled images. In: Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. 2009, 919−926

[23]

Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y. Deformable convolutional networks. In: Proceedings of 2017 IEEE International Conference on Computer Vision. 2017, 764−773

[24]

Zhu X, Hu H, Lin S, Dai J. Deformable convNets V2: more deformable, better results. In: Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 9300−9308

[25]

Gupta A, Vedaldi A, Zisserman A. Synthetic data for text localisation in natural images. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016, 2315−2324

[26]

Ch’ng C K, Chan C S. Total-text: a comprehensive dataset for scene text detection and recognition. In: Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition. 2017, 935−942

[27]

Karatzas D, Gomez-Bigorda L, Nicolaou A, Ghosh S, Bagdanov A, Iwamura M, Matas J, Neumann L, Chandrasekhar V R, Lu S, Shafait F, Uchida S, Valveny E. ICDAR 2015 competition on robust reading. In: Proceedings of the 13th International Conference on Document Analysis and Recognition. 2015, 1156−1160

[28]

Yao C, Bai X, Liu W, Ma Y, Tu Z. Detecting texts of arbitrary orientations in natural images. In: Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. 2012, 1083−1090

[29]

Yao C, Bai X, Liu W . A unified framework for multioriented text detection and recognition. IEEE Transactions on Image Processing, 2014, 23( 11): 4737–4749

[30]

Liu Y, Jin L, Zhang S, Luo C, Zhang S . Curved scene text detection via transverse and longitudinal sequence connection. Pattern Recognition, 2019, 90: 337–345

[31]

Deng J, Dong W, Socher R, Li L J, Li K, Fei-Fei L. ImageNet: a large-scale hierarchical image database. In: Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. 2009, 248−255

[32]

Wang X, Jiang Y, Luo Z, Liu C L, Choi H, Kim S. Arbitrary shape scene text detection with adaptive text region representation. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 6442−6451

[33]

Lyu P, Liao M, Yao C, Wu W, Bai X. Mask textSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. In: Proceedings of the 15th European Conference on Computer Vision. 2018, 71−88

[34]

Xu Y, Wang Y, Zhou W, Wang Y, Yang Z, Bai X . TextField: learning a deep direction field for irregular scene text detection. IEEE Transactions on Image Processing, 2019, 28( 11): 5566–5579

[35]

Zhang C, Liang B, Huang Z, En M, Han J, Ding E, Ding X. Look more than once: an accurate detector for text of arbitrary shapes. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 10544−10553

[36]

Baek Y, Lee B, Han D, Yun S, Lee H. Character region awareness for text detection. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 9357−9366

[37]

Liu Z, Lin G, Yang S, Liu F, Lin W, Goh W L. Towards robust curve text detection with conditional spatial expansion. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 7261−7270

[38]

Tian Z, Shu M, Lyu P, Li R, Zhou C, Shen X, Jia J. Learning shape-aware embedding for scene text detection. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 4229−4238

[39]

He P, Huang W, He T, Zhu Q, Qiao Y, Li X. Single shot text detector with regional attention. In: Proceedings of 2017 IEEE International Conference on Computer Vision. 2017, 3066−3074

[40]

Hu H, Zhang C, Luo Y, Wang Y, Han J, Ding E. WordSup: exploiting word annotations for character based text detection. In: Proceedings of 2017 IEEE International Conference on Computer Vision. 2017, 4950−4959

[41]

Lyu P, Yao C, Wu W, Yan S, Bai X. Multi-oriented scene text detection via corner localization and region segmentation. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 7553−7563

[42]

Liao M, Zhu Z, Shi B, Xia G S, Bai X. Rotation-sensitive regression for oriented scene text detection. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 5909−5918

[43]

Liu Z, Lin G, Yang S, Feng J, Lin W, Goh W L. Learning Markov clustering networks for scene text detection. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 6936−6944

[44]

He T, Huang W, Qiao Y, Yao J . Text-attentional convolutional neural network for scene text detection. IEEE Transactions on Image Processing, 2016, 25( 6): 2529–2541

[45]

He W, Zhang X Y, Yin F, Liu C L. Deep direct regression for multi-oriented scene text detection. In: Proceedings of 2017 IEEE International Conference on Computer Vision. 2017, 745−753

[46]

Ma J, Shao W, Ye H, Wang L, Wang H, Zheng Y, Xue X . Arbitrary-oriented scene text detection via rotation proposals. IEEE Transactions on Multimedia, 2018, 20( 11): 3111–3122

[47]

Xue C, Lu S, Zhan F. Accurate scene text detection through border semantics awareness and bootstrapping. In: Proceedings of the 15th European Conference on Computer Vision. 2018, 370−387

[48]

Xue C, Lu S, Zhang W. MSR: multi-scale shape regression for scene text detection. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence. 2019, 989−995

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (2041KB)

Supplementary files

FCS-21446-OF-HM_suppl_1

2158

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/