A CCD based machine vision system for real-time text detection

Shihua ZHAO, Lipeng SUN, Gang LI, Yun LIU, Binbing LIU

PDF(1725 KB)
PDF(1725 KB)
Front. Optoelectron. ›› 2020, Vol. 13 ›› Issue (4) : 418-424. DOI: 10.1007/s12200-019-0854-0
RESEARCH ARTICLE
RESEARCH ARTICLE

A CCD based machine vision system for real-time text detection

Author information +
History +

Abstract

Text detection and recognition is a hot topic in computer vision, which is considered to be the further development of the traditional optical character recognition (OCR) technology. With the rapid development of machine vision system and the wide application of deep learning algorithms, text recognition has achieved excellent performance. In contrast, detecting text block from complex natural scenes is still a challenging task. At present, many advanced natural scene text detection algorithms have been proposed, but most of them run slow due to the complexity of the detection pipeline and cannot be applied to industrial scenes. In this paper, we proposed a CCD based machine vision system for real-time text detection in invoice images. In this system, we applied optimizations from several aspects including the optical system, the hardware architecture, and the deep learning algorithm to improve the speed performance of the machine vision system. The experimental data confirms that the optimization methods can significantly improve the running speed of the machine vision system and make it meeting the real-time text detection requirements in industrial scenarios.

Keywords

machine vision / text detection / optical character recognition (OCR) / deep learning

Cite this article

Download citation ▾
Shihua ZHAO, Lipeng SUN, Gang LI, Yun LIU, Binbing LIU. A CCD based machine vision system for real-time text detection. Front. Optoelectron., 2020, 13(4): 418‒424 https://doi.org/10.1007/s12200-019-0854-0

References

[1]
Contes A, Carpenter B, Case C, Satheesh S, Suresh B, Wang T, Wu J D, Ng A Y. Text detection and character recognition in scene images with unsupervised feature learning. In: Proceedings of International Conference on Document Analysis and Recognition. Beijing: IEEE, 2011, 440–445
[2]
Ye Q, Doermann D. Text detection and recognition in imagery: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(7): 1480–1500
CrossRef Pubmed Google scholar
[3]
Zhang X, Gao X, Tian C. Text detection in natural scene images based on color prior guided MSER. Neurocomputing, 2018, 307: 61–71
[4]
Smith R. An overview of the tesseract OCR engine. In: Proceedings of International Conference on Document Analysis and Recognition. Parana: IEEE, 2007, 629–633
[5]
Epshtein B, Ofek E, Wexler Y. Detecting text in natural scenes with stroke width transform. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2010, 2963–2970
[6]
Jaderberg M, Simonyan K, Vedaldi A, Zisserman A. Reading text in the wild with convolutional neural networks. International Journal of Computer Vision, 2016, 116(1): 1–20
CrossRef Google scholar
[7]
Ren S, He K, Girshick R, Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149
CrossRef Pubmed Google scholar
[8]
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S E, Fu C, Berg A C. SSD: single shot MultiBox detector. In: Proceedings of European Conference on Computer Vision. Berlin: Springer, 2016, 21–37
[9]
Redmon J, Divvala S K, Girshick R B, Farhadi A. You only look once: unified, real-time object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016, 779–788
[10]
Tian Z, Huang W, He T, He P, Qiao Y. Detecting text in natural image with connectionist text proposal network. In: Proceedings of European Conference on Computer Vision. Berlin: Springer, 2016, 56–72
[11]
Ma J, Shao W, Ye H, Wang L, Wang H, Zheng Y, Xue X. Arbitrary-oriented scene text detection via rotation proposals. IEEE Transactions on Multimedia, 2018, 20(11): 3111–3122
[12]
Liu Y, Jin L. Deep matching prior network: toward tighter multi-oriented text detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017, 3454–3461
[13]
Shi B, Bai X, Belongie S. Detecting oriented text in natural images by linking segments. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 3482–3490
[14]
Liao M, Shi B, Bai X, Wang X, Liu W. TextBoxes: A fast text detector with a single deep neural network. 2016, arXiv:1611.06779
[15]
Liao M, Shi B, Bai X. TextBoxes++: a single-shot oriented scene text detector. IEEE Transactions on Image Processing, 2018, 27(8): 3676–3690
CrossRef Pubmed Google scholar
[16]
Dai Y, Huang Z, Gao Y, Xu Y, Chen K, Guo J, Qiu W. Fused text segmentation networks for multi-oriented scene text detection. 2018, arXiv:1709.03272
[17]
Hu H, Zhang C, Luo Y, Wang Y, Han J, Ding E. WordSup: exploiting word annotations for character based text detection. In: Proceedings of IEEE International Conference on Computer Vision. Venice: IEEE, 2017, 4950–4959
[18]
Deng J, Dong W, Socher R, Li L J, Li K, Li F F. ImageNet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Miami: IEEE, 2009, 248–255
[19]
Karatzas D, Shafait F, Uchida S, Iwamura M, Bigorda L G I, Mestre S R, Mas J, Mota D F, Almazàn J A, Heras L P D L. ICDAR 2013 robust reading competition. In: Proceedings of International Conference on Document Analysis and Recognition. Washington, DC: IEEE, 2013, 1484–1493
[20]
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014, arXiv:1409.1556
[21]
Howard A G, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H. MobileNets: efficient convolutional neural networks for mobile vision applications. 2017, arXiv:1704.04861
[22]
Zhang X, Zhou X, Lin M, Sun J. ShuffleNet: an extremely efficient convolutional neural network for mobile devices. 2017, arXiv:1707.01083v2
[23]
Iandola F N, Han S, Moskewicz M W, Ashraf K, Dally W J, Keutzer K. SqueezeNet: AlexNet-level accuracy with 50 ´ fewer parameters and<0.5 MB model size. 2016, arXiv:1602.07360v4

RIGHTS & PERMISSIONS

2019 Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature
AI Summary AI Mindmap
PDF(1725 KB)

Accesses

Citations

Detail

Sections
Recommended

/