Tandem hiddenMarkovmodels using deep belief networks for offline handwriting recognition
Partha Pratim ROY, Guoqiang ZHONG, Mohamed CHERIET
Tandem hiddenMarkovmodels using deep belief networks for offline handwriting recognition
Unconstrained offline handwriting recognition is a challenging task in the areas of document analysis and pattern recognition. In recent years, to sufficiently exploit the supervisory information hidden in document images, much effort has been made to integrate multi-layer perceptrons (MLPs) in either a hybrid or a tandem fashion into hidden Markov models (HMMs). However, due to the weak learnability of MLPs, the learnt features are not necessarily optimal for subsequent recognition tasks. In this paper, we propose a deep architecture-based tandem approach for unconstrained offline handwriting recognition. In the proposed model, deep belief networks are adopted to learn the compact representations of sequential data, while HMMs are applied for (sub-)word recognition. We evaluate the proposed model on two publicly available datasets, i.e., RIMES and IFN/ENIT, which are based on Latin and Arabic languages respectively, and one dataset collected by ourselves called Devanagari (an Indian script). Extensive experiments show the advantage of the proposed model, especially over the MLP-HMMs tandem approaches.
Handwriting recognition / Hidden Markov models / Deep learning / Deep belief networks / Tandem approach
[1] |
Augustin, E., Carré , M., Grosicki, E. ,
|
[2] |
Baum, L.E., Petrie, T., Soules, G.,
|
[3] |
Bertolami, R., Bunke, H., 2008. Hidden Markov modelbased ensemble methods for offline handwritten text line recognition. Patt. Recog., 41(11):3452–3460. http://dx.doi.org/10.1016/j.patcog.2008.04.003
|
[4] |
Bianne-Bernard, A.L., Menasri , F., Mohamad, R.A.H. ,
|
[5] |
Bourlard, H.A., Morgan, N., 1994. Connectionist Speech Recognition: a Hybrid Approach.Springer US, USA.
|
[6] |
Bunke, H., 2003. Recognition of cursive Roman handwriting: past, present and future.Proc. 7th Int. Conf. on Document Analysis and Recognition, p.448–459. http://dx.doi.org/10.1109/ICDAR.2003.1227707
|
[7] |
Dahl, G., Yu, D., Deng, L.,
|
[8] |
Deselaers, T., Hasan, S., Bender, O.,
|
[9] |
Dreuw, P., Heigold , G., Ney, H. , 2009. Confidence-based discriminative training for model adaptation in offline Arabic handwriting recognition.Proc. 10th Int. Conf. on Document Analysis and Recognition, p.596–600. http://dx.doi.org/10.1109/ICDAR.2009.116
|
[10] |
Dreuw, P., Doetsch , P., Plahl, C. ,
|
[11] |
Dreuw, P., Heigold , G., Ney, H. , 2011b. Confidence- and margin-based MMI/MPE discriminative training for offline handwriting recognition.Int. J. Doc. Anal. Recog., 14:273–288. http://dx.doi.org/10.1007/s10032-011-0160-x
|
[12] |
El-Yacoubi, A., Gilloux , M., Sabourin, R. ,
|
[13] |
Espana-Boquera, S., Castro-Bleda , M.J., Gorbe-Moya, J. ,
|
[14] |
Fujisawa, H., 2008. Forty years of research in character and document recognition—an industrial perspective.Patt. Recog., 41:2435–2446. http://dx.doi.org/10.1016/j.patcog.2008.03.015
|
[15] |
Graves, A., Schmidhuber , J., 2008. Offline handwriting recognition with multidimensional recurrent neural networks.Proc. 21st Int. Conf. on Neural Information Processing Systems, p.545–552.
|
[16] |
Graves, A., Liwicki , M., Fernández, S.,
|
[17] |
Grosicki, E., El Abed , H., 2009. ICDAR 2009 handwriting recognition competition.Proc. 10th Int. Conf. on Document Analysis and Recognition, p.1398–1402. http://dx.doi.org/10.1109/ICDAR.2009.184
|
[18] |
Haykin, S., 1998. Neural Networks: a Comprehensive Foundation. Prentice Hall, USA.
|
[19] |
Hermansky, H., EllisD.P.W. Sharma, S., 2000. Tandem connectionist feature extraction for conventional HMM systems.Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, p.1–4. http://dx.doi.org/10.1109/ICASSP.2000.862024
|
[20] |
Hinton, G.E., 2002. Training products of experts by minimizing contrastive divergence. Neur. Comput., 14(8):1771–1800. http://dx.doi.org/10.1162/089976602760128018
|
[21] |
Hinton, G.E., Osindero , S., Teh, Y.W. , 2006. A fast learning algorithm for deep belief nets. Neur. Comput., 18(7):1527–1554. http://dx.doi.org/10.1162/neco.2006.18.7.1527
|
[22] |
Kessentini, Y., Paquet, T., Benhamadou, A. , 2008. A multistream HMM-based approach for off-line multi-script handwritten word recognition.Proc. Int. Conf. on Frontiers in Handwriting Recognition, p.1–6.
|
[23] |
Kittler, J., Young, P.C., 1973. A new approach to feature selection based on the Karhunen-Loeve expansion. Patt. Recog., 5(4):335–352. http://dx.doi.org/10.1016/0031-3203(73)90025-3
|
[24] |
Kozielski, M., Doetsch , P., Ney, H. , 2013. Improvements in RWTH’s system for off-line handwriting recognition.Proc. 12th Int. Conf. on Document Analysis and Recognition, p.935–939. http://dx.doi.org/10.1109/ICDAR.2013.190
|
[25] |
Margner, V., El Abed , H., 2010. ICFHR 2010—Arabic handwriting recognition competition.Proc. Int. Conf. on Frontiers in Handwriting Recognition, p.709–714. http://dx.doi.org/10.1109/ICFHR.2010.115
|
[26] |
Marinai, S., Gori, M., Soda, G., 2005. Artificial neural networks for document analysis and recognition. IEEE Trans. Patt. Anal. Mach. Intell., 27(1):23–35. http://dx.doi.org/10.1109/TPAMI.2005.4
|
[27] |
Marti, U.V., Bunke, H., 2001. Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition system. Int. J. Patt. Recog. Artif. Intell., 15(1):65–90. http://dx.doi.org/10.1142/S0218001401000848
|
[28] |
Mohamad, R.A.H., Likforman-Sulem, L., Mokbel, C. , 2009. Combining slanted-frame classifiers for improved HMMbased Arabic handwriting recognition.IEEE Trans. Patt. Anal. Mach. Intell., 31(7):1165–1177. http://dx.doi.org/10.1109/TPAMI.2008.136
|
[29] |
Mohamed, A.R., Dahl, G., Hinton, G., 2009. Deep belief networks for phone recognition.Proc. NIPS Workshop on Deep Learning for Speech Recognition and Related Applications, p.1–9.
|
[30] |
Mohamed, A.R., Dahl, G., Hinton, G., 2012. Acoustic modeling using deep belief networks.IEEE Trans. Audio Speech Lang. Process., 20(1):14–22. http://dx.doi.org/10.1109/TASL.2011.2109382
|
[31] |
Otsu, N., 1979. A threshold selection method from gray-level histograms.IEEE Trans. Syst. Man Cybern., 9(1):62–66. http://dx.doi.org/10.1109/TSMC.1979.4310076
|
[32] |
Pal, U., Chaudhuri , B.B., 2004. Indian script character recognition: a survey.Patt. Recog. , 37(9):1887–1899. http://dx.doi.org/10.1016/j.patcog.2004.02.003
|
[33] |
Rabiner, L.R., 1989. A tutorial on hidden Markov models and selected applications in speech recognition.Proc. IEEE, 77(2):257–286. http://dx.doi.org/10.1109/5.18626
|
[34] |
Renals, S., Morgan, N., Bourlard, H. ,
|
[35] |
Rodríguez, J.A. , Perronnin, F., 2008. Local gradient histogram features for word spotting in unconstrained handwritten documents.Proc. Int. Conf. on Frontiers in Handwriting Recognition, p.7–12.
|
[36] |
Schenk, J., Rigoll, G., 2006. Novel hybrid NN/HMM modelling techniques for on-line handwriting recognition.Proc. 10th Int. Workshop on Frontiers in Handwriting Recognition, p.1–5.
|
[37] |
Senior, A., Robinson , A.J., 1998. An off-line cursive handwriting recognition system.IEEE Trans. Patt. Anal. Mach. Intell., 20(3):309–321. http://dx.doi.org/10.1109/34.667887
|
[38] |
Senior, A., Heigold , G., Bacchiani, M. ,
|
[39] |
Sharma, S., Ellis, D., Kajarekar, S. ,
|
[40] |
Shaw, B., Bhattacharya , U., Parui, S.K. , 2014. Combination of features for efficient recognition of offline handwritten Devanagari words.Proc. 14th Int. Conf. on Frontiers in Handwriting Recognition, p.240–245. http://dx.doi.org/10.1109/ICFHR.2014.48
|
[41] |
Thomas, S., Chatelain , C., Heutte, L. ,
|
[42] |
Vinciarelli, A., 2002. A survey on off-line cursive word recognition. Patt. Recog., 35(7):1433–1446.http://dx.doi.org/10.1016/S0031-3203(01)00129-7
|
[43] |
Vinciarelli, A., Bengio , S., Bunke, H. , 2004. Offline recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Trans. Patt. Anal. Mach. Intell., 26(6):709–720. http://dx.doi.org/10.1109/TPAMI.2004.14
|
[44] |
Young, S., Evermann , G., Gales, M.J.F. , 2006. The HTK Book (Version 3.4).Engineering Department, Cambridge University, UK.
|
[45] |
Zimmermann, M., Chappelier , J.C., Bunke, H. , 2006. Offline grammar-based recognition of handwritten sentences.IEEE Trans. Patt. Anal. Mach. Intell. , 28(5):818–821. http://dx.doi.org/10.1109/TPAMI.2006.103
|
/
〈 | 〉 |