PDF
(2470KB)
Abstract
People who have trouble communicating verbally are often dependent on sign language, which can be difficult for most people to understand, making interaction with them a difficult endeavor. The Sign Language Recognition (SLR) system takes an input expression from a hearing or speaking-impaired person and outputs it in the form of text or voice to a normal person. The existing study related to the Sign Language Recognition system has some drawbacks, such as a lack of large datasets and datasets with a range of backgrounds, skin tones, and ages. This research efficiently focuses on Sign Language Recognition to overcome previous limitations. Most importantly, we use our proposed Convolutional Neural Network (CNN) model, “ConvNeural”, in order to train our dataset. Additionally, we develop our own datasets, “BdSL_OPSA22_STATIC1” and “BdSL_OPSA22_STATIC2”, both of which have ambiguous backgrounds. “BdSL_OPSA22_STATIC1” and “BdSL_OPSA22_STATIC2” both include images of Bangla characters and numerals, a total of 24,615 and 8437 images, respectively. The “ConvNeural” model outperforms the pre-trained models with accuracy of 98.38% for “BdSL_OPSA22_STATIC1” and 92.78% for “BdSL_OPSA22_STATIC2”. For “BdSL_OPSA22_STATIC1” dataset, we get precision, recall, F1-score, sensitivity and specificity of 96%, 95%, 95%, 99.31%, and 95.78% respectively. Moreover, in case of “BdSL_OPSA22_STATIC2” dataset, we achieve precision, recall, F1-score, sensitivity and specificity of 90%, 88%, 88%, 100%, and 100% respectively.
Keywords
ConvNeural
/
Sign language
/
CNN
/
Static
/
Feature extraction
/
Convolution2D
/
Fully connected layer
/
Dropout
Cite this article
Download citation ▾
Muhammad Aminur Rahaman, Kabiratun Ummi Oyshe, Prothoma Khan Chowdhury, Tanoy Debnath, Anichur Rahman, Md. Saikat Islam Khan.
Computer vision-based six layered ConvNeural network to recognize sign language for both numeral and alphabet signs.
Biomimetic Intelligence and Robotics, 2024, 4(1): 100141-100141 DOI:10.1016/j.birob.2023.100141
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
We would especially like to thank the 14 participants who gave us permission to use their pictures and data in our dataset. They voluntarily agreed to engage in our study research by reading and signing our ethical statement and enabling us to record their images. While signing the consent papers, we made sure to inform all the participants about the publication and the availability of the data for other researchers. We used our own devices to acquire the photographs that are featured in our dataset and in our paper as examples of our dataset
| [1] |
WHO, Deafness and Hearing Loss, WHO, 2020, URL: https://www.who.int/news-room/fact-sheets/detail/deafness-and-hearing-loss.
|
| [2] |
T.F. Ayshee, S.A. Raka, Q.R. Hasib, M. Hossain, R.M. Rahman, Fuzzy rule-based hand gesture recognition for bengali characters, in: 2014 IEEE International Advance Computing Conference, IACC, IEEE, 2014, pp. 484-489.
|
| [3] |
N. Tubaiz, T. Shanableh, K. Assaleh, Glove-based continuous Arabic sign language recognition in user-dependent mode, IEEE Trans. Hum.-Mach. Syst. 45 (4) (2015) 526-533.
|
| [4] |
A. Khatun, M. Shahriar, M. Hasan, K. Das, S. Ahmed, M. Islam, A systematic review on the chronological development of Bangla sign language recogni-tion systems, in: 2021 Joint 10th International Conference on Informatics, Electronics & Vision (ICIEV) and 2021 5th International Conference on Imaging, Vision & Pattern Recognition (icIVPR), 2021, pp. 1-9.
|
| [5] |
M. Islalm, M. Rahman, M. Rahman, M. Arifuzzaman, R. Sassi, M. Ak-taruzzaman, Recognition Bangla sign language using convolutional neural network, in: 2019 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies, 3ICT, 2019, pp. 1-6.
|
| [6] |
A. Mittal, P. Kumar, P. Roy, R. Balasubramanian, B. Chaudhuri, A modified LSTM model for continuous sign language recognition using leap motion, IEEE Sens. J. 19 (17) (2019) 7056-7063.
|
| [7] |
J. Wu, Introduction to Convolutional Neural Networks, Vol. 5, No. 23, National Key Lab for Novel Software Technology. Nanjing University, China, 2017, p. 495.
|
| [8] |
S.I. Khan, A. Shahrior, R. Karim, M. Hasan, A. Rahman, Multinet: A deep neural network approach for detecting breast cancer through multi-scale feature fusion, J. King Saud Univ. -Comput. Inf. Sci. 34 (8) (2022) 6217-6228.
|
| [9] |
M.A. Rahaman, M. Jasim, M.H. Ali, M. Hasanuzzaman, Real-time computer vision-based bengali sign language recognition, in: 2014 17th International Conference on Computer and Information Technology, ICCIT, IEEE, 2014, pp. 192-197.
|
| [10] |
Y. Liu, G. Sun, Y. Qiu, L. Zhang, A. Chhatkuli, L.V. Gool, Transformer in convolutional neural networks, 3, 2021, arXiv preprint arXiv:2106. 03180.
|
| [11] |
J. Shin, A.S.M. Miah, M.A.M. Hasan, K. Hirooka, K. Suzuki, H.-S. Lee, S.-W. Jang, Korean sign language recognition using transformer-based deep neural network, Appl. Sci. 13 (5) (2023) 3029.
|
| [12] |
M. Zakariah, Y.A. Alotaibi, D. Koundal, Y. Guo, M.M. Elahi, Sign language recognition for arabic alphabets using transfer learning technique, Comput. Intell. Neurosci. 2022 (2022).
|
| [13] |
S. Pan, Q. Yang, A survey on transfer learning, IEEE Trans. Knowl. Data Eng. 22 (10) (2009) 1345-1359.
|
| [14] |
S. Katoch, V. Singh, U.S. Tiwary, Indian sign language recognition system using SURF with SVM and CNN, Array 14 (2022) 100141.
|
| [15] |
G.Z. de Castro, R.R. Guerra, F.G. Guimarães, Automatic translation of sign language with multi-stream 3D CNN and generation of artificial depth maps, Expert Syst. Appl. 215 (2023) 119394.
|
| [16] |
D. Maturana, S. Scherer, VoxNet: A 3D convolutional neural network for real-time object recognition, in: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS, IEEE, 2015, pp. 922-928.
|
| [17] |
Y. Obi, K.S. Claudio, V.M. Budiman, S. Achmad, A. Kurniawan, Sign language recognition system for communicating to people with disabilities, Procedia Comput. Sci. 216 (2023) 13-20.
|
| [18] |
N. Basnin, L. Nahar, M.S. Hossain,An integrated CNN-LSTM model for Bangla lexical sign language recognition, in:Proceedings of Interna-tional Conference on Trends in Computational and Cognitive Engineering: Proceedings of TCCE 2020, Springer, 2020, pp. 695-707.
|
| [19] |
F. Shamrat, S. Chakraborty, M.M. Billah, M. Kabir, N.S. Shadin, S. Sanjana, Bangla numerical sign language recognition using convolutional neural networks, Indonesian J. Electr. Eng. Comput. Sci. 23 (1) (2021) 405-413.
|
| [20] |
D. Breland, S. Skriubakken, A. Dayal, A. Jha, P. Yalavarthy, L. Cenkeramaddi, Deep learning-based sign language digits recognition from thermal images with edge computing system, IEEE Sens. J. 21 (19) (2021) 10445-10453.
|
| [21] |
N. Adaloglou, T. Chatzis, I. Papastratis, A. Stergioulas, G. Papadopoulos, V. Zacharopoulou, G. Xydopoulos, K. Atzakas, D. Papazachariou, P. Daras, A comprehensive study on deep learning-based methods for sign language recognition, IEEE Trans. Multimed. 24 (2021) 1750-1762.
|
| [22] |
S. Das, M.S. Imtiaz, N.H. Neom, N. Siddique, H. Wang, A hybrid approach for Bangla sign language recognition using deep transfer learning model with random forest classifier, Expert Syst. Appl. 213 (2023) 118914.
|
| [23] |
M.M. Islam, M.R. Uddin, M.N. AKhtar, K.R. Alam, Recognizing multiclass static sign language words for deaf and dumb people of Bangladesh based on transfer learning techniques, Inform. Med. Unlocked 33 (2022) 101077.
|
| [24] |
S. Tammina, Transfer learning using VGG-16 with deep convolutional neural network for classifying images, Int. J. Sci. Res. Publ. (IJSRP) 9 (10)(2019) 143-150.
|
| [25] |
M. Shaha, M. Pawar, Transfer learning for image classification, in: 2018 Second International Conference on Electronics, Communication and Aerospace Technology, ICECA, IEEE, 2018, pp. 656-660.
|
| [26] |
M.Z. Alom, T.M. Taha, C. Yakopcic, S. Westberg, P. Sidike, M.S. Nasrin, B. C.V. Esesn, A.A.S. Awwal, V.K. Asari, The history began from AlexNet: A comprehensive survey on deep learning approaches, 2018, arXiv preprint arXiv:1803.01164.
|
| [27] |
X. Xia, C. Xu, B. Nan, Inception-v3 for flower classification, in: 2017 2nd International Conference on Image, Vision and Computing, ICIVC, IEEE, 2017, pp. 783-787.
|
| [28] |
S. Akter, F.J.M. Shamrat, S. Chakraborty, A. Karim, S. Azam, COVID-19 detection using deep learning algorithm on chest X-ray images, Biology 10 (11) (2021) 1174.
|
| [29] |
S. Hossain, D. Sarma, T. Mittra, M.N. Alam, I. Saha, F.T. Johora, Bengali hand sign gestures recognition using convolutional neural network, in: 2020 Second International Conference on Inventive Research in Computing Applications, ICIRCA, IEEE, 2020, pp. 636-641.
|
| [30] |
B. Mor, S. Garhwal, A. Kumar, A systematic review of hidden Markov models and their applications, Arch. Comput. Methods Eng. 28 (2021) 1429-1448.
|
| [31] |
W.S. Noble, What is a support vector machine? Nature Biotechnol. 24 (12)(2006) 1565-1567.
|
| [32] |
S. Islam, U. Sara, A. Kawsar, A. Rahman, D. Kundu, D.D. Dipta, A.R. Karim, M. Hasan, SGBBA: An efficient method for prediction system in machine learning using imbalance dataset, Int. J. Adv. Comput. Sci. Appl. 12 (3)(2021).
|
| [33] |
A.A. Youssif, A.E. Aboutabl, H.H. Ali, Arabic sign language (ARSL) recog-nition system using HMM, Int. J. Adv. Comput. Sci. Appl. 2 (11)(2011).
|
| [34] |
A. Parmar, R. Katariya, V. Patel, A review on random forest: An ensemble classifier, in: International Conference on Intelligent Data Communication Technologies and Internet of Things (ICICI) 2018, Springer, 2019, pp. 758-763.
|
| [35] |
K.A. Lipi, S.F.K. Adrita, Z.F. Tunny, A.H. Munna, A. Kabir, Static-gesture word recognition in Bangla sign language using convolutional neural network, TELKOMNIKA (Telecommun. Comput. Electron. Control) 20 (5)(2022) 1109-1116.
|
| [36] |
K. Mistree, D. Thakor, B. Bhatt, Indian alphabets and digits sign recognition using pretrained model, in: Smart Intelligent Computing and Applica-tions, Volume 2: Proceedings of Fifth International Conference on Smart Computing and Informatics, SCI 2021, Springer, 2022, pp. 13-20.
|
| [37] |
M.M. Hasan, S.M.M. Ahsan, Bangla sign digits recognition using HOG fea-ture based multi-class support vector machine, in: 2019 4th International Conference on Electrical Information and Communication Technology, EICT, IEEE, 2019, pp. 1-5.
|
| [38] |
N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1, CVPR’05, IEEE, 2005, pp. 886-893.
|
| [39] |
S.B. Imandoust, M. Bolandraftar, et al., Application of k-nearest neighbor (knn) approach for predicting economic events: Theoretical background, Int. J. Eng. Res. Appl. 3 (5) (2013) 605-610.
|
| [40] |
C.M. Sharma, K. Tomar, R.K. Mishra, V.M. Chariar, Indian sign language recognition using fine-tuned deep transfer learning model, in: Procc. of International Conference on Innovations in Computer and Information Science, ICICIS, 2021, pp. 62-67.
|
| [41] |
G.A. Rao, K. Syamala, P. Kishore, A. Sastry, Deep convolutional neural networks for sign language recognition, in: 2018 Conference on Signal Processing and Communication Engineering Systems, SPACES, IEEE, 2018, pp. 194-197.
|
| [42] |
F. Yasir, P. Prasad, A. Alsadoon, A. Elchouemi, S. Sreedharan, Bangla sign language recognition using convolutional neural network, in: 2017 International Conference on Intelligent Computing, Instrumentation and Control Technologies, ICICICT, IEEE, 2017, pp. 49-53.
|
| [43] |
S.R. Eddy, What is a hidden markov model? Nature Biotechnol. 22 (10)(2004) 1315-1316.
|
| [44] |
F. Weichert, D. Bachmann, B. Rudak, D. Fisseler, Analysis of the accuracy and robustness of the leap motion controller, Sensors 13 (5) (2013) 6380-6393.
|
| [45] |
M.M. Zaki, S.I. Shaheen, Sign language recognition using a combination of new vision based features, Pattern Recognit. Lett. 32 (4) (2011) 572-577.
|
| [46] |
K. Bantupalli, Y. Xie, American sign language recognition using deep learning and computer vision, in: 2018 IEEE International Conference on Big Data, Big Data, IEEE, 2018, pp. 4896-4899.
|
| [47] |
R.A. Nihal, S. Rahman, N.M. Broti, S.A. Deowan, Bangla sign alphabet recognition with zero-shot and transfer learning, Pattern Recognit. Lett. 150 (2021) 84-93.
|
| [48] |
M.M. Hasan, A.Y. Srizon, A. Sayeed, M.A.M. Hasan, Classification of sign language characters by applying a deep convolutional neural network, in: 2020 2nd International Conference on Advanced Information and Communication Technology, ICAICT, IEEE, 2020, pp. 434-438.
|
| [49] |
J.J. Bird, A. Ekárt, D.R. Faria, British sign language recognition via late fusion of computer vision and leap motion with transfer learning to American sign language, Sensors 20 (18) (2020) 5151.
|
| [50] |
Prothoma. Oyshe, Bangla sign language dataset ‘‘BdSL_OPSA22 _STATIC1’’, 2023, https://github.com/Prothoma2001/Bangla-Sign-Language-Recognition-Using-CNN/tree/main/BdSL_OPSA22_STATIC1.
|
| [51] |
Prothoma. Oyshe, Bangla sign language dataset ‘‘BdSL_OPSA22 _STATIC2’’, 2023, https://github.com/Prothoma2001/Bangla-Sign-Language-Recognition-Using-CNN/tree/main/BdSL_OPSA22_STATIC2.
|
| [52] |
M. Hasan, A. Rahman, M.R. Karim, M.S.I. Khan, M.J. Islam, Normalized ap-proach to find optimal number of topics in latent Dirichlet allocation (LDA),in:Proceedings of International Conference on Trends in Computational and Cognitive Engineering, Springer, 2021, pp. 341-354.
|
| [53] |
A. Rahman, K. Hasan, D. Kundu, M.R. Karim, M.K. Nasir, S.S. Band, A. Mosavi, I. Dehzangi, On the ICN-IoT with federated learning integration of communication: Concepts, security-privacy issues, applications, and future perspectives, Future Gener. Comput. Syst. 138 (2023) 61-88.
|
| [54] |
M.S.I. Khan, A. Rahman, T. Debnath, M.R. Karim, M.K. Nasir, S.S. Band,A.Mosavi, I. Dehzangi, Accurate brain tumor detection using deep convolutional neural network, Comput. Struct. Biotechnol. J. 20 (2022) 4733-4745.
|
| [55] |
T. Debnath, M.M. Reza, A. Rahman, A. Beheshti, S.S. Band, H. Alinejad-Rokny, Four-layer ConvNet to facial emotion recognition with minimal epochs and the significance of data diversity, Sci. Rep. 12 (1) (2022) 6991, URL: https://www.nature.com/articles/s41598-022-11173-0.
|
| [56] |
A. Rahman, M. Hossain, G. Muhammad, D. Kundu, T. Debnath, M. Rahman, M. Khan, S. Islam, P. Tiwari, S.S. Band, et al., Federated learning-based AI approaches in smart healthcare: concepts, taxonomies, challenges and open issues, Cluster Comput. (2022) 1-41.
|
| [57] |
C. Deutsch, Direct assessment of local accuracy and precision, Geostat. Wollongong 96 (1) (1997) 115-125.
|
| [58] |
v7labs, F1 Score in Machine Learning: Intro & Calculation, V7Labs, 2020, URL: https://www.v7labs.com/blog/f1-score-guide#:-:text=F1%20score%20is%20a%20machine%20learning%20evaluation%20metric%20that%20measures,prediction%20across%20the%20entire%20dataset.
|
| [59] |
BMJ, What are Sensitivity and Specificity?, BMJ, 2020, URL: https://ebn.bmj.com/content/23/1/2.
|
| [60] |
M.S. Islam, S.S.S. Mousumi, N.A. Jessan, A.S.A. Rabby, S.A. Hossain, Ishara-lipi: The first complete multipurpose open access dataset of isolated characters for Bangla sign language, in: 2018 International Conference on Bangla Speech and Language Processing, ICBSLP, IEEE, 2018, pp. 1-4.
|
| [61] |
T. Abedin, K.S. Prottoy, A. Moshruba, S.B. Hakim, Bangla sign language recognition using a concatenated BdSL network, in: Computer Vision and Image Analysis for Industry 4.0, Chapman and Hall/CRC, 2023, pp. 76-86.
|
| [62] |
A.M. Rafi, N. Nawal, N.S.N. Bayev, L. Nima, C. Shahnaz, S.A. Fattah, Image-based bengali sign language alphabet recognition for deaf and dumb community, in: 2019 IEEE Global Humanitarian Technology Conference, GHTC, 2019, pp. 1-7.
|
| [63] |
M. Hossen, A. Govindaiah, S. Sultana, A. Bhuiyan, Bengali sign language recognition using deep convolutional neural network, in: 2018 Joint 7th International Conference on Informatics, Electronics & Vision (IcIEV) and 2018 2nd International Conference on Imaging, Vision & Pattern Recognition (IcIVPR), IEEE, 2018, pp. 369-373.
|
| [64] |
D. Tasmere, B. Ahmed, M.M. Hasan, Bangla sign digits: A dataset for real time hand gesture recognition, in: 2020 11th International Conference on Electrical and Computer Engineering, ICECE, IEEE, 2020, pp. 186-189.
|
| [65] |
S.S. Shanta, S.T. Anwar, M.R. Kabir, Bangla sign language detection using SIFT and CNN, in: 2018 9th International Conference on Comput-ing, Communication and Networking Technologies, ICCCNT, IEEE, 2018, pp. 1-6.
|
| [66] |
S.A. Khan, A.D. Joy, S. Asaduzzaman, M. Hossain, An efficient sign language translator device using convolutional neural network and customized ROI segmentation, in: 2019 2nd International Conference on Communication Engineering and Technology, ICCET, IEEE, 2019, pp. 152-156.
|
| [67] |
S. Ahmed, M. Islam, J. Hassan, M.U. Ahmed, B.J. Ferdosi, S. Saha, M. Shopon, et al., Hand sign to Bangla speech: A deep learning in vision based system for recognizing hand sign digits and generating Bangla speech, 2019, arXiv preprint arXiv:1901.05613.
|
| [68] |
A.J. Rony, K.H. Saikat, M. Tanzeem, F.R.H. Robi, An effective ap-proach to communicate with the deaf and mute people by recognizing characters of one-hand Bangla sign language using convolutional neural-network, in: 2018 4th International Conference on Electrical Engineering and Information & Communication Technology, ICEEiCT, IEEE, 2018, pp. 74-79.
|
| [69] |
T. Tabassum, I. Mahmud, M. Uddin, A. Emran, M. Afjal, A. Nitu, Enhance-ment of single-handed bengali sign language recognition based on HOG features, 2020.
|
| [70] |
S. Begum, M. Hasanuzzaman, Computer vision-based Bangladeshi sign language recognition system, in: 2009 12th International Conference on Computers and Information Technology, IEEE, 2009, pp. 414-419.
|
| [71] |
M.A. Uddin, S.A. Chowdhury, Hand sign language recognition for Bangla alphabet using support vector machine, in: 2016 International Conference on Innovations in Science, Engineering and Technology, ICISET, IEEE, 2016, pp. 1-4.
|