Decoding Marathi emotions: Enhanced speech emotion recognition through deep belief network-support vector machine integration

Varsha Nilesh Gaikwad , Rahul Kumar Budania

International Journal of Systematic Innovation ›› 2025, Vol. 9 ›› Issue (4) : 71 -83.

PDF (2220KB)
International Journal of Systematic Innovation ›› 2025, Vol. 9 ›› Issue (4) :71 -83. DOI: 10.6977/IJoSI.202508_9(4).0006
ARTICLE
research-article

Decoding Marathi emotions: Enhanced speech emotion recognition through deep belief network-support vector machine integration

Author information +
History +
PDF (2220KB)

Abstract

Speech emotion recognition in Marathi presents considerable hurdles due to the language’s distinct grammatical and emotional characteristics. This paper presents a robust methodology for classifying emotions in Marathi speech utilizing advanced signal processing, feature extraction, and machine learning techniques. The method entails collecting diverse Marathi speech samples and using pre-processing steps such as pre-emphasis and voice activity detection to improve signal quality. Speech signals are segmented using the Hamming window to reduce discontinuities, and features such as Mel-frequency cepstral coefficients, pitch, intensity, and spectral properties are retrieved. For classification, an attentive deep belief network is paired with a support vector machine, which uses attention techniques and batch normalization to improve performance and reduce overfitting. The suggested approach surpasses existing models, with 98% accuracy, 98% F1-score, 99% specificity, 99% sensitivity, 98% precision, and 98% recall.

Keywords

Speech Emotion Recognition / Voice Activity Detection / Mel-Frequency Cepstral Coefficient / Deep Belief Network / Support Vector Machine

Cite this article

Download citation ▾
Varsha Nilesh Gaikwad, Rahul Kumar Budania. Decoding Marathi emotions: Enhanced speech emotion recognition through deep belief network-support vector machine integration. International Journal of Systematic Innovation, 2025, 9(4): 71-83 DOI:10.6977/IJoSI.202508_9(4).0006

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Abdel-Hamid L., Shaker N.H., Emara I. (2020). Analysis of linguistic and prosodic features of bilingual Arabic-English speakers for speech emotion recognition. IEEE Access, 8, 72957-72970.

[2]

Abdusalomov A., Kutlimuratov A., Nasimov R., & Whangbo T.K. (2023). Improved speech emotion recognition focusing on high-level data representations and swift feature extraction calculation. Computers, Materials and Continua, 77(3), 2915-2933.

[3]

Akçay M.B., & Oğuz K. (2020). Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers. Speech Communication, 116, 56-76.

[4]

Akinpelu S., & Viriri S. (2024). Deep learning framework for speech emotion classification: A survey of the state-of-the-art. IEEE Access, 12, 152152.

[5]

Alam Monisha S.T., & Sultana S. (2022). A review of the advancement in speech emotion recognition for indo-aryan and dravidian languages. In: Advances in Human-Computer Interaction. Wiley, Hoboken.

[6]

Alluhaidan A.S., Saidani O., Jahangir R., Nauman M.A., & Neffati O.S. (2023). Speech emotion recognition through hybrid features and convolutional neural network. Applied Sciences, 13(8), 4750.

[7]

Amartya J.G.M., & Kumar S.M. (2022). Speech emotion recognition in machine learning to improve accuracy using novel support vector machine and compared with decision tree algorithm. Journal of Pharmaceutical Negative Results, 185-192.

[8]

Arul V.H. (2021). Deep learning methods for data classification. In: Artificial Intelligence in Data Mining. Academic Press, p87-108

[9]

Bachate R.P., Sharma A., Singh A., Aly A.A., Alghtani A.H., & Le D.N. (2022). Enhanced marathi speech recognition facilitated by grasshopper optimisation-based recurrent neural network. Computer Systems Science and Engineering, 43(2), 439-454.

[10]

Bhangale K., & Kothandaraman M. (2023). Speech emotion recognition based on multiple acoustic features and deep convolutional neural network. Electronics, 12(4), 839.

[11]

Byun S.W., & Lee S.P. (2021). A study on a speech emotion recognition system with effective acoustic features using deep learning algorithms. Applied Sciences, 11(4), 1890.

[12]

Chai J., Zeng H., Li A., & Ngai E.W. (2021). Deep learning in computer vision: A critical review of emerging techniques and application scenarios. Machine Learning with Applications, 6, 100134.

[13]

Chaudhari P., Nandeshwar P., Bansal S., & Kumar N.(2023). MahaEmoSen: Towards Emotion-aware Multimodal Marathi Sentiment Analysis. ACM Transactions on Asian and Low-Resource Language Information Processing, 22(9), 1-24.

[14]

Er M.B. (2020). A novel approach for classification of speech emotions based on deep and acoustic features. IEEE Access, 8, 221640-221653.

[15]

Farooq M., Hussain F., Baloch N.K., Raja F.R., Yu H., & Zikria Y.B. (2020). Impact of feature selection algorithm on speech emotion recognition using deep convolutional neural network. Sensors, 20, 6008.

[16]

Hammed F.A., & George L. (2023). Using speech signal for emotion recognition using hybrid features with SVM classifier. Wasit Journal of Computer and Mathematics Science, 2(1), 27-38.

[17]

Harhare T., & Shah M. (2021). Linear mixed effect modelling for analyzing prosodic parameters for marathi language emotions. International Journal of Advanced Computer Science and Applications, 12(12).

[18]

Kaur K., & Singh P. (2023). Comparison of various feature selection algorithms in speech emotion recognition. AIUB Journal of Science and Engineering (AJSE), 22(2), 125-131.

[19]

Kawade R., & Jagtap S. (2024). Indian cross corpus speech emotion recognition using multiple spectral-temporal-voice quality acoustic features and deep convolution neural network. Revue d’Intelligence Artificielle, 38(3), 913-927.

[20]

Kishor B., Mohanaprasad K. (2022). Speech emotion recognition using mel frequency log spectrogram and deep convolutional neural network. In: Futuristic Communication and Network Technologies. Springer, Singapore, p241-250.

[21]

Kok C.L., Ho C.K., Tan F.K., & Koh Y.Y. (2024). Machine learning-based feature extraction and classification of emg signals for intuitive prosthetic control. Applied Sciences, 14(13), 5784.

[22]

Li R., Zhao J., & Jin Q. (2021). Speech Emotion Recognition Via Multi-Level Cross-Modal Distillation. In: Proceedings of Interspeech, p4488-4492.

[23]

Li Z., Huang H., Zhang Z., & Shi G. (2022). Manifold-based multi-deep belief network for feature extraction of hyperspectral image. Remote Sensing, 14(6), 1484.

[24]

Lieskovská E., Jakubec M., Jarina R., & Chmulík M. (2021). A review on speech emotion recognition using deep learning and attention mechanism. Electronics, 10(10), 1163.

[25]

Luna-Jiménez C., Kleinlein R., Griol D., Callejas Z., Montero J.M., & Fernández-Martínez F. (2021). A proposal for multimodal emotion recognition using aural transformers and action units on raves dataset. Applied Sciences, 12(1), 327.

[26]

Madanian S., Chen T., Adeleye O., Templeton J.M., Poellabauer C., Parry D., & Schneider S.L. (2023). Speech emotion recognition using machine learning-a systematic review. Intelligent Systems with Applications, 20, 200266.

[27]

Oh S., & Kim D.K. (2022). Comparative analysis of emotion classification based on facial expression and physiological signals using deep learning. Applied Sciences, 12(3), 1286.

[28]

Padman S., & Magare D. (2022). Regional language speech emotion detection using deep neural network. ITM Web of Conferences, 44, 03071.

[29]

Papala G., Ransing A., & Jain P. (2023). Sentiment analysis and speaker diarization in hindi and marathi using finetuned whisper: Sentiment analysis in Hindi and Marathi. Scalable Computing: Practice and Experience, 24(4), 835-846.

[30]

Sajjad M., & Kwon S. (2020). Clustering-based speech emotion recognition by incorporating learned features and deep BiLSTM. IEEE Access, 8, 79861-79875.

[31]

Shah F.M., Ranjan A., Yadav J., Deepak A. (2021). A survey of speech emotion recognition in the natural environment. Digital Signal Process, 110, 102951.

[32]

Singh Y.B., & Goel S. (2021). An efficient algorithm for recognition of emotions from speaker and language independent speech using deep learning. Multimedia Tools and Applications, 80(9), 14001-14018.

[33]

Sonawane S., & Kulkarni N. (2020). Speech emotion recognition based on MFCC and convolutional neural network. International Journal of Advance Scientific Research and Engineering Trends, 5, 18-22.

[34]

Subbarao M.V., Terlapu S.K., Geethika N., & Harika K.D. (2021). Speech emotion recognition using k-nearest neighbor classifiers. In: Recent Advances in Artificial Intelligence and Data Engineering: Select Proceedings of AIDE. Springer Verlag, Singapore, p123-131.

[35]

Tiwari P., Dehdashti S., Obeid A.K., Marttinen P., & Bruza P. (2022). Kernel method based on non-linear coherent states in quantum feature space. Journal of Physics A: Mathematical and Theoretical, 55(35), 355301.

[36]

Yang Z., Zhou S., Zhang L., & Serikawa S. (2024). Optimizing Speech Emotion Recognition with Hilbert Curve and convolutional neural network. Cognitive Robotics, 4, 30-41.

[37]

Zaidi S.A.M., Latif S., & Qadi J. (2023). Cross-Language Speech Emotion Recognition Using Multimodal Dual Attention Transformers. [arXiv Preprint].

PDF (2220KB)

613

Accesses

0

Citation

Detail

Sections
Recommended

/