PDF
(829KB)
Abstract
Objective: This review evaluates the worldwide use of artificial intelligence (AI) for the diagnosis and treatment of voice disorders.
Methods: An electronic search was completed in Embase, Pubmed, Ovid MEDLINE, Scopus, Google Scholar, and Web of Science. Studies in English from 2019 to 2024 evaluating the use of AI in detection and management of voice disorders were included. Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines were followed.
Results Eighty-one studies were recognized. Thirty-three studies were chosen and screened for quality assessment. Of these, 16 studies used AI to determine normal versus pathological voice. The convolutional neural network (CNN) was the most employed algorithm among all machine learning algorithms.
Conclusion This review revealed significant interest worldwide in utilizing AI in detection of voice disorders. Gaps included the use of limited, inconsistent data sets, lack of validation, and emphasis on detection rather than treatment of the voice disorder. These are areas of opportunity for AI techniques to improved diagnostic accuracy.
Keywords
artificial intelligence
/
dysphonia
/
machine learning
Cite this article
Download citation ▾
Amna Suleman, Amy L. Rutt.
Global utilization of artificial intelligence in the diagnosis and management of voice disorders over the past five years.
Eye & ENT Research, 2025, 2(2): 88-95 DOI:10.1002/eer3.70006
| [1] |
Yang X , Wu J , Chen X . Application of artificial intelligence to the diagnosis and therapy of nasopharyngeal carcinoma. J Clin Med. 2023; 12 (9): 3077.
|
| [2] |
Yao P , Usman M , Chen YH , et al. Applications of artificial intelligence to office laryngoscopy: A Scoping Review. Laryngoscope. 2022; 132 (10): 1993- 2016.
|
| [3] |
Hegde S , Shetty S , Rai S , Dodderi T . A survey on machine learning approaches for automatic detection of voice disorders. J Voice. 2019; 33 (6): 947- e11-e947.e33.
|
| [4] |
Peng X , Xu H , Liu J , Wang J , He C . Voice disorder classification using convolutional neural network based on deep transfer learning. Sci Rep. 2023; 13 (1): 7264.
|
| [5] |
Simon S , Silverstein E , Timmons-Sund L , et al. Validation of an AIassisted treatment outcome Measure for gender-affirming voice care: Comparing AI Accuracy to Listener's Perception of Voice Femininity. J Voice. 2023.
|
| [6] |
Tessler I , Primov-Fever A , Soffer S , et al. Deep learning in voice analysis for diagnosing vocal cord pathologies: A Systematic Review. Eur Arch Otorhinolaryngol. 2024; 281 (2): 863- 871.
|
| [7] |
Hu HC , Chang SY , Wang CH , et al. Deep learning application for vocal fold disease prediction through voice recognition: preliminary development study. J Med Internet Res. 2021; 23 (6): e25247.
|
| [8] |
Fang SH , Tsao Y , Hsiao MJ , et al. Detection of pathological voice using cepstrum vectors: A Deep Learning Approach. J Voice. 2019; 33 (5): 634- 641.
|
| [9] |
Powell ME , Rodriguez Cancio M , Young D , et al. Decoding phonation with artificial intelligence (DeP AI): Proof of Concept. Laryngoscope Investigative Otolaryngology. 2019; 4 (3): 328- 334.
|
| [10] |
Wang TV , Adamian N , Song PC , et al. Application of a computer vision tool for automated glottic tracking to vocal fold paralysis patients. Otolaryngology-Head Neck Surg (Tokyo). 2021; 165 (4): 556- 562.
|
| [11] |
Liu GS , Hodges JM , Yu J , Sung CK , Erickson-DiRenzo E , Doyle PC . End-to-end deep learning classification of vocal pathology using stacked vowels. Laryngoscope Investig Otolaryngol. 2023; 8 (5): 1312- 1318.
|
| [12] |
Garcia M , Rosset A . Deep neural network for automatic assessment of dysphonia. Electricqal Engineering and Systems Science. 2022; 2202: 12957.
|
| [13] |
Uloza V , Maskeliunas R , Pribuisis K , Vaitkus S , Kulikajevas A , Damasevicius R . An artificial intelligence-based algorithm for the assessment of substitution voicing. Appl Sci. 2022; 12 (19): 9748.
|
| [14] |
Maskeliūnas R , Kulikajevas A , Damaševičius R , Pribuišis K , Ulozaitė-Stanienė N , Uloza V . Lightweight deep learning model for assessment of substitution voicing and speech after laryngeal carcinoma surgery. Cancers. 2022; 14 (10): 2366.
|
| [15] |
Kwon I , Wang SG , Shin S , et al. Diagnosis of early glottic cancer using laryngeal image and voice based on ensemble learning of convolutional neural network classifiers. J Voice. 2022; 39 (1): 245- 257.
|
| [16] |
Lee JY . Experimental evaluation of deep learning methods for an intelligent pathological voice detection system using the saarbruecken voice database. Appl Sci. 2021; 11 (15): 7149.
|
| [17] |
Verde L , De Pietro G , Alrashoud M , Ghoneim A , Al-Mutib KN , Sannino G . Dysphonia detection index (DDI): A New Multi-Parametric Marker to Evaluate Voice Quality. IEEE. 2019; 7: 55689- 55697.
|
| [18] |
Verde L , De Pietro G , Alrashoud M , Ghoneim A , Al-Mutib KN , Sannino G . Leveraging artificial intelligence to improve voice disorder identification through the use of a reliable mobile app. IEEE. 2019; 7: 124048- 124054.
|
| [19] |
Suppa A , Asci F , Saggio G , et al. Voice analysis in adductor spasmodic dysphonia: Objective Diagnosis and Response to Botulinum Toxin. Parkinsonism and Related. 2020; 73: 23- 30.
|
| [20] |
Marchese MR , Sensoli F , Campagnini S , et al. Artificial intelligence for the recognition of benign lesions of vocal folds from audio recordings. Acta Otorhinolaryngol Ital. 2023Oct; 43 (5): 317- 323.
|
| [21] |
Cala F , Frassineti L , Manfredi C , et al. Machine learning assessment of spasmodic dysphonia based on acoustical and perceptal parameters. Bioengineering. 2023; 10 (4): 426.
|
| [22] |
Cala F , Frassineti L , Sforza E , et al. Artificial intelligence procedure for the screenign of genetic syndromes based on voice characteristics. Bioengineering. 2023; 10 (12): 1375.
|
| [23] |
Mendoza LF , Kohler M , Munoz C , et al. Analysis and classification of voice pathologies using glottal signal parameters eith recurrent neural networks and SVM. ICAART. 2019: 19- 28.
|
| [24] |
Teixeira JP , Alves N , Odete P . Vocal Acoustic Analysis: ANN Versos SVM in Classification of Dysphonic Voices and Vocal Cord Paralysis. Research Anthology on Artifical Neural Network Applications; 2022.
|
| [25] |
Ksibi A , Hakami NA , Alturki N , Asiri MM , Zakariah M , Ayadi M . Voice pathology detection using a two-level classifier based on combined CNN-RNN architecture. Sustainability. 2023; 15 (4): 3204.
|
| [26] |
Naranjo L , Perez C , Roca Y , Madruga M . Replication-based regularization approaches to diagnose Reinke's edema by using voice recordings. Artif Intell Med. 2021; 120: 102162.
|
| [27] |
Coelho S , Shashirekha HL . Identification of voice disorders: A Comparative Study of Machine Learning Algorithms. In: Karpov A, Samudravijaya K, Deepak KT, Hegde RM, Agrawal SS, Prasanna SRM, eds. Speech and Computer. SPECOM; 2023. 14338.
|
| [28] |
Wang SS , Wang CT , Lai CC , Tsao Y , Fang SH . Continuous speech for improved learning pathological voice disorders. IEEE Open J Eng Med Biol. 2022; 3: 25- 33.
|
| [29] |
Wang CT , Chuang ZY , Hung CH , Tsao Y , Fang SH . Detection of glottic neoplasm based on voice signals using deep neural networks. IEEE Sensors Letters. 2022; 3 (6): 1- 4.
|
| [30] |
Geng l , Liang Y , Shan H , et al. Pathological voice detection and classification based on multilodal transmission network. J Voice. 2022.
|
| [31] |
Chen ZH , Lin L , Wu CF , Li CF , Xu RH , Sun Y . Artificial intelligence for assisting cancer diagnosis and treatment in the era of precision medicine. Cancer Commun. 2021; 41 (11): 1100- 1115.
|
| [32] |
Zhang T , Shao Y , Wu Y , Pang Z , Liu G . Multiple vowels repair based on pitch extraction and line spectrum pair feature for voice disorder. IEEE J Biomed Health Inform. 2020; 24 (7): 1940- 1951.
|
| [33] |
Compton EC , Cruz T , Andreassen M , et al. Developing an artificial intelligence tool to predict vocal cord pathology in primary care settings. Laryngoscope. 2023; 133 (8): 1952- 1960.
|
| [34] |
Reid J , Parmar P , Lund T , Aalto DK , Jeffery CC . Development of a machine-learning based voice disorder screening tool. Am J Otolaryngol. 2022; 43 (2): 103327.
|
| [35] |
Rivera M , Garcia C , Rojas T , et al. Automatic identification of dysphonia using machine learning algorithms. Applied Computer Science. 2023 (19): 4.
|
| [36] |
Zaim N , Al-Dhief F , Azman M , Alsemawi MRM , Abdul Latiff NM , Mat Baki M . The accuracy of an online sequential extreme learning machine in detecting voice pathology using the Malaysian voice pathology database. Journal of Otolaryngology-Head and Neck Surgery. 2023; 52 (1): 52.
|
| [37] |
Al-Dhief F , Baki M , Latiff N , et al. Voice pathology detection and classification by adopting online sequential extreme learning machine. IEEE Access. 2021; 9 (9): 77293- 77306.
|
| [38] |
Barriera R , Ling L . Kullback-Leibler divergence and sample skewness for pathological voice quality assessment. Biomed Signal Process Control. 2020; 57 (57): 101697.
|
| [39] |
Fonseca E , Guido R , Junior S , Dezani H , Gati RR , Mosconi Pereira DC . Acoustic investigation of speech pathologies based on the discriminative paraconsistent machine (DPM). Biomed Signal Process Control. 2020; 55 (55): 101615.
|
| [40] |
Schlegel P , Kniesburges S , Dürr S , Schützenberger A , Döllinger M . Machine learning based identification of relevant parameters for functional voice disorders derived from endoscopic high-speed recordings. Sci Rep. 2020; 10 (1): 10517.
|
| [41] |
Barlow J , Sragi Z , Rivera-Rivera G , et al. The use of deep learning software in the detection of voice disorders: A Systematic Review. Otolaryngol Head Neck Surg. 2024; 170: 1531- 1543.
|
| [42] |
Wu H , Soraghan J , Lowit A , Di Caterina G . Convolutional neural networks for pathological voice detection. Annu Int Conf IEEE Eng Med Biol Soc. 2018; 2018: 1- 4.
|
| [43] |
Tsui SY , Tsao Y , Lin CW , Fang SH , Lin FC , Wang CT . Demographic and symptomatic features of voice disorders and their potential application in classification using machine learning algorithms. Folia Phoniatr Logop. 2018; 70 (3-4): 174- 182.
|
| [44] |
Fujimura S , Kojima T , Okanoue Y , et al. Classification of voice disorders using a one-dimensional convolutional neural network. J Voice. 2022; 36 (1): 15- 20.
|
| [45] |
Syed SA , Rashid M , Hussain S , Zahid H . Comparative analysis of CNN and RNN for voice pathology detection. BioMed Res Int. 2021; 2021 (1).
|
RIGHTS & PERMISSIONS
The Author(s). Eye & ENT Research published by John Wiley & Sons Australia, Ltd on behalf of Higher Education Press.