Methodology for testing and monitoring artificial intelligence-based software for medical diagnostics
Yuri A. Vasiliev , Anton V. Vlazimirsky , Olga V. Omelyanskaya , Kirill M. Arzamasov , Sergey F. Chetverikov , Denis A. Rumyantsev , Maria A. Zelenova
Digital Diagnostics ›› 2023, Vol. 4 ›› Issue (3) : 252 -267.
Methodology for testing and monitoring artificial intelligence-based software for medical diagnostics
BACKGROUND: The global amount of investment in companies developing artificial intelligence (AI)-based software technologies for medical diagnostics reached $80 million in 2016, rose to $152 million in 2017, and is expected to continue growing. While software manufacturing companies should comply with existing clinical, bioethical, legal, and methodological frameworks and standards, there is a lack of uniform national and international standards and protocols for testing and monitoring AI-based software.
AIM: This objective of this study is to develop a universal methodology for testing and monitoring AI-based software for medical diagnostics, with the aim of improving its quality and implementing its integration into practical healthcare.
MATERIALS AND METHODS: The research process involved an analytical phase in which a literature review was conducted on the PubMed and eLibrary databases. The practical stage included the approbation of the developed methodology within the framework of an experiment focused on the use of innovative technologies in the field of computer vision to analyze medical images and further application in the health care system of the city of Moscow.
RESULTS: A methodology for testing and monitoring AI-based software for medical diagnostics has been developed, aimed at improving its quality and introducing it into practical healthcare. The methodology consists of seven stages: self-testing, functional testing, calibration testing, technological monitoring, clinical monitoring, feedback, and refinement.
CONCLUSION: Distinctive features of the methodology include its cyclical stages of monitoring and software development, leading to continuous improvement of its quality, the presence of detailed requirements for the results of the software work, and the participation of doctors in software evaluation. The methodology will allow software developers to achieve significant outcomes and demonstrate achievements across various areas. It also empowers users to make informed and confident choices among software options that have passed an independent and comprehensive quality check.
software / artificial intelligence / radiology / diagnostic imaging / methodology / quality control
| [1] |
Oakden-Rayner L, Palme LJ. Artificial intelligence in medicine: Validation and study design. In: Ranschart E, Morozov S, Algra P, eds. Artificial intelligence in medical imaging. Cham: Springer; 2019. Р. 83–104. |
| [2] |
Oakden-Rayner L., Palme L.J. Artificial intelligence in medicine: Validation and study design. In: Ranschart E., Morozov S., Algra P., eds. Artificial intelligence in medical imaging. Cham: Springer, 2019. Р. 83–104. |
| [3] |
Oakden-Rayner L, Palme LJ. Artificial intelligence in medicine: Validation and study design. In: Ranschart E, Morozov S, Algra P, eds. Artificial intelligence in medical imaging. Cham: Springer; 2019. Р. 83–104. |
| [4] |
Morozov SP, Zinchenko VV, Khoruzhaya AN, et al. Standardization of artificial intelligence in healthcare: Russia is becoming a leader. Doctor Inform Technol. 2021;(2):12–19. (In Russ). doi: 10.25881/18110193_2021_2_12 |
| [5] |
Морозов С.П., Зинченко В.В., Хоружая А.Н., и др. Стандартизация искусственного интеллекта в здравоохранении: Россия выходит в лидеры // Врач и информационные технологии. 2021. № 2. С. 12–19. doi: 10.25881/18110193_2021_2_12 |
| [6] |
Morozov SP, Zinchenko VV, Khoruzhaya AN, et al. Standardization of artificial intelligence in healthcare: Russia is becoming a leader. Doctor Inform Technol. 2021;(2):12–19. (In Russ). doi: 10.25881/18110193_2021_2_12 |
| [7] |
Mello AA, Utkin LV, Trofimova TN. Artificial intelligence in medicine: The current state and main directions of development of intellectual diagnostics. Radiation Diagnost Therapy. 2020;(1):9–17. (In Russ). doi: 10.22328/2079-5343-2020-11-1-9-17 |
| [8] |
Мелдо А.А., Уткин Л.В., Трофимова Т.Н. Искусственный интеллект в медицине: современное состояние и основные направления развития интеллектуальной диагностики // Лучевая диагностика и терапия. 2020. № 1. С. 9–17. doi: 10.22328/2079-5343-2020-11-1-9-17 |
| [9] |
Mello AA, Utkin LV, Trofimova TN. Artificial intelligence in medicine: The current state and main directions of development of intellectual diagnostics. Radiation Diagnost Therapy. 2020;(1):9–17. (In Russ). doi: 10.22328/2079-5343-2020-11-1-9-17 |
| [10] |
Zinchenko VV, Arzamasov KM, Chetverikov SF, et al. Methodology of post-registration clinical monitoring for software using artificial intelligence technologies. Modern Technol Med. 2022;14(5):15–25. (In Russ). doi: 10.17691/stm2022.14.5.02 |
| [11] |
Зинченко В.В., Арзамасов К.М., Четвериков С.Ф., и др. Методология проведения пострегистрационного клинического мониторинга для программного обеспечения с применением технологий искусственного интеллекта // Современные технологии в медицине. 2022. Т. 14, № 5. С. 15–25. doi: 10.17691/stm2022.14.5.02 |
| [12] |
Zinchenko VV, Arzamasov KM, Chetverikov SF, et al. Methodology of post-registration clinical monitoring for software using artificial intelligence technologies. Modern Technol Med. 2022;14(5):15–25. (In Russ). doi: 10.17691/stm2022.14.5.02 |
| [13] |
Tanguay W, Acar P, Fine B, et al. Assessment of radiology artificial intelligence software: A validation and evaluation framework. Can Assoc Radiol J. 2023;74(2):326–333. doi: 10.1177/08465371221135760 |
| [14] |
Tanguay W., Acar P., Fine B., et al. Assessment of radiology artificial intelligence software: A validation and evaluation framework // Can Assoc Radiol J. 2023. Vol. 74, N 2. Р. 326–333. doi: 10.1177/08465371221135760 |
| [15] |
Tanguay W, Acar P, Fine B, et al. Assessment of radiology artificial intelligence software: A validation and evaluation framework. Can Assoc Radiol J. 2023;74(2):326–333. doi: 10.1177/08465371221135760 |
| [16] |
Kohli A, Jha S. Why CAD failed in mammography. J Am Coll Radiol. 2018;15(3 Pt B):535–537. doi: 10.1016/j.jacr.2017.12.029 |
| [17] |
Kohli A., Jha S. Why CAD failed in mammography // J Am Coll Radiol. 2018. Vol. 15, N 3, Pt. B. Р. 535–537. doi: 10.1016/j.jacr.2017.12.029 |
| [18] |
Kohli A, Jha S. Why CAD failed in mammography. J Am Coll Radiol. 2018;15(3 Pt B):535–537. doi: 10.1016/j.jacr.2017.12.029 |
| [19] |
Recht MP, Dewey M, Dreyer K, et al. Integrating artificial intelligence into the clinical practice of radiology: Challenges and recommendations. Eur Radiol. 2020;30(6):3576–3584. doi: 10.1007/s00330-020-06672-5 |
| [20] |
Recht M.P., Dewey M., Dreyer K., et al. Integrating artificial intelligence into the clinical practice of radiology: Challenges and recommendations // Eur Radiol. 2020. Vol. 30, N 6. Р. 3576–3584. doi: 10.1007/s00330-020-06672-5 |
| [21] |
Recht MP, Dewey M, Dreyer K, et al. Integrating artificial intelligence into the clinical practice of radiology: Challenges and recommendations. Eur Radiol. 2020;30(6):3576–3584. doi: 10.1007/s00330-020-06672-5 |
| [22] |
Higgins DC, Johner C. Validation of artificial intelligence containing products across the regulated healthcare industries. Ther Innov Regul Sci. 2023;57(4):797–809. doi: 10.1007/s43441-023-00530-4 |
| [23] |
Higgins D.C., Johner C. Validation of artificial intelligence containing products across the regulated healthcare industries // Ther Innov Regul Sci. 2023. Vol. 57, N 4. Р. 797–809. doi: 10.1007/s43441-023-00530-4 |
| [24] |
Higgins DC, Johner C. Validation of artificial intelligence containing products across the regulated healthcare industries. Ther Innov Regul Sci. 2023;57(4):797–809. doi: 10.1007/s43441-023-00530-4 |
| [25] |
Rudolph J, Schachtner B, Fink N, et al. Clinically focused multi-cohort benchmarking as a tool for external validation of artificial intelligence algorithm performance in basic chest radiography analysis. Sci Rep. 2022;12(1):12764. doi: 10.1038/s41598-022-16514-7 |
| [26] |
Rudolph J., Schachtner B., Fink N., et al. Clinically focused multi-cohort benchmarking as a tool for external validation of artificial intelligence algorithm performance in basic chest radiography analysis // Sci Rep. 2022. Vol. 12, N 1. Р. 12764. doi: 10.1038/s41598-022-16514-7 |
| [27] |
Rudolph J, Schachtner B, Fink N, et al. Clinically focused multi-cohort benchmarking as a tool for external validation of artificial intelligence algorithm performance in basic chest radiography analysis. Sci Rep. 2022;12(1):12764. doi: 10.1038/s41598-022-16514-7 |
| [28] |
Allen B, Dreyer K, Stibolt R, et al. Evaluation and real-world performance monitoring of artificial intelligence models in clinical practice: Try it, buy it, check it. J Am Coll Radiol. 2021;18(11):1489–1496. doi: 10.1016/j.jacr.2021.08.022 |
| [29] |
Allen B., Dreyer K., Stibolt R., et al. Evaluation and real-world performance monitoring of artificial intelligence models in clinical practice: Try it, buy it, check it // J Am Coll Radiol. 2021. Vol. 18, N 11. Р. 1489–1496. doi: 10.1016/j.jacr.2021.08.022 |
| [30] |
Allen B, Dreyer K, Stibolt R, et al. Evaluation and real-world performance monitoring of artificial intelligence models in clinical practice: Try it, buy it, check it. J Am Coll Radiol. 2021;18(11):1489–1496. doi: 10.1016/j.jacr.2021.08.022 |
| [31] |
Strohm L, Hehakaya C, Ranschaert ER, et al. Implementation of artificial intelligence (AI) applications in radiology: Hindering and facilitating factors. Eur Radiol. 2020;30(10):5525–5532. doi: 10.1007/s00330-020-06946-y |
| [32] |
Strohm L., Hehakaya C., Ranschaert E.R., et al. Implementation of artificial intelligence (AI) applications in radiology: Hindering and facilitating factors // Eur Radiol. 2020. Vol. 30, N 10. Р. 5525–5532. doi: 10.1007/s00330-020-06946-y |
| [33] |
Strohm L, Hehakaya C, Ranschaert ER, et al. Implementation of artificial intelligence (AI) applications in radiology: Hindering and facilitating factors. Eur Radiol. 2020;30(10):5525–5532. doi: 10.1007/s00330-020-06946-y |
| [34] |
Sohn JH, Chillakuru YR, Lee S, et al. An open-source, vender agnostic hardware and software pipeline for integration of artificial intelligence in radiology workflow. J Digit Imaging. 2020;33(4):1041–1046. doi: 10.1007/s10278-020-00348-8 |
| [35] |
Sohn J.H., Chillakuru Y.R., Lee S., et al. An open-source, vender agnostic hardware and software pipeline for integration of artificial intelligence in radiology workflow // J Digit Imaging. 2020. Vol. 33, N 4. Р. 1041–1046. doi: 10.1007/s10278-020-00348-8 |
| [36] |
Sohn JH, Chillakuru YR, Lee S, et al. An open-source, vender agnostic hardware and software pipeline for integration of artificial intelligence in radiology workflow. J Digit Imaging. 2020;33(4):1041–1046. doi: 10.1007/s10278-020-00348-8 |
| [37] |
Wichmann JL, Willemink MJ, De Cecco CN. Artificial intelligence and machine learning in radiology: Current state and considerations for routine clinical implementation. Invest Radiol. 2020;55(9):619–627. doi: 10.1097/RLI.0000000000000673 |
| [38] |
Wichmann J.L., Willemink M.J., De Cecco C.N. Artificial intelligence and machine learning in radiology: Current state and considerations for routine clinical implementation // Invest Radiol. 2020. Vol. 55, N 9. Р. 619–627. doi: 10.1097/RLI.0000000000000673 |
| [39] |
Wichmann JL, Willemink MJ, De Cecco CN. Artificial intelligence and machine learning in radiology: Current state and considerations for routine clinical implementation. Invest Radiol. 2020;55(9):619–627. doi: 10.1097/RLI.0000000000000673 |
| [40] |
Larson DB, Harvey H, Rubin DL, et al. Regulatory frameworks for development and evaluation of artificial intelligence-based diagnostic imaging algorithms: Summary and recommendations. J Am Coll Radiol. 2021;18(3 Pt A):413–424. doi: 10.1016/j.jacr.2020.09.060 |
| [41] |
Larson D.B., Harvey H., Rubin D.L., et al. Regulatory frameworks for development and evaluation of artificial intelligence-based diagnostic imaging algorithms: Summary and recommendations // J Am Coll Radiol. 2021. Vol. 18, N 3, Pt. A. Р. 413–424. doi: 10.1016/j.jacr.2020.09.060 |
| [42] |
Larson DB, Harvey H, Rubin DL, et al. Regulatory frameworks for development and evaluation of artificial intelligence-based diagnostic imaging algorithms: Summary and recommendations. J Am Coll Radiol. 2021;18(3 Pt A):413–424. doi: 10.1016/j.jacr.2020.09.060 |
| [43] |
Milam ME, Koo CW. The current status and future of FDA-approved artificial intelligence tools in chest radiology in the United States. Clin Radiol. 2023;78(2):115–122. doi: 10.1016/j.crad.2022.08.135 |
| [44] |
Milam M.E., Koo C.W. The current status and future of FDA-approved artificial intelligence tools in chest radiology in the United States // Clin Radiol. 2023. Vol. 78, N 2. Р. 115–122. doi: 10.1016/j.crad.2022.08.135 |
| [45] |
Milam ME, Koo CW. The current status and future of FDA-approved artificial intelligence tools in chest radiology in the United States. Clin Radiol. 2023;78(2):115–122. doi: 10.1016/j.crad.2022.08.135 |
| [46] |
De Silva D, Alahakoon D. An artificial intelligence life cycle: From conception to production. Patterns (NY). 2022;3(6):100489. doi: 10.1016/j.patter.2022.100489 |
| [47] |
De Silva D., Alahakoon D. An artificial intelligence life cycle: From conception to production // Patterns (NY). 2022. Vol. 3, N 6. Р. 100489. doi: 10.1016/j.patter.2022.100489 |
| [48] |
De Silva D, Alahakoon D. An artificial intelligence life cycle: From conception to production. Patterns (NY). 2022;3(6):100489. doi: 10.1016/j.patter.2022.100489 |
| [49] |
Cerdá-Alberich L, Solana J, Mallol P, et al. MAIC-10 brief quality checklist for publications using artificial intelligence and medical images. Insights Imaging. 2023;14(1):11. doi: 10.1186/s13244-022-01355-9 |
| [50] |
Cerdá-Alberich L., Solana J., Mallol P., et al. MAIC-10 brief quality checklist for publications using artificial intelligence and medical images // Insights Imaging. 2023. Vol. 14, N 1. Р. 11. doi: 10.1186/s13244-022-01355-9 |
| [51] |
Cerdá-Alberich L, Solana J, Mallol P, et al. MAIC-10 brief quality checklist for publications using artificial intelligence and medical images. Insights Imaging. 2023;14(1):11. doi: 10.1186/s13244-022-01355-9 |
| [52] |
Vasey B, Novak A, Ather S, et al. DECIDE-AI: A new reporting guideline and its relevance to artificial intelligence studies in radiology. Clin Radiol. 2023;78(2):130–136. doi: 10.1016/j.crad.2022.09.131 |
| [53] |
Vasey B., Novak A., Ather S., et al. DECIDE-AI: A new reporting guideline and its relevance to artificial intelligence studies in radiology // Clin Radiol. 2023. Vol. 78, N 2. Р. 130–136. doi: 10.1016/j.crad.2022.09.131 |
| [54] |
Vasey B, Novak A, Ather S, et al. DECIDE-AI: A new reporting guideline and its relevance to artificial intelligence studies in radiology. Clin Radiol. 2023;78(2):130–136. doi: 10.1016/j.crad.2022.09.131 |
| [55] |
Regulations for the preparation of data sets with a description of approaches to the formation of a representative sample of data. Moscow: Scientific and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Department of Health of the City of Moscow; 2022. 40 p. (Best practices in radiological and instrumental diagnostics; Part 1). (In Russ). |
| [56] |
Регламент подготовки наборов данных с описанием подходов к формированию репрезентативной выборки данных. Москва: Научно-практический клинический центр диагностики и телемедицинских технологий Департамента здравоохранения города Москвы, 2022. 40 с. (Лучшие практики лучевой и инструментальной диагностики; Часть 1). |
| [57] |
Regulations for the preparation of data sets with a description of approaches to the formation of a representative sample of data. Moscow: Scientific and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Department of Health of the City of Moscow; 2022. 40 p. (Best practices in radiological and instrumental diagnostics; Part 1). (In Russ). |
| [58] |
Chetverikov S, Arzamasov KM, Andreichenko AE, et al. Approaches to sampling for quality control of artificial intelligence systems in biomedical research. Modern Technol Med. 2023;15(2):19–27. (In Russ). doi: 10.17691/stm2023.15.2.02 |
| [59] |
Четвериков С.Ф., Арзамасов К.М., Андрейченко А.Е., и др. Подходы к формированию выборки для контроля качества работы систем искусственного интеллекта в медико-биологических исследованиях // Современные технологии в медицине. 2023. Т. 15, № 2. С. 19–27. doi: 10.17691/stm2023.15.2.02 |
| [60] |
Chetverikov S, Arzamasov KM, Andreichenko AE, et al. Approaches to sampling for quality control of artificial intelligence systems in biomedical research. Modern Technol Med. 2023;15(2):19–27. (In Russ). doi: 10.17691/stm2023.15.2.02 |
| [61] |
Morozov SP, Vladzimirsky AV, Klyashtorny VG, et al. Clinical trials of software based on intelligent technologies (radiation diagnostics). Moscow: Scientific and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Department of Health of the City of Moscow; 2019. 33 р. (In Russ). |
| [62] |
Морозов С.П., Владзимирский А.В., Кляшторный В.Г., и др. Клинические испытания программного обеспечения на основе интеллектуальных технологий (лучевая диагностика). Москва: Научно-практический клинический центр диагностики и телемедицинских технологий Департамента здравоохранения города Москвы, 2019. 33 с. |
| [63] |
Morozov SP, Vladzimirsky AV, Klyashtorny VG, et al. Clinical trials of software based on intelligent technologies (radiation diagnostics). Moscow: Scientific and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Department of Health of the City of Moscow; 2019. 33 р. (In Russ). |
| [64] |
Kim DW, Jang HY, Kim KW, et al. Design characteristics of studies reporting the performance of artificial intelligence algorithms for diagnostic analysis of medical images: Results from recently published papers. Korean J Radiol. 2019;20(3):405–410. doi: 10.3348/kjr.2019.0025 |
| [65] |
Kim D.W., Jang H.Y., Kim K.W., et al. Design characteristics of studies reporting the performance of artificial intelligence algorithms for diagnostic analysis of medical images: Results from recently published papers // Korean J Radiol. 2019. Vol. 20, N 3. Р. 405–410. doi: 10.3348/kjr.2019.0025 |
| [66] |
Kim DW, Jang HY, Kim KW, et al. Design characteristics of studies reporting the performance of artificial intelligence algorithms for diagnostic analysis of medical images: Results from recently published papers. Korean J Radiol. 2019;20(3):405–410. doi: 10.3348/kjr.2019.0025 |
Eco-Vector
/
| 〈 |
|
〉 |