Reference medical datasets (MosMedData) for independent external evaluation of algorithms based on artificial intelligence in diagnostics
Nikolay A. Pavlov , Anna E. Andreychenko , Anton V. Vladzymyrskyy , Anush A. Revazyan , Yury S. Kirpichev , Sergey P. Morozov
Digital Diagnostics ›› 2021, Vol. 2 ›› Issue (1) : 49 -66.
Reference medical datasets (MosMedData) for independent external evaluation of algorithms based on artificial intelligence in diagnostics
The article describes a novel approach to creating annotated medical datasets for testing artificial intelligence-based diagnostic solutions. Moreover, there are four stages of dataset formation described: planning, selection of initial data, marking and verification, and documentation. There are also examples of datasets created using the described methods. The technique is scalable and versatile, and it can be applied to other areas of medicine and healthcare that are being automated and developed using artificial intelligence and big data technologies.
artificial intelligence / medical data / dataset / marking / computer-assisted learning / big data / verification
| [1] |
Gusev AV. Prospects for neural networks and deep machine learning in creating health solutions (Compex medical information system, Russian). Vrach i Informatsionnye Tekhnologii. 2017;(3):92–105. (In Russ). |
| [2] |
Гусев А.В. Перспективы нейронных сетей и глубокого машинного обучения в создании решений для здравоохранения // Врач и информационные технологии. 2017. № 3. С. 92–105. |
| [3] |
Ranschaert ER, Morozov S, Algra PR, eds. Artificial intelligence in medical imaging. Cham: Springer International Publishing; 2019. doi: 10.1007/978-3-319-94878-2 |
| [4] |
Ranschaert E.R., Morozov S., Algra P.R., eds. Artificial intelligence in medical imaging. Cham: Springer International Publishing; 2019. doi: 10.1007/978-3-319-94878-2 |
| [5] |
Griffith B, Kadom N, Straus CM. Radiology Education in the 21st Century: Threats and Opportunities. J Am Coll Radiol. 2019;16(10):1482–1487. doi: 10.1016/j.jacr.2019.04.003 |
| [6] |
Griffith B., Kadom N., Straus C.M. Radiology Education in the 21st Century: Threats and Opportunities // J Am Coll Radiol. 2019. Vol. 16, N 10. Р. 1482–1487. doi: 10.1016/j.jacr.2019.04.003 |
| [7] |
Savadjiev P, Chong J, Dohan A, et al. Demystification of AI-driven medical image interpretation: past, present and future. Eur Radiol. 2019:29(3):1616–1624. doi: 10.1007/s00330-018-5674-x |
| [8] |
Savadjiev P., Chong J., Dohan A., et al. Demystification of AI-driven medical image interpretation: past, present and future // European Radiology. 2019. Vol. 29. N 3, Р. 1616–1624. doi: 10.1007/s00330-018-5674-x |
| [9] |
Ng А. What artificial intelligence can and can’t do right now. Harvard Business Review; 2016. Available from: https://hbr.org/2016/11/what-artificial-intelligence-can-and-cant-do-right-now |
| [10] |
Renear H, Sacchi S, Wickett KM. Definitions of dataset in the scientific and technical literature. Proceedings of the American Society for Information Science and Technology. 2010;47(1):1-4. doi: 10.1002/meet.14504701240 |
| [11] |
Renear H., Sacchi S., Wickett K.M. Definitions of dataset in the scientific and technical literature // Proceedings of the American Society for Information Science and Technology. 2010. Vol. 47, N 1. Р. 1–4. doi: 10.1002/meet.14504701240 |
| [12] |
Tan SL, Gao G, Koch S. Big data and analytics in healthcare. Methods Inf Med. 2015;54(6):546–547. doi: 10.3414/ME15-06-1001 |
| [13] |
Tan S.L., Gao G., Koch S. Big data and analytics in healthcare // Methods Inf Med. 2015. Vol. 54, N 6. Р. 546–547. doi: 10.3414/ME15-06-1001 |
| [14] |
Kohli MD, Summers RM, Geis JR. Medical image data and datasets in the era of machine learning—whitepaper from the 2016 C- MIMI meeting dataset session. J Digit Imaging. 2017;30(4):392–399. doi: 10.1007/s10278-017-9976-3 |
| [15] |
Kohli M.D., Summers R.M., Geis J.R. Medical image data and datasets in the era of machine learning—whitepaper from the 2016 C-MIMI meeting dataset session // J Digit Imaging. 2017. Vol. 30, N 4. Р. 392–399. doi: 10.1007/s10278-017-9976-3 |
| [16] |
Willemink MJ, Koszek WA, Hardell C, et al. Preparing medical imaging data for machine learning. Radiology. 2020;295(1):4–15. doi: 10.1148/radiol.2020192224 |
| [17] |
Willemink M.J., Koszek W.A., Hardell C., et al. Preparing medical imaging data for machine learning // Radiology. 2020. Vol. 295, N 1. Р. 4–15. doi: 10.1148/radiol.2020192224 |
| [18] |
Morozov SP, Shelekhov PV, Vladzymyrsky AV. Modern approaches to the radiology service improvement. Health Care Standardization Problems. 2019;(5-6):30−34. (In Russ). doi: 10.26347/1607-2502201905-06030-034 |
| [19] |
Морозов С.П., Шелехов П.В., Владзимирский А.В. Современные стандартизованные подходы к совершенствованию службы лучевой диагностики // Проблемы стандартизации в здравоохранении. 2019. № 5-6. С. 30–34. doi: 10.26347/1607-2502201905-06030-034 |
| [20] |
Kulberg NS, Gusev MA, Reshetnikov RV, et al. Methodology and tools for creating training samples for artificial intelligence systems for recognizing lung cancer on CT images. Health Care Russian Federation. 2021;64(6):343–350. doi: 10.46563/0044-197x-2020-64-6-343-350 |
| [21] |
Kulberg N.S., Gusev M.A., Reshetnikov R.V., et al. Methodology and tools for creating training samples for artificial intelligence systems for recognizing lung cancer on CT images // Health Care Russian Federation. 2021. Vol. 64, N 6. Р. 343–350. doi: 10.46563/0044-197x-2020-64-6-343-350 |
| [22] |
Preston-Werner T. Semantic Versioning 2.0.0 [Internet]. Available from: https://semver.org |
| [23] |
Morozov SP, Protsenko DN, Smetanina SV, et al. Radiation diagnostics of coronavirus disease (COVID-19): organization, methodology, interpretation of results: Preprint No.CDT ― 2020 ― II. Version 2 from 17.04.2020. The series “Best practices of radiation and instrumental diagnostics”. Issue 65. Moscow : Scientific and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Department of Health; 2020. 80 p. (In Russ). Avalable from: https://telemedai.ru/biblioteka-dokumentov/luchevaya-diagnostika-koronavirusnoj-bolezni-covid-19-organizaciya-metodologiya-interpretaciya-rezultatov |
| [24] |
Морозов С.П., Проценко Д.Н., Сметанина С.В. и др. Лучевая диагностика коронавирусной болезни (COVID-19): организация, методология, интерпретация результатов : препринт № ЦДТ ― 2020 ― II. Версия 2 от 17.04.2020. Серия «Лучшие практики лучевой и инструментальной диагностики». Вып. 65. Москва : ГБУЗ НПКЦ ДиТ ДЗМ, 2020. 80 с. Режим доступа: https://telemedai.ru/biblioteka-dokumentov/luchevaya-diagnostika-koronavirusnoj-bolezni-covid-19-organizaciya-metodologiya-interpretaciya-rezultatov. Дата обращения: 15.01.2021. |
| [25] |
Pavlov N. ECR 2021: Value of technical stratification of medical datasets for AI services. Moscow, 2021. [Internet]. Available from: https://connect.myesr.org/course/ai-in-breast-imaging/ |
| [26] |
Morozov SP, Vladzymyrskyy A, Andreychenko A, et al. Moscow experiment on computer vision in radiology: involvement and participation of radiologists. Vrach i informacionnye tehnologii. 2020;(4):14–23. doi: 10.37690/1811-0193-2020-4-14-23 |
| [27] |
Морозов С.П., Владзимирский А.В., Ледихова Н.В. и др. Московский эксперимент по применению компьютерного зрения в лучевой диагностике: вовлеченность врачей-рентгенологов // Врач и информационные технологии. 2020. № 4. С. 14–23. doi: 10.37690/1811-0193-2020-4-14-23 |
| [28] |
Morozov SP, Vladzymyrskyy AV, Klyashtornyy VG, et al. Clinical acceptance of software based on artificial intelligence technologies (radiology). Series “Best practices in medical imaging”. Issue 57. Moscow; 2019. 45 p. |
| [29] |
Morozov S.P., Vladzymyrskyy A.V., Klyashtornyy V.G., et al. Clinical acceptance of software based on artificial intelligence technologies (radiology). Series «Best practices in medical imaging». Issue 57. Moscow; 2019. 45 p. |
| [30] |
Morozov SP, Andreychenko AE, Pavlov NA, et al. MosMedData: Chest CT scans with COVID-19 related findings dataset. medRxiv. 2020. doi: 10.1101/2020.05.20.20100362 |
| [31] |
Morozov S.P., Andreychenko A.E., Pavlov N.A., et al. MosMedData: Chest CT scans with COVID-19 related findings dataset // medRxiv. 2020. doi: 10.1101/2020.05.20.20100362 |
| [32] |
Sushentsev N, Bura V, Kotniket M, et al. A head-to-head comparison of the intra- and interobserver agreement of COVID-RADS and CO-RADS grading systems in a population with high estimated prevalence of COVID-19. BJR Open. 2020;2(1):20200053. doi: 10.1259/bjro.20200053 |
| [33] |
Sushentsev N., Bura V., Kotniket M., et al. A head-to-head comparison of the intra- and interobserver agreement of COVID-RADS and CO-RADS grading systems in a population with high estimated prevalence of COVID-19 // BJR Open. 2020. Vol. 2, N 1. Р. 20200053. doi: 10.1259/bjro.20200053 |
| [34] |
Jin C, Chen W, Caoet Y, et al. Development and evaluation of an artificial intelligence system for COVID-19 diagnosis. Nat Commun. 2020;11(1):5088. doi: 10.1038/s41467-020-18685-1 |
| [35] |
Jin C., Chen W., Caoet Y., et al. Development and evaluation of an artificial intelligence system for COVID-19 diagnosis // Nat Commun. 2020. Vol. 11, N 1. Р. 5088. doi: 10.1038/s41467-020-18685-1 |
Pavlov N.A., Andreychenko A.E., Vladzymyrskyy A.V., Revazyan A.A., Kirpichev Y.S., Morozov S.P.
/
| 〈 |
|
〉 |