Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study

Li-peng Xing , Gang Liu , Hao-chen Zhang , Lei Wang , Shan Zhu , Man Du La Hua Bao , Yan-ni Wang , Chao Chen , Zhi Wang , Xin-yu Liu , Shuai Zhang , Qiang Yang

Orthopaedic Surgery ›› 2025, Vol. 17 ›› Issue (1) : 233 -243.

PDF
Orthopaedic Surgery ›› 2025, Vol. 17 ›› Issue (1) : 233 -243. DOI: 10.1111/os.14280
RESEARCH ARTICLE

Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study

Author information +
History +
PDF

Abstract

Objective: Modic changes (MCs) classification system is the most widely used method in magnetic resonance imaging (MRI) for characterizing subchondral vertebral marrow changes. However, it shows a high degree of sensitivity to variations in MRI because of its semiquantitative nature. In 2021, the authors of this classification system further proposed a quantitative and reliable MC grading method. However, automated tools to grade MCs are lacking. This study developed and investigated the performance of convolutional neural network (CNN) in detecting and grading MCs based on their maximum vertical extent. In order to verify performance, we tested CNNs’ generalization performance, the performance of CNN with that of junior doctors, and the consistency of junior doctors after AI assistance.

Methods: A retrospective analysis of 139 patients’ MRIs with MCs was conducted and annotated by a spine surgeon. Of the 139 patients, MRIs from 109 patients were acquired using Philips scanners from June 2020 to June 2021, constituting Dataset 1. The remaining 30 patients had MRIs obtained from both Philips and United Imaging scanners from June 2022 to March 2023, forming Dataset 2. YOLOv8 and YOLOv5 were developed in PyCharm using the Python language and based on the PyTorch deep learning framework, data enhancement and transfer learning were applied to enhance model generalization. The model’s performance was compared with precision, recall, F1 score, and mAP50. It also tested generalizability and compared it with the junior doctor’s performance on the second data set (Dataset 2). Post hoc, the junior doctor graded Dataset 2 with CNN assistance. In addition, the region of interest was displayed using the class activation mapping heat map.

Results: On the unseen test set, the YOLOv8 and YOLOv5 models achieved precision of 81.60% and 61.59%, recall of 80.90% and 67.16%, mAP50 of 84.40% and 68.88%, and F1 of 0.81 and 0.60 respectively. On Dataset 2, YOLOv8 and junior doctor achieved precision of 95.1% and 72.5%, recall of 68.3% and 60.6%. In the AI-assisted experiment, agreement between the junior doctor and the senior spine surgeon significantly improved from Cohen’s kappa of 0.368–0.681.

Conclusions: YOLOv8 in detecting and grading MCs was significantly superior to that of YOLOv5. The performance of YOLOv8 is superior to that of junior doctors, and it can enhance the capabilities of junior doctors and improve the reliability of diagnoses.

Keywords

deep learning / endplate osteochondritis / magnetic resonance imaging / Modic changes

Cite this article

Download citation ▾
Li-peng Xing, Gang Liu, Hao-chen Zhang, Lei Wang, Shan Zhu, Man Du La Hua Bao, Yan-ni Wang, Chao Chen, Zhi Wang, Xin-yu Liu, Shuai Zhang, Qiang Yang. Evaluating CNN Architectures for the Automated Detection and Grading of Modic Changes in MRI: A Comparative Study. Orthopaedic Surgery, 2025, 17(1): 233-243 DOI:10.1111/os.14280

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

T. Vos, C. Allen, M. Arora, et al., “Global, Regional, and National Incidence, Prevalence, and Years Lived With Disability for 310 Diseases and Injuries, 1990–2015: A Systematic Analysis for the Global Burden of Disease Study 2015,” Lancet 388, no. 10053 (2016): 1545–1602.

[2]

A. Cieza, K. Causey, K. Kamenov, S. W. Hanson, S. Chatterji, and T. Vos, “Global Estimates of the Need for Rehabilitation Based on the Global Burden of Disease Study 2019: A Systematic Analysis for the Global Burden of Disease Study 2019,” Lancet 396, no. 10267 (2020): 2006–2017.

[3]

P. Termsarasab, T. Thammongkolchai, and S. J. Frucht, “Spinal-Generated Movement Disorders: A Clinical Review,” Journal of Clinical Movement Disorders 2 (2015): 1–13.

[4]

T. S. Jensen, J. Karppinen, J. S. Sorensen, J. Niinimäki, and C. Leboeuf-Yde, “Vertebral Endplate Signal Changes (Modic Change): A Systematic Literature Review of Prevalence and Association With Non-Specific Low Back Pain,” European Spine Journal 17, no. 11 (2008): 1407–1422,

[5]

A. Conger, N. M. Schuster, D. S. Cheng, et al., “The Effectiveness of Intraosseous Basivertebral Nerve Radiofrequency Neurotomy for the Treatment of Chronic Low Back Pain in Patients With Modic Changes: A Systematic Review,” Pain Medicine 22, no. 5 (2021): 1039–1054.

[6]

M. T. Modic, P. M. Steinberg, J. S. Ross, T. J. Masaryk, and J. R. Carter, “Degenerative Disk Disease: Assessment of Changes in Vertebral Body Marrow With MR Imaging,” Radiology 166, no. 1 (1988): 193–199.

[7]

U. Zehra, C. Bow, J. C. Lotz, et al., “Structural Vertebral Endplate Nomenclature and Etiology: A Study by the ISSLS Spinal Phenotype Focus Group,” European Spine Journal 27, no. 1 (2018): 2–12,

[8]

P. M. Udby, D. Samartzis, L. Y. Carreon, et al., “A Definition and Clinical Grading of Modic Changes,” Journal of Orthopaedic Research 40, no. 2 (2022): 301–307.

[9]

Y. Wang, T. Videman, R. Niemeläinen, et al., “Quantitative Measures of Modic Changes in Lumbar Spine Magnetic Resonance Imaging: Intra-and Inter-Rater Reliability,” Spine 36, no. 15 (2011): 1236–1243.

[10]

A. Jamaludin, T. Kadir, and A. Zisserman, “SpineNet: Automated Classification and Evidence Visualization in Spinal MRIs,” Medical Image Analysis 41 (2017): 63–73.

[11]

R. Windsor, A. Jamaludin, T. Kadir, et al., “SpineNetV2: Automated Detection, Labelling and Radiological Grading of Clinical MR Scans,” arXiv Preprint arXiv:2205.01683, 2022.

[12]

L. Wang, S. Zhang, G. Liu, et al., “Comparison of MRI Diagnosis of 140 Cases of MCs Using Intelligent Network Automatic Detection and Classification Methods,” Journal of Shandong University (Health Sciences) 61, no. 3 (2023): 71–79.

[13]

G. Liu, L. Wang, S. You, et al., “Automatic Detection and Classification of Modic Changes in MRI Images Using Deep Learning: Intelligent Assisted Diagnosis System,” Orthopaedic Surgery 16, no. 1 (2024): 196–206.

[14]

K. T. Gao, R. Tibrewala, M. Hess, et al., “Automatic Detection and Voxel-Wise Mapping of Lumbar Spine Modic Changes With Deep Learning,” JOR Spine 5, no. 2 (2022): e1204.

[15]

T. Mukaihata, S. Maki, Y. Eguchi, et al., “Differentiating Magnetic Resonance Images of Pyogenic Spondylitis and Spinal Modic Change Using a Convolutional Neural Network,” Spine 48, no. 4 (2023): 288–294.

[16]

J. Terven, D. M. Córdova-Esparza, and J. A. Romero-González, “A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS,” Machine Learning and Knowledge Extraction 5, no. 4 (2023): 1680–1716.

[17]

MicroDicom, “MicroDicom—Free DICOM Viewer and Software PACS DICOM Viewer,” accessed November 1, 2023, https://www.microdicom.com/.

[18]

A. Buslaev, V. I. Iglovikov, E. Khvedchenya, et al., “Albumentations: Fast and Flexible Image Augmentations,” Information 11, no. 2 (2020): 125.

[19]

Lsell, “labelImg,” accessed July 1, 2023, https://github.com/HumanSignal/labelImg.

[20]

T. Y. Lin, P. Dollár, R. Girshick, et al., “Feature Pyramid Networks for Object Detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 2117–2125.

[21]

L. Yang, S. Hanneke, and J. Carbonell, “A Theory of Transfer Learning With Applications to Active Learning,” Machine Learning 90, no. 2 (2013): 161–189.

[22]

G. Jocher, A. Chaurasia, and J. Qiu, “YOLO by Ultralytics (Version 8.0.0),” accessed August 1, 2023, https://github.com/ultralytics/ultralytics.

[23]

T. Y. Lin, M. Maire, S. Belongie, et al., “Microsoft COCO: Common Objects in Context,” in Proceedings, Part V13. Computer Vision-ECCV 2014: 13th European Conference, September 6–12, 2014, (Zurich, Switzerland: Springer International Publishing, 2014), 740–755.

[24]

R. R. Selvaraju, M. Cogswell, A. Das, et al., “Grad-CAM: Visual Explanations From Deep Networks via Gradient-Based Localization,” in Proceedings of the IEEE International Conference on Computer Vision, 2017, 618–626.

[25]

Ultralytics, “YOLOv5,” accessed October 20, 2023, https://docs.ultralytics.com/models/yolov5/#overview.

[26]

J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,” arXiv Preprint arXiv:1804.02767, 2018.

[27]

J. MacQueen, “Some Methods for Classification and Analysis of Multivariate Observations,” Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability 1, no. 14 (1967): 281–297.

[28]

A. Bochkovskiy, C. Y. Wang, and H. Y. M. Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection,” arXiv Preprint arXiv:2004.10934, 2020.

[29]

A. Fawzy, “Mean Average Precision (MAP) Explained,” accessed August 1, 2023, https://blog.paperspace.com/mean-average-precision/.

[30]

A. Neubeck and L. Van Gool, “Efficient Non-Maximum Suppression,” in 18th International Conference on Pattern Recognition (ICPR’06), IEEE, 2006, vol. 3, 850–855.

RIGHTS & PERMISSIONS

2024 The Author(s). Orthopaedic Surgery published by Tianjin Hospital and John Wiley & Sons Australia, Ltd.

AI Summary AI Mindmap
PDF

179

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/