Objective This study aimed to develop a few-shot learning model for lung nodule detection in CT images by leveraging visual open-set object detection.
Methods The Lung Nodule Analysis 2016 (LUNA16) public dataset was used for validation. It was split into training and testing sets in an 8:2 ratio. Classical You Only Look Once (YOLO) models of three sizes (n, m, x) were trained on the training set. Transfer learning experiments were then conducted using the mainstream open-set object detection models derived from Detection Transformer (DETR) with Improved DeNoising AnchOr Boxes (DINO), i.e., Grounding DINO and Open-Vocabulary DINO (OV-DINO), as well as our proposed few-shot learning model, across a range of different shot sizes. Finally, all trained models were compared on the test set.
Results After training on LUNA16, the precision, recall, and mean average precision (mAP) of the different-sized YOLO models showed no significant differences, with peak values of 82.8%, 73.1%, and 77.4%, respectively. OV-DINO’s recall was significantly higher than YOLO’s, but it did not show clear advantages in precision or mAP. Using only one-fifth of the training samples and one-tenth of the training epochs, our proposed model outperformed both YOLO and OV-DINO, achieving improvements of 6.6%, 9.3%, and 6.9% in precision, recall, and mAP, respectively, with final values of 89.4%, 96.2%, and 87.7%.
Conclusion The proposed few-shot learning model demonstrates stronger scene transfer capabilities, requiring fewer samples and training epochs, and can effectively improve the accuracy of lung nodule detection.
| [1] |
Parkin DM. Global cancer statistics in the year 2000. Lancet Oncol., 2001, 2(9): 533-543
|
| [2] |
Qinghua Z, Yaguang F, Ying W, et al.. China national guideline of classification, diagnosis and treatment for lung nodules (2016 version). Zhongguo Fei Ai Za Zhi (Chinese)., 2016, 19(12): 793-798
|
| [3] |
Henschke CI, McCauley DI, Yankelevitz DF, et al.. Early Lung Cancer Action Project: overall design and findings from baseline screening. Lancet., 1999, 354(9173): 99-105
|
| [4] |
Mastouri R, Khlifa N, Neji H, et al.. Deep learning-based CAD schemes for the detection and classification of lung nodules from CT images: A survey. J Xray Sci Technol, 2020, 28(4): 591-617
|
| [5] |
Nithila EE, Kumar S. Lung cancer diagnosis from CT images using CAD system: a review. Int J Biomed Eng Technol., 2016, 21(4): 311-321
|
| [6] |
Bhattacharjee A, Murugan R, Goel T. A hybrid approach for lung cancer diagnosis using optimized random forest classification and K-means visualization algorithm. Health Technol., 2022, 12(4): 787-800
|
| [7] |
El-Baz A, Beache GM, Gimel′farb G, et al.. Computer-aided diagnosis systems for lung cancer: Challenges and methodologies. Int J Biomed Imaging., 2013, 2013(1): 942353
|
| [8] |
Lee SL, Kouzani AZ, Hu EJ. Random forest based lung nodule classification aided by clustering. Comput Med Imaging Graph., 2010, 34(7535-542
|
| [9] |
Madero Orozco H, Vergara Villegas OO, Cruz Sanchez VG, et al.. Automated system for lung nodules classification based on wavelet feature descriptor and support vector machine. Biomed Eng Online., 2015, 14(1): 9
|
| [10] |
Da Nóbrega RVM, Peixoto SA, da Silva SPP, et al. Lung nodule classification via deep transfer learning in CT lung images. 2018 IEEE 31st International Symposium on Computer-Based Medical Systems (CBMS), 2018. https://doi.org/10.1109/CBMS.2018.00050.
|
| [11] |
Abraham GK, Bhaskaran P, Jayanthi VS. Lung Nodule Classification in CT Images Using Convolutional Neural Network. 2019 9th International Conference on Advances in Computing and Communication (ICACC); Nov. 6–8, 2019. https://doi.org/10.1109/ICACC48162.2019.8986213.
|
| [12] |
Girshick R, Donahue J, Darrell T, et al.. Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell., 2015, 38(1142-158
|
| [13] |
Girshick R, editor. Fast R-CNN. 2015 IEEE International Conference on Computer Vision (ICCV); Dec. 7–13, 2015. https://doi.org/10.1109/ICCV.2015.169.
|
| [14] |
Ren S, He K, Girshick R, et al.. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans Pattern Anal Mach Intell., 2017, 39(6): 1137-1149
|
| [15] |
Redmon J, Divvala S, Girshick R, et al. You Only Look Once: Unified, Real-Time Object Detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Jun. 27–30, 2016. https://doi.org/10.1109/CVPR.2016.91.
|
| [16] |
Varghese R, Sambath M. YOLOv8: A Novel Object Detection Algorithm with Enhanced Performance and Robustness. 2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS); Apr. 18–19, 2024. https://doi.org/10.1109/ADICS58448.2024.10533619.
|
| [17] |
Wang A, Chen H, Liu L, et al.. Yolov10: Real-time end-to-end object detection. Adv Neural Inf Process Syst., 2024, 37: 107984-108011
|
| [18] |
Kido S, Hirano Y, Hashimoto N. Detection and classification of lung abnormalities by use of convolutional neural network (CNN) and regions with CNN features (R-CNN). 2018 International Workshop on Advanced Image Technology (IWAIT); Jan. 7–9 2018. https://doi.org/10.1109/IWAIT.2018.8369798.
|
| [19] |
Mammeri S, Amroune M, Haouam MY, et al.. Early detection and diagnosis of lung cancer using YOLO v7, and transfer learning. Multimedia Tools Appl., 2024, 83(10): 30965-30980
|
| [20] |
Su Y, Li D, Chen X. Lung Nodule Detection based on Faster R-CNN Framework. Comput Methods Programs Biomed., 2021, 200: 105866
|
| [21] |
Tang C, Zhou F, Sun J, et al.. Lung-YOLO: Multiscale feature fusion attention and cross-layer aggregation for lung nodule detection. Biomed Signal Process Control., 2025, 99: 106815
|
| [22] |
Cai J, Wang L, Cai J, et al. Contactless Intelligent Anti-interference Lung Nodule Detection Method for Early Disease Detection. IEEE J Biomed Health Inf. 2025:1–12.
|
| [23] |
Brown T, Mann B, Ryder N, et al.. Language models are few-shot learners. Adv Neural Inf Process Syst., 2020, 33: 1877-1901
|
| [24] |
Carion N, Massa F, Synnaeve G, et al. End-to-End Object Detection with Transformers. Computer Vision – ECCV 2020; Aug. 23–28, 2020. https://doi.org/10.1007/978-3-030-58452-8_13.
|
| [25] |
Radford A, Kim JW, Hallacy C, et al. Learning Transferable Visual Models From Natural Language Supervision. In: Marina M, Tong Z. Proceedings of the 38th International Conference on Machine Learning; Proceedings of Machine Learning Research: PMLR; 2021:8748–8763.
|
| [26] |
Kirillov A, Mintun E, Ravi N, et al. Segment Anything. 2023 IEEE/CVF International Conference on Computer Vision (ICCV); Oct. 1–6, 2023. https://doi.org/10.1109/ICCV51070.2023.00371.
|
| [27] |
Wang Z, Wu Z, Agarwal D, et al. Medclip: Contrastive learning from unpaired medical images and text. Proceedings of the Conference on Empirical Methods in Natural Language Processing Conference on Empirical Methods in Natural Language Processing; 2022. https://doi.org/10.18653/v1/2022.emnlp-main.256.
|
| [28] |
Ma J, He Y, Li F, et al.. Segment anything in medical images. Nat Commun., 2024, 15(1): 654
|
| [29] |
Koleilat T, Asgariandehkordi H, Rivaz H, et al. MedCLIP-SAM: Bridging Text and Image Towards Universal Medical Image Segmentation. Medical Image Computing and Computer Assisted Intervention – MICCAI 2024; Sep. 29–Oct. 3, 2024. https://doi.org/10.1007/978-3-031-72390-2_60
|
| [30] |
Setio AAA, Traverso A, De Bel T, et al.. Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: the LUNA16 challenge. Med Image Anal., 2017, 42: 1-13
|
| [31] |
Armato SGIII, McLennan G, Bidaut L, et al.. The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on CT scans. Med Phys., 2011, 38(2915-931
|
| [32] |
Liu S, Zeng Z, Ren T, et al. Grounding DINO: Marrying DINO with Grounded Pre-training for Open-Set Object Detection. Computer Vision – ECCV 2024; Sep. 29–Oct. 3, 2024. https://doi.org/10.1007/978-3-031-72970-6_3.
|
| [33] |
Xu Y, Zhang M, Fu C, et al.. Multi-modal queried object detection in the wild. Adv Neural Inf Process Syst., 2023, 36: 4452-4469
|
| [34] |
Li LH, Zhang P, Zhang H, et al. Grounded Language-Image Pre-training. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); Jun.18–24, 2022. https://doi.org/10.1109/CVPR52688.2022.01069.
|
Funding
Natural Science Foundation of Beijing Municipality(7222320)
Capital Health Research and Development of Special Fund(2022–2-6081)
Scientific Research Fund of Aerospace Center Hospital(YN202301)
Aerospace Medical Health Science and Technology Research Projects(2021YK09)
RIGHTS & PERMISSIONS
The Author(s), under exclusive licence to the Huazhong University of Science and Technology