Improved multi-scale feature fusion for infrared small target detection based on YOLOv8

Shangsi DING; Guiqin YANG; Bingkun GAN

doi:10.62756/jmsi.1674-8042.2026018

Journal of Measurement Science and Instrumentation ›› 2026, Vol. 17 ›› Issue (2) :208 -218. DOI: 10.62756/jmsi.1674-8042.2026018

Special topic on advanced visual measurement and intelligent detection technology

research-article

Improved multi-scale feature fusion for infrared small target detection based on YOLOv8

Author information +

History +

PDF (19793KB)

Abstract

Aiming at the problems of low target pixels and intricate background in small target detection in infrared scenes, a target detection model based on multi-scale feature extraction with YOLOv8 was proposed. Firstly, all downsampling convolutions in the network were replaced with the Haar wavelet downsampling (HWD) module to better preserve fine-grained details in infrared imagery during downsampling. Secondly, the spatial pyramid pooling-fast (SPPF) module was improved by introducing separable convolutions, which expanded the receptive field in both horizontal and vertical directions, enabling more comprehensive spatial information capture. Furthermore, a novel C2f_CDWR module was designed using dilated convolutions with varying dilation rates to achieve adaptive feature extraction across multiple receptive fields, thus enhancing detection performance for objects of different sizes. Finally, to improve localization accuracy, the original CIoU loss in YOLOv8 was replaced with Inner-SIoU, which effectively improved bounding box regression accuracy and significantly boosted the model’s capability in detecting small infrared targets. The experimental evaluation on the HIT-UAV dataset shows that the precision of the enhanced YOLOv8 model is 90.5%, the recall rate is 75.9%, and the mean average precision is 85.7%. In terms of infrared target detection, its performance was significantly better than that of the baseline YOLOv8 model and other benchmark models.

Keywords

infrared image / small object detection / multi-scale feature extraction / dilation convolution / YOLOv8

Cite this article

Download citation ▾

Shangsi DING, Guiqin YANG, Bingkun GAN. Improved multi-scale feature fusion for infrared small target detection based on YOLOv8. Journal of Measurement Science and Instrumentation, 2026, 17 (2) : 208-218 DOI:10.62756/jmsi.1674-8042.2026018

登录浏览全文

4963

注册一个新账户忘记密码

Acknowledgement

This work was supported by the National Natural Science Foundation of China(No.62361034)

Declaration of conflicting interests

The authors have no conflict of interests related to this publication.

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	LU A D, QIAN C, LI C L, et al. Duality-gated mutual condition network for RGBT tracking. IEEE Transactions on Neural Networks and Learning Systems, 2025, 36(3): 4118-4131.

[2]	LI C L, XIANG Z Q, TANG J, et al. RGBT tracking via noise-robust cross-modal ranking. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(9): 5019-5031.

[3]	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.

[4]	HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 386-397.

[5]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection//2016 IEEE Conference on Computer Vision and Pattern Recognition, June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE, 2016: 779-788.

[6]	LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector//Computer Vision-ECCV 2016. Cham: Springer International Publishing, 2016: 21-37.

[7]	LIU Y, SUN P, WERGELES N, et al. A survey and performance evaluation of deep learning methods for small object detection. Expert Systems with Applications, 2021, 172: 114602.

[8]	TANG G Y, NI J J, ZHAO Y H, et al. A survey of object detection for UAVs based on deep learning. Remote Sensing, 2023, 16(1): 149.

[9]	DAI J S, WU L, WANG P K. Overview of UAV target detection algorithms based on deep learning//2021 IEEE 2nd International Conference on Information Technology, Big Data and Artificial Intelligence, December 17-19, 2021, Chongqing, China. New York: IEEE, 2022: 736-745.

[10]	CHEN Y F, SHIN H. Pedestrian detection at night in infrared images using an attention-guided encoder-decoder convolutional neural network. Applied Sciences, 2020, 10(3): 809.

[11]	MA X Y, LI Y. Edge-aided multiscale context network for infrared small target detection. IEEE Geoscience and Remote Sensing Letters, 2023, 20: 7001405.

[12]	MA J Y, TANG L F, XU M L, et al. STDFusionNet: an infrared and visible image fusion network based on salient target detection. IEEE Transactions on Instrumentation and Measurement, 2021, 70: 5009513.

[13]	LI S S, LI Y J, LI Y, et al. YOLO-FIRI: improved YOLOv5 for infrared image object detection. IEEE Access, 2021, 9: 141861-141875.

[14]	LAU K W, PO L M, REHMAN Y A U. Large separable kernel attention: rethinking the large kernel attention design in CNN. Expert Systems with Applications, 2024, 236: 121352.

[15]	VARGHESE R, M S. YOLOv8: a novel object detection algorithm with enhanced performance and robustness//2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems, April 18-19, 2024, Chennai, India. New York: IEEE, 2024: 1-6.

[16]	WANG X Q, GAO H B, JIA Z M, et al. BL-YOLOv8: an improved road defect detection model based on YOLOv8. Sensors, 2023, 23(20): 8361.

[17]	FU A M, ZHANG X L, XIONG N X, et al. VFL: a verifiable federated learning with privacy-preserving for big data in industrial IoT. IEEE Transactions on Industrial Informatics, 2022, 18(5): 3316-3326.

[18]	SUNKARA R, LUO T. No more strided convolutions or pooling: A new CNN building block for low-resolution images and small objects//Joint European Conference on Machine Learning and Knowledge Discovery in Databases. September. 19-23, 2022, Cham: Springer Nature Switzerland, 2022: 443-459.

[19]	ZHANG H, XU C, ZHANG S J. Inner-IoU: more effective intersection over union loss with auxiliary bounding box. 2023: arXiv: 2311.02877.

[20]	SUO J S, WANG T Y, ZHANG X Z, et al. HIT-UAV: a high-altitude infrared thermal dataset for Unmanned Aerial Vehicle-based object detection. Scientific Data, 2023, 10: 227.

[21]	ZHAO Y A, LV W Y, XU S L, et al. DETRs beat YOLOs on real-time object detection//2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 16-22, 2024, Seattle, WA, USA. New York: IEEE, 2024: 16965-16974.

[22]	ZHANG H, LI F, LIU S L, et al. DINO: DETR with improved DeNoising anchor boxes for end-to-end object detection. 2022: arXiv: 2203.03605.

[23]	ZHANG S L, WANG X J, WANG J Q, et al. Dense distinct query for end-to-end object detection//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 17-24, 2023, Vancouver, BC, Canada. New York: IEEE, 2023: 7329-7338.

[24]	ZHANG Y, MA R G, LIANG C. Road target detection algorithm based on improved YOLOv5 in UAV images. Journal of Measurement Science and Instrumentation, 2024, 15(1): 128-139.