LA-YOLO: Location Refinement and Adjacent Feature Fusion-Based Infrared Small Target Detection

Shijie Liu , Chenqi Luo , Kang Yan , Feiwei Qin , Ruiquan Ge , Yong Peng , Jie Huang , Nenggan Zheng , Yongquan Zhang , Changmiao Wang

CAAI Transactions on Intelligence Technology ›› 2025, Vol. 10 ›› Issue (6) : 1893 -1903.

PDF (1897KB)
CAAI Transactions on Intelligence Technology ›› 2025, Vol. 10 ›› Issue (6) :1893 -1903. DOI: 10.1049/cit2.70070
ORIGINAL RESEARCH
research-article

LA-YOLO: Location Refinement and Adjacent Feature Fusion-Based Infrared Small Target Detection

Author information +
History +
PDF (1897KB)

Abstract

In the field of infrared small target detection (ISTD), the ability to detect targets in dim environments is critical, as it improves the performance of target recognition in nighttime and harsh weather conditions. The blurry contour, small size and sparse distribution of infrared small targets increase the difficulty of identifying such targets in cluttered backgrounds. Existing methodologies fall short of satisfying the requisites for the detection and categorisation of infrared small targets. To address these challenges and to enhance the precision of small object detection and classification, this paper introduces an innovative approach called location refinement and adjacent feature fusion YOLO (LA-YOLO), which enhances feature extraction by integrating a multi-head self-attention mechanism (MSA). We have improved the feature fusion method to merge adjacent features, to enhance information utilisation in the path aggregation network (PAN). Lastly, we introduce supervision on the target centre points in the detection network. Empirical results on publicly available datasets demonstrate that LA-YOLO achieves an impressive average precision (AP) of 92.46% on IST-A and a mean average precision (mAP) of 84.82% on FLIR. The results surpass those of contemporary state-of-the-art detectors, striking a balance between precision and speed. LA-YOLO emerges as a viable and efficacious solution for ISTD, making a substantial contribution to the progression of infrared imagery analysis. The code is available at https://github.com/liusjo/LA-YOLO.

Keywords

artificial intelligence / image analysis / image recognition / infrared imaging

Cite this article

Download citation ▾
Shijie Liu, Chenqi Luo, Kang Yan, Feiwei Qin, Ruiquan Ge, Yong Peng, Jie Huang, Nenggan Zheng, Yongquan Zhang, Changmiao Wang. LA-YOLO: Location Refinement and Adjacent Feature Fusion-Based Infrared Small Target Detection. CAAI Transactions on Intelligence Technology, 2025, 10(6): 1893-1903 DOI:10.1049/cit2.70070

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

H. Du, H. Shi, D. Zeng, X.-P. Zhang and T. Mei, “The Elements of End-to-End Deep Face Recognition: A Survey of Recent Advances,” supplement, ACM Computing Surveys 54, no. S 10 (2022): S1-S42, https://doi.org/10.1145/3507902.

[2]

T. Si, F. He, P. Li, and X. Gao, “Tri-Modality Consistency Optimi-zation With Heterogeneous Augmented Images for Visible-Infrared Person Re-Identification,” Neurocomputing 523 (2023): 170-181, https://doi.org/10.1016/j.neucom.2022.12.042.

[3]

Z. Zou, K. Chen, Z. Shi, Y. Guo, and J. Ye, “Object Detection in 20 Years: A Survey,” Proceedings of the IEEE 111, no. 3 (2023): 257-276, https://doi.org/10.1109/jproc.2023.3238524.

[4]

N.-D. Nguyen T. Do T. D. Ngo and D. -D. Le, “An Evaluation of Deep Learning Methods for Small Object Detection,” Journal of Elec-trical and Computer Engineering 2020 (2020): 1-18, https://doi.org/10.1155/2020/3189691.

[5]

S. Ge, C. Li, S. Zhao, and D. Zeng, “Occluded Face Recognition in the Wild by Identity-Diversity Inpainting,” IEEE Transactions on Circuits and Systems for Video Technology 30, no. 10 (2020): 3387-3397, https://doi.org/10.1109/tcsvt.2020.2967754.

[6]

C. Xia, S. Chen, X. Zhang, Z. Chen, and Z. Pan, “Infrared Small Target Detection via Dynamic Image Structure Evolution,” IEEE Transactions on Geoscience and Remote Sensing 60 (2022): 1-18, https://doi.org/10.1109/tgrs.2022.3196047.

[7]

T.-Y. Lin P. Dollár R. Girshick K. He B. Hariharan and S. Belongie, “Feature Pyramid Networks for Object Detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2017), 2117-2125.

[8]

R. Girshick, “Fast R-CNN in Proceedings of the IEEE International Conference on Computer Vision (IEEE, 2015), 1440-1448.

[9]

S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection With Region Proposal Networks,” Advances in Neural Information Processing Systems 28 (2015), https://doi.org/10.48550/arXiv.1506.01497.

[10]

J. Redmon, S. Divvala,R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2016), 779-788.

[11]

M. Tan,R. Pang, and Q. V. Le, “EfficientDet: Scalable and Efficient Object Detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (EEE/CVF, 2020), 10781-10790.

[12]

Z. Tian, C. Shen, H. Chen, and T. He, “FCOS: A Simple and Strong Anchor-Free Object Detector,” IEEE Transactions on Pattern Analysis and Machine Intelligence 44, no. 4 (2020): 1922-1933, https://doi.org/10.1109/tpami.2020.3032166.

[13]

T.-Y. Lin P. Goyal R. Girshick K. He and P. Dollár, “Focal Loss for Dense Object Detection,” in Proceedings of the IEEE International Conference on Computer Vision (IEEE, 2017), 2980-2988.

[14]

C.-Y. Wang I.-H. Yeh and H. -Y. M. Liao, “YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information,” arXiv preprint arXiv:2402. 13616 (2024): 1-21, https://doi.org/10.1007/978-3-031-72751-1_1.

[15]

G. Jocher,YOLOv5 by Ultralytics (GitHub, 2020), https://github.com/ultralytics/yolov5.

[16]

H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian,I. Reid, and S. Savarese, “Generalized Intersection over Union: A Metric and a Loss for Bounding Box Regression,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (EEE/CVF, 2019), 658-666.

[17]

Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “YOLOX: Exceeding YOLO Series in 2021,” arXiv preprint arXiv:2107. 08430 (2021).

[18]

C.-Y. Wang A. Bochkovskiy and H. -Y. M. Liao, “YOLOv7:Train-able Bag-of-freebies Sets New State-of-the-art for Real-time Object De-tectors,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (EEE/CVF, 2023), 7464-7475.

[19]

H. Lou, X. Duan, J. Guo, et al., “DC-YOLOv8: Small-Size Object Detection Algorithm Based on Camera Sensor,” Electronics 12, no. 10 (2023): 2323, https://doi.org/10.3390/electronics12102323.

[20]

A. Vaswani, N. Shazeer, N. Parmar, et al., “Attention Is all You Need,” Advances in Neural Information Processing Systems 30 (2017), https://doi.org/10.48550/arXiv.1706.03762.

[21]

N. Carion, F. Massa, G. Synnaeve, N. Usunier,A. Kirillov, and S. Zagoruyko, “End-to-End Object Detection With Transformers,” in Computer Vision-ECCV 2020: 16Th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part I 16 (Springer International Pub-lishing 2020), 213-229.

[22]

Z. Zong,G. Song, and Y. Liu, “DETRs With Collaborative Hybrid Assignments Training,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (IEEE, 2023), 6748-6758.

[23]

Y. Chen, C. Zhang, B. Chen, et al., “Accurate Leukocyte Detection Based on Deformable-DETR and Multi-Level Feature Fusion for Aiding Diagnosis of Blood Diseases,” Computers in Biology and Medicine 170 (2024): 107917, https://doi.org/10.1016/j.compbiomed.2024.107917.

[24]

S. Li, Y. Li, Y. Li, M. Li, and X. Xu, “YOLO-FIRI: Improved YOLOv5 for Infrared Image Object Detection,” IEEE Access 9 (2021): 141861-141875, https://doi.org/10.1109/access.2021.3120870.

[25]

Y. Chen, L. Li, X. Liu, and X. Su, “A Multi-Task Framework for Infrared Small Target Detection and Segmentation,” IEEE Transactions on Geoscience and Remote Sensing 60 (2022): 1-9, https://doi.org/10.1109/tgrs.2022.3195740.

[26]

Z. Zhu, R. Zheng, G. Qi, S. Li, Y. Li, and X. Gao, “Small Object Detection Method Based on Global Multi-Level Perception and Dynamic Region Aggregation,” IEEE Transactions on Circuits and Systems for Video Technology 34, no. 10 (2024): 10011-10022, https://doi.org/10.1109/tcsvt.2024.3402097.

[27]

Y. Li, Z. Zhou, G. Qi, G. Hu, Z. Zhu, and X. Huang, “Remote Sensing Micro-Object Detection Under Global and Local Attention Mechanism,” Remote Sensing 16, no. 4 (2024): 644, https://doi.org/10.3390/rs16040644.

[28]

Z. Zhu, S. Wang, S. Gu, et al., “Driver Distraction Detection Based on Lightweight Networks and Tiny Object Detection,” Mathematical Eiosciences and Engineering 20, no. 10 (2023): 18248-18266, https://doi.org/10.3934/mbe.2023811.

[29]

G. Qi, Y. Zhang, K. Wang, N. Mazur, Y. Liu, and D. Malaviya, “Small Object Detection Method Based on Adaptive Spatial Parallel Convolution and Fast Multi-Scale Fusion,” Remote Sensing 14, no. 2 (2022): 420, https://doi.org/10.3390/rs14020420.

[30]

Z. Zhang, J. Huang, G. Hei, and W. Wang, “YOLO-IR-Free: An Improved Algorithm for Real-Time Detection of Vehicles in Infrared Images,” Sensors 23, no. 21 (2023): 8723, https://doi.org/10.3390/s23218723.

[31]

S. Liu, K. Yan, F. Qin, et al. “Infrared Image Super-Resolution via Lightweight Information Split Network,” , in International Conference on Intelligent Computing (Springer, 2024), 293-304.

[32]

Y. Dai, Y. Wu, F. Zhou, and K. Barnard, “Asymmetric Contextual Modulation for Infrared Small Target Detection,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (EEE/ CVF, 2021), 950-959.

[33]

B. Li, C. Xiao, L. Wang, et al., “Dense Nested Attention Network for Infrared Small Target Detection,” IEEE Transactions on Image Process-ing 32 (2022): 1745-1758, https://doi.org/10.1109/tip.2022.3199107.

[34]

M. Zhang, R. Zhang, Y. Yang, H. Bai,J. Zhang, and J. Guo, “ISNET: Shape Matters for Infrared Small Target Detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (EEE/CVF, 2022), 877-886.

[35]

H. Zhu, J. Zhang, G. Xu, and L. Deng, “Tensor Field graph-cut for Image Segmentation: A Non-Convex Perspective,” IEEE Transactions on Circuits and Systems for Video Technology 31, no. 3 (2020): 1103-1113, https://doi.org/10.1109/tcsvt.2020.2995866.

[36]

J. Dai, H. Qi, Y. Xiong, et al. “Deformable Convolutional Net-works , in Proceedings of the IEEE International Conference on Com-puter Vision (IEEE, 2017), 764-773.

[37]

H. Wang,L. Zhou, and L. Wang, “Miss Detection vs. False Alarm: Adversarial Learning for Small Object Segmentation in Infrared Im-ages,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (EEE/CVF, 2019), 8509-8518.

[38]

H. Xu, S. Zhong, T. Zhang, and X. Zou, “Multi-Scale Multi-Level Residual Feature Fusion for Real-Time Infrared Small Target Detec-tion,” IEEE Transactions on Geoscience and Remote Sensing 61 (2023): 1-16, https://doi.org/10.1109/tgrs.2023.3269092.

[39]

F. A. Group,Flir Thermal Dataset for Algorithm Training, https://www.fiir.com/oem/adas/adas-dataset-form/.

Funding

Guangdong Basic and Applied Basic Research Foundation(2025A1515011617)

Guangdong Basic and Applied Basic Research Foundation(2022A1515110570)

Fundamental Research Funds for the Provincial Universities of Zhejiang(GK259909299001-006)

Innovation Teams of Youth Innovation in Science and Technology of High Education Institutions of Shandong Province(2021KJ088)

Anhui Provincial Joint Construction Key Laboratory of Intelligent Education Equipment and Technology(IEET202401)

Aeronautical Science Foundation of China(2022Z0710T5001)

AI Summary AI Mindmap
PDF (1897KB)

27

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/