Improved remote sensing image target detection based on YOLOv7

Shuanglong Xu; Zhihong Chen; Haiwei Zhang; Lifang Xue; Huijun Su

doi:10.1007/s11801-024-3063-z

Optoelectronics Letters ›› 2024, Vol. 20 ›› Issue (4) : 234-242. DOI: 10.1007/s11801-024-3063-z

Article

Improved remote sensing image target detection based on YOLOv7

Author information +

History +

Abstract

Remote sensing images are taken at high altitude from above, with complex spatial scenes of images and a large number of target types. The detection of image targets on large scale remote sensing images suffers from the problem of small target size and target density. This paper proposes an improved model for remote sensing image detection based on you only look once version 7 (YOLOv7). First, the small-scale detection layer is added to reacquire tracking frames to improve the network’s recognition ability of small-scale targets, and then Bottleneck Transformers are fused in the backbone to make full use of the convolutional neural network (CNN)+Transformer architecture to enhance the feature extraction ability of the network. After that, the convolutional block attention module (CBAM) mechanism is added in the head to improve the model’s ability of small-scale target. Finally, the non-maximum suppressed (NMS) of YOLOv7 algorithm is changed to distance intersection over union-non maximum suppression (DIOU-NMS) to improve the detection ability of overlapping targets in the network. The results show that the method in this paper can improve the detection rate of small-scale targets in remote sensing images and effectively solve the problem of high overlap and is tested on the NWPU-VHR10 and DOTA1.0 datasets, and the accuracy of the improved model is improved by 6.3% and 4.2%, respectively, compared with the standard YOLOv7 algorithm.

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Shuanglong Xu, Zhihong Chen, Haiwei Zhang, Lifang Xue, Huijun Su. Improved remote sensing image target detection based on YOLOv7. Optoelectronics Letters, 2024, 20(4): 234‒242 https://doi.org/10.1007/s11801-024-3063-z

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	LuX, ZhengX, YuanY. Remote sensing scene classification by unsupervised representation learning[J]. IEEE transactions on geoscience and remote sensing, 2017, 55(9):5148-5157 CrossRef Google scholar

[2]	AfaqY, ManochaA. Analysis on change detection techniques for remote sensing applications: a review[J]. Ecological informatics, 2021, 63: 101310 CrossRef Google scholar

[3]	ZhaoZ Q, ZhengP, XuS, et al.. Object detection with deep learning: a review[J]. IEEE transactions on neural networks and learning systems, 2019, 30(11):3212-3232 CrossRef Google scholar

[4]	ShafiqueA, CaoG, KhanZ, et al.. Deep learning-based change detection in remote sensing images: a review[J]. Remote sensing, 2022, 14(4):871 CrossRef Google scholar

[5]	GirshickR, DonahueJ, DarrellT, et al.. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 23–28, 2014, Columbus, OH, USA. New York: IEEE, 2014, 978: 580-587

[6]	RenS Q, HeK, GirshickR, et al.. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE transactions on pattern analysis & machine intelligence, 2017, 39(06):1137-1149 CrossRef Google scholar

[7]	LiuW, AnguelovD, ErhanD, et al.. SSD: single shot multibox detector[C], 2016, Berlin, Heidelberg, Springer International Publishing: 21-37

[8]	RedmonJ, DivvalaS, GirshickR, et al.. You only look once: unified, real-time object detection[C], 2016, New York, IEEE: 779-788

[9]	RedmonJ, FarhadiA. Yolo9000: better, faster, stronger[C], 2017, New York, IEEE: 7263-7271

[10]	REDMON J, FARHADI A. Yolov3: an incremental improvement[EB/OL]. (2018-04-08) [2023-01-23]. https://arxiv.org/abs/1804.02767.

[11]	ZHANG H, CISSE M, DAUPHIN Y N, et al. Mixup: beyond empirical risk minimization[EB/OL]. (2017-10-25) [2023-01-23]. https://arxiv.org/abs/1710.09412.

[12]	WangC, ShiJ, YangX, et al.. Geospatial object detection via deconvolutional region proposal network[J]. IEEE journal of selected topics in applied earth observations and remote sensing, 2019, 12(8):3014-3027 CrossRef Google scholar

[13]	ChengG, ZhouP, HanJ. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images[J]. IEEE transactions on geoscience and remote sensing, 2016, 54(12):7405-7415 CrossRef Google scholar

[14]	YuX, GongY, JiangN, et al.. Scale match for tiny person detection[C], 2020, New York, IEEE: 1257-1265

[15]	LuoH, WangP, ChenH, et al.. Object detection method based on shallow feature fusion and semantic information enhancement[J]. IEEE sensors journal, 2021, 21(19):21839-21851 CrossRef Google scholar

[16]	DengC, WangM, LiuL, et al.. Extended feature pyramid network for small object detection[J]. IEEE transactions on multimedia, 2021, 24: 1968-1979 CrossRef Google scholar

[17]	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. Yolov4: optimal speed and accuracy of object detection[EB/OL]. (2020-04-23) [2023-01-23]. https://arxiv.org/abs/2004.10934.

[18]	WangC Y, BochkovskiyA, LiaoH Y M. Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C], 2023, New York, IEEE: 7464-7475

[19]	SrinivasA, LinT Y, ParmarN, et al.. Bottleneck transformers for visual recognition[C], 2021, New York, IEEE: 16514-16524

[20]	WooS, ParkJ, LeeJ Y, et al.. CBAM: convolutional block attention module[C], 2018, Berlin, Heidelberg, Springer International Publishing: 3-19

[21]	ZHENG Z, WANG P, LIU W, et al. Distance-iouloss: faster and better learning for bounding box regression[EB/OL]. (2019-11-19) [2023-01-23]. https://arxiv.org/abs/1911.08287.

[22]	ChengG, HanJ, ZhouP, et al.. Multi-class geospatial object detection and geographic image classification based on collection of part detectors[J]. ISPRS Journal of photogrammetry and remote sensing, 2014, 986: 119-132 CrossRef Google scholar

[23]	LiK, ChengG, BuS, et al.. Rotation-insensitive and context-augmented object detection in remote sensing images[J]. IEEE transactions on geoscience and remote sensing, 2017, 56(4):2337-2348 CrossRef Google scholar

[24]	YangX, YangJ, YanJ, et al.. SCRDet: towards more robust detection for small, cluttered and rotated objects[C], 2019, New York, IEEE: 8231-8240

[25]	LiC, XuC, CuiZ, et al.. Feature-attentioned object detection in remote sensing imagery[C], 2019, New York, IEEE: 3886-3890978

[26]	QIAN W, YANG X, PENG S, et al. Learning modulated loss for rotated object detection[EB/OL]. (2019-11-19) [2023-01-23]. https://arxiv.org/abs/1911.08299.