DS-YOLO: A dense small object detection algorithm based on inverted bottleneck and multi-scale fusion network

Hongyu Zhang , Guoliang Li , Dapeng Wan , Ziyue Wang , Jinshun Dong , Shoujun Lin , Lixia Deng , Haiying Liu

Biomimetic Intelligence and Robotics ›› 2024, Vol. 4 ›› Issue (4) : 100190 -100190.

PDF (4380KB)
Biomimetic Intelligence and Robotics ›› 2024, Vol. 4 ›› Issue (4) :100190 -100190. DOI: 10.1016/j.birob.2024.100190
Research Article
research-article

DS-YOLO: A dense small object detection algorithm based on inverted bottleneck and multi-scale fusion network

Author information +
History +
PDF (4380KB)

Abstract

In the field of security, intelligent surveillance tasks often involve a large number of dense and small objects, with severe occlusion between them, making detection particularly challenging. To address this significant challenge, Dense and Small YOLO (DS-YOLO), a dense small object detection algorithm based on YOLOv8s, is proposed in this paper. Firstly, to enhance the dense small objects’ feature extraction capability of backbone network, the paper proposes a lightweight backbone. The improved C2fUIB is employed to create a lightweight model and expand the receptive field, enabling the capture of richer contextual information and reducing the impact of occlusion on detection accuracy. Secondly, to enhance the feature fusion capability of model, a multi-scale feature fusion network, Light-weight Full Scale PAFPN (LFS-PAFPN), combined with the DO-C2f module, is introduced. The new module successfully reduces the miss rate of dense small objects while ensuring the accuracy of detecting large objects. Finally, to minimize feature loss of dense objects during network transmission, a dynamic upsampling module, DySample, is implemented. DS-YOLO was trained and tested on the CrowdHuman and VisDrone2019 datasets, which contain a large number of densely populated pedestrians, vehicles and other objects. Experimental evaluations demonstrated that DS-YOLO has advantages in dense small object detection tasks. Compared with YOLOv8s, the Recall and mAP@0.5 are increased by 4.9% and 4.2% on CrowdHuman dataset, 4.6% and 5% on VisDrone2019, respectively. Simultaneously, DS-YOLO does not introduce a substantial amount of computing overhead, maintaining low hardware requirements.

Keywords

Dense objects detection / LFS-PAFPN / DOConv / C2fUIB / YOLOv8

Cite this article

Download citation ▾
Hongyu Zhang, Guoliang Li, Dapeng Wan, Ziyue Wang, Jinshun Dong, Shoujun Lin, Lixia Deng, Haiying Liu. DS-YOLO: A dense small object detection algorithm based on inverted bottleneck and multi-scale fusion network. Biomimetic Intelligence and Robotics, 2024, 4(4): 100190-100190 DOI:10.1016/j.birob.2024.100190

登录浏览全文

4963

注册一个新账户 忘记密码

CRediT authorship contribution statement

Hongyu Zhang: Writing - review & editing, Writing - original draft, Visualization, Validation, Software, Project administration, Methodology, Investigation, Data curation, Conceptualization. Guoliang Li: Writing - review & editing, Supervision, Resources, Investigation, Funding acquisition, Conceptualization. Dapeng Wan: Writing - review & editing, Validation, Supervision, Investigation. Ziyue Wang: Supervision, Methodology, Formal analysis, Data curation. Jinshun Dong: Writing - review & editing, Validation, Investigation. Shoujun Lin: Resources, Investigation. Lixia Deng: Writing - review & editing, Supervision, Methodology, Funding acquisition, Conceptualization. Haiying Liu: Supervision, Project administration, Funding acquisition, Conceptualization.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This work has been supported by the Innovation Ability Enhancement Project of Shandong Province Science and Technology Small Medium Enterprises (2023TSCG0159 and 2022TSGC2175) and the Peiyou Fund of Qilu University of Technology (Shandong Academy of Sciences) (2023PY006).

References

[1]

Redmon Joseph, Ali Farhadi,Yolov3: An incremental improvement, 2018, arXiv preprint arXiv:1804.02767.

[2]

Glenn Jocher, Ultralytics YOLOv5, 2020, 7.0[EB/OL]. https://github.com/ultralytics/yolov5.

[3]

Glenn Jocher, Ayush Chaurasia, Jing Qiu, Glenn jocher ayush chaurasia jing qiu ultralytics YOLOv8,Glenn jocher ayush chaurasia jing qiu ultralytics YOLOv8, 2023, 8.0.0[EB/OL]. https://github.com/ultralytics/ultralytics

[4]

Wei. Chen, Yuxuan. Zhu, Zijian. Tian, Fan. Zhang, Minda. Yao, Occlusion and multi-scale pedestrian detection a review, Array 19 (2023) 100318.

[5]

Ao Wang, Hui Chen, Lihao Liu, Kai Chen, Zijia Lin, Jungong Han, Guiguang Ding,YOLOv10: Real-time end-to-end object detection, 2024, arXiv preprint arXiv:2405.14458.

[6]

Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, in: IEEE Conference on Computer Vision and Pattern Recognition, Vol. 81, 2014, pp. 580-587.

[7]

Ross Girshick, Fast R-CNN, in: IEEE International Conference on Computer Vision (ICCV), Vol. 169, 2015, pp. 1440-1448.

[8]

Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun, Faster R-CNN: Towards real-time object detection with region proposal networks, IEEE Trans. Pattern Anal. Mach. Intell. 39 (06) (2017) 1137-1149.

[9]

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo, Swin transformer: Hierarchical vision transformer using shifted windows,in: IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 9992-10002.

[10]

Yian Zhao, Wenyu Lv, Shangliang Xu, Jinman Wei, Guanzhong Wang, Qingqing Dang, Yi Liu, Jie Chen,Detrs beat yolos on real-time object detection, in:Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 16965-16974.

[11]

Deepak Kumar Jain, Xudong Zhao, Salvador Garcia, Subramani Neelakan-dan, Robust multi-modal pedestrian detection using deep convolutional neural network with ensemble learning model, Expert Syst. Appl. 249 (2024) 123527.

[12]

M. Xu, Z. Wang, X. Liu, L. Ma, A. Shehzad, An efficient pedestrian detection for realtime surveillance systems based on modified YOLOv3, IEEE J. Radio Freq. Identif. 6 (2022) 972-976.

[13]

Weiyen Hsu, Peiyu Yang, Pedestrian detection using multi-scale structure-enhanced super-resolution, IEEE Trans. Intell. Transp. Syst. 24 (11) (2023) 12312-12322.

[14]

Jinming Cao, Yangyan Li, Mingchao Sun, Ying Chen, Dani Lischinski, Daniel Cohen-Or, Baoquan Chen, Changhe Tu, Do-conv: Depthwise over-parameterized convolutional layer, IEEE Trans. Image Process. 31 (2022) 3726-3736.

[15]

Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen,Mobilenetv2: Inverted residuals and linear bottlene-cks, in: IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 4510-4520.

[16]

Kaiming. He, Xiangyu. Zhang, Shaoqing. Ren, Jian. Sun, Deep residual learning for image recognition, in: IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770-778.

[17]

Danfeng. Qin, Chas. Leichner, Manolis. Delakis, Marco. Fornoni, Shixin. Luo, Fan. Yang, Weijun. Wang, Colby. Banbury, Chengxi. Ye, Berkin. Akin, Vaibhav. Aggarwal, Tenghui. Zhu, Daniele. Moro, Andrew. Howard,MobileNetV4-universal models for the mobile ecosystem, 2024, arXiv preprint arXiv:2404.10518.

[18]

Wenze. Liu, Hao. Lu, Hongtao. Fu, Zhiguo. Cao,Learning to upsample by learning to sample, in:Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 6027-6037.

[19]

Lihu Pan, Jianzhong Diao, Zhengkui Wang, Shouxin Peng, Cunhui Zhao, HF-YOLO: Advanced pedestrian detection model with feature fusion and imbalance resolution, Neural Process. Lett. 56 (2024) 90.

PDF (4380KB)

315

Accesses

0

Citation

Detail

Sections
Recommended

/