D-DeepOCSORT: multi-object tracking algorithm based on LiDAR and monocular camera

Xiao He; Shuai Ren; Chang Liu; Lei Shi

doi:10.1007/s11801-026-4213-2

Optoelectronics Letters ›› 2026, Vol. 22 ›› Issue (2) :118 -123. DOI: 10.1007/s11801-026-4213-2

Article

research-article

D-DeepOCSORT: multi-object tracking algorithm based on LiDAR and monocular camera

Author information +

History +

PDF

Abstract

To enhance the tracking stability of DeepOCSORT, this paper proposes a novel multi-sensor data fusion-based multi-object tracking (MOT) method. Specifically, we build upon the DeepOCSORT foundation and additionally integrate target velocity information directly measured by light detection and ranging (LiDAR). The introduction of this velocity information is conducted from three perspectives. Firstly, during data association, a penalty term is constructed based on the differences in target velocities to constrain generating matches with consistent velocities. Secondly, use LiDAR velocity for initialization and online updating of the velocity state within the tracker, making tracking predictions more stable. Thirdly, control the degree of dependence on velocity information by adjusting the process noise covariance matrix. Evaluation results on the KITTI dataset demonstrate that compared to the original DeepOCSORT, the proposed improved multi-source heterogeneous information fusion method significantly enhances tracking performance, with maximum improvements of 3.35, 3.26, and 3.71 on the higher order tracking accuracy (HOTA), multi-object tracking accuracy (MOTA), and interaction detection F1 score (IDF1) metrics, respectively. This study provides an effective approach to building a more stable and accurate MOT system.

Keywords

Cite this article

Download citation ▾

Xiao He, Shuai Ren, Chang Liu, Lei Shi. D-DeepOCSORT: multi-object tracking algorithm based on LiDAR and monocular camera. Optoelectronics Letters, 2026, 22(2): 118-123 DOI:10.1007/s11801-026-4213-2

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Ciaparrone G, Sánchez F L, Tabik Set al. . Deep learning in video multi-object tracking: a survey. Neurocomputing. 2020, 381: 61-88. J]

[2]	Luo W, Xing J, Milan Aet al. . Multiple object tracking: a literature review. Artificial intelligence. 2021, 293: 103448. J]

[3]	Wojke N, Bewley A, Paulus D. Simple online and real-time tracking with a deep association metric. IEEE International Conference on Image Processing, September 17–20, 2017, Beijing, China. 2017, New York, IEEE3645-3649[C]

[4]	Bochinski E, Eiselein V, Sikora T. High-speed tracking-by-detection without using image information. 14th IEEE International Conference on Advanced Video and Signal Based Surveillance, August 29–September 1, 2017, Lecce, Italy. 2017, New York, IEEE16[C]

[5]	Zhang Y, Sun P, Jiang Yet al. . Bytetrack: multi-object tracking by associating every detection box. European Conference on Computer Vision, October 23–27, 2022, Tel Aviv, Israel. 2022, Cham, Springer Nature Switzerland121[C]

[6]	Brown R G, Hwang P Y C. Introduction to random signals and applied Kalman filtering: with MATLAB exercises and solutions. 1997, New York, John Wiley & Sons, Inc.[M]

[7]	Bergmann P, Meinhardt T, Leal-Taixe L. Tracking without bells and whistles. Proceedings of the IEEE/CVF International Conference on Computer Vision, October 20–26, 2019, South Korea. 2019, New York, IEEE941951[C]

[8]	Brasó G, Leal-Taixé L. Learning a neural solver for multiple object tracking. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 13–19, 2020, Seattle, WA, USA. 2020, New York, IEEE62476257[C]

[9]	Chu P, Ling H. Famnet: joint learning of feature, affinity and multi-dimensional assignment for online multiple object tracking. Proceedings of the IEEE/CVF International Conference on Computer Vision, October 20–26, 2019, South Korea. 2019, New York, IEEE61726181[C]

[10]	Hornakova A, Henschel R, Rosenhahn Bet al. . Lifted disjoint paths with application in multiple object tracking. International Conference on Machine Learning, July 12–17, 2020, Vienna, Austria. 202043644375[C]

[11]	Xu J, Cao Y, Zhang Zet al. . Spatial-temporal relation networks for multi-object tracking. Proceedings of the IEEE/CVF International Conference on Computer Vision, October 20–26, 2019, South Korea. 2019, New York, IEEE39883998[C]

[12]	WANG C Y, YEH I H, LIAO H Y M. YOLOv9: learning what you want to learn using programmable gradient information[EB/OL]. (2024-02-21) [2025-02-23]. https://arxiv.org/abs/2402.13616.

[13]	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. (2020-04-23) [2025-02-23]. https://arxiv.org/abs/2004.10934.

[14]	GE Z. YOLOx: Exceeding YOLO series in 2021[EB/OL]. (2021-07-18) [2025-02-23]. https://arxiv.org/abs/2107.08430.

[15]	ZHU X, SU W, LU L, et al. Deformable DETR: deformable transformers for end-to-end object detection[EB/OL]. (2020-10-08) [2025-02-23]. https://arxiv.org/abs/2010.04159.

[16]	Maggiolino G, Ahmad A, Cao Jet al. . Deep OC-SORT: multi-pedestrian tracking by adaptive re-identification. 2023 IEEE International Conference on Image Processing, October 8–11, 2023, Kuala Lumpur, Malaysia. 2023, New York, IEEE30253029[C]

[17]	SUN P, CAO J, JIANG Y, et al. Transtrack: multiple object tracking with transformer[EB/OL]. (2020-12-31) [2025-02-23]. https://arxiv.org/abs/2012.15460.

[18]	Lan J P, Cheng Z Q, He J Yet al. . Procontext: exploring progressive context transformer for tracking. IEEE International Conference on Acoustics, Speech and Signal Processing, June 4–10, 2023, Rhodes Island, Greece. 2023, New York, IEEE15[C]

[19]	Zeng F, Dong B, Zhang Yet al. . MOTR: end-to-end multiple-object tracking with transformer. European Conference on Computer Vision, October 23–27, 2022, Tel Aviv, Israel. 2022, Cham, Springer Nature Switzerland659675[C]

[20]	Chu P, Wang J, You Qet al. . Transmot: spatial-temporal graph transformer for multiple object tracking. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, January 2–7, 2023, Waikoloa, USA. 2023, New York, IEEE48704880[C]

[21]	Pang J, Qiu L, Li Xet al. . Quasi-dense similarity learning for multiple object tracking. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 20–25, 2021, Virtual. 2021, New York, IEEE164173[C]

[22]	Zhang Y, Sun P, Jiang Yet al. . Bytetrack: multi-object tracking by associating every detection box. European Conference on Computer Vision, October 23–27, 2022, Tel Aviv, Israel. 2022, Cham, Springer Nature Switzerland121[C]

[23]	Ge Y, Ye W, Zhang Get al. . Multi-level temporal feature fusion with feature exchange strategy for multiple object tracking. Optoelectronics letters. 2024, 20(8): 505-512. J]