AHAT: adaptive hybrid association strategy for multi-object tracking in complex motion scenes

Keyu Wang; Yajie Yang; Xiuxian Li

doi:10.1007/s43684-026-00129-0

Autonomous Intelligent Systems ›› 2026, Vol. 6 ›› Issue (1) :7 DOI: 10.1007/s43684-026-00129-0

Original Article

research-article

AHAT: adaptive hybrid association strategy for multi-object tracking in complex motion scenes

Keyu Wang ¹
, Yajie Yang ²
, Xiuxian Li ¹^,²^,³^,^a

Author information +

History +

PDF

Abstract

Multi-object tracking (MOT) has long been a challenging task in computer vision, particularly in complex scenes with intricate motion patterns and frequent occlusions. Existing approaches often face significant hurdles in maintaining consistent and accurate trajectories under such demanding conditions. The integration of motion and appearance cues has proven beneficial, yet most methods rely on static fusion strategies that fail to adapt to dynamic scene variations. In this paper, we propose Adaptive Hybrid Association Tracking (AHAT), a novel framework designed to address the limitations of traditional MOT methods. AHAT employs a two-stage dynamic feature selection mechanism. The first stage combines motion and appearance features to achieve high-precision matching for high-scoring detection boxes, while the second stage utilizes a dynamic threshold for simple matching against low-scoring detection boxes. This approach effectively reduces trajectory fragmentation and ID switches, improving tracking robustness in crowded and dynamic environments. Notably, AHAT achieves a 5% improvement in HOTA in scenarios with low detection confidence or high motion complexity and reduces identity switches by over 10%. These results highlight AHAT’s effectiveness in practical applications, especially in video surveillance and robotics where high accuracy and real-time performance are crucial. The modular design of AHAT allows for seamless integration into existing tracking frameworks, offering a simple yet effective solution.

Keywords

Adaptive hybrid association / Data association / Multi-object tracking / Visual tracking

Cite this article

Download citation ▾

Keyu Wang, Yajie Yang, Xiuxian Li. AHAT: adaptive hybrid association strategy for multi-object tracking in complex motion scenes. Autonomous Intelligent Systems, 2026, 6(1): 7 DOI:10.1007/s43684-026-00129-0

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Lin J., et al. . Echotrack: auditory referring multi-object tracking for autonomous driving. IEEE Trans. Intell. Transp. Syst., 2024, 25(11): 18964-18977

[2]	Gómez-Huélamo C., Bergasa L.M., Gutiérrez R., Arango J.F., Díaz A.. SmartMOT: exploiting the fusion of HDMaps and multi-object tracking for real-time scene understanding in intelligent vehicles applications. IEEE Intelligent Vehicles Symposium, 2021, 710-715

[3]	W. Luo, et al., Multiple object tracking: a literature review. Artif. Intell., 103448 (2021)

[4]	He L., Liao X., Liu W., Liu X., Cheng P., Mei T.. Fastreid: a pytorch toolbox for general instance re-identification. 31st ACM International Conference on Multimedia, 2023, 9664-9667

[5]	Bewley A., Ge Z., Ott L., Ramos F., Upcroft B.. Simple online and realtime tracking. IEEE International Conference on Image Processing, 2016, 3464-3468

[6]	Zhang Y., et al. . ByteTrack: multi-object tracking by associating every detection box. European Conference on Computer Vision, 2022, 1-21

[7]	Qin Z., Zhou S., Wang L., Duan J., Hua G., Tang W.. MotionTrack: learning robust short-term and long-term motions for multi-object tracking. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, 17939-17948

[8]	Wojke N., Bewley A., Paulus D.. Simple online and realtime tracking with a deep association metric. IEEE/CVF International Conference on Image Processing, 2017, 3645-3649

[9]	Wang Y.H., Hsieh J.W., Chen P.Y., Chang M.C., So H.H., Li X.. SMILEtrack: similarity learning for occlusion-aware multiple object tracking. AAAI Conference on Artificial Intelligence, 2024, 5740-5748

[10]	Cao X., Zheng Y., Yao Y., Qin H., Cao X., Guo S.. TOPIC: a parallel association paradigm for multi-object tracking under complex motions and diverse scenes. IEEE Trans. Image Process., 2025, 34: 743-758

[11]	Du Y., et al. . StrongSORT: make deepSORT great again. IEEE Trans. Multimed., 2023, 25: 8725-8737

[12]	Henriques J.F., Caseiro R., Batista J.. Globally optimal solution to multi-object tracking with merged measurements. IEEE International Conference on Computer Vision, 2011, 2470-2477

[13]	Yang B., Nevatia R.. Multi-target tracking by online learning of non-linear motion patterns and robust appearance models. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012, 1918-1925

[14]	Fu H., Wu L., Jian M., Yang Y., Wang X.. MF-SORT: simple online and realtime tracking with motion features. Image and Graphics: 10th International Conference, 2019, 157-168

[15]	Dai P., Weng R., Choi W., Zhang C., He Z., Ding W.. Learning a proposal classifier for multiple object tracking. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, 2443-2452

[16]	Wang S., Sheng H., Yang D., Zhang Y., Wu Y., Wang S.. Extendable multiple nodes recurrent tracking framework with RTU++. IEEE Trans. Image Process., 2022, 31: 5257-5271

[17]	Peng J., et al. . Chained-tracker: chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. European Conference on Computer Vision, 2020, 145-161

[18]	Brendel W., Amer M., Todorovic S.. Multiobject tracking as maximum weight independent set. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2011, 1273-1280

[19]	Dalal N., Triggs B.. Histograms of oriented gradients for human detection. IEEE Conference on Computer Vision and Pattern Recognition, 2005, 886-893

[20]	Cao J., Pang J., Weng X., Khirodkar R., Kitani K.. Observation-centric SORT: rethinking SORT for robust multi-object tracking. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, 9686-9696

[21]	Zhang Y., Liang Y., Leng J., Wang Z.. SCGTracker: spatio-temporal correlation and graph neural networks for multiple object tracking. Pattern Recognit., 2024, 149 110249

[22]	P. Sun, et al., TransTrack: multiple object tracking with transformer (2020). arXiv preprint. arXiv:2012.15460

[23]	Gao R., Wang L.. MeMOTR: long-term memory-augmented transformer for multi-object tracking. IEEE International Conference on Computer Vision, 2023, 9901-9910

[24]	Zhang Y., Wang T., Zhang X.. MOTRv2: bootstrapping end-to-end multi-object tracking by pretrained object detectors. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, 22056-22065

[25]	A. Milan, L. Leal-Taixé, I. Reid, S. Roth, K. Schindler, MOT16: a benchmark for multi-object tracking (2016). arXiv preprint. arXiv:1603.00831

[26]	P. Dendorfer, et al., MOT20: a benchmark for multi object tracking in crowded scenes (2020). arXiv preprint. arXiv:2003.09003

[27]	Sun P., et al. . DanceTrack: multi-object tracking in uniform appearance and diverse motion. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, 20993-21002

[28]	Bai H., Cheng W., Chu P., Liu J., Zhang K., Ling H.. GMOT-40: a benchmark for generic multiple object tracking. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, 6719-6728

[29]	Z. Ge, S. Liu, F. Wang, Z. Li, J. Sun, YOLOX: exceeding yolo series in 2021 (2021). arXiv preprint. arXiv:2107.08430

[30]	Luiten J., et al. . HOTA: a higher order metric for evaluating multi-object tracking. Int. J. Comput. Vis., 2021, 129: 548-578

[31]	Bernardin K., Stiefelhagen R.. Evaluating multiple object tracking performance: the CLEAR MOT metrics. EURASIP J. Image Video Process., 2008, 2008: 1-10

[32]	Zhang Y., Wang C., Wang X., Zeng W., Liu W.. FairMOT: on the fairness of detection and re-identification in multiple object tracking. Int. J. Comput. Vis., 2021, 129(11): 3069-3087

[33]	Meinhardt T., Kirillov A., Leal-Taixe L., Feichtenhofer C.. TrackFormer: multi-object tracking with transformers. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, 8844-8854

[34]	Wu J., Cao J., Song L., Wang Y., Yang M., Yuan J.. Track to detect and segment: an online multi-object tracker. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, 12352-12361

[35]	Z. Wang, H. Zhao, Y.L. Li, S. Wang, P. Torr, L. Bertinetto, Do different tracking tasks require different appearance models? Adv. Neural Inf. Process. Syst., 726–738 (2021)

[36]	X. Zhou, V. Koltun, P. Krähenbühl, Tracking objects as points in European Conference on Computer Vision (2020), pp. 474–490

[37]	Pang J., Qiu L., Li X., Chen H., Li Q., Darrell T., Yu F.. Quasi-dense similarity learning for multiple object tracking. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, 164-173

[38]	Zeng F., Dong B., Zhang Y., Wang T., Zhang X., Wei Y.. MOTR: end-to-end multiple-object tracking with transformer. European Conference on Computer Vision, 2022, 659-675

[39]	J. Cao, H. Wu, K. Kitani, Track targets by dense spatio-temporal position encoding (2022). arXiv preprint. arXiv:2210.09455

[40]	Fischer T., et al. . QDTrack: quasi-dense similarity learning for appearance-only multiple object tracking. IEEE Trans. Pattern Anal. Mach. Intell., 2023, 45(12): 15380-15393