Tiny neural networks for multi-object tracking in a modular Kalman framework

Christian Holz , Christian Bader , Markus Enzweiler , Matthias Drüppel

Autonomous Intelligent Systems ›› 2026, Vol. 6 ›› Issue (1) : 6

PDF
Autonomous Intelligent Systems ›› 2026, Vol. 6 ›› Issue (1) :6 DOI: 10.1007/s43684-026-00127-2
Original Article
research-article
Tiny neural networks for multi-object tracking in a modular Kalman framework
Author information +
History +
PDF

Abstract

We present a modular, production-ready approach that integrates compact Neural Network (NN) into a Kalman-filter-based Multi-Object Tracking (MOT) pipeline. We design three tiny task-specific networks to retain modularity, interpretability and real-time suitability for embedded Automotive Driver Assistance Systems:

1.

SPENT (Single-Prediction Network) — predicts per-track states and replaces heuristic motion models used by the Kalman Filter (KF).

2.

SANT (Single-Association Network) — assigns a single incoming sensor object to existing tracks, without relying on heuristic distance and association metrics.

3.

MANTa (Multi-Association Network) — jointly associates multiple sensor objects to multiple tracks in a single step.

Each module has less than 50k trainable parameters. Furthermore, all three can be operated in real-time, are trained from tracking data, and expose modular interfaces so they can be integrated with standard Kalman-filter state updates and track management. This makes them drop-in compatible with many existing trackers. Modularity is ensured, as each network can be trained and evaluated independently of the others. Our evaluation on the KITTI tracking benchmark shows that SPENT reduces prediction RMSE by more than 50% compared to a standard Kalman filter, while SANT and MANTa achieve up to 95% assignment accuracy. These results demonstrate that small, task-specific neural modules can substantially improve tracking accuracy and robustness without sacrificing modularity, interpretability, or the real-time constraints required for automotive deployment.

Keywords

Multi-object tracking / Recurrent neural networks / Kalman filter / Real-time embedded systems / Tiny neural networks / Data-driven methods

Cite this article

Download citation ▾
Christian Holz, Christian Bader, Markus Enzweiler, Matthias Drüppel. Tiny neural networks for multi-object tracking in a modular Kalman framework. Autonomous Intelligent Systems, 2026, 6(1): 6 DOI:10.1007/s43684-026-00127-2

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Bewley A., Ge Z., Ott L., Ramos F., Upcroft B.. Sort: simple online and realtime tracking. 2016 IEEE International Conference on Image Processing (ICIP). 2016, New York, IEEE Press34643468.

[2]

J. Seidenschwarz, G. Brasó, V. Serrano, I. Elezi, L. Leal-Taixé, Simple cues lead to a strong multi-object tracker. (2022) Available https://doi.org/10.48550/arXiv.2206.04656

[3]

Krejcí J., Kost O., Straka O., Duník J.. Bounding box dynamics in visual tracking: modeling and noise covariance estimation. 2023 26th International Conference on Information Fusion (FUSION). 202316[Online]. Available https://api.semanticscholar.org/CorpusID:261126559

[4]

A. Milan, S. Rezatofighi, A. Dick, I. Reid, K. Schindler, Online multi-target tracking using recurrent neural networks. Association for the Advancement of Artificial Intelligence (AAAI) (2017) Available https://doi.org/10.1609/aaai.v31i1.11194

[5]

Liu H., Zhang H., Mertz C.. DeepDA: LSTM-based deep data association network for multi-targets tracking in clutter. 2019 22th International Conference on Information Fusion (FUSION). 2019

[6]

Chu Q., Ouyang W., Li H., Wang X., Liu B., Yu N.. Online multi-object tracking using cnn-based single object tracker with spatial-temporal attention mechanism. 2017 IEEE International Conference on Computer Vision (ICCV). 2017

[7]

T. Meinhardt, A. Kirillov, L. Leal-Taixe, C. Feichtenhofer, Trackformer: multi-object tracking with transformers. (2022) Available https://doi.org/10.48550/arXiv.2101.02702

[8]

W.-C. Hung, H. Kretzschmar, T.-Y. Lin, Y. Chai, R. Yu, Soda: multi-object tracking with soft data association. Paper (2020) Available https://doi.org/10.48550/arXiv.2008.07725

[9]

Y. Zhang, P. Sun, Y. Jiang, D. Yu, Z. Yuan, P. Luo, W. Liu, X. Wang, Bytetrack: multi-object tracking by associating every detection box. arXiv (2021). [Online]. Available https://api.semanticscholar.org/CorpusID:238744032. arXiv:2110.06864

[10]

Zhang Y., Wang C., Wang X., Zeng W., Liu W.. Fairmot: on the fairness of detection and re-identification in multiple object tracking. Int. J. Comput. Vis.. 2020, 129: 3069-3087. Online]. Available https://api.semanticscholar.org/CorpusID:221562313

[11]

Liu J., Wang Z., Xu M.. Deepmtt: a deep learning maneuvering target-tracking algorithm based on bidirectional lstm network. Inf. Fusion. 2020, 53: 289-304.

[12]

Kampker A., Sefati M., Rachman A.S.A., Kreisköther K., Campoy P.. Towards multi-object detection and tracking in urban scenario under uncertainties. VEHITS. 2018156167

[13]

Wu H., Han W., Wen C., Li X., Wang C.. 3d multi-object tracking in point clouds based on prediction confidence-guided data association. IEEE Trans. Intell. Transp. Syst.. 2022, 23(6): 5668-5677.

[14]

S.D. Pendleton, H. Andersen, X. Du, X. Shen, M. Meghjani, Y.H. Eng, D. Rus, M.H. Ang, Perception, planning, control, and coordination for autonomous vehicles. Machines 5(1) (2017). [Online]. Available https://www.mdpi.com/2075-1702/5/1/6

[15]

Winner H., Hakuli S., Lotz F., Singer C., Stiller C.. Handbook of Driver Assistance Systems: Basic Information, Components and Systems for Active Safety and Comfort. 2024, Berlin, Springer

[16]

Warden P., Situnayake D.. TinyML: Machine Learning with TensorFlow Lite on Arduino and Ultra-Low-Power Microcontrollers. 2019O’Reilly Media

[17]

H.W. Kuhn, The Hungarian method for the assignment problem. Naval Research Logistics Quarterly (1955)

[18]

B.-T. Vo, Code set for research use: multi-sensor multi-target tracking. Code (2013). [Online]. Available https://ba-tuong.vo-au.com/codes.html

[19]

Ristic B., Arulampalam S., Gordon N.. Particle Filters for Tracking Applications. 2004, Norwood, Artech House

[20]

Julier S.J., Uhlmann J.K.. The unscented Kalman filter for nonlinear estimation. Proc. IEEE. 2004, 92(3): 401-422.

[21]

Wan E.A., Van Der Merwe R.. The unscented Kalman filter for nonlinear estimation. Proceedings of the IEEE Adaptive Systems for Signal Processing, Communications, and Control Symposium (AS-SPCC). 2000, New York, IEEE Press153158

[22]

X. Zhou, V. Koltun, P. Kr’ahenb’uhl, CenterTrack: tracking Objects as Points (2020). arXiv preprint. arXiv:2004.01177

[23]

P. Sun, Y. Jiang, R. Zhang, E. Xie, P. Luo, H. Li, TransTrack: multiple-Object Tracking with Transformer (2020). arXiv preprint. arXiv:2012.15460

[24]

Dezert J., Bar-Shalom Y.. Joint probabilistic data association for autonomous navigation. IEEE Trans. Aerosp. Electron. Syst.. 1993, 29(4): 1275-1286.

[25]

Wojke N., Bewley A., Paulus D.. Deepsort: simple online and realtime tracking with a deep association metric. 2017 IEEE International Conference on Image Processing (ICIP). 2017, New York, IEEE Press36453649.

[26]

R. Henschel, L. Leal-Taixé, D. Cremers, B. Rosenhahn, Fusion of head and full-body detectors for multi-object tracking. Computer Vision and Pattern Recognition (2017) Available https://doi.org/10.48550/arXiv.1705.08314

[27]

Wang Z., Zheng L., Liu Y.et al. . Towards real-time multi-object tracking. Proceedings of the IEEE International Conference on Multimedia and Expo Workshops (ICMEW). 2019

[28]

J. Fitz, Datenassoziation für multi-objekt-verfolgung mittels deep learning. Paper (2020) Available https://www.deutsche-digitale-bibliothek.de/item/JMGUN5VSWOARH4SDB62G54IRXB2VSHJN

[29]

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, L. Kaiser, I. Polosukhin, Attention is all you need. NIPS (2017) Available https://doi.org/10.48550/arXiv.1706.03762

[30]

N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, S. Zagoruyko, End-to-end object detection with transformers. arXiv (2020). [Online]. Available https://api.semanticscholar.org/CorpusID:218889832. arXiv:2005.12872

[31]

Hochreiter S., Schmidhuber J.. Long short-term memory. Neural Comput.. 1997, 9(8): 1735-1780.

[32]

Schuster M., Paliwal K.. Bidirectional recurrent neural networks. IEEE Trans. Signal Process.. 1997, 45(11): 2673-2681.

[33]

Geiger A., Lenz P., Urtasun R.. Are we ready for autonomous driving? The kitti vision benchmark suite. Conference on Computer Vision and Pattern Recognition (CVPR). 2012

[34]

X. Gao, Z. Wang, X. Wang, S. Zhang, S. Zhuang, H. Wang, Dettrack: an algorithm for multiple object tracking by improving occlusion object detection. Electronics 13(1) (2023). [Online]. Available https://www.mdpi.com/2079-9292/13/1/91

[35]

Goodfellow I., Bengio Y., Courville A.. Deep Learning. 2019, Cambridge, MIT Press

[36]

D.M. Reddy, N.V.S. Reddy, Effect of padding on lstms and cnns. Paper (2019) Available https://doi.org/10.48550/arXiv.1903.07288

[37]

S. Ioffe, C. Szegedy, Batch normalization: accelerating deep network training by reducing internal covariate shift. ICML (2015) Available https://doi.org/10.48550/arXiv.1502.03167

[38]

Srivastava N., Hinton G., Krizhevsky A., Sutskever I., Salakhutdinov R.. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res.. 2014, 15(1): 1929-1958

[39]

Glorot X., Bengio Y.. Understanding the difficulty of training deep feedforward neural networks. J. Mach. Learn. Res.. 2010, 9: 249-256

[40]

G. Welch, G. Bishop, An Introduction to the Kalman Filter. Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, Tech. Rep. TR 95-041 (2006)

[41]

Bergmann P., Meinhardt T., Leal-Taixé L.. Tracking without bells and whistles. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2019

[42]

T. Meinhardt, A. Kirillov, L. Leal-Taixé, C. Feichtenhofer, TrackFormer: multi-Object Tracking with Transformers (2021). arXiv preprint. arXiv:2101.02702

RIGHTS & PERMISSIONS

The Author(s)

PDF

0

Accesses

0

Citation

Detail

Sections
Recommended

/