Tiny neural networks for multi-object tracking in a modular Kalman framework
Christian Holz , Christian Bader , Markus Enzweiler , Matthias Drüppel
Autonomous Intelligent Systems ›› 2026, Vol. 6 ›› Issue (1) : 6
We present a modular, production-ready approach that integrates compact Neural Network (NN) into a Kalman-filter-based Multi-Object Tracking (MOT) pipeline. We design three tiny task-specific networks to retain modularity, interpretability and real-time suitability for embedded Automotive Driver Assistance Systems:
| 1. | SPENT (Single-Prediction Network) — predicts per-track states and replaces heuristic motion models used by the Kalman Filter (KF). |
| 2. | SANT (Single-Association Network) — assigns a single incoming sensor object to existing tracks, without relying on heuristic distance and association metrics. |
| 3. | MANTa (Multi-Association Network) — jointly associates multiple sensor objects to multiple tracks in a single step. |
Multi-object tracking / Recurrent neural networks / Kalman filter / Real-time embedded systems / Tiny neural networks / Data-driven methods
| [1] |
|
| [2] |
J. Seidenschwarz, G. Brasó, V. Serrano, I. Elezi, L. Leal-Taixé, Simple cues lead to a strong multi-object tracker. (2022) Available https://doi.org/10.48550/arXiv.2206.04656 |
| [3] |
|
| [4] |
A. Milan, S. Rezatofighi, A. Dick, I. Reid, K. Schindler, Online multi-target tracking using recurrent neural networks. Association for the Advancement of Artificial Intelligence (AAAI) (2017) Available https://doi.org/10.1609/aaai.v31i1.11194 |
| [5] |
|
| [6] |
|
| [7] |
T. Meinhardt, A. Kirillov, L. Leal-Taixe, C. Feichtenhofer, Trackformer: multi-object tracking with transformers. (2022) Available https://doi.org/10.48550/arXiv.2101.02702 |
| [8] |
W.-C. Hung, H. Kretzschmar, T.-Y. Lin, Y. Chai, R. Yu, Soda: multi-object tracking with soft data association. Paper (2020) Available https://doi.org/10.48550/arXiv.2008.07725 |
| [9] |
Y. Zhang, P. Sun, Y. Jiang, D. Yu, Z. Yuan, P. Luo, W. Liu, X. Wang, Bytetrack: multi-object tracking by associating every detection box. arXiv (2021). [Online]. Available https://api.semanticscholar.org/CorpusID:238744032. arXiv:2110.06864 |
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
|
| [14] |
S.D. Pendleton, H. Andersen, X. Du, X. Shen, M. Meghjani, Y.H. Eng, D. Rus, M.H. Ang, Perception, planning, control, and coordination for autonomous vehicles. Machines 5(1) (2017). [Online]. Available https://www.mdpi.com/2075-1702/5/1/6 |
| [15] |
|
| [16] |
|
| [17] |
H.W. Kuhn, The Hungarian method for the assignment problem. Naval Research Logistics Quarterly (1955) |
| [18] |
B.-T. Vo, Code set for research use: multi-sensor multi-target tracking. Code (2013). [Online]. Available https://ba-tuong.vo-au.com/codes.html |
| [19] |
|
| [20] |
|
| [21] |
|
| [22] |
X. Zhou, V. Koltun, P. Kr’ahenb’uhl, CenterTrack: tracking Objects as Points (2020). arXiv preprint. arXiv:2004.01177 |
| [23] |
P. Sun, Y. Jiang, R. Zhang, E. Xie, P. Luo, H. Li, TransTrack: multiple-Object Tracking with Transformer (2020). arXiv preprint. arXiv:2012.15460 |
| [24] |
|
| [25] |
|
| [26] |
R. Henschel, L. Leal-Taixé, D. Cremers, B. Rosenhahn, Fusion of head and full-body detectors for multi-object tracking. Computer Vision and Pattern Recognition (2017) Available https://doi.org/10.48550/arXiv.1705.08314 |
| [27] |
|
| [28] |
J. Fitz, Datenassoziation für multi-objekt-verfolgung mittels deep learning. Paper (2020) Available https://www.deutsche-digitale-bibliothek.de/item/JMGUN5VSWOARH4SDB62G54IRXB2VSHJN |
| [29] |
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, L. Kaiser, I. Polosukhin, Attention is all you need. NIPS (2017) Available https://doi.org/10.48550/arXiv.1706.03762 |
| [30] |
N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, S. Zagoruyko, End-to-end object detection with transformers. arXiv (2020). [Online]. Available https://api.semanticscholar.org/CorpusID:218889832. arXiv:2005.12872 |
| [31] |
|
| [32] |
|
| [33] |
|
| [34] |
X. Gao, Z. Wang, X. Wang, S. Zhang, S. Zhuang, H. Wang, Dettrack: an algorithm for multiple object tracking by improving occlusion object detection. Electronics 13(1) (2023). [Online]. Available https://www.mdpi.com/2079-9292/13/1/91 |
| [35] |
|
| [36] |
D.M. Reddy, N.V.S. Reddy, Effect of padding on lstms and cnns. Paper (2019) Available https://doi.org/10.48550/arXiv.1903.07288 |
| [37] |
S. Ioffe, C. Szegedy, Batch normalization: accelerating deep network training by reducing internal covariate shift. ICML (2015) Available https://doi.org/10.48550/arXiv.1502.03167 |
| [38] |
|
| [39] |
|
| [40] |
G. Welch, G. Bishop, An Introduction to the Kalman Filter. Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, Tech. Rep. TR 95-041 (2006) |
| [41] |
|
| [42] |
T. Meinhardt, A. Kirillov, L. Leal-Taixé, C. Feichtenhofer, TrackFormer: multi-Object Tracking with Transformers (2021). arXiv preprint. arXiv:2101.02702 |
The Author(s)
/
| 〈 |
|
〉 |