Real-time visual tracking using complementary kernel support correlation filters
Zhenyang SU, Jing LI, Jun CHANG, Bo DU, Yafu XIAO
Real-time visual tracking using complementary kernel support correlation filters
Despite demonstrated success of SVM based trackers, their performance remains a boosting room if carefully considering the following factors: first, the tradeoff between sampling and budgeting samples affects tracking accuracy and efficiency much; second, how to effectively fuse different types of features to learn a robust target representation plays a key role in tracking accuracy. In this paper, we propose a novel SVM based tracking method that handles the first factor with the help of the circulant structures of the samples and the second one by a multi-kernel learning mechanism. Specifically, we formulate an SVM classification model for visual tracking that incorporates two types of kernels whose matrices are circulant, fully taking advantage of the complementary traits of the color and HOG features to learn a robust target representation. Moreover, it is fortunate that the SVM model has a closed-form solution in terms of both the classifier weights and the kernel weights, and both can be efficiently computed via fast Fourier transforms (FFTs). Extensive evaluations on OTB100 and VOT2016 visual tracking benchmarks demonstrate that the proposed method achieves a favorable performance against various state-of-the-art trackers with a speed of 50 fps on a single CPU.
visual tracking / SVM / correlation filter / multikernel learning
[1] |
Li X, Hu W, Shen C, Zhang Z, Dick A, Hengel A V D. A survey of appearance models in visual object tracking. ACM Transactions on Intelligent Systems and Technology, 2013, 4(4): 1–48
CrossRef
Google scholar
|
[2] |
Song Y, Ma C, Wu X, Gong L, Bao L, Zuo W, Shen C, Lau R, Yang M H. VITAL: visual tracking via adversarial learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 8990–8999
CrossRef
Google scholar
|
[3] |
Zhang K, Zhang L, Yang M H. Real-time compressive tracking. In: Proceedings of European Conference on Computer Vision. 2012, 864–877
CrossRef
Google scholar
|
[4] |
Jia X, Lu H, Yang M H. Visual tracking via adaptive structural local sparse appearance model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2012, 1822–1829
|
[5] |
Zhang K, Zhang L, Yang M H. Fast compressive tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(10): 2002–2015
CrossRef
Google scholar
|
[6] |
Zhong W, Lu H, Yang M H. Robust object tracking via sparsity-based collaborative model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2012, 1838–1845
|
[7] |
Chen W, Zhang K, Liu Q. Robust visual tracking via patch based kernel correlation filters with adaptive multiple feature ensemble. Neurocomputing, 2016, 214: 607–617
CrossRef
Google scholar
|
[8] |
Yang J, Zhang K, Liu Q. Robust object tracking by online fisher discrimination boosting feature selection. Computer Vision and Image Understanding, 2016, 153: 100–108
CrossRef
Google scholar
|
[9] |
Song H, Zheng Y, Zhang K. Robust visual tracking via self-similarity learning. Electronics Letters, 2016, 53(1): 20–22
CrossRef
Google scholar
|
[10] |
Song H, Wang G, Zhang K. Hyperspectral image denoising via lowrank matrix recovery. Remote Sensing Letters, 2014, 5(10): 872–881
CrossRef
Google scholar
|
[11] |
Zhang K, Liu Q, Song H, Li X. A variational approach to simultaneous image segmentation and bias correction. IEEE Transactions on Cybernetics, 2015, 45(8): 1426–1437
CrossRef
Google scholar
|
[12] |
Zhang K, Zhang L, Lam K M, Zhang D. A level set approach to image segmentation with intensity inhomogeneity. IEEE Transactions on Cybernetics, 2016, 46(2): 546–557
CrossRef
Google scholar
|
[13] |
Song H. Robust visual tracking via online informative feature selection. Electronics Letters, 2014, 50(25): 1931–1933
CrossRef
Google scholar
|
[14] |
Zhang K, Song H. Real-time visual tracking via online weighted multiple instance learning. Pattern Recognition, 2013, 46(1): 397–411
CrossRef
Google scholar
|
[15] |
Zhang K, Liu Q, Wu Y, Yang M H. Robust visual tracking via convolutional networks without training. IEEE Transactions on Image Processing, 2016, 25(4): 1779–1792
|
[16] |
Wu Y, Lim J, Yang M H. Online object tracking: a benchmark. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 2411–2418
CrossRef
Google scholar
|
[17] |
Wang N, Shi J, Yeung D Y, Jia J. Understanding and diagnosing visual tracking systems. In: Proceedings of IEEE International Conference on Computer Vision. 2015, 3101–3109
CrossRef
Google scholar
|
[18] |
Avidan S. Support vector tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(8): 1064–1072
CrossRef
Google scholar
|
[19] |
Bai Y, Tang M. Robust tracking via weakly supervised ranking SVM. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2012, 1854–1861
|
[20] |
Hare S, Golodetz S, Saffari A, Vineet V, Cheng M M, Hicks S L, Torr P H. Struck: structured output tracking with kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(10): 2096–2109
CrossRef
Google scholar
|
[21] |
Zhang J, Ma S, Sclaroff S. MEEM: robust tracking via multiple experts using entropy minimization. In: Proceedings of European Conference on Computer Vision. 2014, 188–203
CrossRef
Google scholar
|
[22] |
Song H, Huang B, Liu Q, Zhang K. Improving the spatial resolution of landsat TM/ETM+ through fusion with SPOT5 images via learningbased super-resolution. IEEE Transactions on Geoscience and Remote Sensing, 2015, 53(3): 1195–1204
CrossRef
Google scholar
|
[23] |
Ning J, Yang J, Jiang S, Zhang L, Yang M H. Object tracking via dual linear structured SVM and explicit feature map. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 4266–4274
CrossRef
Google scholar
|
[24] |
Song H. Active contours driven by regularised gradient flux flows for image segmentation. Electronics Letters, 2014, 50(14): 992–994
CrossRef
Google scholar
|
[25] |
Zuo W M, Wu X H, Lin L, Zhang L, Yang M H. Learning support correlation filters for visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(5): 1158–1172
CrossRef
Google scholar
|
[26] |
Song H, Liu Q, Wang G, Hang R, Huang B. Spatiotemporal satellite image fusion using deep convolutional neural networks. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2018, 11(3): 821–829
CrossRef
Google scholar
|
[27] |
Song H, Zheng Y, Zhang K. Efficient algorithm for piecewise-smooth model with approximately explicit solutions. Electronics Letters, 2017, 53(4): 233–235
CrossRef
Google scholar
|
[28] |
Wang M, Liu Y, Huang Z. Large margin object tracking with circulant feature maps. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 4021–4029
CrossRef
Google scholar
|
[29] |
Henriques J F, Caseiro R, Martins P, Batista J. High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583–596
CrossRef
Google scholar
|
[30] |
Bertinetto L, Valmadre J, Golodetz S, Miksik O, Torr P H. Staple: complementary learners for real-time tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 1401–1409
CrossRef
Google scholar
|
[31] |
Zhang K, Zhang L, Liu Q, Zhang D, Yang M H. Fast visual tracking via dense spatio-temporal context learning. In: Proceedings of European Conference on Computer Vision. 2014, 127–141
CrossRef
Google scholar
|
[32] |
Song H, Wang G, Cao A, Liu Q, Huang B. Improving the spatial resolution of FY-3 microwave radiation imager via fusion with FY-3/MERSI. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2017, 10(7): 3055–3063
CrossRef
Google scholar
|
[33] |
Bolme D S, Beveridge J R, Draper B A, Lui Y M. Visual object tracking using adaptive correlation filters. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2010, 2544–2550
CrossRef
Google scholar
|
[34] |
Zhang K, Liu Q, Yang J, Yang M H. Visual tracking via boolean map representations. Pattern Recognition, 2018, 81: 147–160
CrossRef
Google scholar
|
[35] |
Zhang K, Li X, Song H, Liu Q, Lian W. Visual tracking using spatiotemporally nonlocally regularized correlation filter. Pattern Recognition, 2018, 83: 185–195
CrossRef
Google scholar
|
[36] |
Tang M, Feng J. Multi-kernel correlation filter for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 3038–3046
CrossRef
Google scholar
|
[37] |
Song H, Wang G, Zhang K. Multiple change detection for multispectral remote sensing images via joint sparse representation. Optical Engineering, 2014, 53(12): 123103
CrossRef
Google scholar
|
[38] |
Qi Y, Qin L, Zhang J, Zhang S, Huang Q, Yang M H. Structure-aware local sparse coding for visual tracking. IEEE Transactions on Image Processing, 2018, 27(8): 3857–3869
CrossRef
Google scholar
|
[39] |
Sun C, Wang D, Lu H, Yang M H. Learning spatial-aware regressions for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 8962–8970
CrossRef
Google scholar
|
[40] |
Qi Y, Zhang S, Qin L, Huang Q, Yao H, Lim J, Yang M H. Hedging deep features for visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(5): 1116–1130
CrossRef
Google scholar
|
[41] |
Sun C, Wang D, Lu H, Yang M H. Correlation tracking via joint discrimination and reliability learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 489–497
CrossRef
Google scholar
|
[42] |
Zhang T, Liu S, Xu C, Liu B, Yang M H. Correlation particle filter for visual tracking. IEEE Transactions on Image Processing, 2018, 27(6): 2676–2687
CrossRef
Google scholar
|
[43] |
Zhang T, Xu C, Yang M H. Learning multi-task correlation particle filters for visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(2): 365–378
CrossRef
Google scholar
|
[44] |
Ma C, Huang J B, Yang X, Yang M H. Hierarchical convolutional features for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 3074–3082
CrossRef
Google scholar
|
[45] |
Varma M, Ray D. Learning the discriminative power-invariance tradeoff. In: Proceedings of International Conference on Computer Vision. 2007, 1–8
CrossRef
Google scholar
|
[46] |
Wu Y, Lim J, Yang M H. Object tracking benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1834–1848
CrossRef
Google scholar
|
[47] |
Tang F, Brennan S, Zhao Q, Tao H. Co-tracking using semi-supervised support vector machines. In: Proceedings of International Conference on Computer Vision. 2007, 1–8
CrossRef
Google scholar
|
[48] |
Li X, Dick A, Wang H, Shen C, Van Den Hengel A. Graph mode-based contextual kernels for robust SVM tracking. In: Proceedings of International Conference on Computer Vision. 2011, 1156–1163
CrossRef
Google scholar
|
[49] |
Supancic J S, Ramanan D. Self-paced learning for long-term tracking. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 2379–2386
CrossRef
Google scholar
|
[50] |
Danelljan M, Shahbaz Khan F, Felsberg M, Van De Weijer J. Adaptive color attributes for real-time visual tracking. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 1090–1097
CrossRef
Google scholar
|
[51] |
Li Y, Zhu J. A scale adaptive kernel correlation filter tracker with feature integration. In: Proceedings of European Conference on Computer Vision. 2014, 254–265
CrossRef
Google scholar
|
[52] |
Ma C, Yang X, Zhang C, Yang M H. Long-term correlation tracking. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 5388–5396
CrossRef
Google scholar
|
[53] |
Hong Z, Chen Z, Wang C, Mei X, Prokhorov D, Tao D. Multi-store tracker (muster): a cognitive psychology inspired approach to object tracking. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 749–758
CrossRef
Google scholar
|
[54] |
Choi J, Jin Chang H, Jeong J, Demiris Y, Young Choi J. Visual tracking using attention-modulated disintegration and integration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 4321–4330
CrossRef
Google scholar
|
[55] |
Danelljan M, Hager G, Shahbaz Khan F, Felsberg M. Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 4310–4318
CrossRef
Google scholar
|
[56] |
Qi Y, Zhang S, Qin L, Yao H, Huang Q, Lim J, Yang M H. Hedged deep tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 4303–4311
CrossRef
Google scholar
|
[57] |
Liu S, Zhang T, Cao X, Xu C. Structural correlation filter for robust visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 4312–4320
CrossRef
Google scholar
|
[58] |
Zhang T, Xu C, Yang M H. Multi-task correlation particle filter for robust object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 4335–4343
CrossRef
Google scholar
|
[59] |
Kiani Galoogahi H, Fagg A, Lucey S. Learning background-aware correlation filters for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 1135–1143
CrossRef
Google scholar
|
[60] |
Mueller M, Smith N, Ghanem B. Context-aware correlation filter tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 1396–1404
CrossRef
Google scholar
|
[61] |
Lukezic A, Vojír T, Zajc L C, Matas J, Kristan M. Discriminative correlation filter with channel and spatial reliability. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 6309–6318
CrossRef
Google scholar
|
[62] |
Scholkopf B, Smola A J. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Massachusetts: MIT Press, 2001
|
[63] |
Lanckriet G R, De Bie T, Cristianini N, Jordan M I, Noble W S. A statistical framework for genomic data fusion. Bioinformatics, 2004, 20(16): 2626–2635
CrossRef
Google scholar
|
[64] |
Kristan M, Leonardis A, Matas J, Felsberg M, Pflugfelder R, Čehovin L, Vojír T, Häger G, Lukežiˇc A, Fernández G, others. The visual object tracking VOT2016 challenge results. In: Proceedings of European Conference on Computer Vision Workshops. 2016, 777–823
|
/
〈 | 〉 |