Please wait a minute...

Frontiers of Computer Science

Front. Comput. Sci.    2018, Vol. 12 Issue (6) : 1160-1172     https://doi.org/10.1007/s11704-017-6281-8
RESEARCH ARTICLE |
Robust feature learning for online discriminative tracking without large-scale pre-training
Jun ZHANG, Bineng ZHONG(), Pengfei WANG, Cheng WANG, Jixiang DU
Department of Computer Science and Technology, Huaqiao University, Xiamen 361021, China
Download: PDF(838 KB)  
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Owing to the inherent lack of training data in visual tracking, recent work in deep learning-based trackers has focused on learning a generic representation offline from large-scale training data and transferring the pre-trained feature representation to a tracking task. Offline pre-training is time-consuming, and the learned generic representation may be either less discriminative for tracking specific objects or overfitted to typical tracking datasets. In this paper, we propose an online discriminative tracking method based on robust feature learning without large-scale pre-training. Specifically, we first design a PCA filter bank-based convolutional neural network (CNN) architecture to learn robust features online with a few positive and negative samples in the high-dimensional feature space. Then, we use a simple softthresholding method to produce sparse features that are more robust to target appearance variations.Moreover, we increase the reliability of our tracker using edge information generated from edge box proposals during the process of visual tracking. Finally, effective visual tracking results are achieved by systematically combining the tracking information and edge box-based scores in a particle filtering framework. Extensive results on the widely used online tracking benchmark (OTB- 50) with 50 videos validate the robustness and effectiveness of the proposed tracker without large-scale pre-training.

Keywords visual tracking      convolutional neural networks      PCA      Edge Box     
Corresponding Authors: Bineng ZHONG   
Just Accepted Date: 29 September 2017   Online First Date: 06 July 2018    Issue Date: 04 December 2018
 Cite this article:   
Jun ZHANG,Bineng ZHONG,Pengfei WANG, et al. Robust feature learning for online discriminative tracking without large-scale pre-training[J]. Front. Comput. Sci., 2018, 12(6): 1160-1172.
 URL:  
http://journal.hep.com.cn/fcs/EN/10.1007/s11704-017-6281-8
http://journal.hep.com.cn/fcs/EN/Y2018/V12/I6/1160
Service
E-mail this article
E-mail Alert
RSS
Articles by authors
Jun ZHANG
Bineng ZHONG
Pengfei WANG
Cheng WANG
Jixiang DU
1 Comaniciu D, Ramesh V, Meer P. Kernel-based object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(5): 564–577
https://doi.org/10.1109/TPAMI.2003.1195991
2 Danelljan M, Khan F S, Felsberg M, Weijer J V D. Adaptive color attributes for real-time visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2014, 1090–1097
https://doi.org/10.1109/CVPR.2014.143
3 Ross D A, Lim J, Lin R S, Yang M H. Incremental learning for robust visual tracking. International Journal of Computer Vision, 2008, 77(1–3): 125–141
https://doi.org/10.1007/s11263-007-0075-7
4 Wang Q, Chen F, Xu W L, Yang M H. Object tracking via partial least squares analysis. IEEE Transactions on Image Processing, 2012, 21(10): 4454–4465
https://doi.org/10.1109/TIP.2012.2205700
5 Viola P, Jones M J. Robust real-time face detection. International Journal of Computer Vision, 2004, 57(2): 137–154
https://doi.org/10.1023/B:VISI.0000013087.49260.fb
6 Grabner H, Bischof H. On-line boosting and vision. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2006, 260–267
https://doi.org/10.1109/CVPR.2006.215
7 Hare S, Saffari A, Torr P. Struck: structured output tracking with kernels. IEEE International Conference on Computer Vision and Pattern Recognition. 2011
https://doi.org/10.1109/ICCV.2011.6126251
8 Yao R, Shi Q F, Shen C H, Zhang Y N, Hengel A V D. Part-based visual tracking with online latent structural learning. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2013, 2363–2370
https://doi.org/10.1109/CVPR.2013.306
9 Ahonen T, Hadid A, Pietikainen M. Face description with local binary patterns: application to face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(12): 2037–2041
https://doi.org/10.1109/TPAMI.2006.244
10 Takala V, Pietikainen M. Multi-object tracking using color, texture and motion. In: Proceedings of the IEEE Conference on Computer Vission and Pattern Recognition. 2007, 1–7
https://doi.org/10.1109/CVPR.2007.383506
11 Yang F, Lu H, Zhang W, Yang G. Visual tracking via bag of features. IEEE Transactions on Image Processing, 2012, 6(2): 115–128
https://doi.org/10.1049/iet-ipr.2010.0127
12 Dalal N, Triggs B. Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2005, 886–893
https://doi.org/10.1109/CVPR.2005.177
13 Godec M, Roth P M, Bischof H. Hough-based tracking of non-rigid objects. Computer Vision and Image Understanding, 2011, 117(10): 1245–1256
https://doi.org/10.1016/j.cviu.2012.11.005
14 Lu Y, Wu T F, Zhu S C. Online object tracking, learning and parsing with and-or graphs. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2014, 3462–3469
https://doi.org/10.1109/CVPR.2014.443
15 Grabner H, Matas J, Gool L V, Cattin P. Tracking the invisible: learning where the object might be. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2010
https://doi.org/10.1109/CVPR.2010.5539819
16 Fan J L, Shen X H, Wu Y. Scribble tracker: a matting-based approach for robust tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(8): 1633–1634
https://doi.org/10.1109/TPAMI.2011.257
17 Porikli F, Tuzel O, Meer P. Covariance tracking using model update based on lie algebra. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2006, 728–735
https://doi.org/10.1109/CVPR.2006.94
18 Wu Y, Cheng J, Wang J, Lu H, Wang J, Ling H, Blasch E, Bai L. Real-time probabilistic covariance tracking with efficient model update. IEEE Transactions on Image Processing, 2012, 21(5): 2824–2837
https://doi.org/10.1109/TIP.2011.2182521
19 Li X, Dick A, Shen C H, Hengel A V D, Wang H Z. Incremental learning of 3D-DCT compact representations for robust visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(4): 863–881
https://doi.org/10.1109/TPAMI.2012.166
20 Isard M, Blake A. CONDENSATION—conditional density propagation for visual tracking. International Journal of Computer Vision, 1998, 29(1): 5–28
https://doi.org/10.1023/A:1008078328650
21 Wang S, Lu H, Yang F, Yang M H. Superpixel tracking. In: Proceedings of the IEEE International Conference on Computer Vision. 2011, 1323–1330
22 Smeulders A W M, Chu D M, Cucchiara R, Calderara S, Dehghan A, Shah M. Visual tracking: an experimental survey. IEEE Transactions on Pattern Analysis andMachine Intelligence, 2014, 36(7): 1442–1468
23 Li X, Hu W, Shen C, Zhang Z, Dick A, van den Hengel A. A survey of appearance models in visual object tracking. ACM Transactions on Intelligent Systems and Technology, 2013, 4(4): 1–42
https://doi.org/10.1145/2508037.2508039
24 Collins R T, Liu Y, Leordeanu M. Online selection of discriminative tracking features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(10): 1631–1643
https://doi.org/10.1109/TPAMI.2005.205
25 Mei X, Ling H. Robust visual tracking using L1 minimization. In: Proceedings of the 12th IEEE International Conference on Computer Vision. 2009, 1436–1443
26 Bao C, Wu Y, Ling H, Ji H. Real time robust L1 tracker using accelerated proximal gradient approach. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2012, 1830–1837
27 Zhang K H, Zhang L, Yang M H. Real-time compressive tracking. In: Proceedings of European Conference on Compute Vision. 2012, 864–877
https://doi.org/10.1007/978-3-642-33712-3_62
28 Zhang T, Ghanem B, Liu S, Ahuja N. Low-rank sparse learning for robust visual tracking. In: Proceedings of European Conference on Compute Vision. 2012, 470–484
https://doi.org/10.1007/978-3-642-33783-3_34
29 Jia X, Lu H C, Yang M H. Visual tracking via adaptive structural local sparse appearance model. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2012, 1822–1829
30 Zhang Z, Wong K H. Pyramid-based visual tracking using sparsity represented mean transform. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2014, 1226–1233
https://doi.org/10.1109/CVPR.2014.160
31 Zhong B N, Yao H X, Chen S, Ji R R, Chin T J, Wang H Z. Visual tracking via weakly supervised learning from multiple imperfect oracles. Pattern Recognition, 2014, 47(3): 1395–1410
https://doi.org/10.1016/j.patcog.2013.10.002
32 Hong Z, Chen Z, Wang C, Mei X, Prokhorov D, Tao D. Multistore tracker (muster): a cognitive psychology inspired approach to object tracking. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2015, 749–758
33 Bai Y, Tang M. Robust tracking via weakly supervised ranking SVM. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2012, 1854–1861
34 Zuo W M, Wu X H, Lin L, Zhang L, Yang M H. Learning support correlation filters for visual tracking. 2016, arXiv preprint arXiv:1601.06032
35 Kalal Z, Mikolajczyk K, Matas J. Tracking-learning-detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(7): 1409–1422
https://doi.org/10.1109/TPAMI.2011.239
36 Babenko B, Yang M, Belongie S. Robust object tracking with online multiple instance learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(8): 1619–1632
https://doi.org/10.1109/TPAMI.2010.226
37 Santner J, Leistner C, Saffari A, Pock T, Bischof H. PROST: parallel robust online simple tracking. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2010, 723–730
https://doi.org/10.1109/CVPR.2010.5540145
38 Gall J, Yao A, Van L, Lempitsky V. Hough forests for object detection, tracking, and action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(11): 2188–2202
https://doi.org/10.1109/TPAMI.2011.70
39 Zhang L, Maaten L V D. Preserving structure in model-free tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(4): 756–769
https://doi.org/10.1109/TPAMI.2013.221
40 Duffner S, Garcia C. Pixeltrack: a fast adaptive algorithm for tracking non-rigid objects. International Conference on Computer Vision. 2013, 2480–2487
https://doi.org/10.1109/ICCV.2013.308
41 Cehovin L, Kristan M, Leonardis A. Robust visual tracking using an adaptive coupled-layer visual model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(4): 941–953
https://doi.org/10.1109/TPAMI.2012.145
42 Henriques J F, Caseiro R, Martins P, Batista J. High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583–596
https://doi.org/10.1109/TPAMI.2014.2345390
43 Chen Z, Hong Z B, Tao D C. An experimental survey on correlation filter-based tracking. 2015, arXiv preprint arXiv:1509.05520
44 Liang P P, Liao C Y, Mei X, Ling H B. Adaptive objectness for object tracking. 2015, arXiv preprint arXiv:1501.00909
45 Cheng M M, Zhang Z M, Lin W Y, Torr P. BING: binarized normed gradients for objectness estimation at 300fps. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2014, 3286–3293
https://doi.org/10.1109/CVPR.2014.414
46 Hua Y, Alahari K, Schmid C. Online object tracking with proposal selection. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 3092–3100.
https://doi.org/10.1109/ICCV.2015.354
47 Zhu G, Porikli F, Li H D. Tracking randomly moving objects on Edge Box proposals. 2015, arXiv preprint arXiv:1507.08085
48 Gan Y, Liu J, Dong J Y, Zhong G Q. A PCA-based convolutional network. 2015, arXiv preprint arXiv:1505.03703
49 Guo Y W, Chen Y, Tang F, Li A, Luo W T, Liu M M. Object tracking using learned feature manifolds. Computer Vision and Image Understanding, 2014, 118: 128–139
https://doi.org/10.1016/j.cviu.2013.09.007
50 Fan J L, Xu W, Wu Y, Gong Y H. Human tracking using convolutional neural networks. TEEE Transactions on Neural Networks, 2010, 21(10): 1610–1623
https://doi.org/10.1109/TNN.2010.2066286
51 Wang N Y, Yeung D Y. Learning a deep compact image representation for visual tracking. In: Proceedings of Neural Information Processing Systems Conference. 2013, 809–817
52 Wang L, Liu T, Wang G, Chan K L, Yang Q. Video tracking using learned hierarchical features. IEEE Transactions on Image Processing, 2015, 24(4): 1424–1435
https://doi.org/10.1109/TIP.2015.2403231
53 Li H X, Li Y, Porikli F. Deeptrack: learning discriminative feature representations by convolutional neural networks for visual tracking. In: Proceedings of British Machine Vision Conference. 2014
https://doi.org/10.5244/C.28.56
54 Wang L J, Ouyang W L, Wang X G, Lu H C. Visual tracking with fully convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 3119–3127
https://doi.org/10.1109/ICCV.2015.357
55 Hong S, You T, Kwak S, Han B. Online tracking by learning discriminative saliency map with convolutional neural network. In: Proceedings of International Conference on Machine Learning. 2015, 597–606
56 Ma C, Huang J B, Yang X K, Yang M H. Hierarchical convolutional features for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 3074–3082
https://doi.org/10.1109/ICCV.2015.352
57 Nam H S, Han B Y. Learning multi-domain convolutional neural networks for visual tracking. 2015, arXiv preprint arXiv:1510.07945
58 Elad M, Figueiredo M A, Ma Y. On the role of sparse and redundant representations in image processing. Proceedings of the IEEE, 2010, 98(6): 972–982
https://doi.org/10.1109/JPROC.2009.2037655
59 Wu Y, Lim J W, Yang M H. Online object tracking: a benchmark. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. 2013, 2411–2418
https://doi.org/10.1109/CVPR.2013.312
60 Yilmaz A, Javed O. Shah M. Object tracking: a survey. ACM Computing Surveys, 2006, 38(4): 13.
https://doi.org/10.1145/1177352.1177355
61 Dollár P, Zitnick C T. Structured forests for fast edge detection. In: Proceedings of the IEEE International Conference on Computer Vision. 2013, 1841–1848
https://doi.org/10.1109/ICCV.2013.231
62 Zitnick C L, Dollár P. Edge boxes: locating object proposals from edges. In: Proceedings of European Conference on Compute Vision. 2014, 391–405
https://doi.org/10.1007/978-3-319-10602-1_26
63 Zhang J M, Ma S G, Sclaroff S. MEEM: robust tracking via multiple experts using entropy minimization. In: Proceedings of European Conference on Compute Vision. 2014
https://doi.org/10.1007/978-3-319-10599-4_13
64 Henriques J F, Caseiro R, Martins P, Batista J. High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583–596
https://doi.org/10.1109/TPAMI.2014.2345390
65 Gao J, Ling H, Hu W, Xing J. Transfer learning based visual tracking with gaussian processes regression. In: Proceedings of European Conference on Compute Vision. 2014
https://doi.org/10.1007/978-3-319-10578-9_13
Related articles from Frontiers Journals
[1] Qianjun ZHANG, Lei ZHANG. Convolutional adaptive denoising autoencoders for hierarchical feature extraction[J]. Front. Comput. Sci., 2018, 12(6): 1140-1148.
[2] Lili HUANG, Jiefeng PENG, Ruimao ZHANG, Guanbin LI, Liang LIN. Learning deep representations for semantic image parsing: a comprehensive overview[J]. Front. Comput. Sci., 2018, 12(5): 840-857.
[3] Nan REN,Junping DU,Suguo ZHU,Linghui LI,Dan FAN,JangMyung LEE. Robust visual tracking based on scale invariance and deep learning[J]. Front. Comput. Sci., 2017, 11(2): 230-242.
[4] Feifei ZHANG,Yongbin YU,Qirong MAO,Jianping GOU,Yongzhao ZHAN. Pose-robust feature learning for facial expression recognition[J]. Front. Comput. Sci., 2016, 10(5): 832-844.
[5] Yi ZHENG,Qi LIU,Enhong CHEN,Yong GE,J. Leon ZHAO. Exploiting multi-channels deep convolutional neural networks for multivariate time series classification[J]. Front. Comput. Sci., 2016, 10(1): 96-112.
[6] Jianhua JIA, Bingxiang LIU, Licheng JIAO. Soft spectral clustering ensemble applied to image segmentation[J]. Front Comput Sci Chin, 2011, 5(1): 66-78.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed