Transferring priors from virtual data for crowd counting in real world
Xiaoheng JIANG , Hao LIU , Li ZHANG , Geyang LI , Mingliang XU , Pei LV , Bing ZHOU
Front. Comput. Sci. ›› 2022, Vol. 16 ›› Issue (3) : 163314
Transferring priors from virtual data for crowd counting in real world
In recent years, crowd counting has increasingly drawn attention due to its widespread applications in the field of computer vision. Most of the existing methods rely on datasets with scarce labeled images to train networks. They are prone to suffer from the over-fitting problem. Further, these existing datasets usually just give manually labeled annotations related to the head center position. This kind of annotation provides limited information. In this paper, we propose to exploit virtual synthetic crowd scenes to improve the performance of the counting network in the real world. Since we can obtain people masks easily in a synthetic dataset, we first learn to distinguish people from the background via a segmentation network using the synthetic data. Then we transfer the learned segmentation priors from synthetic data to real-world data. Finally, we train a density estimation network on real-world data by utilizing the obtained people masks. Our experiments on two crowd counting datasets demonstrate the effectiveness of the proposed method.
crowd counting / synthetic data / virtual-real combination / people segmentation / density estimation
| [1] |
Wang Q, Gao J Y, Lin W, Yuan Y. Learning from synthetic data for crowd counting in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019, 8198−8207 |
| [2] |
|
| [3] |
Liu W Z, Salzmann M, Fua P. Context-aware crowd counting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019, 5099-5108 |
| [4] |
Zhang Y Y, Zhou D S, Chen S Q, Gao S G, Ma Y. Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 589−597 |
| [5] |
|
| [6] |
|
| [7] |
|
| [8] |
|
| [9] |
|
| [10] |
|
| [11] |
Boominathan L, Kruthiventi S S, Babu R V. Crowdnet: a deep convolutional network for dense crowd counting. In: Proceedings of the ACM on Multimedia Conference. 2016, 640−644 |
| [12] |
Onoro-Rubio D, López-Sastre R J. Towards perspective-free object counting with deep learning. In: Proceedings of the European Conference on Computer Vision. 2016, 615−629 |
| [13] |
|
| [14] |
Marsden M, McGuinness K, Little S, O’Connor N E. Resnetcrowd: a residual deep learning architecture for crowd counting, violent behaviour detection and crowd density level classification. In: Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance. 2017, 1−7 |
| [15] |
Walach E, Wolf L. Learning to count with cnn boosting. In: Proceedings of the European Conference on Computer Vision. 2016, 660−676 |
| [16] |
Sam D B, Surya S, and Babu R V. Switching convolutional neural network for crowd counting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 4031-4039 |
| [17] |
|
| [18] |
|
| [19] |
|
| [20] |
Jiang X H, Zhang L, Xu M L, Zhang T Z, Lv P, Zhou B, Yang X, Pang Y W. Attention scaling for crowd counting. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2020, 4705−4714 |
| [21] |
Sindagi V A, Patel V M. Generating high-quality crowd density maps using contextual pyramid cnns. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 1879−1888 |
| [22] |
Sindagi V A, Patel V M. Multi-level bottom-top and top-bottom feature fusion for crowd counting. In: Proceedings of The IEEE International Conference on Computer Vision. 2019, 1002−1012 |
| [23] |
Zhang A, Yue L, Shen J Y, Zhu F, Zhen X T, Cao X B, Shao L. Attentional neural fields for crowd counting. In : Proceedings of The IEEE International Conference on Computer Vision. 2019, 5713−5722 |
| [24] |
Zhang A, Shen J Y, Xiao Z H, Zhu F, Zhen X T, Cao X H, and Ling Shao. Relational attention network for crowd counting. In: Proceedings of The IEEE International Conference on Computer Vision. 2019, 6787−6796 |
| [25] |
Liu N, Long Y C, Zou C Q, Niu Q, Pan L, Wu H F. Adcrowdnet: An attention-injective deformable convolutional network for crowd understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019, 3225−3234 |
| [26] |
Liu C C, Weng X Y, Mu Y D. Recurrent attentive zooming for joint crowd counting and precise localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019, 1217−1226 |
| [27] |
Sindagi V A, Patel V M. Cnn-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance. 2017, 1−6 |
| [28] |
Zhao M M, Zhang J, Zhang C Y, Zhang W J. Leveraging heterogeneous auxiliary tasks to assist crowd counting. In : Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019, 12736−12745 |
| [29] |
Liu X L, Weijer J D, Bagdanov A D. Leveraging unlabeled data for crowd counting by learning to rank. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 7661−7669 |
| [30] |
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: Proceedings of the International Conference on Learning Representations. 2015 |
| [31] |
Li Y H, Zhang X F, Chen D M. Csrnet: dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 1091−1100 |
| [32] |
Ma Z H, Wei X, Hong X P, Gong Y H. Bayesian loss for crowd count estimation with point supervision. In: Proceedings of The IEEE International Conference on Computer Vision. 2019, 6141−6150 |
| [33] |
Kingma D P, Ba J. Adam: a method for stochastic optimization. In: Proceedings of the International Conference on Learning Representations. 2015 |
| [34] |
|
| [35] |
Deng J, Dong W, Socher R, Li L J, Li K, Li F F. Imagenet: A large-scale hierarchical image database. In: Proceedings of The IEEE conference on computer vision and pattern recognition. 2009, 248−255 |
| [36] |
Sam D B, Babu R V. Top-down feedback for crowd counting convolutional neural network. In: Proceedings of AAAI Conference on Artificial Intelligence. 2018, 7323−7330 |
| [37] |
|
| [38] |
Zeng L K, Xu X M, Cai B L, Qiu S, Zhang T. Multi-scale convolutional neural networks for crowd counting. In: Proceedings of The IEEE International Conference on Image Processing. 2017, 465−469 |
| [39] |
Shen Z, Xu Y, Ni B B, Wang M S, Hu J G, Yang X K. Crowd counting via adversarial cross-scale consistency pursuit. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 5245−5254 |
| [40] |
|
| [41] |
|
| [42] |
Sam D B, Sajjan N N, Babu R V, Srinivasan M. Divide and grow: Capturing huge diversity in crowd images with incrementally growing cnn. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 3618−3626 |
| [43] |
|
| [44] |
|
| [45] |
Liu L B, Wang H J, Li G B, Ouyang W L, Lin L. Crowd counting using deep recurrent spatial-aware network. In: Proceedings of the International Joint Conference on Artificial Intelligence. 2018, 849−855 |
| [46] |
Ranjan V, Le H U, Hoai M. Iterative crowd counting. In: Proceedings of the European Conference on Computer Vision. 2018, 270−285 |
| [47] |
|
Higher Education Press
/
| 〈 |
|
〉 |