Crowd counting via learning perspective for multi-scale multi-view Web images

Chong SHANG, Haizhou AI, Yi YANG

PDF(822 KB)
PDF(822 KB)
Front. Comput. Sci. ›› 2019, Vol. 13 ›› Issue (3) : 579-587. DOI: 10.1007/s11704-017-6598-3
RESEARCH ARTICLE

Crowd counting via learning perspective for multi-scale multi-view Web images

Author information +
History +

Abstract

Estimating the number of people in Web images still remains a challenging problem owing to the perspective variation, different views, and diverse backgrounds. Existing deep learning models still have difficulties in dealing with scenarios where the size of a person is either extremely large or extremely small. In this paper, we propose a novel perspective-aware architecture to estimate the number of people in a crowd in web images. Specifically,we use a two-stage framework, where we first learn a policy network to infer the perspective of the target scene, which outputs a scale label for the subsequent perspective normalization. Next, given the aligned inputs, we further adjust the scale-specific counting network to regress the final count. Experiments on challenging datasets demonstrate our approach can deal with a large perspective variation and that we have achieved state-of-theart results.

Keywords

crowd counting / Web images / perspective inference

Cite this article

Download citation ▾
Chong SHANG, Haizhou AI, Yi YANG. Crowd counting via learning perspective for multi-scale multi-view Web images. Front. Comput. Sci., 2019, 13(3): 579‒587 https://doi.org/10.1007/s11704-017-6598-3

References

[1]
Ali S, Shah M. A lagrangian particle dynamics approach for crowd flow segmentation and stability analysis. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2007
CrossRef Google scholar
[2]
Shao J, Kang K, Change Loy C, Wang X. Deeply learned attributes for crowded scene understanding. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 4657–4666
CrossRef Google scholar
[3]
Idrees H, Soomro K, Shah M. Detecting humans in dense crowds using locally-consistent scale prior and global occlusion reasoning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(10): 1986–1998
CrossRef Google scholar
[4]
Lempitsky V, Zisserman A. Learning to count objects in images. In: Proceedings of the Neural Information Processing Systems Conference. 2010, 1324–1332
[5]
Chan A B, Liang Z S J, Vasconcelos N. Privacy preserving crowd monitoring: counting people without people models or tracking. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2008
CrossRef Google scholar
[6]
Idrees H, Saleemi I, Seibert C, Shah M. Multi-source multi-scale counting in extremely dense crowd images. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 2547–2554
CrossRef Google scholar
[7]
Ma Z, Chan A B. Crossing the line: crowd counting by integer programming with local features. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 2539–2546
CrossRef Google scholar
[8]
Loy C C, Gong S, Xiang T. From semisupervised to transfer counting of crowds. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 2256–2263
[9]
Chen K, Gong S, Xiang T, Loy C C. Cumulative attribute space for age and crowd density estimation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 2467–2474
CrossRef Google scholar
[10]
Fiaschi L, Köthe U, Nair R, Hamprecht F A. Learning to count with regression forest and structured labels. In: Proceedings of the 21st IEEE International Conference on Pattern Recognition. 2012, 2685–2688
[11]
Chen K, Loy C C, Gong S, Xiang T. Feature mining for localised crowd counting. In: Proceedings of the British Machine Vision Conference. 2012
CrossRef Google scholar
[12]
Shang C, Ai H, Bai B. End-to-end crowd counting via joint learning local and global count. In: Proceedings of the International Conference on Image Processing. 2016, 1215–1219
CrossRef Google scholar
[13]
Zhang Y, Zhou D, Chen S, Gao S, Ma Y. Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016, 589–597
CrossRef Google scholar
[14]
Onoro-Rubio D, López-Sastre R J. Towards perspective-free object counting with deep learning. In: Proceedings of the European Conference on Computer Vision. 2016, 615–629
CrossRef Google scholar
[15]
Zhang C, Li H, Wang X, Yang X. Cross-scene crowd counting via deep convolutional neural networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 833–841
CrossRef Google scholar
[16]
Rabaud V, Belongie S. Counting crowded moving objects. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2006, 705–711
CrossRef Google scholar
[17]
Wu X, Liang G, Lee K K, Xu Y. Crowd density estimation using texture analysis and learning. In: Proceedings of IEEE International Conference on Robotics and Biomimetics. 2006, 214–219
CrossRef Google scholar
[18]
Kong D, Gray D, Tao H. A viewpoint invariant approach for crowd counting. In: Proceedings of the 18th IEEE International Conference on Pattern Recognition. 2006, 1187–1190
CrossRef Google scholar
[19]
Cong Y, Gong H, Zhu S C, Tang Y. Flow mosaicking: real-time pedestrian counting without scene-specific learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 1093–1100
CrossRef Google scholar
[20]
Tang N C, Lin Y Y, Weng M F, Liao H Y M. Cross-camera knowledge transfer for multiview people counting. IEEE Transactions on Image Processing, 2015, 24(1): 80–93
CrossRef Google scholar
[21]
Zhang Z, Wang M, Geng X. Crowd counting in public video surveillance by label distribution learning. Elsevier Neurocomputing, 2015, 166: 151–163
CrossRef Google scholar
[22]
Liu B, Vasconcelos N. Bayesian model adaptation for crowd counts. In: Proceedings of IEEE International Conference on Computer Vision. 2015, 4175–4183
CrossRef Google scholar
[23]
Arteta C, Lempitsky V, Noble J A, Zisserman A. Interactive object counting. In: Proceedings of the European Conference on Computer Vision. 2014, 504–518
CrossRef Google scholar
[24]
Pham V Q, Kozakaya T, Yamaguchi O, Okada R. Count forest: covoting uncertain number of targets using random forest for crowd density estimation. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 3253–3261
CrossRef Google scholar
[25]
Felzenszwalb P F, Huttenlocher D P.Efficient belief propagation for early vision. International Journal of Computer Vision, 2006, 70(1): 41–54
CrossRef Google scholar
[26]
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015
CrossRef Google scholar
[27]
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. 2015, arXiv preprint arXiv:1512.03385
[28]
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014, arXiv preprint arXiv:1409.1556
[29]
Kingma D, Ba J. Adam: a method for stochastic optimization. 2014, arXiv preprint arXiv:1412.6980
[30]
Rodriguez M, Sivic J, Laptev I, Audibert J Y. Data-driven crowd analysis in videos. In: Proceedings of IEEE International Conference on Computer Vision. 2011, 1235–1242
CrossRef Google scholar
[31]
An S, Liu W, Venkatesh S. Face recognition using kernel ridge regression. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2007
CrossRef Google scholar

RIGHTS & PERMISSIONS

2018 Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature
AI Summary AI Mindmap
PDF(822 KB)

Accesses

Citations

Detail

Sections
Recommended

/