Dynamic road crime risk prediction with urban open data
Binbin ZHOU, Longbiao CHEN, Fangxun ZHOU, Shijian LI, Sha ZHAO, Gang PAN
Dynamic road crime risk prediction with urban open data
Crime risk prediction is helpful for urban safety and citizens’ life quality. However, existing crime studies focused on coarse-grained prediction, and usually failed to capture the dynamics of urban crimes. The key challenge is data sparsity, since that 1) not all crimes have been recorded, and 2) crimes usually occur with low frequency. In this paper, we propose an effective framework to predict fine-grained and dynamic crime risks in each road using heterogeneous urban data. First, to address the issue of unreported crimes, we propose a cross-aggregation soft-impute (CASI) method to deal with possible unreported crimes. Then, we use a novel crime risk measurement to capture the crime dynamics from the perspective of influence propagation, taking into consideration of both time-varying and location-varying risk propagation. Based on the dynamically calculated crime risks, we design contextual features (i.e., POI distributions, taxi mobility, demographic features) from various urban data sources, and propose a zero-inflated negative binomial regression (ZINBR) model to predict future crime risks in roads. The experiments using the real-world data from New York City show that our framework can accurately predict road crime risks, and outperform other baseline methods.
crime prediction / road crime risk / urban computing / data sparsity / zero-inflated negative binomial regression
[1] |
UCR F. Crime in the U.S. 2017-robbery, 2017
|
[2] |
UCR F. Crime in the U.S. 2017-larceny-theft, 2017
|
[3] |
Zhou B, Chen L, Zhao S, Zhou F, Li S, Pan G. Spatio-temporal analysis of urban crime leveraging multisource crowdsensed data. Personal and Ubiquitous Computing, 2020
CrossRef
Google scholar
|
[4] |
Department N Y C P. Nypd complaint data, 2018
|
[5] |
Crime-recording: making the victim count. HMIC, November 2014
|
[6] |
Masucci M, Langton L. Hate crime victimization, 2004-2015. Special Report.(No. NCJ 250653). Washington, DC: Bureau of Justice Statistics. US Department of Justice, 2017
|
[7] |
Planty M, Langton L, Krebs C, Berzofsky M, Smiley-McDonald H. Female victims of sexual violence, 1994-2010. Special Report (No. NCJ 240655). Washington, DC: Bureau of Justice Statistics. US Department of Justice, 2013
|
[8] |
Zheng Y. Urban computing: enabling urban intelligence with big data. Frontiers of Computer Science, 2017, 11 (1): 1- 3
CrossRef
Google scholar
|
[9] |
Jiang Z, Liu Y, Fan X, Wang C, Li J, Chen L. Understanding urban structures and crowd dynamics leveraging large-scale vehicle mobility data. Frontiers of Computer Science, 2020, 14 (5): 1- 12
|
[10] |
Chen C, Gao L, Xie X, Wang Z. Enjoy the most beautiful scene now: a memetic algorithm to solve two-fold time-dependent arc orienteering problem. Frontiers of Computer Science, 2020, 14 (2): 364- 377
CrossRef
Google scholar
|
[11] |
Yi F, Yu Z, Chen H, Du H, Guo B. Cyber-physical-social collaborative sensing: from single space to cross-space. Frontiers of Computer Science, 2018, 12 (4): 609- 622
CrossRef
Google scholar
|
[12] |
Block R L, Block C R. Space, place and crime: hot spot areas and hot places of liquor-related crime. Crime and Place, 1995, 4 (2): 145- 184
|
[13] |
Cohen L E, Felson M. Social change and crime rate trends: a routine activity approach. American Sociological Review, 1979, 44 (4): 588- 608
CrossRef
Google scholar
|
[14] |
Cohn E G. Weather and crime. The British Journal of Criminology, 1990, 30 (1): 51- 64
CrossRef
Google scholar
|
[15] |
Field S. The effect of temperature on crime. The British Journal of Criminology, 1992, 32 (3): 340- 351
CrossRef
Google scholar
|
[16] |
Mazumder R, Hastie T, Tibshirani R. Spectral regularization algorithms for learning large incomplete matrices. Journal of Machine Learning Research, 2010, 11: 2287- 2322
|
[17] |
Mohler G O, Short M B, Brantingham P J, Schoenberg F P, Tita G E. Self-exciting point process modeling of crime. Journal of the American Statistical Association, 2011, 106 (493): 100- 108
CrossRef
Google scholar
|
[18] |
Yu C H, Ding W, Chen P, Morabito M. Crime forecasting using spatiotemporal pattern with ensemble learning. In: Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2014, 174- 185
|
[19] |
Yi F, Yu Z, Zhuang F, Zhang X, Xiong H. An integrated model for crime prediction using temporal and spatial factors. In: Proceedings of IEEE International Conference on Data Mining. 2018, 1386- 1391
CrossRef
Google scholar
|
[20] |
Zhao X, Tang J. Modeling temporal-spatial correlations for crime prediction. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 2017, 497- 506
|
[21] |
Huang C, Zhang J, Zheng Y, Chawla N V. Deepcrime: attentive hierarchical recurrent networks for crime prediction. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 2018, 1423- 1432
CrossRef
Google scholar
|
[22] |
Vomfell L, Härdle W K, Lessmann S. Improving crime count forecasts using twitter and taxi data. Decision Support Systems, 2018, 113: 73- 85
CrossRef
Google scholar
|
[23] |
Yi F, Yu Z, Zhuang F, Guo B. Neural network based continuous conditional random field for fine-grained crime prediction. In: Proceedings of International Joint Conferences on Artificial Intelligence. 2019, 4157- 4163
|
[24] |
Gerber M S. Predicting crime using twitter and kernel density estimation. Decision Support Systems, 2014, 61: 115- 125
CrossRef
Google scholar
|
[25] |
Wang H, Kifer D, Graif C, Li Z. Crime rate inference with big data. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 635- 644
CrossRef
Google scholar
|
[26] |
Kang Z, Peng C, Cheng Q. Top-n recommender system via matrix completion. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016, 179- 185
|
[27] |
Shin D, Cetintas S, Lee K C, Dhillon I S. Tumblr blog recommendation with boosted inductive matrix completion. In: Proceedings of the 24th ACM International Conference on Information and Knowledge Management. 2015, 203- 212
CrossRef
Google scholar
|
[28] |
Chi E C, Zhou H, Chen G K, Del Vecchyo D O, Lange K. Genotype imputation via matrix completion. Genome Research, 2013, 23 (3): 509- 518
CrossRef
Google scholar
|
[29] |
Cai T, Cai T T, Zhang A. Structured matrix completion with applications to genomic data integration. Journal of the American Statistical Association, 2016, 111 (514): 621- 633
CrossRef
Google scholar
|
[30] |
Argyriou A, Evgeniou T, Pontil M. Convex multi-task feature learning. Machine Learning, 2008, 73 (3): 243- 272
CrossRef
Google scholar
|
[31] |
Biswas P, Lian T C, Wang T C, Ye Y. Semidefinite programming based algorithms for sensor network localization. ACM Transactions on Sensor Networks (TOSN), 2006, 2 (2): 188- 220
CrossRef
Google scholar
|
[32] |
Singer A, Cucuringu M. Uniqueness of low-rank matrix completion by rigidity theory. SIAM Journal on Matrix Analysis and Applications, 2010, 31 (4): 1621- 1641
CrossRef
Google scholar
|
[33] |
Chen P, Suter D. Recovering the missing components in a large noisy lowrank matrix: application to SFM. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26 (8): 1051- 1063
CrossRef
Google scholar
|
[34] |
Liu G, Liu Q, Li P. Blessing of dimensionality: recovering mixture data via dictionary pursuit. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (1): 47- 60
CrossRef
Google scholar
|
[35] |
Chistov A L, Grigor’Ev D Y. Complexity of quantifier elimination in the theory of algebraically closed fields. In: Proceedings of International Symposium on Mathematical Foundations of Computer Science. 1984, 17- 31
|
[36] |
Candès E J, Recht B. Exact matrix completion via convex optimization. Foundations of Computational Mathematics, 2009, 9 (6): 717
CrossRef
Google scholar
|
[37] |
National crime victimization survey. Special Report (No. NCJ 240655).
|
[38] |
Cameron A C, Trivedi P K. Regression Analysis of Count Data. Cambridge University Press, 2010-2016 (2017)
|
[39] |
Khoshgoftaar T M, Gao K, Szabo R M. An application of zero-inflated poisson regression for software fault prediction. In: Proceedings of the 12th International Symposium on Software Reliability Engineering. 2001, 66- 73
CrossRef
Google scholar
|
[40] |
Gardner W, Mulvey E P, Shaw E C. Regression analyses of counts and rates: poisson, overdispersed poisson, and negative binomial models. Psychological Bulletin, 1995, 118 (3): 392
CrossRef
Google scholar
|
[41] |
Lambert D. Zero-inflated poisson regression, with an application to defects in manufacturing. Technometrics, 1992, 34 (1): 1- 14
CrossRef
Google scholar
|
[42] |
Osgood D W. Poisson-based regression analysis of aggregate crime rates. Journal of Quantitative Criminology, 2000, 16 (1): 21- 43
CrossRef
Google scholar
|
[43] |
Xiao K, Liu Q, Liu C, Xiong H. Price shock detection with an influencebased model of social attention. ACM Transactions on Management Information Systems, 2017, 9 (1): 1- 21
|
[44] |
Weisel D L. Analyzing repeat victimization. US Department of Justice, Office of Community Oriented Policing Services Washington, DC, 2005
|
[45] |
Yu H F, Rao N, Dhillon I S. Temporal regularized matrix factorization for high-dimensional time series prediction. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 847- 855
|
[46] |
Stekhoven D J, Bühlmann P. Missfores-non-parametric missing value imputation for mixed-type data. Bioinformatics, 2011, 28 (1): 112- 118
|
[47] |
Gondara L, Wang K. Multiple imputation using deep denoising autoencoders. 2017, arXiv preprint arXiv:1705.02737
|
[48] |
Yoon J, Jordon J, Schaar v d M.. Gain: missing data imputation using generative adversarial nets. In: Proceedings of International Conference on Machine Learning. 2018, 5689- 5698
|
[49] |
Cai J F, Candès E J, Shen Z. A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 2010, 20 (4): 1956- 1982
CrossRef
Google scholar
|
[50] |
Ji S, Ye J. An accelerated gradient method for trace norm minimization. In: Proceedings of the 26th Annual International Conference on Machine Learning. 2009, 457- 464
CrossRef
Google scholar
|
[51] |
Donoho D L, Johnstone I M, Kerkyacharian G, Picard D. Wavelet shrinkage: asymptopia? Journal of the Royal Statistical Society, Series B (Methodological), 1995, 57 (2): 301- 337
CrossRef
Google scholar
|
[52] |
Lichman M, Smyth P. Prediction of sparse user-item consumption rates with zero-inflated poisson regression. In: Proceedings of the 2018 World Wide Web Conference on World Wide Web. 2018, 719- 728
|
[53] |
Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993- 1022
|
[54] |
Salton G, McGill M J. Introduction to Modern Information Retrieval. McGraw-Hill, Inc., 1986
|
[55] |
Foursquare. see Foursquare website, 2018
|
[56] |
Ehrlich I. On the relation between education and crime. National Bureau of Economic Research, 1975
|
[57] |
Patterson E B. Poverty, income inequality, and community crime rates. Criminology, 1991, 29 (4): 755- 776
CrossRef
Google scholar
|
[58] |
New York City Department of City Planning, U.S. Census Bureau, New York City PUMAS and Community Districts, 2010
|
[59] |
Zhou B, Chen L, Zhou F, Li S, Zhao S, Das S K, Pan G. Escort: finegrained urban crime risk inference leveraging heterogeneous open data. IEEE Systems Journal, 2021, 15 (3): 4656- 4667
CrossRef
Google scholar
|
[60] |
Moon T K. The expectation-maximization algorithm. IEEE Signal Processing Magazine, 1996, 13 (6): 47- 60
CrossRef
Google scholar
|
[61] |
Kingma D, Ba J. Adam: a method for stochastic optimization. 2014, arXiv preprint axXiv: 1412.6980
|
[62] |
OpenStreetMap. Open street map. see Openstreetmap.org website, 2018
|
[63] |
NYC Taxi and Limousine Commission. NYC Taxi Dataset, 2018
|
[64] |
Census Bureau. American Community Survey, 2018
|
[65] |
Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9 (8): 1735- 1780
CrossRef
Google scholar
|
[66] |
Zhang J, Zheng Y, Qi D. Deep spatio-temporal residual networks for citywide crowd flows prediction. In: Proceedings of the 31st AAAZ Conference on Artificial Intelligence. 2017
|
/
〈 | 〉 |