Human behavior clustering for anomaly detection
Xudong ZHU, Zhijing LIU
Human behavior clustering for anomaly detection
This paper aims to address the problem of modeling human behavior patterns captured in surveillance videos for the application of online normal behavior recognition and anomaly detection. A novel framework is developed for automatic behavior modeling and online anomaly detection without the need for manual labeling of the training data set. The framework consists of the following key components. 1) A compact and effective behavior representation method is developed based on spatial-temporal interest point detection. 2) The natural grouping of behavior patterns is determined through a novel clustering algorithm, topic hidden Markov model (THMM) built upon the existing hidden Markov model (HMM) and latent Dirichlet allocation (LDA), which overcomes the current limitations in accuracy, robustness, and computational efficiency. The new model is a four-level hierarchical Bayesian model, in which each video is modeled as a Markov chain of behavior patterns where each behavior pattern is a distribution over some segments of the video. Each of these segments in the video can be modeled as a mixture of actions where each action is a distribution over spatial-temporal words. 3) An online anomaly measure is introduced to detect abnormal behavior, whereas normal behavior is recognized by runtime accumulative visual evidence using the likelihood ratio test (LRT) method. Experimental results demonstrate the effectiveness and robustness of our approach using noisy and sparse data sets collected from a real surveillance scenario.
computer vision / unsupervised anomaly detection / Bayesian topic models / hidden Markov model (HMM) / spatiotemporal interest points
[1] |
Dollar P, Rabaud V, Cottrell G, Belongie S. Behavior recognition via sparse spatio-temporal features. In: Proceedings of 14th International Conference on Computer Communications and Networks. 2005, 65–72
|
[2] |
Yilmaz A. Shah M. Recognizing human actions in videos acquired by uncalibrated moving cameras. In: Proceedings of 10th IEEE International Conference on Computer Vision. 2005, 150–157
|
[3] |
Song Y, Goncalves L, Perona P. Unsupervised learning of human motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(7): 814–827
CrossRef
Google scholar
|
[4] |
Fanti C, Zelnik-Manor L, Perona P. Hybrid models for human motion recognition. In: Proceedings of 10th IEEE International Conference on Computer Vision. 2005, 1166–1173
|
[5] |
Zhong H, Shi J, Visontai M. Detecting unusual activity in video. In: Proceedings of 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2004, 819–826
CrossRef
Google scholar
|
[6] |
Niebles J C, Wang H C, Li F F. Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision, 2008, 79(3): 299–318
CrossRef
Google scholar
|
[7] |
Wallach H M. Topic modeling: beyond bag-of-words. In: Proceedings of 23rd International Conference on Machine Learning. 2006, 977–984
CrossRef
Google scholar
|
[8] |
Wang X, McCallum A. A note on topical n-grams. Technical Report UM-CS-071. Department of Computer Science University of Massachusetts Amherst, 2005
|
[9] |
Gruber A, Rosen-Zvi M, Weiss Y. Hidden topic Markov models. In: Proceedings of Artificial Intelligence and Statistics. 2007
|
[10] |
Boiman O, Irani M. Detecting irregularities in images and in video. In: Proceedings of 10th IEEE International Conference on Computer Vision. 2005, 462–469
|
[11] |
Oliver N M, Rosario B, Pentland A P. ABayesian computer vision system for modeling human interactions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8): 831–843
CrossRef
Google scholar
|
[12] |
Zelnik-Manor L, Irani M. Event-based analysis of video. In: Proceedings of 2001 IEEE Conference on Computer Vision and Pattern Recognition. 2001, 123–130
|
[13] |
Hongeng S. Nevatia R. Multi-agent event recognition. In: Proceedings of 8th IEEE International Conference on Computer Vision. 2001, 84–91
|
[14] |
Russo R, Shah M, Lobo N. A computer vision system for monitoring production of fast food. In: Proceedings of 5th Asian Conference on Computer Vision, Melbourne. 2002, 23–25
|
[15] |
Johnson N, Hogg D. Learning the distribution of object trajectories for event recognition. Image and Vision Computing, 1995, 14(8): 609–615
|
[16] |
Brand M, Oliver N, Pentland A. Coupled hidden Markov models for complex action recognition. In: Proceedings of 1997 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 1997, 994–999
CrossRef
Google scholar
|
[17] |
Medioni G, Cohen I, Bremond F, Hongeng S, Nevatia R. Event detection and analysis from video streams. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(8): 873–889
CrossRef
Google scholar
|
[18] |
Naphide H R, Huang T S. A probabilistic framework for semantic video indexing, filtering, and retrieval. IEEE Transactions on Multimedia, 2001, 3(1): 141–151
CrossRef
Google scholar
|
[19] |
Hamid R, Johnson A, Batta S, Bobick A, Isbell C, Coleman G. Detection and explanation of anomalous activities: representing activities as bags of event n-grams. In: Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2005, 1031–1038
|
[20] |
Xiang T, Gong S. Beyond tracking: modelling activity and understanding behaviour. International Journal of Computer Vision, 2006, 67(1): 21–51
CrossRef
Google scholar
|
[21] |
Wilpon J G, Rabiner L R, Lee C H, Goldman E R. Automatic recognition of keywords in unconstrained speech using hidden Markov models. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1990, 38(11): 1870–1878
CrossRef
Google scholar
|
[22] |
Blei D M, Ng A Y, Jordan M I, Lafferty J. Latent Dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993–1022
CrossRef
Google scholar
|
[23] |
Griffiths T L, Steyvers M. Finding scientific topic. Proceedings of the National Academy of Sciences of the United States of America. 2004, 5228–5235
|
/
〈 | 〉 |