Action recognition from arbitrary views using 3D-key-pose set

Junxia GU; Xiaoqing DING; Shenjing WANG

doi:10.1007/s11460-011-0175-6

PDF(1249 KB)

Front. Electr. Electron. Eng. ›› 2012, Vol. 7 ›› Issue (2) : 224-241. DOI: 10.1007/s11460-011-0175-6

RESEARCH ARTICLE

Action recognition from arbitrary views using 3D-key-pose set

Author information +

History +

Abstract

Recovering three-dimensional (3D) human pose sequence from arbitrary view is very difficult, due to loss of depth information and self-occlusion. In this paper, view-independent 3D-key-pose set is selected from 3D action samples, for the purpose of representing and recognizing those same actions from a single or few cameras without any restriction of the relative orientations between cameras and subjects. First, 3D-key-pose set is selected from the 3D human joint sequences of 3D training action samples that are built from multiple viewpoints. Second, 3D key pose sequence, which matches best with the observation sequence, is selected from the 3D-key-pose set to represent the observation sequence of arbitrary view. 3D key pose sequence contains many discriminative view-independent key poses but cannot accurately describe pose of every frame in the observation sequence. Considering the above reasons, pose and dynamic of action are modeled respectively in this paper. Exemplar-based embedding and probability of unique key pose are applied to model pose property. Complementary dynamic feature is extracted to model these actions that share the same poses but have different dynamic features. Finally, these action models are fused to recognize observation sequence from a single or few cameras. Effectiveness of the proposed approach is demonstrated with experiments on IXMAS dataset.

Keywords

action representation / action recognition / 3D-key-pose set / 3D key pose sequence / action models fusion

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Junxia GU, Xiaoqing DING, Shenjing WANG. Action recognition from arbitrary views using 3D-key-pose set. Front Elect Electr Eng, 2012, 7(2): 224‒241 https://doi.org/10.1007/s11460-011-0175-6

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Yilmaz A, Shah M. Matching actions in presence of camera motion. Computer Vision and Image Understanding, 2006, 104(2-3): 221-231 CrossRef Google scholar

[2]	Poppe R. Vision-based human motion analysis: An overview. Computer Vision and Image Understanding, 2007, 108(1-2): 4-18 CrossRef Google scholar

[3]	Shen Y, Ashraf N, Foroosh H. Action recognition based on homography constraints. In: Proceedings of International Conference on Pattern Recognition. 2008, 1-4

[4]	Souvenir R, Babbs J. Learning the viewpoint manifold for action recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2008, 1-7

[5]	Ahmad M, Lee S. HMM-based human action recognition using multiview image sequences. In: Proceedings of International Conference on Pattern Recognition. 2006, 1: 263-266

[6]	Weinland D, Ronfard R, Boyer E. Free viewpoint action recognition using motion history volumes. Computer Vision and Image Understanding, 2006, 104(2): 249-257 CrossRef Google scholar

[7]	Lv F, Nevatia R. Recognition and segmentation of 3-D human action using HMM and multi-class AdaBoost. In: Proceedings of European Conference on Computer Vision. 2006, 359-372

[8]	Weinland D, Boyer E, Ronfard R. Action recognition from arbitrary views using 3D exemplars. In: Proceedings of IEEE International Conference on Computer Vision. 2007, 1-7

[9]	Yan P, Khan S M, Shah M.Learning 4D action feature models for arbitrary view action recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2008, 1-7

[10]	Liu J, Ali S, Shah M. Recognizing human actions using multiple features. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2008, 1-8

[11]	Johansson G. Visual motion perception. Scientific American, 1975, 232(6): 76-88 CrossRef Pubmed Google scholar

[12]	Gu J, Ding X, Wang S, Wu Y. Action and gait recognition from recovered 3-D human joints. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 2010, 40(4): 1021-1033

[13]	Cheung K M G, Baker S, Kanade T. Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2003, 1: 77-84

[14]	Parameswaran V, Chellappa R. View invariance for human action recognition. International Journal of Computer Vision, 2006, 66(1): 83-101 CrossRef Google scholar

[15]	Gritai A, Sheikh Y, Shah M. On the use of anthropometry in the invariant analysis of human actions. In: Proceedings of International Conference on Pattern Recognition. 2004, 2: 923-926

[16]	Ahmad M, Lee S W. Human action recognition using shape and CLG-motion flow from multi-view image sequences. Pattern Recognition, 2008, 41(7): 2237-2252 CrossRef Google scholar

[17]	Natarajan P, Nevatia R. View and scale invariant action recognition using multiview shape-flow models. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2008, 1-8

[18]	Lv F, Nevatia R. Single view human action recognition using key pose matching and Viterbi path searching. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2007, 1-8

[19]	Weinland D, Boyer E. Action recognition using exemplar-based embedding. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2008, 1-7

[20]	Rabiner L R. A tutorial on hidden Markov model and selected applications in speech recognition. Proceedings of the IEEE, 1989, 77(2): 257-286 CrossRef Google scholar

[21]	Wang L, Suter D. Informative shape representations for human action recognition. In: Proceedings of International Conference on Pattern Recognition. 2006, 1266-1269

[22]	Gorelick L, Blank M, Shechtman E, Irani M, Basri R. Actions as space-time shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(12): 2247-2253 CrossRef Pubmed Google scholar

[23]	Davis J W, Bobick A F. The representation and recognition of human movement using temporal templates. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 1997, 928-934

Acknowledgements

This work was supported by the National Basic Research Program of China (973 program) under Grant No. 2007CB311004, and the National High Technology Research and Development Program of China (863 program) under Grant No. 2006AA01Z115.