Robust human motion prediction via integration of spatial and temporal cues
Shaobo Zhang , Sheng Liu , Fei Gao , Yuan Feng
Optoelectronics Letters ›› 2025, Vol. 21 ›› Issue (8) : 499 -506.
Robust human motion prediction via integration of spatial and temporal cues
Research on human motion prediction has made significant progress due to its importance in the development of various artificial intelligence applications. However, effectively capturing spatio-temporal features for smoother and more precise human motion prediction remains a challenge. To address these issues, a robust human motion prediction method via integration of spatial and temporal cues (RISTC) has been proposed. This method captures sufficient spatio-temporal correlation of the observable sequence of human poses by utilizing the spatio-temporal mixed feature extractor (MFE). In multi-layer MFEs, the channel-graph united attention blocks extract the augmented spatial features of the human poses in the channel and spatial dimension. Additionally, multi-scale temporal blocks have been designed to effectively capture complicated and highly dynamic temporal information. Our experiments on the Human3.6M and Carnegie Mellon University motion capture (CMU Mocap) datasets show that the proposed network yields higher prediction accuracy than the state-of-the-art methods.
| [1] |
|
| [2] |
HABIBI G, JAIPURIA N, HOW J P. Context-aware pedestrian motion prediction in urban intersections[EB/OL]. (2018-07-25) [2024-02-23]. https://arxiv.org/abs/1806.09453. |
| [3] |
|
| [4] |
|
| [5] |
|
| [6] |
LIU J, GUANG Y, ROJAS J. GAST-Net: graph attention spatio-temporal convolutional networks for 3D human pose estimation in video[EB/OL]. (2020-03-11) [2024-02-23]. https://arxiv.org/abs/2003.14179. |
| [7] |
|
| [8] |
FU J, YANG F, DANG Y, et al. Learning constrained dynamic correlations in spatiotemporal graphs for motion prediction[EB/OL]. (2022-04-04) [2024-02-23]. https://arxiv.org/abs/2204.01297. |
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
|
| [14] |
|
| [15] |
|
| [16] |
VELICKOVIC P, CUCRRULL G, CASANOVA A, et al. Graph attention networks[EB/OL]. (2017-10-30) [2024-02-23]. http://arxiv.org/abs/1710.10903. |
| [17] |
|
| [18] |
|
| [19] |
BOUAZIZI A, HOLZBOCK A, KRESSEL U, et al. MotionMixer: MLP-based 3D human body pose forecasting[EB/OL]. (2022-07-01) [2024-02-23]. https://arxiv.org/abs/2207.00499. |
| [20] |
|
| [21] |
|
Tianjin University of Technology
/
| 〈 |
|
〉 |