Unseen head pose prediction using densemultivariate label distribution<FootNote> Project supported by the National Key Scientific Instrument and Equipment Development Project of China (No. 2013YQ49087903) and the National Natural Science Foundation of China (No. 61202160) </FootNote>
Gao-li SANG, Hu CHEN, Ge HUANG, Qi-jun ZHAO
Unseen head pose prediction using densemultivariate label distribution<FootNote> Project supported by the National Key Scientific Instrument and Equipment Development Project of China (No. 2013YQ49087903) and the National Natural Science Foundation of China (No. 61202160) </FootNote>
Accurate head poses are useful for many face-related tasks such as face recognition, gaze estimation, and emotion analysis. Most existing methods estimate head poses that are included in the training data (i.e., previously seen head poses). To predict head poses that are not seen in the training data, some regression-based methods have been proposed. However, they focus on estimating continuous head pose angles, and thus do not systematically evaluate the performance on predicting unseen head poses. In this paper, we use a dense multivariate label distribution (MLD) to represent the pose angle of a face image. By incorporating both seen and unseen pose angles into MLD, the head pose predictor can estimate unseen head poses with an accuracy comparable to that of estimating seen head poses. On the Pointing’04 database, the mean absolute errors of results for yaw and pitch are 4.01◦ and 2.13◦, respectively. In addition, experiments on the CAS-PEAL and CMU Multi-PIE databases show that the proposed dense MLD-based head pose estimation method can obtain the state-of-the-art performance when compared to some existing methods.
Head pose estimation / Dense multivariate label distribution / Sampling intervals / Inconsistent labels
[1] |
Aghajanian, J., Prince, S.J.D., 2009. Face pose estimation in uncontrolled environments. Proc. British Machine Vision Conf., p.1–11.
|
[2] |
Berger, A.L., Pietra, V.J.D., Pietra, S.A.D., 1996. A maximum entropy approach to natural language processing. Comput. Ling., 22(1):39–71.
|
[3] |
Bowyer, K.W., Chang, K., Flynn, P., 2006. A survey of approaches and challenges in 3D and multi-modal 3D+2D face recognition. Comput. Vis. Image Understand., 101(1):1–15. http://dx.doi.org/10.1016/j.cviu.2005.05.005
|
[4] |
Brunelli, R., 1997. Estimation of pose and illuminant direction for face processing. Image Vis. Comput., 15(10):741–748. http://dx.doi.org/10.1016/S0262-8856(97)00024-3
|
[5] |
Cai, Y., Yang, M.L., Li, Z.Q., 2015. Robust head pose estimation using a 3D morphable model. Math. Prob. Eng., 2015:678973.1–678973.10. http://dx.doi.org/10.1155/2015/678973
|
[6] |
Do, M.N., 2003. Fast approximation of Kullback-Leibler distance for dependence trees and hidden Markov models. IEEE Signal Process. Lett., 10(4):115–118. http://dx.doi.org/10.1109/LSP.2003.809034
|
[7] |
Fenzi, M., Leal-Taixé, L., Rosenhahn, B.,
|
[8] |
Fitzpatrick, P., 2000. Head Pose Estimation Without Manual Initialization. Report, Massachusetts Institute of Technology, Cambridge.
|
[9] |
Gao, W., Cao, B., Shan, S.G.,
|
[10] |
Geng, X., Xia, Y., 2014. Head pose estimation based on multivariate label distribution. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, p.1837–1842.
|
[11] |
Gourier, N., Hall, D., Crowley, J.L., 2004. Estimating face orientation from robust detection of salient facial features. Proc. Int. Workshop on Visual Observation of Deictic Gestures. Available from http://www-prima.inrialpes.fr/perso/Gourier/Faces/HPDatabase.html.
|
[12] |
Gross, R., Matthews, I., Cohn, J.,
|
[13] |
Haj, M.A., Gonzàlez, J., Davis, L.S., 2012. On partial least squares in head pose estimation: how to simultaneously deal with misalignment. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, p.2602–2609. http://dx.doi.org/10.1109/CVPR.2012.6247979
|
[14] |
Hu, C.L., Gong, L.Y., Wang, T.J.,
|
[15] |
Huang, D., Storer, M., de la Torre, F.,
|
[16] |
Jain, V., Crowley, J.L., 2013. Head pose estimation using multi-scale Gaussian derivatives. Proc. 18th Scandinavian Conf. on Image Analysis, p.319–328. http://dx.doi.org/10.1007/978-3-642-38886-6_31
|
[17] |
Krüger, V., Sommer, G., 2002. Gabor wavelet networks for efficient head pose estimation. Image Vis. Comput., 20(9-10):665–672. http://dx.doi.org/10.1016/S0262-8856(02)00056-2
|
[18] |
Liu, D.C., Nocedal, J., 1989. On the limited memory BFGS method for large scale optimization. Math. Program., 45(1):503–528. http://dx.doi.org/10.1007/BF01589116
|
[19] |
Lu, F., Sugano, Y., Okabe, T.,
|
[20] |
Lu, F., Okabe, T., Sugano, Y.,
|
[21] |
Ma, B.P., Chai, X.J., Wang, T.J., 2013. A novel feature descriptor based on biologically inspired feature for head pose estimation. Neurocomputing, 115:1–10. http://dx.doi.org/10.1016/j.neucom.2012.11.005
|
[22] |
Ma, B.P., Li, A.N., Chai, X.J.,
|
[23] |
Ma, B.P., Huang, R., Qin, L., 2015. VoD: a novel image representation for head yaw estimation. Neurocomputing, 148:455–466. http://dx.doi.org/10.1016/j.neucom.2014.07.019
|
[24] |
Ma, X.H., Tan, Y.Q., Zheng, G.M., 2013. A fast classification scheme and its application to face recognition. J. Zhejiang Univ.-Sci. C (Comput. & Electron.), 14(7):561–572. http://dx.doi.org/10.1631/jzus.CIDE1309
|
[25] |
Murphy-Chutorian, E., Trivedi, M.M., 2009. Head pose estimation in computer vision: a survey. IEEE Trans. Patt. Anal. Mach. Intell., 31(4):607–626. http://dx.doi.org/10.1109/TPAMI.2008.106
|
[26] |
Pang, H., Lin, A., Holford, M.,
|
[27] |
Sim, T., Baker, S., Bsat, M., 2002. The CMU pose, illumination, and expression (PIE) database. Proc. 5th IEEE Int. Conf. on Automatic Face and Gesture Recognition, p.46–51. http://dx.doi.org/10.1109/AFGR.2002.1004130
|
[28] |
Tang, Y.Q., Sun, Z.N., Tan, T.N., 2014. A survey on head pose estimation. Patt. Recogn. Artif. Intell., 27(3):213–225.
|
[29] |
Wu, J.W., Trivedi, M.M., 2008. A two-stage head pose estimation framework and evaluation. Patt. Recog., 41(3):1138–1158. http://dx.doi.org/10.1016/j.patcog.2007.07.017
|
[30] |
Zhang, Z.P., Luo, P., Loy, C.C.,
|
[31] |
Zhu, R.H., Sang, G.L., Cai, Y.,
|
[32] |
Zhu, X.X., Ramanan, D., 2012. Face detection, pose estimation, and landmark localization in the wild. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, p.2879–2886. http://dx.doi.org/10.1109/CVPR.2012.6248014
|
/
〈 | 〉 |