Extracting hand articulations frommonocular depth images using curvature scale space descriptors

Shao-fan WANG; Chun LI; De-hui KONG; Bao-cai YIN

doi:10.1631/FITEE.1500126

PDF(2315 KB)

Front. Inform. Technol. Electron. Eng ›› 2016, Vol. 17 ›› Issue (1) : 41-54. DOI: 10.1631/FITEE.1500126

Orginal Article

Extracting hand articulations frommonocular depth images using curvature scale space descriptors

Shao-fan WANG¹ ,
Chun LI¹ ,
De-hui KONG¹ ,
Bao-cai YIN²^,¹^,³

Author information +

History +

Abstract

We propose a framework of hand articulation detection from a monocular depth image using curvature scale space (CSS) descriptors. We extract the hand contour from an input depth image, and obtain the fingertips and finger-valleys of the contour using the local extrema of a modified CSS map of the contour. Then we recover the undetected fingertips according to the local change of depths of points in the interior of the contour. Compared with traditional appearance-based approaches using either angle detectors or convex hull detectors, the modified CSS descriptor extracts the fingertips and finger-valleys more precisely since it is more robust to noisy or corrupted data;moreover, the local extrema of depths recover the fingertips of bending fingers well while traditional appearance-based approaches hardly work without matching models of hands. Experimental results show that our method captures the hand articulations more precisely compared with three state-of-the-art appearance-based approaches.

Keywords

Curvature scale space (CSS) / Hand articulation / Convex hull / Hand contour

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Shao-fan WANG, Chun LI, De-hui KONG, Bao-cai YIN. Extracting hand articulations frommonocular depth images using curvature scale space descriptors. Front. Inform. Technol. Electron. Eng, 2016, 17(1): 41‒54 https://doi.org/10.1631/FITEE.1500126

This is a preview of subscription content, contact us for subscripton.

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Abbasi, S., Mokhtarian, F., Kittler, J., 1999. Curvature scale space image in shape similarity retrieval. Multimedia Syst., 7(6):467–476. http://dx.doi.org/10.1007/s005300050147

[2]	Athitsos, V., Sclaroff, S., 2002. An appearance-based frame-work for 3D hand shape classiﬁcation and camera view¬point estimation. Proc. 5th IEEE Int. Conf. on Automatic Face and Gesture Recognition, p.40–45. http://dx.doi.org/10.1109/AFGR.2002.1004129

[3]	Athitsos, V., Sclaroff, S., 2003. Estimating 3D hand pose from a cluttered image. Proc. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, p.432–439. http://dx.doi.org/10.1109/CVPR.2003.1211500

[4]	Cerezo, T., 2012. 3D hand and ﬁnger recognition using Kinect. Technical Report, Universidad de Granada, Spain.

[5]	Chang, W.Y., Chen, C.S., Jian, Y.D., 2008. Visual track¬ing in high-dimensional state space by appearance-guided particle ﬁltering. IEEE Trans. Image Process., 17(7):1054–1067. http://dx.doi.org/10.1109/TIP.2008.924283

[6]	de La Gorce, M., Fleet, D.J., Paragios, N., 2011. Model-based 3D hand pose estimation from monocular video. IEEE Trans. Patt. Anal. Mach. Intell., 33(9):1793–1805. http://dx.doi.org/10.1109/TPAMI.2011.33

[7]	Feng, Z., Yang, B., Chen, Y., et al., 2011. Features extraction from hand images based on new detection operators. Patt. Recog., 44(5):1089–1105. http://dx.doi.org/10.1016/j.patcog.2010.08.007

[8]	Keskin, C., Kıraç, F., Kara, Y.E., et al., 2011. Real time hand pose estimation using depth sensors. In: Fossati, A.,Gall, J.,Grabner, H., et al. (Eds.), Consumer Depth Cameras for Computer Vision, Springer, London, p.119–137. http://dx.doi.org/10.1007/978-1-4471-4640-7_7

[9]	Kirac, F., Kara, Y.E., Akarun, L., 2014. Hierarchically constrained 3D hand pose estimation using regression forests from single frame depth data. Patt. Recog. Lett., 50:91–100. http://dx.doi.org/10.1016/j.patrec.2013.09.003

[10]	Lee, D., Lee, S., 2011. Vision-based ﬁnger action recognition by angle detection and contour analysis. ETRI J., 33(3):415–422. http://dx.doi.org/10.4218/etrij.11.0110.0313

[11]	Ma, Z., Wu, E., 2014. Real-time and robust hand track¬ing with a single depth camera. Vis. Comput., 30(10):1133–1144. http://dx.doi.org/10.1007/s00371-013-0894-1

[12]	Maisto, M., Panella, M., Liparulo, L., et al., 2013. An accurate algorithm for the identiﬁcation of ﬁngertips using an RGB-D camera. IEEE J. Emerg. Sel. Topics Circ. Syst., 3(2):272–283. http://dx.doi.org/10.1109/JETCAS.2013.2256830

[13]	Morshidi, M., Tjahjadi, T., 2014. Gravity optimised particle ﬁlter for hand tracking. Patt. Recog., 47(1):194–207. http://dx.doi.org/10.1016/j.patcog.2013.06.032

[14]	Nagarajan, S., Subashini, T., Ramalingam, V., 2012. Vi¬sion based real time ﬁnger counter for hand gesture recognition. Int. J. Technol., 2(2):1–5.

[15]	Oikonomidis, I., Kyriazis, N., Argyros, A.A., 2011. Effcient model-based 3D tracking of hand articulations using Kinect. BMVC, 1(2):1–11.

[16]	Qian, C., Sun, X., Wei, Y., et al., 2014. Realtime and robust hand tracking from depth. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, p.1106–1113. http://dx.doi.org/10.1109/CVPR.2014.145

[17]	Ren, Z., Yuan, J., Zhang, Z., 2011. Robust hand gesture recognition based on ﬁnger-earth mover’s distance with a commodity depth camera. Proc. 19th ACM Int. Conf. on Multimedia, p.1093–1096. http://dx.doi.org/10.1145/2072298.2071946

[18]	Rosales, R., Athitsos, V., Sigal, L., et al., 2001. 3D hand pose reconstruction using specialized mappings. Proc. 8th IEEE Int. Conf. on Computer Vision, p.378–385. http://dx.doi.org/10.1109/ICCV.2001.937543

[19]	Schlattmann, M., Kahlesz, F., Sarlette, R., et al., 2007. Markerless 4 gestures 6 DOF real-time visual tracking of the human hand with automatic initialization. Comput. Graph. Forum, 26(3):467–476. http://dx.doi.org/10.1111/j.1467-8659.2007.01069.x

[20]	Tomasi, C., Petrov, S., Sastry, A., 2003. 3D tracking= classiﬁcation+ interpolation. Proc. 9th IEEE Int. Conf. on Computer Vision, p.1441–1448. http://dx.doi.org/10.1109/ICCV.2003.1238659

[21]	Tompson, J., Stein, M., Lecun, Y., et al., 2014. Real-time continuous pose recovery of human hands using convolutional networks. ACM Trans. Graph., 33(5):169.1–169.10. http://dx.doi.org/10.1145/2629500