VIPLFaceNet: an open source deep face recognition SDK
Xin LIU, Meina KAN, Wanglong WU, Shiguang SHAN, Xilin CHEN
VIPLFaceNet: an open source deep face recognition SDK
Robust face representation is imperative to highly accurate face recognition. In this work, we propose an open source face recognition method with deep representation named as VIPLFaceNet, which is a 10-layer deep convolutional neural network with seven convolutional layers and three fully-connected layers. Compared with the well-known AlexNet, our VIPLFaceNet takes only 20% training time and 60% testing time, but achieves 40% drop in error rate on the real-world face recognition benchmark LFW. Our VIPLFaceNet achieves 98.60% mean accuracy on LFW using one single network. An open-source C++ SDK based on VIPLFaceNet is released under BSD license. The SDK takes about 150ms to process one face image in a single thread on an i7 desktop CPU. VIPLFaceNet provides a state-of-the-art start point for both academic and industrial face recognition applications.
deep learning / face recognition / open source / VIPLFaceNet
[1] |
Zhao W Y, Chellappa R, Phillips P J, Rosenfeld A. Face recognition: a literature survey. ACM Computing Surveys, 2003, 35(4): 399–458
CrossRef
Google scholar
|
[2] |
Liu C, Wechsler H. Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition. IEEE Transactions on Image Processing, 2002, 11(4): 467–476
CrossRef
Google scholar
|
[3] |
Ahonen T, Hadid A, Pietikainen M. Face description with local binary patterns: application to face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(12): 2037–2041
CrossRef
Google scholar
|
[4] |
Chen D, Cao X D, Wen F, Sun J. Blessing of dimensionality: highdimensional feature and its efficient compression for face verification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 3025–3032
CrossRef
Google scholar
|
[5] |
Albiol A, Monzo D, Martin A, Sastre J, Albiol A. Face recognition using HOG-EBGM. Pattern Recognition Letters, 2008, 29(10): 1537–1543
CrossRef
Google scholar
|
[6] |
Vu N S, Caplier A. Enhanced patterns of oriented edge magnitudes for face recognition and image matching. IEEE Transactions on Image Processing, 2012, 21(3): 1352–1365
CrossRef
Google scholar
|
[7] |
Hussain S U, Napoléon T, Jurie F. Face recognition using local quantized patterns. In: Proceedings of British Machive Vision Conference. 2012, 11–20
CrossRef
Google scholar
|
[8] |
Bicego M, Lagorio A, Grosso E, Tistarelli M. On the use of SIFT features for face authentication. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2006, 35–35
CrossRef
Google scholar
|
[9] |
Kumar R, Banerjee A, Vemuri B C, Pfister H. Trainable convolution filters and their application to face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(7): 1423–1436
CrossRef
Google scholar
|
[10] |
Lei Z, Yi D, Li S Z. Discriminant image filter learning for face recognition with local binary pattern like representation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2012, 2512–2517
|
[11] |
Xie S F, Shan S G, Chen X L, Meng X, Gao W. Learned local gabor patterns for face representation and recognition. Signal Processing, 2009, 89(12): 2333–2344
CrossRef
Google scholar
|
[12] |
Cao Z M, Yin Q, Tang X O, Sun J. Face recognition with learningbased descriptor. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2010, 2707–2714
|
[13] |
Cui Z, Li W, Xu D, Shan S G, Chen X. Fusing robust face region descriptors via multiple metric learning for face recognition in the wild. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 3554–3561
CrossRef
Google scholar
|
[14] |
Berg T, Belhumeur P N. Tom-vs-Pete classifiers and identitypreserving alignment for face verification. In: Proceedings of British Machine Vision Conference. 2012, 5
|
[15] |
Taigman Y, Yang M, Ranzato M, Wolf L. Deepface: closing the gap to human-level performance in face verification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 1701–1708
CrossRef
Google scholar
|
[16] |
Sun Y, Wang X, Tang X. Deep learning face representation from predicting 10,000 classes. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 1891–1898
CrossRef
Google scholar
|
[17] |
Sun Y, Chen Y H, Wang X G, Tang X O. Deep learning face representation by joint identification-verification. In: Proceedings of Advances in Neural Information Processing Systems. 2014, 1988–1996
|
[18] |
Sun Y, Wang X G, Tang X O. Deeply learned face representations are sparse, selective, and robust. 2014, arXiv:1412.1265
|
[19] |
Schroff F, Kalenichenko D, Philbin J. Facenet: a unified embedding for face recognition and clustering. 2015, arXiv:1503.03832
|
[20] |
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems. 2012, 1097–1105
|
[21] |
Liu X, Shan S G, Li S X, Hauptmann A G. Everything is in the face? represent faces with object bank. In: Proceedings of Asian Conference on Computer Vision Workshops. 2014, 180–193
|
[22] |
Simonyan K, Parkhi O M, Vedaldi A, Zisserman A. Fisher vector faces in the wild. In: Proceedings of British Machive Vision Conference. 2013
CrossRef
Google scholar
|
[23] |
Kumar N, Berg A C, Belhumeur P N, Nayar S K. Attribute and simile classifiers for face verification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 365–372
CrossRef
Google scholar
|
[24] |
Yi D, Lei Z, Liao S C, Li S Z. Learning face representation from scratch, 2014, arXiv:1411.7923
|
[25] |
Chen D, Cao X, Wang L, Wen F, Sun J. Bayesian face revisited: a joint formulation. In: Proceedings of European Conference on Computer Vision. 2012, 566–579
CrossRef
Google scholar
|
[26] |
Samaria F S, Harter A C. Parameterisation of a stochastic model for human face identification. In: Proceedings of IEEE Workshop on Applications of Computer Vision. 1994, 138–142
CrossRef
Google scholar
|
[27] |
Martinez A M. The AR face database. CVC Technical Report, 1998, 24
|
[28] |
Phillips P J, Moon H, Rizvi S A, Rauss P J. The feret evaluation methodology for face-recognition algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(10): 1090–1104
CrossRef
Google scholar
|
[29] |
Sim T, Baker S, Bsat M. The CMU pose, illumination, and expression (PIE) database. In: Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition. 2002, 46–51
CrossRef
Google scholar
|
[30] |
Phillips P J, Flynn P J, Scruggs T, Bowyer K W, Chang J, Hoffman K, Marques J, Min J, Worek W. Overview of the face recognition grand challenge. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2005, 947–954
CrossRef
Google scholar
|
[31] |
Lee K C, Ho J, Kriegman D. Acquiring linear subspaces for face recognition under variable lighting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(5): 684–698
CrossRef
Google scholar
|
[32] |
Gao W, Cao B, Shan S G, Chen X L, Zhou D L, Zhang X H, Zhao D B. The CAS-PEAL large-scale Chinese face database and baseline evaluations. IEEE Transactions on Systems, Man and Cybernetics Part A System and Humans, 2008, 38(1): 149–161
CrossRef
Google scholar
|
[33] |
Gross R, Matthews I, Cohn J, Kanade T, Baker S. Multi-pie. Image and Vision Computing, 2010, 28(5): 807–813
CrossRef
Google scholar
|
[34] |
Huang G B, Ramesh M, Berg T, Learned-Miller E. Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Technical Report 07-49, 2007
|
[35] |
Chen B C, Chen C S, Hsu W H. Cross-age reference coding for ageinvariant face recognition and retrieval. In: Proceedings of European Conference on Computer Vision. 2014, 768–783
|
[36] |
Wang D Y, Hoi S C H, Zhu J K. WLFDB: weakly labeled face databases. Technical Report, 2014
|
[37] |
Zhang X, Zhang L, Wang X J, Shum H Y. Finding celebrities in billions of web images. IEEE Transactions on Multimedia, 2012, 14(4): 995–1007
CrossRef
Google scholar
|
[38] |
Best-Rowden L, Han H, Otto C, Klare B F, Jain A K. Unconstrained face recognition: identifying a person of interest from a media collection. IEEE Transactions on Information Forensics and Security, 2014, 9(12): 2144–2157
CrossRef
Google scholar
|
[39] |
Guillaumin M, Verbeek J, Schmid C. Is that you? metric learning approaches for face identification. In: Proceedings of the 12th IEEE International Conference on Computer Vision. 2009, 498–505
CrossRef
Google scholar
|
[40] |
Taigman Y, Wolf L, Hassner T. Multiple one-shots for utilizing class label information. In: Proceedings of British Machive Vision Conference. 2009, 1–12
CrossRef
Google scholar
|
[41] |
Yin Q, Tang X O, Sun J. An associate-predict model for face recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2011, 497–504
CrossRef
Google scholar
|
[42] |
Cao X, Wipf D, Wen F, Duan G Q, Sun J. A practical transfer learning algorithm for face verification. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 3208–3215
CrossRef
Google scholar
|
[43] |
Lu C C, Tang X O. Surpassing human-level face verification performance on LFW with gaussianface. 2014, arXiv:1404.3840
|
[44] |
Parkhi O M, Vedaldi A, Zisserman A. Deep face recognition. Proceedings of the British Machine Vision, 2015, 1(3): 6
CrossRef
Google scholar
|
[45] |
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. 2014, arXiv:1409.4842
|
[46] |
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014, arXiv:1409.1556
|
[47] |
He K M, Zhang X Y, Ren S Q, Sun J. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. 2015, arXiv:1502.01852
|
[48] |
He K M, Sun J. Convolutional neural networks at constrained time cost. 2014, arXiv:1412.1710
|
[49] |
Zhang S S, Zhang C, You Z, Zheng R, Xu B. Asynchronous stochastic gradient descent for DNN training. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. 2013, 6660–6663
CrossRef
Google scholar
|
[50] |
Chatfield K, Simonyan K, Vedaldi A, Zisserman A. Return of the devil in the details: delving deep into convolutional nets. 2014, arXiv:1405.3531
|
[51] |
LeCun Y, Bottou L, Orr G B, Müller K R. Efficient backprop. In: Montavon G, Orr G B, Müller K R, eds. Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science, Vol 7700. Berlin: Springer, 2012, 9–48
CrossRef
Google scholar
|
[52] |
Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of Internatioal Conference on Machine Learning. 2015, 448–456
|
[53] |
Yan S, Shan S G, Chen X, Gao W. Locally assembled binary (LAB) feature with feature-centric cascade for fast and accurate face detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2008, 1–7
|
[54] |
Zhang J, Shan S G, Kan M N, Chen X L. Coarse-to-fine auto-encoder networks (CFAN) for real-time face alignment. In: Proceedings of European Conference on Computer Vision. 2014, 1–16
CrossRef
Google scholar
|
[55] |
Jia Y Q, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T. Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia. 2014, 675–678
CrossRef
Google scholar
|
[56] |
Zou Q, Zeng J C, Cao L J, Ji R R. A novel features ranking metric with application to scalable visual and bioinformatics data classification. Neurocomputing, 2016, 173: 346–354
CrossRef
Google scholar
|
[57] |
Lin C, Chen W Q, Qiu C, Wu Y F, Krishnan S, Zou Q. LibD3C: ensemble classifiers with a clustering and dynamic selection strategy. Neurocomputing, 2014, 123: 424–435
CrossRef
Google scholar
|
[58] |
Taigman Y, Yang M, Ranzato M A, Wolf L. Web-scale training for face identification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 2746–2754
CrossRef
Google scholar
|
[59] |
Liu X, Li S X, Kan M N, Zhang J, Wu S Z, Liu W X, Han H, Shan S G, Chen X L. Agenet: deeply learned regressor and classifier for robust apparent age estimation. In: Proceedings of IEEE International Conference on Computer Vision Workshops. 2015, 258–266
CrossRef
Google scholar
|
/
〈 | 〉 |