Secure and efficient implementation of facial emotion detection for smart patient monitoring system
Kh Shahriya Zaman, Md Mamun Bin Ibne Reaz
Secure and efficient implementation of facial emotion detection for smart patient monitoring system
Background: Machine learning has enabled the automatic detection of facial expressions, which is particularly beneficial in smart monitoring and understanding the mental state of medical and psychological patients. Most algorithms that attain high emotion classification accuracy require extensive computational resources, which either require bulky and inefficient devices or require the sensor data to be processed on cloud servers. However, there is always the risk of privacy invasion, data misuse, and data manipulation when the raw images are transferred to cloud servers for processing facical emotion recognition (FER) data. One possible solution to this problem is to minimize the movement of such private data.
Methods: In this research, we propose an efficient implementation of a convolutional neural network (CNN) based algorithm for on-device FER on a low-power field programmable gate array (FPGA) platform. This is done by encoding the CNN weights to approximated signed digits, which reduces the number of partial sums to be computed for multiply-accumulate (MAC) operations. This is advantageous for portable devices that lack full-fledged resource-intensive multipliers.
Results: We applied our approximation method on MobileNet-v2 and ResNet18 models, which were pretrained with the FER2013 dataset. Our implementations and simulations reduce the FPGA resource requirement by at least 22% compared to models with integer weight, with negligible loss in classification accuracy.
Conclusions: The outcome of this research will help in the development of secure and low-power systems for FER and other biomedical applications. The approximation methods used in this research can also be extended to other image-based biomedical research fields.
With the advent of the Internet of Things and Big Data analysis, it is possible to create an intelligent ecosystem for continuous patient monitoring with minimal human interaction. But the machine learning algorithms associated with such smart healthcare ecosystems are usually resource-intensive and require communication of private data to cloud servers. In this article, we proposed an approximation method for neural networks, which can be efficiently implemented on FPGAs, and therefore enable on-device FER. We implemented several CNN models on the FPGA for FER. Our approximated method can save at least 22% FPGA resource usage, which is advantageous for embedded systems.
facial expression detection / emotion recognition / FPGA implementation / convolutional neural network / signed digit approximation
[1] |
ChowdhuryM. E. H.,RahmanT.,KhandakarA.,MazharR.,KadirM. A.,MahbubZ.,IslamK. R.,KhanM. S.,IqbalA.,EmadiN. A.,. (2020) Can AI help in screening viral and COVID-19 pneumonia? IEEE Access, 8, 132665–132676
|
[2] |
Khandakar, A., Chowdhury, M. E. H., Reaz, M. B. I., Ali, S. H. M., Abbas, T. O., Alam, T., Ayari, M. A., Mahbub, Z. B., Habib, R., Rahman, T.
CrossRef
Google scholar
|
[3] |
Shuzan, M. N. I., Chowdhury, M. H., Hossain, M. S., Chowdhury, M. E. H., Reaz, M. B. I., Uddin, M. M., Khandakar, A., Mahbub, Z. B. Ali, S. H. (2021). A novel non-invasive estimation of respiration rate from motion corrupted photoplethysmograph signal using machine learning model. IEEE Access, 9: 96775–96790
CrossRef
Google scholar
|
[4] |
Gupta, K. K., Vijay, R., Pahadiya, P. (2021). Use of novel thermography features of extraction and different artificial neural network algorithms in breast cancer screening. In: Wireless Personal Communications,
|
[5] |
Arbabshirani, M. R., Fornwalt, B. K., Mongelluzzo, G. J., Suever, J. D., Geise, B. D., Patel, A. A. Moore, G. (2018). Advanced machine learning in action: identification of intracranial hemorrhage on computed tomography scans of the head with clinical workflow integration. NPJ Digit. Med., 1: 9
CrossRef
Google scholar
|
[6] |
Yu, K. Beam, A. L. Kohane, I. (2018). Artificial intelligence in healthcare. Nat. Biomed. Eng., 2: 719–731
CrossRef
Google scholar
|
[7] |
Davoudi, A., Malhotra, K. R., Shickel, B., Siegel, S., Williams, S., Ruppert, M., Bihorac, E., Ozrazgat-Baslanti, T., Tighe, P. J., Bihorac, A.
CrossRef
Google scholar
|
[8] |
Simcock, G., McLoughlin, L. T., De Regt, T., Broadhouse, K. M., Beaudequin, D., Lagopoulos, J. Hermens, D. (2020). Associations between facial emotion recognition and mental health in early adolescence. Int. J. Environ. Res. Public Health, 17: 330
CrossRef
Google scholar
|
[9] |
ndez, J. C., Satorres, E., Reyes-Olmedo, M., Delhom, I., Real, E. (2020). Emotion recognition changes in a confinement situation due to COVID-19. J. Environ. Psychol., 72: 101518
CrossRef
Google scholar
|
[10] |
Fereydouni, S., Hadianfard, H. (2019). Facial emotion recognition in patients with relapsing-remitting multiple sclerosis. Neurological Association., 24: 327–332
|
[11] |
Al-Nafjan, A., Alharthi, K. (2020). Lightweight building of an electroencephalogram-based emotion detection system. Brain Sci., 10: 781
CrossRef
Google scholar
|
[12] |
Kanjo, E., Younis, E. M. G. Ang, C. (2019). Deep learning analysis of mobile physiological, environmental and location sensor data for emotion detection. Inf. Fusion, 49: 46–56
CrossRef
Google scholar
|
[13] |
Ju, X., Zhang, D., Li, J. (2020). Transformer-based label set generation for multi-modal multi-label emotion detection. In: Proceedings of the 28th ACM International Conference on Multimedia,
|
[14] |
Zhang, D., Ju, X., Li, J., Li, S., Zhu, Q. (2020). Multi-modal multi-label emotion detection with modality and label dependence. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP),
|
[15] |
Corneanu, C. A., Cohn, J. F. Guerrero, S. (2016). Survey on RGB, 3D, thermal, and multimodal approaches for facial expression recognition: History, trends, and affect-related applications. IEEE Trans. Pattern Anal. Mach. Intell., 38: 1548–1568
CrossRef
Google scholar
|
[16] |
HarleyJ.. (2016) Chapter 5−Measuring emotions: A survey of cutting edge methodologies used in computer-based learning environment research. In: Emotions, Technology, Design, and Learning, Tettegah S.Y. (Eds). pp. 89–114. Academic Press, San Diego
|
[17] |
Ko, B. (2018). A brief review of facial emotion recognition based on visual information. Sensors (Basel), 18: 401
CrossRef
Google scholar
|
[18] |
Mellouk, W. (2020). Facial emotion recognition using deep learning: review and insights. Procedia Comput. Sci., 175: 689–694
CrossRef
Google scholar
|
[19] |
Carmichael, Z., Langroudi, H. F., Khazanov, C., Lillie, J., Gustafson, J. L. (2019). Performance-efficiency trade-off of low-precision numerical formats in deep neural networks. In: ACM International Conference Proceeding Series,
CrossRef
Google scholar
|
[20] |
ZhouA.,YaoA.,GuoY.,XuL.. (2017) Incremental network quantization: Towards lossless CNNs with low-precision weights. In: 5th International Conference on Learning Representations, ICLR 2017-Conference Track Proceedings
|
[21] |
Sun, X., Wang, N., Chen, C. Ni, J., Agrawal, A., Cui, X., Venkataramani, S., Maghraoui, K., Srinivasan, V. (2020). Ultra-low precision 4-bit training of deep neural networks. In: Proceedings of the 34th International Conference on Neural Information Processing Systems,
|
[22] |
GuoR.DeBrunnerL.. (2011) A novel fast canonical-signed-digit conversion technique for multiplication. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1637–40.
|
[23] |
Joye, M. Yen, S. (2000). Optimal left-to-right binary signed-digit recoding. IEEE Trans. Comput., 49: 740–748
CrossRef
Google scholar
|
[24] |
Bagheri, E., Esteban, P. G., Cao, H. De Beir, A., Lefeber, D. (2020). An autonomous cognitive empathy model responsive to users’ facial emotion expressions. ACM Trans. Inter. Intel. Sys. (TIIS). 10,
|
[25] |
Goodfellow, I. J., Erhan, D., Carrier, P. L., Courville, A., Mirza, M., Hamner, B., Cukierskic, W., Tang, Y., Thaler, D., Lee, D.
|
[26] |
Lyons, M. J., Akamatsu, S., Kamachi, M., Gyoba, J. (1998). The Japanese female facial expression (JAFFE) database. In: Proceedings of Third International Conference on Automatic Face and Gesture Recognition,
|
[27] |
LundqvistD.,FlyktA.. (1998) Karolinska directed emotional faces. Cogn. Emotion
|
[28] |
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A. Chen, L. (2018). MobileNetV2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition,
CrossRef
Google scholar
|
[29] |
CotterS.. (2019) Low complexity deep learning for mobile face expression recognition. In: Proceedings of the 3rd International Conference on Vision, Image and Signal Processing, pp. 1–5
|
[30] |
WangY.,WuJ.. (2019) Lightweight deep convolutional neural networks for facial expression recognition. In: 2019 IEEE 21st International Workshop on Multimedia Signal Processing (MMSP), IEEE. pp. 1–6
|
[31] |
ZhaoY.,GaoX.,GuoX.,LiuJ.,WangE.,MullinsR.,CheungP.ConstantinidesG.. (2019) Automatic generation of multi-precision multi-arithmetic CNN accelerators for FPGAs. In: 2019 International Conference on Field-Programmable Technology (ICFPT), IEEE. pp. 45–53
|
[32] |
He, K., Zhang, X., Ren, S. (2016). Deep residual learning for image recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition,
CrossRef
Google scholar
|
[33] |
Fan, X., Jiang, M., Zhang, H., Li, Y. (2020). Quantized separable residual network for facial expression recognition on FPGA. In: International Conference on Cognitive Systems and Signal Processing,
|
[34] |
Phan-Xuan, H., Le-Tien, T. (2019). FPGA platform applied for facial expression recognition system using convolutional neural networks. Procedia Comput. Sci., 151: 651–658
CrossRef
Google scholar
|
[35] |
VinhP. T.VinhT.. (2019) Facial expression recognition system on SoC FPGA. In: 2019 International Symposium on Electrical and Electronics Engineering (ISEE), IEEE. pp. 1–4
|
/
〈 | 〉 |