Facial Pose Adaptation: Intersective Attention U-Net for Facial Expression Recognition

Surya Pratap Yadav; Shailendra Kumar Shrivastava

doi:10.1007/s11518-025-5678-4

Journal of Systems Science and Systems Engineering ›› :1 -38. DOI: 10.1007/s11518-025-5678-4

Article

research-article

Facial Pose Adaptation: Intersective Attention U-Net for Facial Expression Recognition

Surya Pratap Yadav ¹^,^a
, Shailendra Kumar Shrivastava ²

Author information +

History +

PDF

Abstract

The proposed Intersective Attention U-Net is a modified U-Net architecture that incorporates the Contextual Attention Branch (CAB) and Pose Normalization Branch (PNB) to enhance Facial Expression Recognition (FER), effectively handling pose variations, occlusions and subtle emotional cues. The CAB employs multi-head attention to selectively focus on expressive facial regions while suppressing occluded or irrelevant areas. Simultaneously, the PNB incorporates an Adaptive Spatial Transformer Network (ASTN) to dynamically align facial features, ensuring robustness to pose variations. Unlike traditional FER methods that struggle with misalignment and occlusions, the proposed fusion mechanism enriches feature representation, enabling the model to capture fine-grained emotional expressions with higher precision. Extensive evaluations on CK+, RAF-DB, and UTKFace datasets demonstrate the superiority of our approach, achieving 99.26%, 99.34%, and 99.43% accuracy, respectively, surpassing state-of-the-art FER techniques. The proposed framework offers a robust and adaptive solution for real-world FER applications.

Keywords

Face emotion recognition / intersective attention / occlusion / pose variation / subtle ques

Cite this article

Download citation ▾

Surya Pratap Yadav, Shailendra Kumar Shrivastava. Facial Pose Adaptation: Intersective Attention U-Net for Facial Expression Recognition. Journal of Systems Science and Systems Engineering 1-38 DOI:10.1007/s11518-025-5678-4

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Ahmed N, Al Aghbari Z, Girija S. A systematic survey on multimodal emotion recognition using learning algorithms. Intelligent Systems with Applications. 2023, 17: 200171.

[2]	Akram A, Khan N. US-GAN: On the importance of ultimate skip connection for facial expression synthesis. Multimedia Tools and Applications. 2024, 83(3): 7231-7247.

[3]	Albraikan A A, Alzahrani J S, Alshahrani R, Yafoz A, Alsini R, Hilal A M, Alkhayyat A, Gupta D. Intelligent facial expression recognition and classification using optimal deep transfer learning model. Image and Vision Computing. 2022, 128: 104583.

[4]	Boughida A, Kouahla M N, Lafifi Y. A novel approach for facial expression recognition based on Gabor filters and genetic algorithm. Evolving Systems. 2022, 132331-345.

[5]	Dengwen Z. An edge-directed bicubic interpolation algorithm. 2010 IEEE 3rd International Congress on Image and Signal Processing. 2010October 16–18, 2010

[6]	Dhall A, Goecke R, Lucey S, Gedeon T. Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark. 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops). 2011November 06–13, 2011

[7]	Ding C, Tao D. Robust face recognition via multimodal deep face representation. IEEE Transactions on Multimedia. 2015, 17112049-2058.

[8]	Dubey A K, Jain V. Automatic facial recognition using VGG16 based transfer learning model. Journal of Information and Optimization Sciences. 2020, 41(7): 1589-1596.

[9]	Gao Y, Xie Y, Hu Z Z, Chen T, Lin L. Adaptive global-local representation learning and selection for cross-domain facial expression recognition. IEEE Transactions on Multimedia. 2024, 26: 6676-6688.

[10]	Gunes H, Piccardi M, Pantic M. From the lab to the real world: Affect recognition using multiple cues and modalities. Affective Computing: Focus on Emotion Expression, Synthesis, and Recognition. 2008185218

[11]	Huang Y, Chen F, Lv S, Wang X. Facial expression recognition: A survey. Symmetry. 2019, 11(10): 1189.

[12]	Kahou E S, Michalski V, Konda K, Memisevic R, Pal C. Recurrent neural networks for emotion recognition in video. 2015 ACM on International Conference on Multimodal Interaction. 2015November 9–13, 2015

[13]	Kim J, Lee D. Facial expression recognition robust to occlusion and to intra-similarity problem using relevant subsampling. Sensors. 2023, 2352619.

[14]	Ko B C. A brief review of facial emotion recognition based on visual information. Sensors. 2018, 182401.

[15]	Liu H, Cai H, Lin Q, Li X, Xiao H. Adaptive multilayer perceptual attention network for facial expression recognition. IEEE Transactions on Circuits and Systems for Video Technology. 2022, 3296253-6266.

[16]	Liu Y, Wang W, Feng C, Zhang H, Chen Z, Zhan Y. Expression snippet transformer for robust video-based facial expression recognition. Pattern Recognition. 2023, 138: 109368.

[17]	Nan F, Jing W, Tian F, Zhang J, Chao K M, Hong Z, Zheng Q. Feature super-resolution based facial expression recognition for multi-scale low-resolution images. Knowledge-Based Systems. 2022, 236107678.

[18]	Nan Y, Ju J, Hua Q, Zhang H, Wang B. A-MobileNet: An approach of facial expression recognition. Alexandria Engineering Journal. 2022, 61(6): 4435-4444.

[19]	Patro S G, Sahu K K (2015). Normalization: A preprocessing stage. arXiv Preprint arXiv: 1503.06462.

[20]	Revina IM, Emmanuel W S. Facial expression recognition via modified GAD features with PSO-KNN. 2018 IEEE International Conference on Smart Systems and Inventive Technology (ICSSIT). 2018December 13–14, 2018

[21]	Saeed S, Baber J, Bakhtyar M, Ullah I, Sheikh N, Dad I, Sanjrani A A. Empirical evaluation of SVM for facial expression recognition. International Journal of Advanced Computer Science and Applications. 2018, 911670-673.

[22]	Saurav S, Gidde P, Saini R, Singh S. Dual integrated convolutional neural network for real-time facial expression recognition in the wild. The Visual Computer. 2022, 38(3): 1083-1096.

[23]	Shi G, Mao S, Gou S, Yan D, Jiao L, Xiong L. Adaptively enhancing facial expression crucial regions via a local non-local joint network. Machine Intelligence Research. 2024, 21(2): 331-348.

[24]	Shi Y, Wang H, Zhou B, Chen Y, Huang Y. Simulation study of a novel AlGaN/GaN L-FER with ultralow turn-on voltage and low reverse leakage. Micro & Nano Letters. 2022, 17(8): 186-192.

[25]	Sunil M P, Hariprasad S A. Facial emotion recognition using a modified deep convolutional neural network based on the concatenation of XCEPTION and RESNET50 v2. SSRG International Journal of Electrical and Electronics Engineering. 2023, 10(6): 94-105.

[26]	Umer S, Rout R K, Pero C, Nappi M. Facial expression recognition with trade-offs between data augmentation and deep learning features. Journal of Ambient Intelligence and Humanized Computing. 2022, 13(2): 721-735.

[27]	Wang W, Fu Y, Sun Q, Chen T, Cao C, Zheng Z, Xu G, Qiu H, Jiang Y G, Xue X (2020). Learning to augment expressions for few-shot fine-grained facial expression recognition. arXiv Preprint arXiv: 2001.06144.

[28]	Wen Z, Lin W, Wang T, Xu G. Distract your attention: Multi-head cross attention network for facial expression recognition. Biomimetics. 2023, 8(2): 199.

[29]	Xie W, Peng Z, Shen L, Lu W, Zhang Y, Song S. Cross-layer contrastive learning of latent semantics for facial expression recognition. IEEE Transactions on Image Processing. 2024, 33: 2514-2529.

[30]	Xu C, Makihara Y, Li X, Yagi Y, Lu J. Cross-view gait recognition using pairwise spatial transformer networks. IEEE Transactions on Circuits and Systems for Video Technology. 2020, 311260-74.

[31]	Xue F, Wang Q, Tan Z, Ma Z, Guo G. Vision transformer with attentive pooling for robust facial expression recognition. IEEE Transactions on Affective Computing. 2022, 14(4): 3244-3256.

[32]	Yan J, Zheng W, Cui Z, Song P. A joint convolutional bidirectional LSTM framework for facial expression recognition. IEICE Transactions on Information and Systems. 2018, 101(4): 1217-1220.

[33]	Yu C, Zhang D, Zou W, Li M. Joint training on multiple datasets with inconsistent labeling criteria for facial expression recognition. IEEE Transactions on Affective Computing. 2024, 15(3): 1812-1825.

[34]

Zahara L, Musa P, Wibowo E P, Karim I, Musa S B. The facial emotion recognition (FER-2013) dataset for prediction system of micro-expressions face using the convolutional neural network (CNN) algorithm based Raspberry Pi. 2020 IEEE Fifth International Conference on Informatics and Computing (ICIC). 2020November 03–04, 2020

[35]	Zhang Y, Wang C, Ling X, Deng W. Learn from all: Erasing attention consistency for noisy label facial expression recognition. 2022 17th European Conference on Computer Vision (ECCV). 2022October 23–27, 2022

[36]	Zhang F, Chen G, Wang H, Zhang C. CF-DAN: Facial-expression recognition based on cross-fusion dual-attention network. Computational Visual Media. 2024, 10(3): 593-608.