Adaptive detection of encrypted malware traffic via fully convolutional masked autoencoders

Jizhe JIA , Meng SHEN , Qingjun YUAN , Yong LIU , Jing WANG , Jian KONG , Liang HUANG , Haotian HE , Liehuang ZHU

Front. Comput. Sci. ›› 2026, Vol. 20 ›› Issue (4) : 2004804

PDF (831KB)
Front. Comput. Sci. ›› 2026, Vol. 20 ›› Issue (4) : 2004804 DOI: 10.1007/s11704-025-41273-9
Information Security
RESEARCH ARTICLE

Adaptive detection of encrypted malware traffic via fully convolutional masked autoencoders

Author information +
History +
PDF (831KB)

Abstract

Network traffic encryption techniques are widely adopted to protect data confidentiality and prevent privacy leakage during data transmission. However, malware often leverages these traffic encryption techniques to conceal malicious activities. Recent research has demonstrated the effectiveness of machine and deep learning-based malware traffic detection methods. However, these methods rely on a sufficient amount of labeled data readily available for model training, limiting the capability of transferring to new malware detection.

In this paper, we propose Malcom, an adaptive encrypted malware traffic detection method based on fully convolutional masked autoencoders to detect malware traffic hidden in the encrypted traffic. We first propose a novel traffic representation named Header-Payload Matrix (HPM) to extract discriminative features that can differentiate from malware and benign traffic. Subsequently, we develop a hierarchical ConvNeXt traffic encoder and a lightweight ConvNeXt traffic decoder to learn high-level features from a large amount of unlabeled data. The masked autoencoder framework enables our model to be adaptive to new malware detection by fine-tuning with only a few labeled data. We conduct extensive experiments with real-world datasets to evaluate Malcom. The results demonstrate that Malcom outperforms the state-of-the-art (SOTA) methods in two typical scenarios. Particularly, in the scenario of few-shot learning, Malcom achieves an average F1 score of 97.35%, with an improvement of 8.24% over the SOTA method, by fine-tuning with only 10 samples per malware type.

Graphical abstract

Keywords

malware traffic detection / encrypted traffic analysis / self-supervised learning / masked autoencoder

Cite this article

Download citation ▾
Jizhe JIA, Meng SHEN, Qingjun YUAN, Yong LIU, Jing WANG, Jian KONG, Liang HUANG, Haotian HE, Liehuang ZHU. Adaptive detection of encrypted malware traffic via fully convolutional masked autoencoders. Front. Comput. Sci., 2026, 20(4): 2004804 DOI:10.1007/s11704-025-41273-9

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Rescorla E. SSL and TLS: Designing and Building Secure Systems. Boston: Addison-Wesley, 2001

[2]

WatchGuard’s threat lab analyzes the latest malware and internet attacks. See watchguard.com/wgrd-resource-center/security-report-q3-2023 website, 2023

[3]

Shen M, Ye K, Liu X, Zhu L, Kang J, Yu S, Li Q, Xu K . Machine learning-powered encrypted network traffic analysis: a comprehensive survey. IEEE Communications Surveys & Tutorials, 2023, 25( 1): 791–824

[4]

Fu C, Li Q, Xu K. Detecting unknown encrypted malicious traffic in real time via flow interaction graph analysis. In: Proceedings of the 30th Annual Network and Distributed System Security Symposium. 2023

[5]

Cui S, Dong C, Shen M, Liu Y, Jiang B, Lu Z . CBSeq: a channel-level behavior sequence for encrypted malware traffic detection. IEEE Transactions on Information Forensics and Security, 2023, 18: 5011–5025

[6]

Anderson B, McGrew D. Identifying encrypted malware traffic with contextual flow data. In: Proceedings of 2016 ACM Workshop on Artificial Intelligence and Security. 2016, 35−46

[7]

Fu Z, Liu M, Qin Y, Zhang J, Zou Y, Yin Q, Li Q, Duan H. Encrypted malware traffic detection via graph-based network analysis. In: Proceedings of the 25th International Symposium on Research in Attacks, Intrusions and Defenses. 2022, 495−509

[8]

Fu C, Li Q, Shen M, Xu K. Realtime robust malicious traffic detection via frequency domain analysis. In: Proceedings of 2021 ACM SIGSAC Conference on Computer and Communications Security. 2021, 3431−3446

[9]

Mirsky Y, Doitshman T, Elovici Y, Shabtai A. Kitsune: An ensemble of autoencoders for online network intrusion detection. In: Proceedings of the 25th Annual Network and Distributed System Security Symposium. 2018

[10]

Qu J, Ma X, Li J, Luo X, Xue L, Zhang J, Li Z, Feng L, Guan X. An input-agnostic hierarchical deep learning framework for traffic fingerprinting. In: Proceedings of the 32nd USENIX Security Symposium. 2023, 589−606

[11]

Xu C, Shen J, Du X . A method of few-shot network intrusion detection based on meta-learning framework. IEEE Transactions on Information Forensics and Security, 2020, 15: 3540–3552

[12]

Lin X, Xiong G, Gou G, Li Z, Shi J, Yu J. ET-BERT: a contextualized datagram representation with pre-training transformers for encrypted traffic classification. In: Proceedings of the ACM Web Conference 2022. 2022, 633−642

[13]

Ahmad R, Alsmadi I, Alhamdani W, Tawalbeh L . Zero-day attack detection: a systematic literature review. Artificial Intelligence Review, 2023, 56( 10): 10733–10811

[14]

Sirinam P, Mathews N, Rahman M S, Wright M. Triplet fingerprinting: more practical and portable website fingerprinting with N-shot learning. In: Proceedings of 2019 ACM SIGSAC Conference on Computer and Communications Security. 2019, 1131−1148

[15]

Fu H, Hu P, Zheng Z, Das A K, Pathak P H, Gu T, Zhu S, Mohapatra P . Towards automatic detection of nonfunctional sensitive transmissions in mobile applications. IEEE Transactions on Mobile Computing, 2021, 20( 10): 3066–3080

[16]

Cyril O O, Elmissaoui T, Okoronkwo M C, Ihedioha U M, Ugwuishiwu C H, Onyebuchi O B. Signature based network intrusion detection system using feature selection on android. International Journal of Advanced Computer Science and Applications, 2020, 11(6): 551−558

[17]

Dong C, Lu Z, Cui Z, Liu B, Chen K . MBTree: Detecting encryption rats communication using malicious behavior tree. IEEE Transactions on Information Forensics and Security, 2021, 16: 3589–3603

[18]

Shen M, Liu Y, Zhu L, Du X, Hu J . Fine-grained webpage fingerprinting using only packet length information of encrypted traffic. IEEE Transactions on Information Forensics and Security, 2021, 16: 2046–2059

[19]

Deng X, Li Q, Xu K. Robust and reliable early-stage website fingerprinting attacks via spatial-temporal distribution analysis. In: Proceedings of 2024 on ACM SIGSAC Conference on Computer and Communications Security. 2024, 1997−2011

[20]

Shen M, Zhang J, Xu K, Zhu L, Liu J, Du X. DeepQoE: real-time measurement of video QoE from encrypted traffic with deep learning. In: Proceedings of the IEEE/ACM 28th International Symposium on Quality of Service (IWQoS). 2020, 1−10

[21]

Lin W T, Pan J Y. Mobile malware detection in sandbox with live event feeding and log pattern analysis. In: Proceedings of the 18th Asia-Pacific Network Operations and Management Symposium (APNOMS). 2016, 1−6

[22]

Devlin J, Chang M W, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019, 4171−4186

[23]

Zhao R, Zhan M, Deng X, Li F, Wang Y, Wang Y, Gui G, Xue Z . A novel self-supervised framework based on masked autoencoder for traffic classification. IEEE/ACM Transactions on Networking, 2024, 32( 3): 2012–2025

[24]

He K, Chen X, Xie S, Li Y, Dollár P, Girshick R. Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 15979−15988

[25]

Liu Z, Mao H, Wu C Y, Feichtenhofer C, Darrell T, Xie S. A ConvNet for the 2020s. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 11966−11976

[26]

Woo S, Debnath S, Hu R, Chen X, Liu Z, Kweon I S, Xie S. ConvNeXt V2: Co-designing and scaling convnets with masked autoencoders. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023, 16133−16142

[27]

Zimmermann H . OSI reference model-the ISO model of architecture for open systems interconnection. IEEE Transactions on Communications, 1980, 28( 4): 425–432

[28]

Wang W, Zhu M, Zeng X, Ye X, Sheng Y. Malware traffic classification using convolutional neural network for representation learning. In: Proceedings of 2017 International Conference on Information Networking (ICOIN). 2017, 712−717

[29]

Shen M, Ji K, Gao Z, Li Q, Zhu L, Xu K. Subverting website fingerprinting defenses with robust traffic representation. In: Proceedings of the 32nd USENIX Security Symposium. 2023, 607−624

[30]

Shen M, Zhang J, Zhu L, Xu K, Du X . Accurate decentralized application identification via encrypted traffic analysis using graph neural networks. IEEE Transactions on Information Forensics and Security, 2021, 16: 2367–2380

[31]

Stratosphere. Stratosphere laboratory datasets. See stratosphereips.org/datasets-overview website, 2015

[32]

Uhrig J, Schneider N, Schneider L, Franke U, Brox T, Geiger A. Sparsity invariant CNNs. In: Proceedings of 2017 International Conference on 3D Vision (3DV). 2017, 11−20

[33]

Choy C, Gwak J, Savarese S. 4D spatio-temporal ConvNets: minkowski convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 3070−3079

[34]

Sharafaldin I, Lashkari A H, Ghorbani A A. Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: Proceedings of the 4th International Conference on Information Systems Security and Privacy. 2018, 108−116

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (831KB)

Supplementary files

Highlights

727

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/