A survey of binary code representation technology

Taiyan WANG , Qingsong XIE , Lu YU , Zulie PAN , Min ZHANG

Front. Inform. Technol. Electron. Eng ›› 2025, Vol. 26 ›› Issue (5) : 671 -694.

PDF (1725KB)
Front. Inform. Technol. Electron. Eng ›› 2025, Vol. 26 ›› Issue (5) : 671 -694. DOI: 10.1631/FITEE.2400088
Review

A survey of binary code representation technology

Author information +
History +
PDF (1725KB)

Abstract

Binary analysis, as an important foundational technology, provides support for numerous applications in the fields of software engineering and security research. With the continuous expansion of software scale and the complex evolution of software architecture, binary analysis technology is facing new challenges. To break through existing bottlenecks, researchers have applied artificial intelligence (AI) technology to the understanding and analysis of binary code. The core lies in characterizing binary code, i.e., how to use intelligent methods to generate representation vectors containing semantic information for binary code, and apply them to multiple downstream tasks of binary analysis. In this paper, we provide a comprehensive survey of recent advances in binary code representation technology, and introduce the workflow of existing research in two parts, i.e., binary code feature selection methods and binary code feature embedding methods. The feature selection section includes mainly two parts: definition and classification of features, and feature construction. First, the abstract definition and classification of features are systematically explained, and second, the process of constructing specific representations of features is introduced in detail. In the feature embedding section, based on the different intelligent semantic understanding models used, the embedding methods are classified into four categories based on the usage of text-embedding models and graph-embedding models. Finally, we summarize the overall development of existing research and provide prospects for some potential research directions related to binary code representation technology.

Keywords

Binary analysis / Binary code representation / Binary code feature selection / Binary code feature embedding

Cite this article

Download citation ▾
Taiyan WANG, Qingsong XIE, Lu YU, Zulie PAN, Min ZHANG. A survey of binary code representation technology. Front. Inform. Technol. Electron. Eng, 2025, 26(5): 671-694 DOI:10.1631/FITEE.2400088

登录浏览全文

4963

注册一个新账户 忘记密码

References

RIGHTS & PERMISSIONS

Zhejiang University Press

AI Summary AI Mindmap
PDF (1725KB)

126

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/