Large language model-enhanced probabilistic modeling for effective static analysis alarms

Xinlong PAN; Jianhua LI; Zhihong ZHOU; Gaolei LI; Xiuzhen CHEN; Jin MA; Jun WU; Quanhai ZHANG

doi:10.1631/FITEE.2500038

Front. Inform. Technol. Electron. Eng ›› 2025, Vol. 26 ›› Issue (10) :1926 -1941. DOI: 10.1631/FITEE.2500038

Research Article

Large language model-enhanced probabilistic modeling for effective static analysis alarms

Xinlong PAN ¹^,²
, Jianhua LI ¹^,²
, Zhihong ZHOU ¹^,²
, Gaolei LI ¹^,²
, Xiuzhen CHEN ¹^,²
, Jin MA ¹^,²
, Jun WU ¹^,²
, Quanhai ZHANG ¹^,²

Author information +

History +

PDF (740KB)

Abstract

Static analysis presents significant challenges in alarm handling, where probabilistic models and alarm prioritization are essential methods for addressing these issues. These models prioritize alarms based on user feedback, thereby alleviating the burden on users to manually inspect alarms. However, they often encounter limitations related to efficiency and issues such as false generalization. While learning-based approaches have demonstrated promise, they typically incur high training costs and are constrained by the predefined structures of existing models. Moreover, the integration of large language models (LLMs) in static analysis has yet to reach its full potential, often resulting in lower accuracy rates in vulnerability identification. To tackle these challenges, we introduce BinLLM, a novel framework that harnesses the generalization capabilities of LLMs to enhance alarm probability models through rule learning. Our approach integrates LLM-derived abstract rules into the probabilistic model, using alarm paths and critical statements from static analysis. This integration enhances the model’s reasoning capabilities, improving its effectiveness in prioritizing genuine bugs while mitigating false generalizations. We evaluated BinLLM on a suite of C programs and observed 40.1% and 9.4% reduction in the number of checks required for alarm verification compared to two state-of-the-art baselines, Bingo and BayeSmith, respectively, underscoring the potential of combining LLMs with static analysis to improve alarm management.

Keywords

Static analysis / Bayesian inference / Large language models (LLMs) / Alarm ranking

Cite this article

Download citation ▾

Xinlong PAN, Jianhua LI, Zhihong ZHOU, Gaolei LI, Xiuzhen CHEN, Jin MA, Jun WU, Quanhai ZHANG. Large language model-enhanced probabilistic modeling for effective static analysis alarms. Front. Inform. Technol. Electron. Eng, 2025, 26(10): 1926-1941 DOI:10.1631/FITEE.2500038