A survey on large language model-based alpha mining
Junjie ZHANG , Shuoling LIU , Tongzhe ZHANG , Yuchen SHI
Front. Inform. Technol. Electron. Eng ›› 2025, Vol. 26 ›› Issue (10) : 1809 -1821.
A survey on large language model-based alpha mining
Alpha mining, which refers to the systematic discovery of data-driven signals predictive of future cross-sectional returns, is a central task in quantitative research. Recent progress in large language models (LLMs) has sparked interest in LLM-based alpha mining frameworks, which offer a promising middle ground between human-guided and fully automated alpha mining approaches and deliver both speed and semantic depth. This study presents a structured review of emerging LLM-based alpha mining systems from an agentic perspective, and analyzes the functional roles of LLMs, ranging from miners and evaluators to interactive assistants. Despite early progress, key challenges remain, including simplified performance evaluation, limited numerical understanding, lack of diversity and originality, weak exploration dynamics, temporal data leakage, and black-box risks and compliance challenges. Accordingly, we outline future directions, including improving reasoning alignment, expanding to new data modalities, rethinking evaluation protocols, and integrating LLMs into more general-purpose quantitative systems. Our analysis suggests that LLM is a scalable interface for amplifying both domain expertise and algorithmic rigor, as it amplifies domain expertise by transforming qualitative hypotheses into testable factors and enhances algorithmic rigor for rapid backtesting and semantic reasoning. The result is a complementary paradigm, where intuition, automation, and language-based reasoning converge to redefine the future of quantitative research.
Alpha mining / Quantitative investment / Large language models (LLMs) / LLM agents / Fintech
Zhejiang University Press
/
| 〈 |
|
〉 |