A Reinforcement Learning-Based Decision-Making Framework for Complex Industry Process
Yufei Zhang , Enjie Ma , Jie Hua , Zhongyuan Wang
In complex process industries, raw material prices and compositions can fluctuate beyond historical ranges, challenging traditional, expert-driven decision-making that relies heavily on past experience. This paper presents a novel reinforcement learning (RL) framework to address this issue and achieve dynamic, plant-wide economic optimization. We first develop a hybrid model that serves as a realistic simulation environment, effectively overcoming the limitations of sparse or out-of-distribution historical data and enabling safe policy exploration. We then formulate the plant’s operational optimization as a high-dimensional, continuous-action decision problem. To solve this, we propose the Asynchronous Three-Delay Deep Deterministic (A3D3) policy gradient algorithm, which offers greater adaptability to industrial settings. A3D3 improves training stability through asynchronous delay updates and enhances the robustness of industrial optimization processes by incorporating noise learning and expert knowledge guidance. The proposed method was validated in an industrial alumina refinery. The operational strategy derived by A3D3 significantly outperformed the plan devised by scheduling experts, successfully achieving increased production (3,167-ton production boost), cost reduction (2%), and enhanced economic benefits (7.6% improvement in profitability). Comparative experiments further demonstrated that A3D3 converges faster and delivers higher economic benefits than classical reinforcement learning algorithms, while ablation studies validated the unique contributions of its core components.
Complex process industry / Production operation strategy / Hybrid model / Maximizing profit / Reinforcement learning / Alumina industry
Higher Education Press 2026
/
| 〈 |
|
〉 |