Research on water-light complementary optimal scheduling based on deep reinforcement learning algorithm

Xianfeng HUANG; Chaoyue RAN; Wen ZHOU; Xu LI

doi:10.13928/j.cnki.wrahe.2025.04.019

Water Resources and Hydropower Engineering ›› 2025, Vol. 56 ›› Issue (4) :235 -247. DOI: 10.13928/j.cnki.wrahe.2025.04.019

research-article

Research on water-light complementary optimal scheduling based on deep reinforcement learning algorithm

Author information +

History +

PDF

Abstract

[Objective] Photovoltaic output in water-light complementary optimal scheduling is characterized by volatility, randomness, and intermittency. Its solution space is typically high-dimensional, complex, and continuous. A variety of continuous control decision-making problems are involved in water-light complementary optimal scheduling. [Methods] The Deep Deterministic Policy Gradient(DDPG) algorithm in the deep reinforcement learning algorithm was suitable for solving continuous and complex problems in the solution space. The water-light complementary problem was modeled using reinforcement learning. Based on the water-light complementary mechanism, the concepts of “demand for adjustment” and “capacity for adjustment” were considered to set up the environment, actions, reward function, and penalty function. The DDPG algorithm was then used for optimization. The applicability and effectiveness of the model were assessed by comparing and analyzing the optimization result using only the initial DDPG algorithm and those using the genetic algorithm. Taking the large-scale water-light complementary base in the upper reaches of the Lancang River as an example, three cascade hydropower station configuration schemes and three representative hydrological years were set up for a case study. [Results] The analysis indicated that:(1) the DDPG algorithm performed faster. After considering the mechanisms of “demand for adjustment” and “capacity for adjustment”, the photovoltaic power consumption reached 12.993 billion kWh, which was the highest among the three models.(2) The lower the water inflow, the stronger the photovoltaic consumption capacity was. When the installed capacity of cascade hydropower stations was 8.62 million kW, the photovoltaic consumption capacity only increased by 1% from normal year to dry year. At this time, the complementary capacity of hydropower system could be maximized.(3) The photovoltaic consumption capacity was relatively low during the wet season, and the photovoltaic consumption rates of the three schemes were 77.43%, 79.85%, and 89.39%, respectively. [Conclusion] The deep reinforcement learning algorithm demonstrates the advantage of rapid convergence in the water-light complementary optimal scheduling. Integrating mechanisms of “demand for adjustment” and “capacity for adjustment” into reinforcement learning modeling can significantly enhance the photovoltaic consumption efficiency, achieve better resource utilization, and effectively improve the operation efficiency of the water-light complementary system and the photovoltaic power consumption capacity. This method shows promising result in the capacity configuration and operation scheduling of clean energy base, providing a theoretical and practical foundation for the future expansion and application of clean energy system.

Keywords

water-light complementary / reinforcement learning / DDPG / optimal scheduling / influencing factors / hydropower stations

Cite this article

Download citation ▾

Xianfeng HUANG, Chaoyue RAN, Wen ZHOU, Xu LI. Research on water-light complementary optimal scheduling based on deep reinforcement learning algorithm. Water Resources and Hydropower Engineering, 2025, 56(4): 235-247 DOI:10.13928/j.cnki.wrahe.2025.04.019