Adaptive multi-agent reinforcement learning for dynamic pricing and distributed energy management in virtual power plant networks

Jian-Dong Yao , Wen-Bin Hao , Zhi-Gao Meng , Bo Xie , Jian-Hua Chen , Jia-Qi Wei

Journal of Electronic Science and Technology ›› 2025, Vol. 23 ›› Issue (1) : 100290

PDF (1494KB)
Journal of Electronic Science and Technology ›› 2025, Vol. 23 ›› Issue (1) :100290 DOI: 10.1016/j.jnlest.2024.100290
research-article

Adaptive multi-agent reinforcement learning for dynamic pricing and distributed energy management in virtual power plant networks

Author information +
History +
PDF (1494KB)

Abstract

This paper presents a novel approach to dynamic pricing and distributed energy management in virtual power plant (VPP) networks using multi-agent reinforcement learning (MARL). As the energy landscape evolves towards greater decentralization and renewable integration, traditional optimization methods struggle to address the inherent complexities and uncertainties. Our proposed MARL framework enables adaptive, decentralized decision-making for both the distribution system operator and individual VPPs, optimizing economic efficiency while maintaining grid stability. We formulate the problem as a Markov decision process and develop a custom MARL algorithm that leverages actor-critic architectures and experience replay. Extensive simulations across diverse scenarios demonstrate that our approach consistently outperforms baseline methods, including Stackelberg game models and model predictive control, achieving an 18.73% reduction in costs and a 22.46% increase in VPP profits. The MARL framework shows particular strength in scenarios with high renewable energy penetration, where it improves system performance by 11.95% compared with traditional methods. Furthermore, our approach demonstrates superior adaptability to unexpected events and mis-predictions, highlighting its potential for real-world implementation.

Keywords

Distributed energy management / Dynamic pricing / Multi-agent reinforcement learning / Renewable energy integration / Virtual power plants

Cite this article

Download citation ▾
Jian-Dong Yao, Wen-Bin Hao, Zhi-Gao Meng, Bo Xie, Jian-Hua Chen, Jia-Qi Wei. Adaptive multi-agent reinforcement learning for dynamic pricing and distributed energy management in virtual power plant networks. Journal of Electronic Science and Technology, 2025, 23(1): 100290 DOI:10.1016/j.jnlest.2024.100290

登录浏览全文

4963

注册一个新账户 忘记密码

CRediT authorship contribution statement

Jian-Dong Yao: Conceptualization, Formal analysis, Funding acquisition, Methodology, Writing – review & editing. Wen-Bin Hao: Data curation, Investigation, Resources, Writing – original draft. Zhi-Gao Meng: Formal analysis, Project administration, Writing – original draft. Bo Xie: Validation, Writing – original draft. Jian-Hua Chen: Software, Visualization, Writing – review & editing. Jia-Qi Wei: Software, Visualization, Writing – review & editing.

Declaration of competing interest

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Jian-Dong Yao, Wen-Bin Hao, Zhi-Gao Meng, Bo Xie, Jian-Hua Chen, and Jia-Qi Wei are currently employed by State Grid Sichuan Electric Power Comany Chengdu Power Supply Company, Chengdu, China, and also the research project is funded by State Grid Sichuan Electric Power Comany Chengdu Power Supply Company.

Acknowledgement

This work was financially supported by the Science and Technology Project of State Grid Sichuan Electric Power Company Chengdu Power Supply Company under Grant No. 521904240005.

References

[1]

P. Asmus, Microgrids, virtual power plants and our distributed energy future, Electr. J. 23 (10) (Dec. 2010) 72-82.

[2]

D. Pudjianto, C. Ramsay, G. Strbac, Virtual power plant and system integration of distributed energy resources, IET Renew. Power Gen. 1 (1) (Apr. 2007) 10-16.

[3]

L. Yavuz, A. Önen, S.M. Muyeen, I. Kamwa, Transformation of microgrid to virtual power plant―a comprehensive review, IET Gener. Transm. Dis. 13 (11) (Jun. 2019) 1994-2005.

[4]

P. Lombardi, M. Powalko, K. Rudion, Optimal operation of a virtual power plant, in: Proc. of IEEE Power & Energy Society General Meeting, Calgary, Canada, (2009), pp. 1-6.

[5]

A.G. Zamani, A. Zakariazadeh, S. Jadid, Day-ahead resource scheduling of a renewable energy based virtual power plant, Appl. Energ. 169 (May 2016) 324-340.

[6]

A. Khaksari, K. Steriotis, P. Makris, G. Tsaousoglou, N. Efthymiopoulos, E. Varvarigos, Electricity market equilibria analysis on the value of demand-side flexibility portfolios’ mix and the strategic demand aggregators’ market power, Sustain. Energy Grids 38 (Jun. 2024) 1-16. 101399.

[7]

S.-Y. Yu, F. Fang, Y.-J. Liu, J.-Z. Liu, Uncertainties of virtual power plant: problems and countermeasures, Appl. Energ. 239 (Apr. 2019) 454-470.

[8]

Y.-H. Peng, Big Data, Machine Learning and Challenges of High Dimensionality in Financial Administration, Ph.D. dissertation, University of Brasilia, Brasilia, (2019).

[9]

W. Tushar, B. Chai, C. Yuen, D.B. Smith, K.L. Wood, Z.-Y. Yang, Three-party energy management with distributed energy resources in smart grid, IEEE T. Ind. Electron. 62 (4) (Apr. 2015) 2487-2498.

[10]

M. Kazemi, H. Zareipour, N. Amjady, W.D. Rosehart, M. Ehsan, Operation scheduling of battery storage systems in joint energy and ancillary services markets, IEEE T. Sustain. Energ. 8 (4) (Oct. 2017) 1726-1735.

[11]

H. Wang, J.-W. Huang, Incentivizing energy trading for interconnected microgrids, IEEE T. Smart Grid 9 (4) (Jul. 2018) 2647-2657.

[12]

W. Tushar, C. Yuen, H. Mohsenian-Rad, T. Saha, H.V. Poor, K.L. Wood, Transforming energy networks via peer-to-peer energy trading: the potential of game-theoretic approaches, IEEE Signal Proc. Mag. 35 (4) (Jul. 2018) 90-111.

[13]

Z.-Y. Wang, B.-K. Chen, J.-H. Wang, M.M. Begovic, C. Chen, Coordinated energy management of networked microgrids in distribution systems, IEEE T. Smart Grid 6 (1) (Jan. 2015) 45-53.

[14]

Y.-P. Wang, W. Saad, Z. Han, H.V. Poor, T. Başar, A game-theoretic approach to energy trading in the smart grid, IEEE T. Smart Grid 5 (3) (May 2014) 1439-1450.

[15]

A.H. Mohsenian-Rad, V.W.S. Wong, J. Jatskevich, R. Schober, A. Leon-Garcia, Autonomous demand-side management based on game-theoretic energy consumption scheduling for the future smart grid, IEEE T. Smart Grid 1 (3) (Dec. 2010) 320-331.

[16]

W. Saad, Z. Han, H.V. Poor, T. Başar, Game-theoretic methods for the smart grid: an overview of microgrid systems, demand-side management, and smart grid communications, IEEE Signal Proc. Mag. 29 (5) (Sept. 2012) 86-105.

[17]

A. Papavasiliou, Y.-T. Mou, L. Cambier, D. Scieur, Application of stochastic dual dynamic programming to the real-time dispatch of storage under renewable supply uncertainty, IEEE T. Sustain. Energ. 9 (2) (Apr. 2019) 547-558.

[18]

L. Buşoniu, R. Babuška, B. De Schutter, Multi-agent reinforcement learning: an overview, D. Srinivasan, L.C. Jain (Eds.), Innovations in Multi-Agent Systems and Applications-1, Springer, Berlin, Germany, (2010), pp. 183-221.

[19]

J.N. Foerster, I.M. Assael, N. de Freitas, S. Whiteson, Learning to communicate with deep multi-agent reinforcement learning, in: Proc. of the 30th Intl. Conf. on Neural Information Processing Systems, Red Hook, USA, (2016), pp. 2145-2153.

[20]

K.-Q. Zhang, Z.-R. Yang, T. Başar, Multi-agent reinforcement learning: a selective overview of theories and algorithms, K.G. Vamvoudakis, Y. Wan, F.L. Lewis, D. Cansever (Eds.), Handbook of Reinforcement Learning and Control, Springer, Cham, Germany, (2021), pp. 321-384.

[21]

T.T. Nguyen, N.D. Nguyen, S. Nahavandi, Deep reinforcement learning for multiagent systems: a review of challenges, solutions, and applications, IEEE T. Cybernetics 50 (9) (Sept. 2020) 3826-3839.

[22]

J.-L. Hu, M.P. Wellman, Nash Q-learning for general-sum stochastic games, J. Mach. Learn. Res. 4 (Dec. 2003) 1039-1069.

[23]

M. Lauer, M.A. Riedmiller, An algorithm for distributed reinforcement learning in cooperative multi-agent systems, in: Proc. of the 17th Intl. Conf. on Machine Learning, Stanford, USA, (2000), pp. 535-542.

[24]

R. Lowe, Y. Wu, A. Tamar, J. Harb, P. Abbeel, I. Mordatch, Multi-agent actor-critic for mixed cooperative-competitive environments, in: Proc. of the 31st Intl. Conf. on Neural Information Processing Systems, Long Beach, USA, (2017), pp. 6382-6393.

[25]

M. Tokic, G. Palm, Value-difference based exploration: adaptive control between epsilon-greedy and softmax, in: Proc. of the 34th Annual German Conf. on AI, Berlin, Germany, (2011), pp. 335-346.

[26]

H.-C. Wu, D.-W. Qiu, L.-Y. Zhang, M.-Y. Sun, Adaptive multi-agent reinforcement learning for flexible resource management in a virtual power plant with dynamic participating multi-energy buildings, Appl. Energ. 374 (Nov. 2024) 1-18. 123998.

[27]

W.-R. Liu, P. Zhuang, H. Liang, J. Peng, Z.-W. Huang, Distributed economic dispatch in microgrids based on cooperative reinforcement learning, IEEE T. Neur. Net. Lear. 29 (6) (Jun. 2018) 2192-2203.

[28]

H.-C. Xu, H.-B. Sun, D. Nikovski, S. Kitamura, K. Mori, H. Hashimoto, Deep reinforcement learning for joint bidding and pricing of load serving entity, IEEE T. Smart Grid 10 (6) (Nov. 2019) 6366-6375.

[29]

T. Chen, W.-C. Su, Indirect customer-to-customer energy trading with reinforcement learning, IEEE T. Smart Grid 10 (4) (Jul. 2019) 4338-4348.

[30]

H.-W. Wang, T.-W. Huang, X.-F. Liao, H. Abu-Rub, G. Chen, Reinforcement learning in energy trading game among smart microgrids, IEEE T. Ind. Electron. 63 (8) (Aug. 2016) 5109-5119.

[31]

Y.-D. Yang, J.-Y. Hao, M.-Y. Sun, Z. Wang, C.-J. Fan, G. Strbac, Recurrent deep multiagent q-learning for autonomous brokers in smart grid, in: Proc. of the 27th Intl. Joint Conf. on Artificial Intelligence, Stockholm, Sweden, (2018), pp. 569-575.

[32]

J.R. Vázquez-Canteli, Z. Nagy, Reinforcement learning for demand response: a review of algorithms and modeling techniques, Appl. Energ. 235 (Feb. 2019) 1072-1089.

[33]

W. Tushar, T.K. Saha, C. Yuen, et al., A motivational game-theoretic approach for peer-to-peer energy trading in the smart grid, Appl. Energ. 243 (Jun. 2019) 10-20.

[34]

Y.-J. Ye, D.-W. Qiu, M.-Y. Sun, D. Papadaskalopoulos, G. Strbac, Deep reinforcement learning for strategic bidding in electricity markets, IEEE T. Smart Grid 11 (2) (Mar. 2020) 1343-1355.

[35]

Q.-L. Yang, G. Wang, A. Sadeghi, G.B. Giannakis, J. Sun, Two-timescale voltage control in distribution grids using deep reinforcement learning, IEEE T. Smart Grid 11 (3) (May 2019) 2313-2323.

[36]

B.V. Mbuwir, F. Ruelens, F. Spiessens, G. Deconinck, Battery energy management in a microgrid using batch reinforcement learning, Energies 10 (11) (Nov. 2017) 1-19. 1846.

[37]

E. Mocanu, D.C. Mocanu, P.H. Nguyen, A. Liotta, M.E. Webber, M. Gibescu, On-line building energy optimization using deep reinforcement learning, IEEE T. Smart Grid 10 (4) (Jul. 2019) 3698-3708.

[38]

F.L. Da Silva, C.E.H. Nishida, D.M. Roijers, A.H.R. Costa, Coordination of electric vehicle charging through multiagent reinforcement learning, IEEE T. Smart Grid 11 (3) (May 2020) 2347-2356.

[39]

P. Vytelingum, S.D. Ramchurn, T.D. Voice, A. Rogers, N.R. Jennings, Trading agents for the smart electricity grid, in: Proc. of the 9th Intl. Conf. on Autonomous Agents and Multiagent Systems, Toronto, Canada, (2010), pp. 897-904.

[40]

S. Chakraborty, T. Okabe, Robust energy storage scheduling for imbalance reduction of strategically formed energy balancing groups, Energy 114 (Nov. 2016) 405-417.

[41]

C. Glanois, P. Weng, M. Zimmer, et al., A survey on interpretable reinforcement learning, Mach. Learn. 113 (8) (Aug. 2024) 5847-5890.

[42]

G. Dalal, K. Dvijotham, M. Vecerik, T. Hester, C. Paduraru, Y. Tassa, Safe exploration in continuous action spaces [Online]. Available, https://arxiv.org/abs/1801.08757, January 2018.

[43]

L. Busoniu, R. Babuska, B. De Schutter, A comprehensive survey of multiagent reinforcement learning, IEEE T. Systems, Man, and Cybernetics, Part C (Applications and Reviews) 38 (2) (Mar. 2008) 156-172.

[44]

W. Du, S. Ding, A survey on multi-agent deep reinforcement learning: from the perspective of challenges and applications, Artif. Intell. Rev. 54 (5) (2021) 3215-3238.

[45]

S. Gronauer, K. Diepold, Multi-agent deep reinforcement learning: a survey, Artif. Intell. Rev. 55 (2) (Feb. 2022) 895-943.

[46]

F. Lezama, J. Soares, P. Hernandez-Leal, M. Kaisers, T. Pinto, Z. Vale, Local energy markets: paving the path toward fully transactive energy systems, IEEE T. Power Syst. 34 (5) (Sept. 2019) 4081-4088.

[47]

T.P. Lillicrap, J.J. Hunt, A. Pritzel, et al., Continuous control with deep reinforcement learning [Online]. Available, https://arxiv.org/abs/1509.02971, July 2019.

[48]

J.N. Foerster, G. Farquhar, T. Afouras, N. Nardelli, S. Whiteson, Counterfactual multi-agent policy gradients, in: Proc. of the 32nd AAAI Conf. on Artificial Intelligence, New Orleans, USA, (2018), pp. 2974-2982.

[49]

A. Vaswani, N. Shazeer, N. Parmar, et al., Attention is all you need, in: Proc. of the 31st Intl. Conf. on Neural Information Processing Systems, Long Beach, USA, (2017), pp. 6000-6010.

[50]

J.-C. Jiang, C. Dun, T.-J. Huang, Z.-Q. Lu, Graph convolutional reinforcement learning, in: Proc. of the 8th Intl. Conf. on Learning Representations, Addis Ababa, Ethiopia, (2020), pp. 1-13.

[51]

S. Iqbal, F. Sha, Actor-attention-critic for multi-agent reinforcement learning, in: Proc. of the 36th Intl. Conf. on Machine Learning, Long Beach, USA, (2019), pp. 2961-2970.

[52]

T. Schaul, J. Quan, I. Antonoglou, D. Silver, Prioritized experience replay [Online]. Available, https://arxiv.org/abs/1511.05952, February 2016.

[53]

M. Plappert, R. Houthooft, P. Dhariwal, et al., Parameter space noise for exploration, in: Proc. of the 6th Intl. Conf. on Learning Representations, Vancouver, Canada, (2018), pp. 1-18.

[54]

G.E. Uhlenbeck, L.S. Ornstein, On the theory of the Brownian motion, Phys. Rev. 36 (5) (Sept. 1930) 823-841.

[55]

D.P. Kingma, J. Ba, Adam: a method for stochastic optimization, in: Proc. of the 3rd Intl. Conf. on Learning Representations, San Diego, USA, (2015), p. 6

[56]

T.P. Lillicrap, J.J. Hunt, A. Pritzel, et al., Continuous control with deep reinforcement learning, in: Proc. of the 4th Intl. Conf. on Learning Representations, San Juan, Puerto Rico, (2016), pp. 1-14.

[57]

J. Nocedal, S.J. Wright, Numerical Optimization, (second ed.), Springer, New York, USA, (2006).

[58]

J. Achiam, D. Held, A. Tamar, P. Abbeel, Constrained policy optimization, in: Proc. of the 34th Intl. Conf. on Machine Learning, Sydney, Australia, (2017), pp. 22-31.

[59]

F. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, Safe model-based reinforcement learning with stability guarantees, in: Proc. of the 31st Intl. Conf. on Neural Information Processing Systems, Long Beach, USA, (2017), pp. 908-919.

[60]

Z. Sheebaelhamd, K. Zisis, A. Nisioti, D. Gkouletsos, D. Pavllo, J. Kohler, Safe deep reinforcement learning for multi-agent systems with continuous action spaces, arXiv Preprint, arXiv: 2108, (2021), Article 03952.

[61]

A.P. Dobos, PVWatts Version 5 Manual, National Renewable Energy Laboratory, Washington, USA, (2014).

[62]

C. Draxl, A. Clifton, B.-M. Hodge, J. McCaa, The wind integration national dataset (WIND) toolkit, Appl. Energ. 151 (Aug. 2015) 355-366.

[63]

S. Ong, N. Clark, Commercial and Residential Hourly Load Profiles for All TMY3 Locations in the United States, DOE Open Energy Data Initiative (OEDI), National Renewable Energy Lab. (NREL), Golden, USA, (2014).

[64]

R. Weron, Electricity price forecasting: a review of the state-of-the-art with a look into the future, Int. J. Forecasting 30 (4) (Oct.–Dec. 2014) 1030-1081.

[65]

L. Dong, S.-Q. Tu, Y. Li, T.-J. Pu, A stackelberg game model for dynamic pricing and energy management of multiple virtual power plants using metamodel-based optimization method, Power Syst. Technol. 44 (3) (Mar. 2020) 973-981.

[66]

A. Parisio, E. Rikos, L. Glielmo, A model predictive control approach to microgrid operation optimization, IEEE T. Contr. Syst. T. 22 (5) (Sept. 2014) 1813-1827.

[67]

S. Stock, D. Babazadeh, C. Becker, Applications of artificial intelligence in distribution power system operation, IEEE Access 9 (Nov. 2021) 150098-150119.

[68]

J. Snoek, H. Larochelle, R.P. Adams, Practical Bayesian optimization of machine learning algorithms, in: Proc. of the 25th Intl. Conf. on Neural Information Processing Systems, Lake Tahoe, USA, (2012), pp. 2951-2959.

[69]

P. Henderson, R. Islam, P. Bachman, J. Pineau, D. Precup, D. Meger, Deep reinforcement learning that matters, in: Proc. of the 32nd AAAI Conf. on Artificial Intelligence, New Orleans, USA, (2018), pp. 3207-3214.

[70]

P. Moritz, R. Nishihara, S. Wang, et al., Ray: a distributed framework for emerging AI applications, in: Proc. of the 13th USENIX Conf. on Operating Systems Design and Implementation, Carlsbad, USA, (2018), pp. 561-577.

[71]

M. Roesch, C. Linder, R. Zimmermann, A. Rudolf, A. Hohmann, G. Reinhart, Smart grid for industry using multi-agent reinforcement learning, Appl. Sci. 10 (19) (Oct. 2020) 1-20. 6900.

[72]

J.-J. Duan, D. Shi, R.-S. Diao, et al., Deep-reinforcement-learning-based autonomous voltage control for power grid operations, IEEE T. Power Syst. 35 (1) (Jan. 2020) 814-817.

[73]

B. Xu, W.-P. Luan, J. Yang, et al., Integrated three-stage decentralized scheduling for virtual power plants: a model-assisted multi-agent reinforcement learning method, Appl. Energ. 376 (Dec. 2024) 1-16. 123985.

[74]

Z.-D. Zhang, D.-X. Zhang, R.C. Qiu, Deep reinforcement learning for power system applications: an overview, CSEE J. Power Energy 6 (1) (Mar. 2019) 213-225.

[75]

B. McMahan, E. Moore, D. Ramage, S. Hampson, B.A.Y. Arcas, Communication-efficient learning of deep networks from decentralized data, in: Proc. of the 20th Intl. Conf. on Artificial Intelligence and Statistics, Fort Lauderdale, USA, (2017), pp. 1273-1282.

[76]

A. Gupta, C. Devin, Y.-X. Liu, P. Abbeel, S. Levine, Learning invariant feature spaces to transfer skills with reinforcement learning, in: Proc. of the 5th Intl. Conf. on Learning Representations, Toulon, France, (2017), pp. 1-14.

PDF (1494KB)

702

Accesses

0

Citation

Detail

Sections
Recommended

/