Joint Optimization of Passenger Flow Control and Train Skip-Stopping for Overcrowded Metro Lines: A Multi-Agent Reinforcement Learning Approach

Siqiao Xing; Jianyuan Guo; Yong Qin; Limin Jia; Yueyue Wang; Shuning Jiang

doi:10.1007/s40864-025-00265-5

Urban Rail Transit ›› :1 -29. DOI: 10.1007/s40864-025-00265-5

Original Research Papers

research-article

Joint Optimization of Passenger Flow Control and Train Skip-Stopping for Overcrowded Metro Lines: A Multi-Agent Reinforcement Learning Approach

Author information +

History +

PDF

Abstract

During rush hours, the capacity of metro in megacities is insufficient to meet the travel demand, resulting in oversaturation and high risk on platform in stations, especially transfer stations. This paper addresses this problem through the joint optimization of some operational interventions, aiming to alleviate passenger overloads while maintaining travel efficiency. To make the model more realistic, the stochastic characteristics of passengers are considered, including the probability distribution of passenger arrival time, inbound and transfer walking times. To provide a high-quality solution for the complex constraint model, three cooperative agents—governing passenger inflow, transfer flows, and train skip-stopping mode—are architected within improved Double Deep Q learning Network (IDDQN) to form a multi-agent reinforcement learning solution. Empirical validation on Beijing Metro Line 13 and Changping Line demonstrates that the multi-agent framework proposed in this paper can eliminate 100% of passenger over-limit flow while reducing the average waiting time of passengers. It also has a significant improvement in reducing stochastic characteristic impact and accelerating convergence.

Keywords

Urban rail transit / Multi-agent reinforcement learning / Train skip-stopping mode / Passenger flow control / Joint optimization

Cite this article

Download citation ▾

Siqiao Xing, Jianyuan Guo, Yong Qin, Limin Jia, Yueyue Wang, Shuning Jiang. Joint Optimization of Passenger Flow Control and Train Skip-Stopping for Overcrowded Metro Lines: A Multi-Agent Reinforcement Learning Approach. Urban Rail Transit 1-29 DOI:10.1007/s40864-025-00265-5

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Yin Yet al.. Maximizing network utility while considering proportional fairness for rail transit systems: jointly optimizing passenger allocation and vehicle schedules. Transp Res Part C: Emerg Technol, 2022

[2]	Li S, Xu R, Han K. Demand-oriented train services optimization for a congested urban rail line: integrating short turning and heterogeneous headways. Transp A Transp Sci, 2019, 15(2): 1459-1486

[3]	Wong RCWet al.. Optimizing timetable synchronization for rail mass transit. Transp Sci, 2008, 42(1): 57-69

[4]	Xu Xet al.. Capacity-oriented passenger flow control under uncertain demand: algorithm development and real-world case study. Transp Res Part E-Logist Transp Rev, 2016, 87: 130-148

[5]	Shang Pet al.. Equity-oriented skip-stopping schedule optimization in an oversaturated urban rail transit network. Transp Res Part C-Emerg Technol, 2018, 89: 321-343

[6]	Zhang T, Li D, Song Y. Urban rail transit congestion management: coordinated optimization of passenger guidance and train scheduling with skip-stop patterns. J Transp Eng, Part A: Syst, 2025, 151(2): 04024101

[7]	Jiang Zet al.. Q-learning approach to coordinated optimization of passenger inflow control with train skip-stopping on a urban rail transit line. Comput Ind Eng, 2019, 127: 1131-1142

[8]	Dou Fet al.. A cloud model-based method for passenger flow control at subway stations: a real-world case study. J Intell Fuzzy Syst, 2023, 44(4): 6103-6115

[9]	Hu Yet al.. Robust metro train scheduling integrated with skip-stop pattern and passenger flow control strategy under uncertain passenger demands. Comput Oper Res, 2023

[10]	Liang Jet al.. Reducing passenger waiting time in oversaturated metro lines with passenger flow control policy. Omega, 2023

[11]	Lu Yet al.. Robust collaborative passenger flow control on a congested metro line: a joint optimization with train timetabling. Transp Res Part B-Methodol, 2023, 168: 27-55

[12]	Lu Yet al.. A distributionally robust optimization method for passenger flow control strategy and train scheduling on an urban rail transit line. Engineering, 2022, 12: 202-220

[13]	Shi Jet al.. Flexible train capacity allocation for an overcrowded metro line: a new passenger flow control approach. Transp Res Part C: Emerg Technol, 2022

[14]	Yuan Yet al.. Real-time optimization of train regulation and passenger flow control for urban rail transit network under frequent disturbances. Transp Res Part E: Logist Transp Rev, 2022

[15]	Yang Ret al.. Research on coordinated passenger inflow control for the urban rail transit network based on the station-to-line spatial-temporal relationship. J Adv Transp, 2022

[16]	Liu Jet al.. A queuing network simulation optimization method for coordination control of passenger flow in urban rail transit stations. Neural Comput Appl, 2021, 33(17): 10935-10959

[17]	Li X, Bai Y, Su K. Collaborative passenger flow control of urban rail transit network considering balanced distribution of passengers. Mod Phys Lett B, 2021

[18]	Wang Yet al.. Sudden passenger flow characteristics and congestion control based on intelligent urban rail transit network. Neural Comput Appl, 2022, 34(9): 6615-6624

[19]	Cao Zet al.. Robust and optimized urban rail timetabling using a marshaling plan and skip-stop operation. Transp A-Transport Sci, 2020, 16(3): 1217-1249

[20]	Li Zet al.. Asymmetric demand oriented train service design with skip-stop tactics for a metro line. Expert Syst Appl, 2025

[21]	Parbo J, Nielsen OA, Prato CG. Reducing passengers' travel time by optimising stopping patterns in a large-scale network: a case-study in the Copenhagen region. Transp Res Part A-Policy Pract, 2018, 113: 197-212

[22]	Wu Yet al.. Mitigating unfairness in urban rail transit operation: a mixed-integer linear programming approach. Transp Res Part B-Methodol, 2021, 149: 418-442

[23]	Cheng Let al.. Skip-stop operation plan for urban rail transit considering bounded rationality of passengers. IET Intell Transp Syst, 2022, 16(1): 24-40

[24]	Niu H, Zhou X, Gao R. Train scheduling for minimizing passenger waiting time with time-dependent demand and skip-stop patterns: nonlinear integer programming models with linear constraints. Transp Res Part B-Methodol, 2015, 76: 117-135

[25]	Li Set al.. Joint optimal train regulation and passenger flow control strategy for high-frequency metro lines. Transp Res Part B-Methodol, 2017, 99: 113-137

[26]	Xie Jet al.. Passenger and energy-saving oriented train timetable and stop plan synchronization optimization model. Transp Res Part D: Transp Environ, 2021, 98(11 102975

[27]	Zhang Tet al.. Novel empty train return strategy and passenger control strategy to satisfy asymmetric passenger demand: a joint optimization with train timetabling. Comput Ind Eng, 2023, 181 109299

[28]	Yuan Fet al.. An integrated optimization approach for passenger flow control strategy and metro train scheduling considering skip-stop patterns in special situations. Appl Math Model, 2023, 118: 412-436

[29]	Jiang Zet al.. Reinforcement learning approach for coordinated passenger inflow control of urban rail transit in peak hours. Transp Res Part C-Emerg Technol, 2018, 88: 1-16

[30]	Hu Yet al.. Computationally efficient train timetable generation of metro networks with uncertain transfer walking time to reduce passenger waiting time: a generalized Benders decomposition-based method. Transp Res Part B Methodol, 2022, 163: 210-231

[31]	Wu Yet al.. A stochastic optimization model for transit network timetable design to mitigate the randomness of traveling time by adding slack time. Transp Res Part C-Emerg Technol, 2015, 52: 15-31

[32]	Yang Xet al.. A stochastic model for the integrated optimization on metro timetable and speed profile with uncertain train mass. Transp Res Part B Methodol, 2016, 91: 424-445

[33]	Liu L, Dessouky M. Stochastic passenger train timetabling using a branch and bound approach. Comput Ind Eng, 2019, 127: 1223-1240

[34]	Wang L, Yan X, Wang Y. Modeling and optimization of collaborative passenger control in urban rail stations under mass passenger flow. Math Probl Eng, 2015

[35]	Yuan Fet al.. Passenger flow control strategies for urban rail transit networks. Appl Math Model, 2020, 82: 168-188

[36]	Ying C, Chow AHF, Chin K. An actor-critic deep reinforcement learning approach for metro train scheduling with rolling stock circulation under stochastic demand. Transp Res Part B Methodol, 2020, 140: 210-235

[37]	Bertsekas D (2019) Reinforcement learning and optimal control. Vol. 1. Athena Scientific.

[38]	Powell WB. Approximate dynamic programming: solving the curses of dimensionality, 2007, Hoboken, Wiley 703

[39]	Chen Xet al.. Research on ATO control method for urban rail based on deep reinforcement learning. IEEE Access, 2023, 11: 5919-5928

[40]	Kieu LM, Cai C. Stochastic collective model of public transport passenger arrival process. IET Intell Transp Syst, 2018, 129): 1027-1035

[41]	Chen Yet al.. Dynamic data-driven computation method for the number of waiting passengers and waiting time in the urban rail transit network. IET Intell Transp Syst, 2023, 17(1): 165-179

[42]	Qi Xet al.. Deep reinforcement learning enabled self-learning control for energy efficient driving. Transp Res Part C-Emerg Technol, 2019, 99: 67-81

[43]	Hessel M et al. (2018) Rainbow: combining improvements in deep reinforcement learning. In: Thirty-second aaai conference on artificial intelligence. p 3215-3222

[44]	Van Hasselt H, Guez A and Silver D (2016) Deep reinforcement learning with double Q-learning. In: Thirtieth AAAI conference on artificial intelligence, p 2094-2100

[45]	Luo S, Zhang L, Fan Y. Dynamic multi-objective scheduling for flexible job shop by deep reinforcement learning. Comput Ind Eng, 2021

[46]	Tong Zet al.. DDQN-TS: a novel bi-objective intelligent scheduling algorithm in the cloud environment. Neurocomputing, 2021, 455: 419-430

[47]	Zhang Yet al.. A deep reinforcement learning based hyper-heuristic for combinatorial optimisation with uncertainties. Eur J Oper Res, 2022, 300(2): 418-427

[48]	Park J, Han S. Reinforcement learning with multimodal advantage function for accurate advantage estimation in robot learning. Eng Appl Artif Intell, 2023

[49]	Babaeizadeh M et al. (2016) Reinforcement learning through asynchronous advantage actor-critic on a gpu. arXiv:1611.06256

[50]	Liu H, Wu Y, Sun F. Extreme trust region policy optimization for active object recognition. IEEE Trans Neural Netw Learn Syst, 2018, 29(6): 2253-2258

[51]	Schulman J (2015) Trust region policy optimization. arXiv:1502.05477

[52]	Zimmer M and Weng P (2019) Exploiting the sign of the advantage function to learn deterministic policies in continuous domains. In: Proceedings of the twenty-eighth international joint conference on artificial intelligence, p 4496-4502

[53]	Schaul T (2015) Prioritized experience replay. arXiv:1511.05952

[54]	Zhang Jet al.. Predicting citywide crowd flows using deep spatio-temporal residual networks. Artif Intell, 2018, 259: 147-166

Funding

Science and Technology Program of Beijing, China(Nos.Z211100004121013)

RIGHTS & PERMISSIONS

The Author(s)

PDF

Accesses

Citation

Detail

Sections

Recommended

About the journal

Aims & scope

Description

Editorial board

Abstracting / indexing

Cover gallery

Contact us

Browse

Just accepted

Online first

Latest issue

All volumes and issues

Most accessed

Most cited

Authors & reviewers

Online submisson