Condition-based maintenance via Markov decision processes: A review

Xiujie ZHAO , Piao CHEN , Loon Ching TANG

Front. Eng ›› 2025, Vol. 12 ›› Issue (2) : 330 -342.

PDF (1063KB)
Front. Eng ›› 2025, Vol. 12 ›› Issue (2) : 330 -342. DOI: 10.1007/s42524-024-4130-7
Industrial Engineering and Intelligent Manufacturing
REVIEW ARTICLE

Condition-based maintenance via Markov decision processes: A review

Author information +
History +
PDF (1063KB)

Abstract

The optimization of condition-based maintenance (CBM) poses challenges due to the rapid advancement of monitoring technologies. Traditional CBM research has mainly relied on theory-driven approaches, which lead to the development of several effective maintenance models characterized by their wide applicability and attractiveness. However, when the system reliability model becomes complex, such methods may run into intractable cost models. The Markov decision process (MDP), a classic framework for sequential decision making, has drawn increasing attention for optimization of CBM optimization due to its appealing tractability and pragmatic applicability across different problems. This paper presents a review of research that optimizes CBM policies using MDP, with a focus on mathematical modeling and optimization methods to enable action. We have organized the review around several key components that are subject to similar mathematical modeling constraints, including system complexity, the availability of system conditions, and diverse criteria of decision-makers. An increase in interest has led to the optimization of CBM for systems possessing increasing numbers of components and sensors. Then, the review focuses on joint optimization problems with CBM. Finally, as an important extension to traditional MDPs, reinforcement learning (RL) based methods are also reviewed as ways to optimize CBM policies. This paper provides significant background research for researchers and practitioners working in reliability and maintenance management, and gives discussions on possible future research directions.

Graphical abstract

Keywords

reliability / degradation modeling / dynamic programming / reinforcement learning / sequential decision problems.

Cite this article

Download citation ▾
Xiujie ZHAO, Piao CHEN, Loon Ching TANG. Condition-based maintenance via Markov decision processes: A review. Front. Eng, 2025, 12(2): 330-342 DOI:10.1007/s42524-024-4130-7

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Abeygunawardane S K, Jirutitijaroen P, Xu H, (2013). Adaptive maintenance policies for aging devices using a Markov decision process. IEEE Transactions on Power Systems, 28( 3): 3194–3203

[2]

Alaswad S, Xiang Y, (2017). A review on condition-based maintenance optimization models for stochastically deteriorating system. Reliability Engineering & System Safety, 157: 54–63

[3]

Albright S C, (1979). Structural results for partially observable Markov decision processes. Operations Research, 27( 5): 1041–1053

[4]

Andersen J F, Andersen A R, Kulahci M, Nielsen B F, (2022). A numerical study of Markov decision process algorithms for multi-component replacement problems. European Journal of Operational Research, 299( 3): 898–909

[5]

Ao Y, Zhang H, Wang C, (2019). Research of an integrated decision model for production scheduling and maintenance planning with economic objective. Computers & Industrial Engineering, 137: 106092

[6]

ArcieriGHoelzl CSchweryOStraubDPapakonstantinou K GChatziE (2024). POMDP inference and robust solution via deep reinforcement learning: An application to railway optimal maintenance. Machine Learning, 1–29

[7]

Aramon Bajestani M, Banjevic D, Beck J C, (2014). Integrated maintenance planning and production scheduling with Markovian deteriorating machine conditions. International Journal of Production Research, 52( 24): 7377–7400

[8]

Barde S R A, Yacout S, Shin H, (2019). Optimal preventive maintenance policy based on reinforcement learning of a fleet of military trucks. Journal of Intelligent Manufacturing, 30( 1): 147–161

[9]

Batun S, Maillart L M, (2012). Reassessing tradeoffs inherent to simultaneous maintenance and production planning. Production and Operations Management, 21( 2): 396–403

[10]

Ben-Daya M, Duffuaa S O, (1995). Maintenance and quality: The missing link. Journal of Quality in Maintenance Engineering, 1( 1): 20–26

[11]

Borrero J S, Akhavan-Tabatabaei R, (2013). Time and inventory dependent optimal maintenance policies for single machine workstations: An MDP approach. European Journal of Operational Research, 228( 3): 545–555

[12]

Celen M, Djurdjanovic D, (2020). Integrated maintenance and operations decision making with imperfect degradation state observations. Journal of Manufacturing Systems, 55: 302–316

[13]

Chai X, Kilic O A, Veldman J, Teunter R H, Zhao X, (2024). Condition-based reallocation and maintenance for a 1-out-of-2 pairs balanced system. European Journal of Operational Research, 318( 2): 618–628

[14]

Chan G K, Asgarpoor S, (2006). Optimum maintenance policy with Markov processes. Electric Power Systems Research, 76( 6–7): 452–456

[15]

Chen D, Trivedi K S, (2005). Optimization for condition-based maintenance with semi-Markov decision process. Reliability Engineering & System Safety, 90( 1): 25–29

[16]

Chen N, Ye Z S, Xiang Y, Zhang L, (2015). Condition-based maintenance using the inverse Gaussian degradation model. European Journal of Operational Research, 243( 1): 190–199

[17]

Cheng J, Liu Y, Li W, Li T, (2023). Deep reinforcement learning for cost-optimal condition-based maintenance policy of offshore wind turbine components. Ocean Engineering, 283: 115062

[18]

Cheung W C, Simchi-Levi D, Zhu R, (2021). Hedging the drift: Learning to optimize under nonstationarity. Management Science, 68( 3): 1696–1713

[19]

Cui L, Gao H, Mo Y, (2018). Reliability for k-out-of-n : F balanced systems with m sectors. IISE Transactions, 50( 5): 381–393

[20]

de Jonge B, Scarf P A, (2020). A review on maintenance optimization. European Journal of Operational Research, 285( 3): 805–824

[21]

Deep A, Zhou S, Veeramani D, Chen Y, (2023). Partially observable Markov decision process-based optimal maintenance planning with time-dependent observations. European Journal of Operational Research, 311( 2): 533–544

[22]

Do P, Nguyen V T, Voisin A, Iung B, Neto W A F, (2024). Multi-agent deep reinforcement learning-based maintenance optimization for multi-dependent component systems. Expert Systems with Applications, 245: 123144

[23]

Drent C, Drent M, van Houtum G J, (2024). Optimal data pooling for shared learning in maintenance operations. Operations Research Letters, 52: 107056

[24]

Duan C, Deng T, Song L, Wang M, Sheng B, (2023). An adaptive reliability-based maintenance policy for mechanical systems under variable environments. Reliability Engineering & System Safety, 238: 109396

[25]

Durango-Cohen P L, Sarutipand P, (2009). Maintenance optimization for transportation systems with demand responsiveness. Transportation Research Part C, Emerging Technologies, 17( 4): 337–348

[26]

Farahani A, Tohidi H, (2021). Integrated optimization of quality and maintenance: A literature review. Computers & Industrial Engineering, 151: 106924

[27]

Ferreira Neto W A, Virgínio Cavalcante C A, Do P, (2024). Deep reinforcement learning for maintenance optimization of a scrap-based steel production line. Reliability Engineering & System Safety, 249: 110199

[28]

Gan S, Hu H, Coit D W, (2023). Maintenance optimization considering the mutual dependence of the environment and system with decreasing effects of imperfect maintenance. Reliability Engineering & System Safety, 235: 109202

[29]

Geurtsen M, Didden J B H C, Adan J, Atan Z, Adan I, (2023). Production, maintenance and resource scheduling: A review. European Journal of Operational Research, 305( 2): 501–529

[30]

Glazebrook K D, Mitchell H M, Ansell P S, (2005). Index policies for the maintenance of a collection of machines by a set of repairmen. European Journal of Operational Research, 165( 1): 267–284

[31]

Goby N, Brandt T, Neumann D, (2023). Deep reinforcement learning with combinatorial actions spaces: An application to prescriptive maintenance. Computers & Industrial Engineering, 179: 109165

[32]

Gosavi A, (2006). A risk-sensitive approach to total productive maintenance. Automatica, 42( 8): 1321–1330

[33]

Grall A, Dieulle L, Berenguer C, Roussignol M, (2002). Continuous-time predictive-maintenance scheduling for a deteriorating system. IEEE Transactions on Reliability, 51( 2): 141–150

[34]

Guo C, Liang Z, (2022a). Semi-Markovian maintenance optimization for reinforced concrete enabled by a synthesized deterioration model. IEEE Transactions on Reliability, 71( 4): 1577–1589

[35]

Guo C, Liang Z, (2022b). A predictive Markov decision process for optimizing inspection and maintenance strategies of partially observable multi-state systems. Reliability Engineering & System Safety, 226: 108683

[36]

Guo J, Elsayed E A, (2019). Reliability of balanced multi-level unmanned aerial vehicles. Computers & Operations Research, 106: 1–13

[37]

Hamida Z, Goulet J A, (2023). Hierarchical reinforcement learning for transportation infrastructure maintenance planning. Reliability Engineering & System Safety, 235: 109214

[38]

Hao S, Zheng J, Yang J, Sun H, Zhang Q, Zhang L, Jiang N, Li Y, (2023). Deep reinforce learning for joint optimization of condition-based maintenance and spare ordering. Information Sciences, 634: 85–100

[39]

Hao Y, Zhu X, Kuo W, (2024). Optimization of condition-based maintenance with multiple times of component reallocation using Markov decision process. IEEE Transactions on Reliability, 73( 1): 131–141

[40]

Hoffman M, Song E, Brundage M P, Kumara S, (2022). Online improvement of condition-based maintenance policy via monte carlo tree search. IEEE Transactions on Automation Science and Engineering, 19( 3): 2540–2551

[41]

Hong Y, Duan Y, Meeker W Q, Stanley D L, Gu X, (2015). Statistical methods for degradation data with dynamic covariates information and an application to outdoor weathering data. Technometrics, 57( 2): 180–193

[42]

HowardR A (1960). Dynamic programming and Markov processes. MIT Press

[43]

Hu J, Huang Y, Shen L, (2024). Maintenance optimization of a two-component series system considering masked causes of failure. Quality and Reliability Engineering International, 40( 1): 388–405

[44]

Hu J, Sun Q, Ye Z S, (2022). Replacement and repair optimization for production systems under random production waits. IEEE Transactions on Reliability, 71( 4): 1488–1500

[45]

Hu J, Wang H, Tang H K, Kanazawa T, Gupta C, Farahat A, (2023). Knowledge-enhanced reinforcement learning for multi-machine integrated production and maintenance scheduling. Computers & Industrial Engineering, 185: 109631

[46]

Ivy J S, Pollock S M, (2005). Marginally monotonic maintenance policies for a multi-state deteriorating machine with probabilistic monitoring, and silent failures. IEEE Transactions on Reliability, 54( 3): 489–497

[47]

Jin H, Song X, Xia H, (2023). Optimal maintenance strategy for large-scale production systems under maintenance time uncertainty. Reliability Engineering & System Safety, 240: 109594

[48]

Kang K, Subramaniam V, (2018). Integrated control policy of production and preventive maintenance for a deteriorating manufacturing system. Computers & Industrial Engineering, 118: 266–277

[49]

Karabağ O, BulutÖ Toy A Ö Fadıloğlu M M, (2024). An efficient procedure for optimal maintenance intervention in partially observable multi-component systems. Reliability Engineering & System Safety, 244: 109914

[50]

Kıvançİ Özgür-Ünlüakın D, Bilgiç T, (2022). Maintenance policy analysis of the regenerative air heater system using factored POMDPs. Reliability Engineering & System Safety, 219: 108195

[51]

Koopmans M, de Jonge B, (2023). Condition-based maintenance and production speed optimization under limited maintenance capacity. Computers & Industrial Engineering, 179: 109155

[52]

Kuhn K D, Madanat S M, (2005). Model uncertainty and the management of a system of infrastructure facilities. Transportation Research Part C, Emerging Technologies, 13( 5–6): 391–404

[53]

Kuo Y, (2006). Optimal adaptive control policy for joint machine maintenance and product quality control. European Journal of Operational Research, 171( 2): 586–597

[54]

Lee J S, Yeo I H, Bae Y, (2024). A stochastic track maintenance scheduling model based on deep reinforcement learning approaches. Reliability Engineering & System Safety, 241: 109709

[55]

Li M, Wu B, (2024). Optimal condition-based opportunistic maintenance policy for two-component systems considering common cause failure. Reliability Engineering & System Safety, 250: 110269

[56]

Li S, Yang Z, He J, Li G, Yang H, Liu T, Li J, (2023). A novel maintenance strategy for manufacturing system considering working schedule and imperfect maintenance. Computers & Industrial Engineering, 185: 109656

[57]

Li Y, Peng S, Li Y, Jiang W, (2020). A review of condition-based maintenance: Its prognostic and operational aspects. Frontiers of Engineering Management, 7( 3): 323–334

[58]

Liang Z, Parlikad A K, (2020). Predictive group maintenance for multi-system multi-component networks. Reliability Engineering & System Safety, 195: 106704

[59]

Lin S, Fan R, Feng D, Yang C, Wang Q, Gao S, (2022). Condition-Based maintenance for traction power supply equipment based on partially observable Markov decision process. IEEE Transactions on Intelligent Transportation Systems, 23( 1): 175–189

[60]

Liu X, Sun Q, Ye Z S, Yildirim M, (2021). Optimal multi-type inspection policy for systems with imperfect online monitoring. Reliability Engineering & System Safety, 207: 107335

[61]

Liu Y, Chen Y, Jiang T, (2020). Dynamic selective maintenance optimization for multi-state systems over a finite horizon: A deep reinforcement learning approach. European Journal of Operational Research, 283( 1): 166–181

[62]

Luo Y, Zhao X, Liu B, He S, (2024). Condition-based maintenance policy for systems under dynamic environment. Reliability Engineering & System Safety, 246: 110072

[63]

Lv Y, Guo X, Zhou Q, Qian L, Liu J, (2023). Predictive maintenance decision-making for variable faults with non-equivalent costs of fault severities. Advanced Engineering Informatics, 56: 102011

[64]

Mahmoodi S, Hamed Ranjkesh S, Zhao Y Q, (2020). Condition-based maintenance policies for a multi-unit deteriorating system subject to shocks in a semi-Markov operating environment. Quality Engineering, 32( 3): 286–297

[65]

Mikhail M, Ouali M S, Yacout S, (2024). A data-driven methodology with a nonparametric reliability method for optimal condition-based maintenance strategies. Reliability Engineering & System Safety, 241: 109668

[66]

Mohammadi R, He Q, (2022). A deep reinforcement learning approach for rail renewal and maintenance planning. Reliability Engineering & System Safety, 225: 108615

[67]

Morato P G, Papakonstantinou K G, Andriotis C P, Nielsen J S, Rigo P, (2022). Optimal inspection and maintenance planning for deteriorating structural components through dynamic Bayesian networks and Markov decision processes. Structural Safety, 94: 102140

[68]

Nguyen K T P, Do P, Huynh K T, Bérenguer C, Grall A, (2019). Joint optimization of monitoring quality and replacement decisions in condition-based maintenance. Reliability Engineering & System Safety, 189: 177–195

[69]

Ogunfowora O, Najjaran H, (2023). Reinforcement and deep reinforcement learning-based solutions for machine maintenance planning, scheduling policies, and optimization. Journal of Manufacturing Systems, 70: 244–263

[70]

Okogbaa O G, Otieno W, Peng X, Jain S, (2008). Transient analysis of maintenance intervention of continuous multi-unit systems. IIE Transactions, 40( 10): 971–983

[71]

Olde Keizer M C A, Flapper S D P, Teunter R H, (2017a). Condition-based maintenance policies for systems with multiple dependent components: A review. European Journal of Operational Research, 261( 2): 405–420

[72]

Olde Keizer M C A, Teunter R H, Veldman J, (2017b). Joint condition-based maintenance and inventory optimization for systems with multiple components. European Journal of Operational Research, 257( 1): 209–222

[73]

Paraschos P D, Koulinas G K, Koulouriotis D E, (2020). Reinforcement learning for combined production-maintenance and quality control of a manufacturing system with deterioration failures. Journal of Manufacturing Systems, 56: 470–483

[74]

Peng S, Feng Q, (2021). Reinforcement learning with Gaussian processes for condition-based maintenance. Computers & Industrial Engineering, 158: 107321

[75]

Ruan J H, Wang Z X, Chan F T S, Patnaik S, Tiwari M K, (2021). A reinforcement learning-based algorithm for the aircraft maintenance routing problem. Expert Systems with Applications, 169: 114399

[76]

ShakedMShanthikumar J G (2007). Stochastic orders. New York: Springer

[77]

Su J, Huang J, Adams S, Chang Q, Beling P A, (2022). Deep multi-agent reinforcement learning for multi-level preventive maintenance in manufacturing systems. Expert Systems with Applications, 192: 116323

[78]

Sun Q, Chen P, Wang X, Ye Z, (2023). Robust condition-based production and maintenance planning for degradation management. Production and Operations Management, 32( 12): 3951–3967

[79]

Sun Q, Ye Z, Chen N, (2018). Optimal Inspection and replacement policies for multi-unit systems subject to degradation. IEEE Transactions on Reliability, 67( 1): 401–413

[80]

TangXXiao HKouGXiangY (2024). Joint optimization of condition-based maintenance and spare parts ordering for a hidden multi-state deteriorating system. IEEE Transactions on Reliability, 1–12

[81]

Tijms H C, van der Duyn Schouten F A, (1985). A Markov decision algorithm for optimal inspections and revisions in a maintenance system with partial information. European Journal of Operational Research, 21( 2): 245–253

[82]

Tseremoglou I, Santos B F, (2024). Condition-based maintenance scheduling of an aircraft fleet under partial observability: A deep reinforcement learning approach. Reliability Engineering & System Safety, 241: 109582

[83]

uit het Broek M A J, Teunter R H, de Jonge B, Veldman J, (2021). Joint condition-based maintenance and load-sharing optimization for two-unit systems with economic dependency. European Journal of Operational Research, 295( 3): 1119–1131

[84]

uit het Broek M A J, Teunter R H, de Jonge B, Veldman J, Van Foreest N D, (2020). Condition-based production planning: Adjusting production tates to balance output and failure risk. Manufacturing & Service Operations Management, 22: 792–811

[85]

Van Horenbeek A, Buré J, Cattrysse D, Pintelon L, Vansteenwegen P, (2013). Joint maintenance and inventory optimization systems: A review. International Journal of Production Economics, 143( 2): 499–508

[86]

van Staden H E, Boute R N, (2021). The effect of multi-sensor data on condition-based maintenance policies. European Journal of Operational Research, 290( 2): 585–600

[87]

Vora M, Thangeda P, Grussing M N, Ornik M, (2023). Welfare maximization algorithm for solving budget-constrained multi-component POMDPs. IEEE Control Systems Letters, 7: 1736–1741

[88]

Wang J, Liu H, Lin T, (2023a). Optimal rearrangement and preventive maintenance policies for heterogeneous balanced systems with three failure modes. Reliability Engineering & System Safety, 238: 109429

[89]

Wang J, Wang Y, Fu Y, (2023b). Joint optimization of condition-based maintenance and performance control for linear multi-state consecutively connected systems. Mathematics, 11( 12): 2724

[90]

Wang J, Zhu X, (2021). Joint optimization of condition-based maintenance and inventory control for a k-out-of-n:F system of multi-state degrading components. European Journal of Operational Research, 290( 2): 514–529

[91]

Wang S, Zhao X, Wu C, Wang X, (2023c). Joint optimization of multi-stage component reassignment and preventive maintenance for balanced systems considering imperfect maintenance. Reliability Engineering & System Safety, 237: 109367

[92]

Wang X, Zhao X, Wang S, Sun L, (2020). Reliability and maintenance for performance-balanced systems operating in a shock environment. Reliability Engineering & System Safety, 195: 106705

[93]

Wei Z, Zhao Z, Zhou Z, Ren J, Tang Y, Yan R, (2024). A deep reinforcement learning-driven multi-objective optimization and its applications on aero-engine maintenance strategy. Journal of Manufacturing Systems, 74: 316–328

[94]

Xia L, Zhao Q, Jia Q S, (2008). A structure property of optimal policies for maintenance problems with safety-critical components. IEEE Transactions on Automation Science and Engineering, 5( 3): 519–531

[95]

Xu J, Liang Z, Li Y F, Wang K, (2021). Generalized condition-based maintenance optimization for multi-component systems considering stochastic dependency and imperfect maintenance. Reliability Engineering & System Safety, 211: 107592

[96]

Xu J, Zhao X, Liu B, (2022). A risk-aware maintenance model based on a constrained Markov decision process. IISE Transactions, 54( 11): 1072–1083

[97]

Xu M, Jin X, Kamarthi S, Noor-E-Alam M, (2018). A failure-dependency modeling and state discretization approach for condition-based maintenance optimization of multi-component systems. Journal of Manufacturing Systems, 47: 141–152

[98]

Ye Z, Cai Z, Yang H, Si S, Zhou F, (2023). Joint optimization of maintenance and quality inspection for manufacturing networks based on deep reinforcement learning. Reliability Engineering & System Safety, 236: 109290

[99]

Zhang C, Li Y F, Coit D W, (2023a). Deep reinforcement learning for dynamic opportunistic maintenance of multi-component systems with load sharing. IEEE Transactions on Reliability, 72( 3): 863–877

[100]

Zhang J, Zhao X, Song Y, Qiu Q, (2022). Joint optimization of condition-based maintenance and spares inventory for a series–parallel system with two failure modes. Computers & Industrial Engineering, 168: 108094

[101]

Zhang N, Deng Y, Liu B, Zhang J, (2023b). Condition-based maintenance for a multi-component system in a dynamic operating environment. Reliability Engineering & System Safety, 231: 108988

[102]

Zhang N, Si W, (2020). Deep reinforcement learning for condition-based maintenance planning of multi-component systems under dependent competing risks. Reliability Engineering & System Safety, 203: 107094

[103]

Zhang Z, Wu S, Li B, Lee S, (2013). Optimal maintenance policy for multi-component systems under Markovian environment changes. Expert Systems with Applications, 40( 18): 7391–7399

[104]

Zhao X, He Z, Wu Y, Qiu Q, (2022). Joint optimization of condition-based performance control and maintenance policies for mission-critical systems. Reliability Engineering & System Safety, 226: 108655

[105]

Zhao X, Liu B, Xu J, Wang X L, (2023). Imperfect maintenance policies for warranted products under stochastic performance degradation. European Journal of Operational Research, 308( 1): 150–165

[106]

Zhao X, Wang Z, (2022). Maintenance policies for two-unit balanced systems subject to degradation. IEEE Transactions on Reliability, 71( 2): 1116–1126

[107]

Zhao X, Wu C, Wang X, Sun J, (2020). Reliability analysis of k-out-of-n: F balanced systems with multiple functional sectors. Applied Mathematical Modelling, 82: 108–124

[108]

Zheng M, Su Z, Wang D, Pan E, (2024a). Joint maintenance and spare part ordering from multiple suppliers for multicomponent systems using a deep reinforcement learning algorithm. Reliability Engineering & System Safety, 241: 109628

[109]

Zheng M, Ye H, Wang D, Pan E, (2024b). Joint decisions of components replacement and spare parts ordering considering different supplied product quality. IEEE Transactions on Automation Science and Engineering, 21( 2): 1952–1964

[110]

Zheng R, Xing Y, Ren X, (2023). Multilevel preventive replacement for a system subject to internal deterioration, external shocks, and dynamic missions. Reliability Engineering & System Safety, 239: 109507

[111]

Zheng R, Zhou Y, Gu L, Zhang Z, (2021). Joint optimization of lot sizing and condition-based maintenance for a production system using the proportional hazards model. Computers & Industrial Engineering, 154: 107157

[112]

Zhou H, Li Y, (2023). Optimal replacement in a proportional hazards model with cumulative and dependent risks. Computers & Industrial Engineering, 176: 108930

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (1063KB)

1090

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/