PDF
Abstract
Piecewise-Deterministic Markov processes (PDMPs) are often used to model abrupt changes in the global environment or capabilities of a controlled system. This is typically done by considering a set of “operating modes” (each with its own system dynamics and performance metrics) and assuming that the mode can switch stochastically, while the system state evolves. Such models have a broad range of applications in engineering, economics, manufacturing, robotics, and biological sciences. Here, we introduce and analyze an “occasionally observed” version of mode-switching PDMPs. We show how such systems can be controlled optimally if the planner is not alerted to mode switches as they occur but may instead have access to infrequent mode observations. We first develop a general framework for handling this through dynamic programming on a higher dimensional mode-belief space. While quite general, this method is rarely practical due to the curse of dimensionality. We then discuss assumptions that allow for solving the same problem much more efficiently,with the computational costs growing linearly (rather than exponentially) with the number of modes. We use this approach to derive Hamilton-Jacobi-Bellman (HJB) PDEs and quasi-variational inequalities encoding the optimal behavior for a variety of planning horizons (fixed, infinite, indefinite, and random) and mode-observation schemes (at fixed times or on demand). We discuss the computational challenges associated with each version and illustrate the resulting methods on test problems from surveillance-evading path planning. We also include an example based on robotic navigation: a Mars rover that minimizes the expected time to target while accounting for the possibility of unobserved/incremental damages and dynamics-altering breakdowns.
Keywords
Piecewise-deterministic & switching processes
/
Partial observability
/
Stochastic optimal control
/
Hamilton-Jacobi-Bellman (HJB) PDEs
/
Path planning
/
49K45
/
49L20
/
60G35
/
49M25
/
60J25
Cite this article
Download citation ▾
Marissa Gee, Alexander Vladimirsky.
Occasionally Observed Piecewise-Deterministic Markov Processes.
Communications on Applied Mathematics and Computation 1-37 DOI:10.1007/s42967-025-00500-7
| [1] |
AkellaR, KumarP. Optimal control of production rate in a failure prone manufacturing system. IEEE Trans. Autom. Control, 1986, 31(2): 116-126.
|
| [2] |
AströmKJ. Optimal control of Markov decision processes with incomplete state estimation. J. Math. Anal. Appl., 1965, 10(1): 174-205.
|
| [3] |
Bardi, Martino, Capuzzo-Dolcetta, I.: Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations. Springer, New York (2008)
|
| [4] |
BarlesG, SouganidisPE. Convergence of approximation schemes for fully nonlinear second order equations. Asymptot. Anal., 1991, 4(3): 271-283
|
| [5] |
Basich, C., Peterson, J., Zilberstein, S.: Planning with intermittent state observability: knowing when to act blind. In: International Conference on Intelligent Robots and Systems (IROS), pp. 11657–11664. IEEE/RSJ (2022)
|
| [6] |
BäuerleN, LangeD. Optimal control of partially observable piecewise deterministic Markov processes. SIAM J. Control. Optim., 2018, 56(2): 1441-1462.
|
| [7] |
BenaïmM, LobryC. Lotka-Volterra with randomly fluctuating environments or “how switching between beneficial environments can make survival harder”. Ann. Appl. Probab., 2016, 26(6): 3754-3785.
|
| [8] |
BrandejskyA, de SaportaB, DufourF. Optimal stopping for partially observed piecewise-deterministic Markov processes. Stochas. Process. Appl., 2013, 123(8): 3201-3238.
|
| [9] |
BrowneCB, PowleyE, WhitehouseD, LucasSM, CowlingPI, RohlfshagenP, TavenerS, PerezD, SamothrakisS, ColtonS. A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intell. AI Games, 2012, 4(1): 1-43.
|
| [10] |
CarteeE, FarahA, NellisA, Van HookJ, VladimirskyA. Quantifying and managing uncertainty in piecewise-deterministic Markov processes. SIAM/ASA J. Uncertain. Quant., 2023, 11(3): 814-847.
|
| [11] |
CarteeE, VladimirskyA. Control-theoretic models of environmental crime. SIAM J. Appl. Math., 2020, 80(3): 1441-1466.
|
| [12] |
ChaconA, VladimirskyA. Fast two-scale methods for Eikonal equations. SIAM J. Sci. Comput., 2012, 34(2): A547-A578.
|
| [13] |
Christensen, P.R., Engle, E., Anwar, S., Dickenshied, S., Noss, D., Gorelick, N., Weiss-Malik, M.: JMARS — a planetary GIS. In: AGU Fall Meeting Abstracts, pp. IN22A–06 (2009)
|
| [14] |
CostaM. A piecewise deterministic model for a prey-predator community. Ann. Appl. Probab., 2016, 26(6): 3491-3530.
|
| [15] |
DavisM. Piecewise-deterministic Markov processes: a general class of non-diffusion stochastic models. J. R. Stat. Soc.: Ser. B (Methodol.), 1984, 46(3): 353-388.
|
| [16] |
Davis, M.H.A., Farid, M..: Piecewise-Deterministic Processes and Viscosity Solutions, pp. 249–268. Birkhäuser Boston, Boston, MA (1999)
|
| [17] |
Falcone, M., Ferretti, R.: Semi-Lagrangian Approximation Schemes for Linear and Hamilton-Jacobi Equations, volume 133. SIAM, Philadelphia (2014)
|
| [18] |
Gee, M., Gonzalez-Granda, N., Joshi, S., Rudrapatna, N., Somalwar, A., Ellner, S.P., Vladimirsky, A.: Navigating the Landscape of Fear. bioRxiv (2024)
|
| [19] |
GeeM, VladimirskyA. Optimal path-planning with random breakdowns. IEEE Control Syst. Lett., 2022, 6: 1658-1663.
|
| [20] |
HaurieA, MoresinoF. A stochastic control model of economic growth with environmental disaster prevention. Automatica, 2006, 42(8): 1417-1428.
|
| [21] |
HeningA, StricklerE. On a predator-prey system with random switching that never converges to its equilibrium. SIAM J. Math. Anal., 2019, 51(5): 3625-3640.
|
| [22] |
KaelblingLP, LittmanML, CassandraAR. Planning and acting in partially observable stochastic domains. Artif. Intell., 1998, 101(1): 99-134.
|
| [23] |
Kalman, R.E.: A new approach to linear filtering and prediction problems. J. Basic Eng. 82(1), 35–45 (1960)
|
| [24] |
KurniawatiH. Partially observable Markov decision processes and robotics. Ann. Rev. Control Robot. Autonom. Syst., 2022, 5(1): 253-277.
|
| [25] |
KushnerHJ, DupuisPGNumerical Methods for Stochastic Control Problems in Continuous Time, 1992, New York. Academic Press. .
|
| [26] |
Littman, M.L., Cassandra, A.R., Kaelbling, L.: Learning policies for partially observable environments: scaling up. In: Machine Learning Proceedings, pp. 362–370. Elsevier, Amsterdam (1995)
|
| [27] |
ObermanAM. Convergent difference schemes for degenerate elliptic and parabolic equations: Hamilton-Jacobi equations and free boundary problems. SIAM J. Numer. Anal., 2006, 44(2): 879-895.
|
| [28] |
Potter, S.F., Cameron, M.K.: Ordered line integral methods for solving the Eikonal equation. J. Sci. Comput. 81(3), 2010–2050 (2019)
|
| [29] |
RandsSA. Leaving safety to visit a feeding site: is it optimal to hesitate while exposed?. R. Soc. Open Sci., 2017, 41. 160910
|
| [30] |
SchälM. On piecewise deterministic Markov control processes: control of jumps and of risk processes in insurance. Insur. Math. Econ., 1998, 22(1): 75-91.
|
| [31] |
SethianJA. A fast marching level set method for monotonically advancing fronts. Proc. Natl. Acad. Sci., 1996, 93(4): 1591-1595.
|
| [32] |
SilverD, VenessJ. Monte-Carlo planning in large POMDPs. Adv. Neural. Inf. Process. Syst., 2010, 23: 2164-2172
|
| [33] |
TranK, YinG. Stochastic competitive Lotka-Volterra ecosystems under partial observation: feedback controls for permanence and extinction. J. Franklin Inst., 2014, 351(8): 4039-4064.
|
| [34] |
TsaiY-HR, ChengL-T, OsherS, ZhaoH-K. Fast sweeping algorithms for a class of Hamilton-Jacobi equations. SIAM J. Numer. Anal., 2003, 41(2): 673-694.
|
| [35] |
Vladimirsky, A., Zheng, C.: A fast implicit method for time-dependent Hamilton-Jacobi PDEs. arXiv:1306.3506 (2013)
|
| [36] |
YuL, ZhangQ, YinG. Asset allocation for regime-switching market models under partial observation. Dyn. Syst. Appl., 2014, 23(1): 39-62
|
Funding
Directorate for Mathematical and Physical Sciences(1645643 and 2111522)
Air Force Office of Scientific Research(FA9550-22-1-0528)
RIGHTS & PERMISSIONS
Shanghai University
Just Accepted
This article has successfully passed peer review and final editorial review, and will soon enter typesetting, proofreading and other publishing processes. The currently displayed version is the accepted final manuscript. The officially published version will be updated with format, DOI and citation information upon launch. We recommend that you pay attention to subsequent journal notifications and preferentially cite the officially published version. Thank you for your support and cooperation.