An average-value-at-risk criterion for Markov decision processes with unbounded costs
Qiuli LIU, Wai-Ki CHING, Junyu ZHANG, Hongchu WANG
An average-value-at-risk criterion for Markov decision processes with unbounded costs
We study the Markov decision processes under the average-valueat-risk criterion. The state space and the action space are Borel spaces, the costs are admitted to be unbounded from above, and the discount factors are state-action dependent. Under suitable conditions, we establish the existence of optimal deterministic stationary policies. Furthermore, we apply our main results to a cash-balance model.
Markov decision processes / average-value-at-risk (AVaR) / stateaction dependent discount factors / optimal policy
[1] |
Andersson F , Mausser H , Rosen D , Uryasev S . Credit risk optimization with conditional value-at-risk criterion. Math Program, 2001, 89: 273- 291
CrossRef
Google scholar
|
[2] |
Bäuerle N , Ott J . Markov decision processes with average-value-at-risk criteria. Math Methods Oper Res, 2011, 74: 361- 379
CrossRef
Google scholar
|
[3] |
Bäuerle N , Rieder U . Markov Decision Processes with Applications to Finance. Universitext. Heidelberg: Springer, 2011
CrossRef
Google scholar
|
[4] |
Boda K , Filar J A . Time consistent dynamic risk measures. Math Methods Oper Res, 2006, 63: 169- 186
CrossRef
Google scholar
|
[5] |
Chu S Y , Zhang Y . Markov decision processes with iterated coherent risk measures. Internat J Control, 2014, 87: 2286- 2293
CrossRef
Google scholar
|
[6] |
Guo X P . Continuous-time Markov decision processes with discounted rewards: the case of Polish spaces. Math Oper Res, 2007, 32: 73- 87
CrossRef
Google scholar
|
[7] |
Guo X P , Hernández-del-Valle A , Hernández-Lerma O . Nonstationary discrete-time deterministic and stochastic control systems: bounded and unbounded cases. Systems Control Lett, 2011, 60: 503- 509
CrossRef
Google scholar
|
[8] |
Hernández-Lerma O , Lasserre J B . Further Topics on Discrete-Time Markov Control Processes. New York: Springer-Verlag, 1999
CrossRef
Google scholar
|
[9] |
Huang Y H , Guo X P . Minimum average value-at-risk for finite horizon semi-Markov decision processes in continuous time. SIAM J Optim, 2016, 26: 1- 28
CrossRef
Google scholar
|
[10] |
Minjárez-Sosa J A . Markov control models with unknown random state-actiondependent discount factors. TOP, 2015, 23 (3): 743- 772
CrossRef
Google scholar
|
[11] |
Puterman M L . Markov Decision Processes: Discrete Stochastic Dynamic Programming. New York: John Wiley & Sons Inc, 1994
CrossRef
Google scholar
|
[12] |
Rockafellar R T , Uryasev S . Optimization of conditional value-at-risk. J Risk, 2000, 2: 21- 41
CrossRef
Google scholar
|
[13] |
Rockafellar R T , Uryasev S . Conditional value-at-risk for general loss distributions. J Bank Finance, 2000, 26: 1443- 1471
|
[14] |
Uğurlu K . Controlled Markov decision processes with AVaR criteria for unbounded costs. J Comput Appl Math, 2017, 319: 24- 37
CrossRef
Google scholar
|
[15] |
Wei Q D , Guo X P . Markov decision processes with state-dependent discount factors and unbounded reward/costs. Oper Res Lett, 2011, 39 (5): 369- 374
CrossRef
Google scholar
|
[16] |
Xia L . Optimization of Markov decision processes under the variance criterion. Automatica J IFAC, 2016, 73: 269- 278
CrossRef
Google scholar
|
[17] |
Xia L . Variance minimization of parameterized Markov decision processes. Discrete Event Dyn Syst, 2018, 28: 63- 81
CrossRef
Google scholar
|
[18] |
Xia L . Risk-sensitive Markov decision processes with combined metrics of mean and variance. Prod Oper Manag, 2020, 29 (12): 2808- 2827
CrossRef
Google scholar
|
/
〈 | 〉 |