Efficient algorithms for Risk-Sensitive Markov Decision Processes with limited budget

Jan 1, 2021·

Daniel A. Melo Moreira

Karina Valdivia Delgado

Leliane Nunes De Barros

Denis Deratani Mauá

· 0 min read

PDF Cite DOI

Abstract

We tackle the problem of finding optimal policies for Markov Decision Processes, that minimize the probability of the cumulative cost exceeding a given budget. Such task falls under the umbrella of Risk-Sensitive Markov Decision Processes, which optimize a non-additive, non-linear function of cumulative cost that incorporates the user’s attitude towards risk. Current algorithms for solving that task, for any budget equal or smaller than an user-defined budget, scale poorly when the support of the cost function is large, since they operate in an augmented state space which enumerates all possible remaining budgets. To circumvent this issue, we develop (i) an improved version of the Topological Value Iteration with Dynamic Programming algorithm (tvi-dp), and (ii) the first symbolic dynamic programming algorithm for this class of problems, called rs-spudd, that exploits conditional independence in the transition function in the augmented state space. The proposed algorithms improve efficiency by pruning irrelevant states and terminating early, without sacrificing optimality. Empirical results show that rs-spudd is able to solve problems up to 103 times larger than tvi-dp.

Type

Publication

International Journal of Approximate Reasoning

Last updated on Oct 8, 2021

"Probabilistic Planning" "Markov Decision Processes" "Risk-Sensitive Markov Decision Processes"

← Cautious Classification with Data Missing Not at Random using Generative Random Forests Jan 1, 2021

Fast And Accurate Learning of Probabilistic Circuits by Random Projections Jan 1, 2021 →