WIAS Preprint No. 2977, (2022)

Risk-sensitive partially observable Markov decision processes as fully observable multivariate utility optimization problems



Authors

  • Afsardeir, Arsham
  • Kapetanis, Andreas
  • Laschos, Vaios
    ORCID: 0000-0001-8721-5335
  • Obermayer, Klaus

2020 Mathematics Subject Classification

  • 93E20

Keywords

  • Markov decision processes, partial observability, risk sensitivity, utility function, sums of exponentials

DOI

10.20347/WIAS.PREPRINT.2977

Abstract

We provide a new algorithm for solving Risk Sensitive Partially Observable Markov Decisions Processes, when the risk is modeled by a utility function, and both the state space and the space of observations are fi- nite. This algorithm is based on an observation that the change of measure and the subsequent introduction of the information space, which is used for exponential utility functions, can be actually extended for sums of exponentials if one introduces an extra vector parameter that tracks the expected accumulated cost that corresponds to each exponential. Since every increasing function can be approximated by sums of expo- nentials in finite intervals, the method can be essentially applied for any utility function, with its complexity depending on the number

Download Documents