TY - GEN
T1 - On occupation measures for total-reward MDPs
AU - Denardo, Eric V.
AU - Feinberg, Eugene A.
AU - Rothblum, Uriel G.
PY - 2008
Y1 - 2008
N2 - This paper is based on our recent contribution [3] that studies Markov Decision Processes (MDPs) with Borel state and action spaces and with the expected total rewards. The initial state distribution is fixed. According to [3], for a given randomized stationary policy, its occupation measure as a convex combination of occupation measures for simpler policies. If this is possible for a given policy, we say that the policy can be split. In particular, we are interested in splitting a randomized stationary policy into (nonrandomized) stationary policies or into a randomized stationary policies that are nonrandomized on a given subset of states. Though [3] studies Borel-state MDPs with expected total rewards, some of its results are new for finite state and action discounted MDPs. This paper focuses on these results.
AB - This paper is based on our recent contribution [3] that studies Markov Decision Processes (MDPs) with Borel state and action spaces and with the expected total rewards. The initial state distribution is fixed. According to [3], for a given randomized stationary policy, its occupation measure as a convex combination of occupation measures for simpler policies. If this is possible for a given policy, we say that the policy can be split. In particular, we are interested in splitting a randomized stationary policy into (nonrandomized) stationary policies or into a randomized stationary policies that are nonrandomized on a given subset of states. Though [3] studies Borel-state MDPs with expected total rewards, some of its results are new for finite state and action discounted MDPs. This paper focuses on these results.
UR - https://www.scopus.com/pages/publications/62949182216
U2 - 10.1109/CDC.2008.4739426
DO - 10.1109/CDC.2008.4739426
M3 - Conference contribution
AN - SCOPUS:62949182216
SN - 9781424431243
T3 - Proceedings of the IEEE Conference on Decision and Control
SP - 4460
EP - 4465
BT - Proceedings of the 47th IEEE Conference on Decision and Control, CDC 2008
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 47th IEEE Conference on Decision and Control, CDC 2008
Y2 - 9 December 2008 through 11 December 2008
ER -