Skip to main navigation Skip to search Skip to main content

On occupation measures for total-reward MDPs

  • Yale University
  • Technion-Israel Institute of Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

This paper is based on our recent contribution [3] that studies Markov Decision Processes (MDPs) with Borel state and action spaces and with the expected total rewards. The initial state distribution is fixed. According to [3], for a given randomized stationary policy, its occupation measure as a convex combination of occupation measures for simpler policies. If this is possible for a given policy, we say that the policy can be split. In particular, we are interested in splitting a randomized stationary policy into (nonrandomized) stationary policies or into a randomized stationary policies that are nonrandomized on a given subset of states. Though [3] studies Borel-state MDPs with expected total rewards, some of its results are new for finite state and action discounted MDPs. This paper focuses on these results.

Original languageEnglish
Title of host publicationProceedings of the 47th IEEE Conference on Decision and Control, CDC 2008
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages4460-4465
Number of pages6
ISBN (Print)9781424431243
DOIs
StatePublished - 2008
Event47th IEEE Conference on Decision and Control, CDC 2008 - Cancun, Mexico
Duration: Dec 9 2008Dec 11 2008

Publication series

NameProceedings of the IEEE Conference on Decision and Control
ISSN (Print)0743-1546
ISSN (Electronic)2576-2370

Conference

Conference47th IEEE Conference on Decision and Control, CDC 2008
Country/TerritoryMexico
CityCancun
Period12/9/0812/11/08

Fingerprint

Dive into the research topics of 'On occupation measures for total-reward MDPs'. Together they form a unique fingerprint.

Cite this