Abstract
This article studies convergence properties of optimal values and actions for discounted and average-cost Markov decision processes (MDPs) with weakly continuous transition probabilities and applies these properties to the stochastic periodic-review inventory control problem with backorders, positive setup costs, and convex holding/backordering costs. The following results are established for MDPs with possibly non-compact action sets and unbounded cost functions: (i) convergence of value iterations to optimal values for discounted problems with possibly non-zero terminal costs, (ii) convergence of optimal finite-horizon actions to optimal infinite-horizon actions for total discounted costs, as the time horizon tends to infinity, and (iii) convergence of optimal discount-cost actions to optimal average-cost actions for infinite-horizon problems, as the discount factor tends to 1. Being applied to the setup-cost inventory control problem, the general results on MDPs imply the optimality of (s, S) policies and convergence properties of optimal thresholds. In particular this article analyzes the setup-cost inventory control problem without two assumptions often used in the literature: (a) the demand is either discrete or continuous or (b) the backordering cost is higher than the cost of backordered inventory if the amount of backordered inventory is large.
| Original language | English |
|---|---|
| Pages (from-to) | 619-637 |
| Number of pages | 19 |
| Journal | Naval Research Logistics |
| Volume | 65 |
| Issue number | 8 |
| DOIs | |
| State | Published - Dec 2018 |
Keywords
- average cost per unit time
- inventory control
- Markov decision process
- optimal policy
- optimality inequality
Fingerprint
Dive into the research topics of 'On the convergence of optimal actions for Markov decision processes and the optimality of (s, S) inventory policies'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver