Skip to main navigation Skip to search Skip to main content

A Q-learning algorithm for Markov decision processes with continuous state spaces

  • Jiaqiao Hu
  • , Xiangyu Yang
  • , Jian Qiang Hu
  • , Yijie Peng
  • Shandong University
  • Fudan University
  • Peking University

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

We propose an online algorithm for solving a class of continuous-state Markov decision processes. The algorithm combines classical Q-learning with an asynchronous averaging procedure, which allows Q-function estimates at sampled state–action pairs to be adaptively updated based on observations collected along a single sample trajectory. These estimates are then used to iteratively construct an interpolation-based function approximator of the Q-function. We prove the convergence of the algorithm and provide numerical results to illustrate its performance.

Original languageEnglish
Article number105782
JournalSystems and Control Letters
Volume187
DOIs
StatePublished - May 2024

Keywords

  • Markov processes
  • Optimization algorithms
  • Statistical learning
  • Stochastic optimal control

Fingerprint

Dive into the research topics of 'A Q-learning algorithm for Markov decision processes with continuous state spaces'. Together they form a unique fingerprint.

Cite this