Abstract
In a work cell with material handling gantries, gantry movements constrain the production of the work cell. Due to the fact that the gantry real-time scheduling and the material flow are highly coupled, modeling of the gantry work cell is very challenging. In this paper, we formulate the gantry real-time scheduling problem as a reinforcement learning problem, carried out by Q-learning algorithm. To build a learning model, the definition of reward function is instrumental. To study the learning performance of Q-learning algorithm, we perform simulation experiments with five different reward functions based on different understandings of the production system. It is shown by simulation experiments that the learning performance varies with reward functions and only the reward demonstrating a better understanding of the system outperforms other reward functions. In addition, the results further validate the effectiveness and practicality of the theories and conclusions from the systematic analyses of the gantry work cell.
| Original language | English |
|---|---|
| Pages (from-to) | 1-8 |
| Number of pages | 8 |
| Journal | Journal of Manufacturing Systems |
| Volume | 50 |
| DOIs | |
| State | Published - Jan 2019 |
Keywords
- Gantry scheduling
- Q-learning
- Reinforcement learning
- Reward function
Fingerprint
Dive into the research topics of 'Simulation study on reward function of reinforcement learning in gantry work cell scheduling'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver