Abstract
We consider the problem of multi-product dynamic pricing with demand learning and propose a nonparametric online learning algorithm based on the simultaneous perturbation stochastic approximation (SPSA) method. The algorithm uses only two price experimentations at each iteration, regardless of problem dimension, and could be especially efficient for solving high-dimensional problems. Under moderate conditions, we prove that the price estimates converge in mean-squared error (MSE) to the optimal price. Furthermore, we show that by suitably choosing input parameters, our algorithm achieves an expected cumulative regret of order OT over T periods, which is the best possible growth rate in the literature. The exact constants in the rate can be identified explicitly. We investigate the extensions of the algorithm to application scenarios characterized by non-stationary demand and inventory constraints. Simulation experiments reveal that our algorithm is effective for a range of test problems and performs favorably compared with a recently proposed alternative method for high-dimensional problems.
| Original language | English |
|---|---|
| Pages (from-to) | 191-205 |
| Number of pages | 15 |
| Journal | European Journal of Operational Research |
| Volume | 319 |
| Issue number | 1 |
| DOIs | |
| State | Published - Nov 16 2024 |
Keywords
- Decision analysis
- Dynamic pricing with demand learning
- Online learning
- Revenue management
- Simultaneous perturbation stochastic approximation (SPSA)
Fingerprint
Dive into the research topics of 'Nonparametric multi-product dynamic pricing with demand learning via simultaneous price perturbation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver