Skip to main navigation Skip to search Skip to main content

Decentralized Stochastic Multi-Player Multi-Armed Walking Bandits

  • State University of New York Binghamton University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Multi-player multi-armed bandit is an increasingly relevant decision-making problem, motivated by applications to cognitive radio systems. Most research for this problem focuses exclusively on the settings that players have full access to all arms and receive no reward when pulling the same arm. Hence all players solve the same bandit problem with the goal of maximizing their cumulative reward. However, these settings neglect several important factors in many real-world applications, where players have limited access to a dynamic local subset of arms (i.e., an arm could sometimes be “walking” and not accessible to the player). To this end, this paper proposes a multi-player multi-armed walking bandits model, aiming to address aforementioned modeling issues. The goal now is to maximize the reward, however, players can only pull arms from the local subset and only collect a full reward if no other players pull the same arm. We adopt Upper Confidence Bound (UCB) to deal with the exploration-exploitation tradeoff and employ distributed optimization techniques to properly handle collisions. By carefully integrating these two techniques, we propose a decentralized algorithm with near-optimal guarantee on the regret, and can be easily implemented to obtain competitive empirical performance.

Original languageEnglish
Title of host publicationAAAI-23 Technical Tracks 9
EditorsBrian Williams, Yiling Chen, Jennifer Neville
PublisherAAAI Press
Pages10528-10536
Number of pages9
ISBN (Electronic)9781577358800
DOIs
StatePublished - Jun 27 2023
Event37th AAAI Conference on Artificial Intelligence, AAAI 2023 - Washington, United States
Duration: Feb 7 2023Feb 14 2023

Publication series

NameProceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023
Volume37

Conference

Conference37th AAAI Conference on Artificial Intelligence, AAAI 2023
Country/TerritoryUnited States
CityWashington
Period02/7/2302/14/23

Fingerprint

Dive into the research topics of 'Decentralized Stochastic Multi-Player Multi-Armed Walking Bandits'. Together they form a unique fingerprint.

Cite this