TY - GEN
T1 - Federated Bandit
T2 - 2021 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2021
AU - Zhu, Zhaowei
AU - Zhu, Jingxuan
AU - Liu, Ji
AU - Liu, Yang
N1 - Publisher Copyright:
© 2021 Owner/Author.
PY - 2021/5/31
Y1 - 2021/5/31
N2 - We study Federated Bandit, a decentralized Multi-Armed Bandit (MAB) problem with a set of N agents, who can only communicate their local data with neighbors described by a connected graph G. Each agent makes a sequence of decisions on selecting an arm from M candidates, yet they only have access to local and potentially biased feedback/evaluation of the true reward for each action taken. Learning only locally will lead agents to sub-optimal actions while converging to a no-regret strategy requires a collection of distributed data. Motivated by the proposal of federated learning, we aim for a solution with which agents will never share their local observations with a central entity, and will be allowed to only share a private copy of his/her own information with their neighbors. We first propose a decentralized bandit algorithm GossipUCB, which is a coupling of variants of both the classical gossiping algorithm and the celebrated Upper Confidence Bound (UCB) bandit algorithm. We show that GossipUCB successfully adapts local bandit learning into a global gossiping process for sharing information among connected agents, and achieves guaranteed regret at the order of O(max(poly(N,M) log T, poly(N,M) logλ2-1 N)) for all N agents, where λ2(0,1) is the second largest eigenvalue of the expected gossip matrix, which is a function of G. We then propose FedUCB, a differentially private version of GossipUCB, in which the agents preserve ϵ-differential privacy of their local data while achieving O(max poly(N,M)/ϵ log2.5 T, poly(N,M) (logλ2-1 N + log T)) regret.
AB - We study Federated Bandit, a decentralized Multi-Armed Bandit (MAB) problem with a set of N agents, who can only communicate their local data with neighbors described by a connected graph G. Each agent makes a sequence of decisions on selecting an arm from M candidates, yet they only have access to local and potentially biased feedback/evaluation of the true reward for each action taken. Learning only locally will lead agents to sub-optimal actions while converging to a no-regret strategy requires a collection of distributed data. Motivated by the proposal of federated learning, we aim for a solution with which agents will never share their local observations with a central entity, and will be allowed to only share a private copy of his/her own information with their neighbors. We first propose a decentralized bandit algorithm GossipUCB, which is a coupling of variants of both the classical gossiping algorithm and the celebrated Upper Confidence Bound (UCB) bandit algorithm. We show that GossipUCB successfully adapts local bandit learning into a global gossiping process for sharing information among connected agents, and achieves guaranteed regret at the order of O(max(poly(N,M) log T, poly(N,M) logλ2-1 N)) for all N agents, where λ2(0,1) is the second largest eigenvalue of the expected gossip matrix, which is a function of G. We then propose FedUCB, a differentially private version of GossipUCB, in which the agents preserve ϵ-differential privacy of their local data while achieving O(max poly(N,M)/ϵ log2.5 T, poly(N,M) (logλ2-1 N + log T)) regret.
KW - decentralized multi-armed bandit
KW - differential privacy
KW - federated learning
KW - heterogeneous rewards
UR - https://www.scopus.com/pages/publications/85104919836
U2 - 10.1145/3410220.3453919
DO - 10.1145/3410220.3453919
M3 - Conference contribution
AN - SCOPUS:85104919836
T3 - SIGMETRICS 2021 - Abstract Proceedings of the 2021 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems
SP - 3
EP - 4
BT - SIGMETRICS 2021 - Abstract Proceedings of the 2021 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems
PB - Association for Computing Machinery, Inc
Y2 - 14 June 2021 through 18 June 2021
ER -