arxiv:2301.10500

Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning

Published on Jan 25, 2023

Authors:

Abstract

We propose Banker Online Mirror Descent (Banker-OMD), a novel framework generalizing the classical Online Mirror Descent (OMD) technique in the online learning literature. The Banker-OMD framework almost completely decouples feedback delay handling and the task-specific OMD algorithm design, thus facilitating the design of new algorithms capable of efficiently and robustly handling feedback delays. Specifically, it offers a general methodology for achieving mathcal O(T + D)-style <PRE_TAG>regret bounds</POST_TAG> in online bandit learning tasks with delayed feedback, where T is the number of rounds and D is the total feedback delay. We demonstrate the power of Banker-OMD by applications to two important bandit learning scenarios with delayed feedback, including delayed scale-free adversarial Multi-Armed Bandits (MAB) and delayed adversarial linear bandits. Banker-OMD leads to the first delayed scale-free adversarial MAB algorithm achieving mathcal O(KL(sqrt T+sqrt D)) regret and the first delayed adversarial linear bandit algorithm achieving mathcal O(poly(n)(T + D)) regret. As a corollary, the first application also implies mathcal O(KTL) regret for non-delayed scale-free adversarial MABs, which is the first to match the Omega(KTL) lower bound up to logarithmic factors and can be of independent interest.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2301.10500 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2301.10500 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2301.10500 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.