Repository logo
  • Log In
    Log in via Symplectic to deposit your publication(s).
Repository logo
  • Communities & Collections
  • Research Outputs
  • Statistics
  • Log In
    Log in via Symplectic to deposit your publication(s).
  1. Home
  2. Faculty of Engineering
  3. Faculty of Engineering
  4. Remote contextual bandits
 
  • Details
Remote contextual bandits
File(s)
PGZ_ISIT22.pdf (332.52 KB)
Accepted version
Author(s)
Pase, Francesco
Gunduz, Deniz
Zorzi, Michele
Type
Conference Paper
Abstract
We consider a remote contextual multi-armed bandit (CMAB) problem, in which the decision-maker observes the context and the reward, but must communicate the actions to be taken by the agents over a rate-limited communication channel. This can model, for example, a personalized ad placement application, where the content owner observes the individual visitors to its website, and hence has the context information, but must convey the ads that must be shown to each visitor to a separate entity that manages the marketing content. In this remote CMAB (R-CMAB) problem, the constraint on the communication rate between the decision-maker and the agents imposes a trade-off between the number of bits sent per agent and the acquired average reward. We are particularly interested in characterizing the rate required to achieve sub-linear regret. Consequently, this can be considered as a policy compression problem, where the distortion metric is induced by the learning objectives. We first study the fundamental information theoretic limits of this problem by letting the number of agents go to infinity, and study the regret achieved when Thompson sampling strategy is adopted. In particular, we identify two distinct rate regions resulting in linear and sub-linear regret behavior, respectively. Then, we provide upper bounds for the achievable regret when the decision-maker can reliably transmit the policy without distortion.
Date Issued
2022-08-03
Date Acceptance
2022-08-01
Citation
2022 IEEE International Symposium on Information Theory (ISIT), 2022, pp.1665-1670
URI
http://hdl.handle.net/10044/1/101695
URL
https://ieeexplore.ieee.org/document/9834399
DOI
https://www.dx.doi.org/10.1109/isit50566.2022.9834399
Publisher
IEEE
Start Page
1665
End Page
1670
Journal / Book Title
2022 IEEE International Symposium on Information Theory (ISIT)
Copyright Statement
Copyright © 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Identifier
https://ieeexplore.ieee.org/document/9834399
Source
2022 IEEE International Symposium on Information Theory (ISIT)
Publication Status
Published
Start Date
2022-06-26
Finish Date
2022-07-01
Coverage Spatial
Espoo, Finland
Date Publish Online
2022-08-03
About
Spiral Depositing with Spiral Publishing with Spiral Symplectic
Contact us
Open access team Report an issue
Other Services
Scholarly Communications Library Services
logo

Imperial College London

South Kensington Campus

London SW7 2AZ, UK

tel: +44 (0)20 7589 5111

Accessibility Modern slavery statement Cookie Policy

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback