Combinatorial Semi-Bandits with Knapsacks.
We unify two prominent lines of work on multi-armed bandits: bandits withknapsacks (BwK) and combinatorial semi-bandits. The former concerns limited"resources" consumed by the algorithm, e.g., limited supply in dynamic pricing.The latter allows a huge number of actions but assumes combinatorial structureand additional feedback to make the problem tractable. We define a commongeneralization, support it with several motivating examples, and design analgorithm for it. Our regret bounds are comparable with those for BwK andcombinatorial semi- bandits.
Continue reading and listening
Stay in the loop.
Subscribe to our newsletter for a weekly update on the latest podcast, news, events, and jobs postings.