Randomized allocation with dimension reduction in a bandit problem with covariates

Wei Qian, Yuhong Yang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Multi-armed bandit problem is an important optimization game requiring an exploration-exploitation tradeoff to achieve optimal total reward. We consider a setting where the rewards of bandit machines are associated with covariates, and focus on the approach of nonparametric estimation of the reward functions together with a randomized allocation to balance the exploration and exploitation. To overcome the curse of dimensionality in nonparametric learning, we propose using dimension reduction methods such as sliced inverse regression (SIR) and likelihood acquired directions (LAD) to reduce the dimension of the covariates. To simultaneously achieve variable selection and dimension reduction, we use coordinate-independent sparse estimation (CISE) for the dimension reduction step. Not knowing which individual dimension reduction method is the best, we show that adaptively combining these dimension reduction methods works really well.

Original languageEnglish (US)
Title of host publicationProceedings - 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2012
Pages1537-1541
Number of pages5
DOIs
StatePublished - 2012
Event2012 9th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2012 - Chongqing, China
Duration: May 29 2012May 31 2012

Publication series

NameProceedings - 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2012

Other

Other2012 9th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2012
Country/TerritoryChina
CityChongqing
Period5/29/125/31/12

Fingerprint

Dive into the research topics of 'Randomized allocation with dimension reduction in a bandit problem with covariates'. Together they form a unique fingerprint.

Cite this