Learning to interact with users: A collaborative-bandit approach

Konstantina Christakopoulou; Arindam Banerjee

doi:10.1137/1.9781611975321.69

Learning to interact with users: A collaborative-bandit approach

Konstantina Christakopoulou, Arindam Banerjee

Computer Science and Engineering

Research output: Contribution to conference › Paper › peer-review

15 Scopus citations

Abstract

Learning to interact with users and discover their preferences is central in most web applications, with recommender systems being a notable example. From such a perspective, merging interactive learning algorithms with recommendation models is natural. While recent literature has explored the idea of combining collaborative filtering approaches with bandit techniques, there exist two limitations: (1) they usually consider Gaussian rewards, which are not suitable for implicit feedback data powering most recommender systems, and (2) they are restricted to the one-item recommendation setting while typically a list of recommendations is given. In this paper, to address these limitations, apart from Gaussian rewards we also consider Bernoulli rewards, the latter being suitable for dyadic data. Also, we consider two user click models: the one-item click/no-click model, and the cascade click model which is suitable for top-K recommendations. For these settings, we propose novel machine learning algorithms that learn to interact with users by learning the underlying parameters collaboratively across users and items. We provide an extensive empirical study, which is the first to illustrate all pairwise empirical comparisons across different interactive learning algorithms for recommendation. Our experiments demonstrate that when the number of users and items is large, propagating the feedback across users and items while learning latent features is the most effective approach for systems to learn to interact with the users.

Original language	English (US)
Pages	612-620
Number of pages	9
DOIs	https://doi.org/10.1137/1.9781611975321.69
State	Published - 2018
Event	2018 SIAM International Conference on Data Mining, SDM 2018 - San Diego, United States Duration: May 3 2018 → May 5 2018

Other

Other	2018 SIAM International Conference on Data Mining, SDM 2018
Country/Territory	United States
City	San Diego
Period	5/3/18 → 5/5/18

Bibliographical note

Funding Information:
∗The research was supported by NSF grants IIS-1563950, IIS-1447566, IIS-1447574, IIS-1422557, CCF-1451986, CNS-1314560, IIS-0953274, IIS-1029711, NASA grant NNX12AQ39A, and gifts from Adobe, IBM, and Yahoo.

Funding Information:
The research was supported by NSF grants IIS-1563950, IIS-1447566, IIS-1447574, IIS-1422557, CCF-1451986, CNS-1314560, IIS-0953274, IIS-1029711, NASA grant NNX12AQ39A, and gifts from Adobe, IBM, and Yahoo.

Publisher Copyright:
© 2018 by SIAM.

Access

10.1137/1.9781611975321.69

OpenUrl availability

Full text

Cite this

@conference{226199a5929c4d2899401430c7c7f47a,

title = "Learning to interact with users: A collaborative-bandit approach",

abstract = "Learning to interact with users and discover their preferences is central in most web applications, with recommender systems being a notable example. From such a perspective, merging interactive learning algorithms with recommendation models is natural. While recent literature has explored the idea of combining collaborative filtering approaches with bandit techniques, there exist two limitations: (1) they usually consider Gaussian rewards, which are not suitable for implicit feedback data powering most recommender systems, and (2) they are restricted to the one-item recommendation setting while typically a list of recommendations is given. In this paper, to address these limitations, apart from Gaussian rewards we also consider Bernoulli rewards, the latter being suitable for dyadic data. Also, we consider two user click models: the one-item click/no-click model, and the cascade click model which is suitable for top-K recommendations. For these settings, we propose novel machine learning algorithms that learn to interact with users by learning the underlying parameters collaboratively across users and items. We provide an extensive empirical study, which is the first to illustrate all pairwise empirical comparisons across different interactive learning algorithms for recommendation. Our experiments demonstrate that when the number of users and items is large, propagating the feedback across users and items while learning latent features is the most effective approach for systems to learn to interact with the users.",

author = "Konstantina Christakopoulou and Arindam Banerjee",

note = "Funding Information: ∗The research was supported by NSF grants IIS-1563950, IIS-1447566, IIS-1447574, IIS-1422557, CCF-1451986, CNS-1314560, IIS-0953274, IIS-1029711, NASA grant NNX12AQ39A, and gifts from Adobe, IBM, and Yahoo. Funding Information: The research was supported by NSF grants IIS-1563950, IIS-1447566, IIS-1447574, IIS-1422557, CCF-1451986, CNS-1314560, IIS-0953274, IIS-1029711, NASA grant NNX12AQ39A, and gifts from Adobe, IBM, and Yahoo. Publisher Copyright: {\textcopyright} 2018 by SIAM.; 2018 SIAM International Conference on Data Mining, SDM 2018 ; Conference date: 03-05-2018 Through 05-05-2018",

year = "2018",

doi = "10.1137/1.9781611975321.69",

language = "English (US)",

pages = "612--620",

}

TY - CONF

T1 - Learning to interact with users

T2 - 2018 SIAM International Conference on Data Mining, SDM 2018

AU - Christakopoulou, Konstantina

AU - Banerjee, Arindam

N1 - Funding Information: ∗The research was supported by NSF grants IIS-1563950, IIS-1447566, IIS-1447574, IIS-1422557, CCF-1451986, CNS-1314560, IIS-0953274, IIS-1029711, NASA grant NNX12AQ39A, and gifts from Adobe, IBM, and Yahoo. Funding Information: The research was supported by NSF grants IIS-1563950, IIS-1447566, IIS-1447574, IIS-1422557, CCF-1451986, CNS-1314560, IIS-0953274, IIS-1029711, NASA grant NNX12AQ39A, and gifts from Adobe, IBM, and Yahoo. Publisher Copyright: © 2018 by SIAM.

PY - 2018

Y1 - 2018

N2 - Learning to interact with users and discover their preferences is central in most web applications, with recommender systems being a notable example. From such a perspective, merging interactive learning algorithms with recommendation models is natural. While recent literature has explored the idea of combining collaborative filtering approaches with bandit techniques, there exist two limitations: (1) they usually consider Gaussian rewards, which are not suitable for implicit feedback data powering most recommender systems, and (2) they are restricted to the one-item recommendation setting while typically a list of recommendations is given. In this paper, to address these limitations, apart from Gaussian rewards we also consider Bernoulli rewards, the latter being suitable for dyadic data. Also, we consider two user click models: the one-item click/no-click model, and the cascade click model which is suitable for top-K recommendations. For these settings, we propose novel machine learning algorithms that learn to interact with users by learning the underlying parameters collaboratively across users and items. We provide an extensive empirical study, which is the first to illustrate all pairwise empirical comparisons across different interactive learning algorithms for recommendation. Our experiments demonstrate that when the number of users and items is large, propagating the feedback across users and items while learning latent features is the most effective approach for systems to learn to interact with the users.

AB - Learning to interact with users and discover their preferences is central in most web applications, with recommender systems being a notable example. From such a perspective, merging interactive learning algorithms with recommendation models is natural. While recent literature has explored the idea of combining collaborative filtering approaches with bandit techniques, there exist two limitations: (1) they usually consider Gaussian rewards, which are not suitable for implicit feedback data powering most recommender systems, and (2) they are restricted to the one-item recommendation setting while typically a list of recommendations is given. In this paper, to address these limitations, apart from Gaussian rewards we also consider Bernoulli rewards, the latter being suitable for dyadic data. Also, we consider two user click models: the one-item click/no-click model, and the cascade click model which is suitable for top-K recommendations. For these settings, we propose novel machine learning algorithms that learn to interact with users by learning the underlying parameters collaboratively across users and items. We provide an extensive empirical study, which is the first to illustrate all pairwise empirical comparisons across different interactive learning algorithms for recommendation. Our experiments demonstrate that when the number of users and items is large, propagating the feedback across users and items while learning latent features is the most effective approach for systems to learn to interact with the users.

UR - http://www.scopus.com/inward/record.url?scp=85048342911&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85048342911&partnerID=8YFLogxK

U2 - 10.1137/1.9781611975321.69

DO - 10.1137/1.9781611975321.69

M3 - Paper

AN - SCOPUS:85048342911

SP - 612

EP - 620

Y2 - 3 May 2018 through 5 May 2018

ER -

Learning to interact with users: A collaborative-bandit approach

Abstract

Other

Bibliographical note

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this