E-commerce search engines can fail to retrieve results that satisfy a query's product intent because: (i) conventional retrieval approaches, such as BM25, may ignore the important terms in queries owing to their low inverse document frequency (IDF), and (ii) for long queries, as is usually the case in rare queries (i.e., tail queries), they may fail to determine the relevant terms that are representative of the query's product intent. In this paper, we leverage the historical query reformulation logs of a large e-retailer (walmart.com) to develop a distant-supervision-based approach to identify the relevant terms that characterize the query's product intent. The key idea underpinning our approach is that the terms retained in the reformulation of a query are more important in describing the query's product intent than the discarded terms. Additionally, we also use the fact that the significance of a term depends on its context (other terms in the neighborhood) in the query to determine the term's importance towards the query's product intent. We show that identifying and emphasizing the terms that define the query's product intent leads to a 3% improvement in ranking and outperforms the context-unaware baselines.
|Original language||English (US)|
|Title of host publication||CIKM 2019 - Proceedings of the 28th ACM International Conference on Information and Knowledge Management|
|Publisher||Association for Computing Machinery|
|Number of pages||4|
|State||Published - Nov 3 2019|
|Event||28th ACM International Conference on Information and Knowledge Management, CIKM 2019 - Beijing, China|
Duration: Nov 3 2019 → Nov 7 2019
|Name||International Conference on Information and Knowledge Management, Proceedings|
|Conference||28th ACM International Conference on Information and Knowledge Management, CIKM 2019|
|Period||11/3/19 → 11/7/19|
Bibliographical noteFunding Information:
This work was supported in part by NSF (1447788, 1704074, 1757916, 1834251) and Walmart Labs. Access to research and computing facilities was provided by the Minnesota Supercomputing Institute.
© 2019 Association for Computing Machinery.
- Query intent
- Query refinement
- Query reformulation
- Term weighting