Personalized query suggestion diversification in information retrieval
Wanyu CHEN, Fei CAI, Honghui CHEN, Maarten DE RIJKE
Personalized query suggestion diversification in information retrieval
Query suggestions help users refine their queries after they input an initial query. Previous work on query suggestion has mainly concentrated on approaches that are similarity-based or context-based, developing models that either focus on adapting to a specific user (personalization) or on diversifying query aspects in order to maximize the probability of the user being satisfied (diversification). We consider the task of generating query suggestions that are both personalized and diversified. We propose a personalized query suggestion diversification (PQSD) model, where a user’s long-term search behavior is injected into a basic greedy query suggestion diversification model that considers a user’s search context in their current session. Query aspects are identified through clicked documents based on the open directory project (ODP) with a latent dirichlet allocation (LDA) topic model. We quantify the improvement of our proposed PQSD model against a state-of-the-art baseline using the public america online (AOL) query log and show that it beats the baseline in terms of metrics used in query suggestion ranking and diversification. The experimental results show that PQSD achieves its best performance when only queries with clicked documents are taken as search context rather than all queries, especially when more query suggestions are returned in the list.
query suggestion / personalization / query suggestion diversification
[1] |
Chen W Y, Cai F, Chen H H, De Rijke M. Personalized query suggestion diversification. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2017, 817–820
|
[2] |
Yang S, Zhou D Y, He L W. Post-ranking query suggestion by diversifying search results. In: Proceedings of the 34th Annual International ACMSIGIR Conference on Research and Development in Information Retrieval. 2011, 815–824
|
[3] |
Li R R, Kao B, Bi B, Cheng R, Lo E. DQR: a probabilistic approach to diversified query recommendation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management. 2012, 16–25
|
[4] |
Ma H, Lyu MR, King I. Diversifying query suggestion results. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence. 2010,1399–1404
|
[5] |
Zhang Z Y, Nasraoui O. Mining search engine query logs for query recommendation. In: Proceedings of the 15th International Conference on World Wide Web. 2006, 1039–1040
|
[6] |
Cao H H, Jiang D X, Pei J, He Q, Liao Z, Chen E H, Li H. Contextaware query suggestion by mining click-through and session data. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2008, 875–883
|
[7] |
Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation. Journal of Machine Learning Research, 2003, 3(4): 993–1022
|
[8] |
Pass G, Chowdhury A, Torgeson C. A picture of search. In: Proceedings of the 1st International Conference on Scalable Information Systems. 2006, 1–7
|
[9] |
Cai F, De Rijke M. A survey of query auto completion in information retrieval. Foundations and Trends in Information Retrieval, 2016, 10(4): 273–363
|
[10] |
Cai F, Liang S S, De Rijke M. Prefix-adaptive and time-sensitive personalized query auto completion. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(9): 2452–2466
|
[11] |
Cai F, De Rijke M. Learning from homologous queries and semantically related terms for query auto completion. Information Processing and Management, 2016, 52(4): 628–643
|
[12] |
Santos R L T, Peng J, Macdonald C, Ounis I. Explicit search result diversification through sub-queries. In: Proceedings of the 32nd European Conference on Information Retrieval. 2010, 87–99
|
[13] |
Al-otaibi S, Ykhlef M. Hybrid immunizing solution for job recommender system. Frontiers of Computer Science, 2017, 11(3): 511–527
|
[14] |
Kharitonov E, Macdonald C, Serdyukov P, Ounis I. Intent models for contextualising and diversifying query suggestions. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management. 2013, 2303–2308
|
[15] |
Ziegler C N, McNee S M, Konstan J A, Lausen G. Improving recommendation lists through topic diversification. In: Proceedings of the 14th International Conference on World Wide Web. 2005, 22–32
|
[16] |
Li L, Yang Z L, Liu L, Kitsuregawa M. Query-URL bipartite based approach to personalized query recommendation. In: Proceedings of the 22nd AAAI Conference on Artificial Intelligence. 2008, 1189–1194
|
[17] |
Sharma S, Mangla N. Obtaining personalized and accurate query suggestion by using agglomerative clustering algorithm and P-QC method. International Journal of Engineering Research and Technology, 2012, 1(5): 28–35
|
[18] |
Verberne S, Sappelli M, Järvelin K, Kraaij W. User simulations for interactive search: evaluating personalized query suggestion. In: Proceedings of the 2015 European Conference on Information Retrieval. 2015, 678–690
|
[19] |
Vallet D, Castells P. Personalized diversification of search results. In: Proceedings of the 35th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2012, 841–850
|
[20] |
Craswell N, Szummer M. Random walks on the click graph. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2007, 239–246
|
[21] |
Cui J W, Liu H Y, Yan J, Ji L, Jin RM, He J, Guo Y Q, Chen Z, Du X Y. Multi-view random walk framework for search task discovery from click-through log. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management. 2011, 135–140
|
[22] |
Ma H, Yang H X, King I, R. Lyu M. Learning latent semantic relations from clickthrough data for query suggestion. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management. 2008, 709–718
|
[23] |
Mei Q Z, Zhou D, Church K. Query suggestion using hitting time. In: Proceedings of the 17th ACM International Conference on Information and Knowledge Management. 2008, 469–478
|
[24] |
Liang S S, Cai F, Ren Z C, de Rijke M. Efficient structured learning for personalized diversification. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(11): 2958–2973
|
[25] |
Huang C K, Chien L F, Oyang Y J. Relevant term suggestion in interactive web search based on contextual information in query session logs. Journal of the American Society for Information Science and Technology, 2003, 54(7): 638–649
|
[26] |
Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. In: Proceedings of Workshop at International Conference on Learning Representations. 2013, 1–13
|
[27] |
Cai F, Ridho R, De Rijke M. Diversifying query auto-completion. ACM Transactions on Information Systems, 2016, 34(4): 1–33
|
[28] |
Joachims T. Optimizing search engines using clickthrough data. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2002, 133–142
|
[29] |
Bollegala D, Matsuo Y, Ishizuka M. Measuring semantic similarity between words using Web search engines. In: Proceedings of the 16th International Conference on World Wide Web. 2007, 757–766
|
[30] |
Carbonell J, Goldstein J. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 1998, 335–336
|
[31] |
Guo J F, Cheng X Q, Xu G, Zhu X F. Intent-aware query similarity. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management. 2011, 259–268
|
[32] |
Shah C, Croft W B. Evaluating high accuracy retrieval techniques. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2004, 2–9
|
[33] |
Clarke C L A, Kolla M, V. Cormack G, Vechtomova O, Ashkan A, Büttcher S, MacKinnon I. Novelty and diversity in information retrieval evaluation. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2008, 659–666
|
[34] |
Järvelin K, Kekäläinen J. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 2002, 20(4): 422–446
|
[35] |
Chapelle O, Metzler D, Zhang Y, Grinspan P. Expected reciprocal rank for graded relevance. In: Proceedings of the 18th ACM International Conference on Information and Knowledge Management. 2009, 621–630
|
[36] |
Asuncion A, Welling M, Smyth P, Teh W Y. On smoothing and inference for topic models. In: Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence. 2009, 27–34
|
[37] |
Agrawal R, Gollapudi S, Halverson A, Ieong S. Diversifying search results. In: Proceedings of the 2009 International Conference on Web Search and Data Mining. 2009, 5–14
|
[38] |
Cai F, Wang S Q, De Rijke M. Behavior-based personalization in Web search. Journal of the Association for Information Science and Technology, 2017, 68(4): 855–868
|
[39] |
Sepliarskai A, Radlinski F, De Rijke M. Simple personalized search based on long-term behavioral signals. In: Proceedings of the 39th European Conference on Information Retrieval. 2017, 95–107
|
/
〈 | 〉 |