RESEARCH ARTICLE

Using log mining to analyze user behavior on search engine

  • Ke XIE ,
  • Huijia YU ,
  • Rongwei CEN
Expand
  • State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China

Received date: 28 Apr 2011

Accepted date: 12 Oct 2011

Published date: 05 Jun 2012

Copyright

2014 Higher Education Press and Springer-Verlag Berlin Heidelberg

Abstract

Users’ behavior analysis has become one of the most important research topics, especially in terms of performance optimization, architecture analysis, and system maintenance, due to the rapid growth of search engine users. By adequately performing analysis on log data, researchers and Internet companies can get guidance to better search engines. In this paper, we perform our analysis based on approximately 750 million entries of search requests obtained from log of a real commercial search engine. Several aspects of users’ behavior are studied, including query length, ratio of query refining, recommendation access, and so on. Different information needs may lead to different behaviors, and we address this discussion in this paper. We firmly believe that these analyses would be helpful with respect of improving both effectiveness and efficiency of search engines.

Cite this article

Ke XIE , Huijia YU , Rongwei CEN . Using log mining to analyze user behavior on search engine[J]. Frontiers of Electrical and Electronic Engineering, 0 , 7(2) : 254 -260 . DOI: 10.1007/s11460-011-0177-4

1
China Internet Network Information Center (CNNIC). The 25th report in development of Internet in China. 2010. http://www.cnnic.net.cn/uploadfiles/pdf/2010/1/15/101600.pdf

2
Cockburn A, Jones S. Which way now? Analyzing and easing inadequacies in WWW navigation. International Journal of Human-Computer Studies, 1996, 45(1): 105–129

DOI

3
Tauscher L, Greenberg S. How people revisit web pages: Empirical findings and implications for the design of history systems. International Journal of Human-Computer Studies, 1997, 47(1): 97–137

DOI

4
Silverstein C, Marais H, Henzinger M, Moricz M. Analysis of a very large web search engine query log. ACM SIGIR Forum, 1999, 33(1): 6–12.

5
Yu H, Liu Y, Zhang M, Ru L, Ma S. Research in search engine user behavior based on log analysis. Journal of Chinese Information Processing, 2007, 21(1): 109–114 (in Chinese)

6
Agichtein E, Brill E, Dumais S. Improving web search ranking by incorporating user behavior information. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2006, 19–26

7
Dou Z, Song R, Yuan X, Wen J. Are click-through data adequate for learning web search rankings? In: Proceedings of the 17th ACM Conference on Information and Knowledge Management.,2008, 73–82

8
Liu Y, Cen R, Zhang M, Ru L, Ma S. Automatic search engine evaluation based on user behavior analysis. Journal of Software, 2008, 19(11): 3023–3032 (in Chinese)

9
Sullivan D. Search engine sizes. 2005. http://searchenginewatch.com/reports/article.php/2156481

10
Joachims T, Granka L, Pan B, Hembrooke H, Gay G. Accurately interpreting clickthrough data as implicit feedback. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2005, 154–161

11
Downey D, Dumais S, Liebling D, Horvitz E. Understanding the relationship between searchers’ queries and information goals. In: Proceedings of the 17th ACM Conference on Information and Knowledge Mining. 2008, 449–458

Outlines

/