Approach to extracting hot topics based on network traffic content

Yadong ZHOU, Xiaohong GUAN, Qindong SUN, Wei LI, Jing TAO

PDF(98 KB)
PDF(98 KB)
Front. Electr. Electron. Eng. ›› 2009, Vol. 4 ›› Issue (1) : 20-23. DOI: 10.1007/s11460-009-0002-5
RESEARCH ARTICLE
RESEARCH ARTICLE

Approach to extracting hot topics based on network traffic content

Author information +
History +

Abstract

This article presents the formal definition and description of popular topics on the Internet, analyzes the relationship between popular words and topics, and finally introduces a method that uses statistics and correlation of the popular words in traffic content and network flow characteristics as input for extracting popular topics on the Internet. Based on this, this article adapts a clustering algorithm to extract popular topics and gives formalized results. The test results show that this method has an accuracy of 16.7% in extracting popular topics on the Internet. Compared with web mining and topic detection and tracking (TDT), it can provide a more suitable data source for effective recovery of Internet public opinions.

Keywords

hot topic extraction / network traffic content / Internet public opinion analysis

Cite this article

Download citation ▾
Yadong ZHOU, Xiaohong GUAN, Qindong SUN, Wei LI, Jing TAO. Approach to extracting hot topics based on network traffic content. Front Elect Electr Eng Chin, 2009, 4(1): 20‒23 https://doi.org/10.1007/s11460-009-0002-5

References

[1]
Allan J, Carbonell J, Doddington G, Yamron J, Yang Y. Topic detection and tracking pilot study: final report. In: Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop. San Francisco: Morgan Kaufmann Publishers, 1998, 194–218
[2]
Yu M, Luo W, Xu H, Bai S. Research on hierarchical topic detection in topic detection and tracking. Journal of Computer Research and Development, 2006, 43(3): 489–495 (in Chinese)
CrossRef Google scholar
[3]
Kosala R, Blockeel H. Web mining research: a survey. ACM SIGKDD Explorations Newsletter, 2000, 2(1): 1–15
CrossRef Google scholar
[4]
Wang Z, Jin F, Li X, Wang G. Web data mining technique and realization. Journal of Harbin Institute of Technology, 2005, 37(10): 1403–1405 (in Chinese)
[5]
Li B, Yu S. Research on topic detection and tracking. Computer Engineering and Applications, 2003, 39(17): 7–10 (in Chinese)
[6]
Topic detection and tracking (TDT) evaluation workshop. The 2002 topic detection and tracking task definition and evaluation plan. [4/20/2006]. ftp://jaguar.ncsl.nist.gov/tdt/tdt2002/
[7]
Jain R, Routhier S A. Packet trains–measurements and a new model for computer network traffic. IEEE Journal on Selected Areas in Communications, 1986, 4(6): 986–995
CrossRef Google scholar
[8]
Mogul J C. Observing TCP dynamics in real networks. ACM SIGCOMM Computer Communication Review, 1992, 22(4): 305–317
CrossRef Google scholar
[9]
Claffy K C, Braun H W, Polyzos G C. A parameterizable methodology for Internet traffic flow profiling. IEEE Journal on Selected Areas in Communications, 1995, 13(8): 1481–1494
CrossRef Google scholar
[10]
Ester M, Kriegel H P, Sander J, . A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. Menlo Park, USA: AAAI Press, 1996, 226–231

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant No. 60574087), the Hi-Tech Research and Development Program of China (2007AA01Z475, 2007AA01Z480, 2007A-A01Z464), and the 111 International Collaboration Program of China.

RIGHTS & PERMISSIONS

2014 Higher Education Press and Springer-Verlag Berlin Heidelberg
PDF(98 KB)

Accesses

Citations

Detail

Sections
Recommended

/