PSG: a two-layer graph model for document summarization
Heng CHEN, Hai JIN, Feng ZHAO
PSG: a two-layer graph model for document summarization
Graph model has been widely applied in document summarization by using sentence as the graph node, and the similarity between sentences as the edge. In this paper, a novel graph model for document summarization is presented, that not only sentences relevance but also phrases relevance information included in sentences are utilized. In a word, we construct a phrase-sentence two-layer graph structure model (PSG) to summarize document(s) . We use this model for generic document summarization and query-focused summarization. The experimental results show that our model greatly outperforms existing work.
relationship graph / Markov random walk / document summarization
[1] |
Wan X, Yang J. Multi-document summarization using cluster-based link analysis. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Informal Retrieval. 2008, 299-306
CrossRef
Google scholar
|
[2] |
Erkan G, Radev D. Lexrank: graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, 2004, 22: 457-479
|
[3] |
Wan X, Yang J. Collabsum: exploiting multiple document clustering for collaborative single document summarizations. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Informal Retrieval. 2007, 143-150
CrossRef
Google scholar
|
[4] |
Radev D, Jing H, Stys M, Tam D. Centroid-based summarization of multiple documents. Information Processing and Management, 2004, 40(6): 919-938
CrossRef
Google scholar
|
[5] |
Mihalcea R. Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions. 2004, 20:1-20:4
CrossRef
Google scholar
|
[6] |
Otterbacher J, Erkan G, Radev D. Using random walks for questionfocused sentence retrieval. In: Proceedings of the 2005 Conference on Human Language Technology and Empirical Methods in Natural Language Processing. 2005, 915-922
CrossRef
Google scholar
|
[7] |
Zhao L, Wu L, Huang X. Using query expansion in graph-based approach for query-focused multi-document summarization. Information Processing and Management, 2009, 45(1): 35-41
CrossRef
Google scholar
|
[8] |
Wan X, Yang J, Xiao J. Manifold-ranking based topic-focused multidocument summarization. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence. 2007, 2903-2908
|
[9] |
Daumé III H, Marcu D. Bayesian query-focused summarization. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. 2006, 305-312
|
[10] |
Ramanathan K, Sankarasubramaniam Y, Mathur N, Gupta A. Document summarization using wikipedia. In: Proceedings of the 1st International Conference on Intelligent Human Computer Interaction. 2009, 254-260
CrossRef
Google scholar
|
[11] |
Kumar N, Srinathan K, Varma V. Using wikipedia anchor text and weighted clustering coefficient to enhance the traditional multidocument summarization. Computational Linguistics and Intelligent Text Processing, 2012, 7182: 390-401
|
[12] |
Nastase V. Topic-driven multi-document summarization with encyclo pedic knowledge and spreading activation. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. 2008, 763-772
|
[13] |
Erkan G, Radev D. Lexpagerank: prestige in multi-document text summarization. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing Chairs. 2004, 365-371
|
[14] |
Li L, Shang Y, Zhang W. Improvement of hits-based algorithms on web documents. In: Proceedings of the 11th International Conference on World Wide Web. 2002, 527-535
|
[15] |
Radev D, Allison T, Blair-Goldensohn S, Blitzer J. MEAD–a platform for multidocument multilingual text summarization. In: Proceedings of the 4th International Conference on Language Resources and Evaluation. 2004, 699-702
|
[16] |
Abu-Jbara A, Radev D. Coherent citation-based summarization of scientific papers. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. 2011, 500-509
|
[17] |
Mihalcea R. Language independent extractive summarization. In: Proceedings of the ACL 2005 on Interactive Poster and Demonstration Sessions. 2005, 49-52
CrossRef
Google scholar
|
[18] |
Cai X, Li W, Ouyang Y, Yan H. Simultaneous ranking and clustering of sentences: a reinforcement approach to multi-document summarization. In: Proceedings of the 23rd International Conference on Computational Linguistics. 2010, 134-142
|
[19] |
Feng J, He X, Konte B, Böhm C, Plant C. Summarization-based mining bipartite graphs. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2012, 1249-1257
CrossRef
Google scholar
|
[20] |
Alguliev R M, Aliguliyev R M, Isazade N R. CDDS: constraint-driven document summarization models. Expert Systems with Applications, 2013, 40(2): 458-465
CrossRef
Google scholar
|
[21] |
Mukherjee S, Bhattacharyya P. Wikisent: weakly supervised sentiment analysis through extractive summarization with wikipedia. In: Proceedings of the 2012 European Conference on Machine Learning and Knowledge Discovery in Databases. 2012, 774-793
|
[22] |
Pourvali M, Abadeh M S. Automated text summarization base on lexicales chain and graph using of wordnet and wikipedia knowledge base. International Journal of Computer Science Issues, 2012, 9(3): 343-349
|
[23] |
Wan X. Document-based hits model for multi-document summarization. Lecture Notes in Computer Science, 2008, 5351: 454-465
CrossRef
Google scholar
|
[24] |
Zhang Z, Ge S S, He H. Mutual-reinforcement document summarization using embedded graph based sentence clustering for storytelling. Information Processing and Management, 2012, 48(4): 767-778
CrossRef
Google scholar
|
[25] |
Alguliev R M, Aliguliyev R M, Isazade N R. Multiple documents summarization based on evolutionary optimization algorithm. Expert Systems with Applications, 2013, 40(5): 1675-1689
CrossRef
Google scholar
|
[26] |
Kumar N, Srinathan K. Automatic keyphrase extraction from scientific documents using n-gram filtration technique. In: Proceeding of the 8th ACM Symposium on Document Engineering. 2008, 199-208
CrossRef
Google scholar
|
[27] |
Cui G, Lu Q, Li W, Chen Y. Mining concepts from wikipedia for ontology construction. In: Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference onWeb Intelligence and Intelligent Agent Technology. 2009, 3: 287-290
|
[28] |
Wang P, Domeniconi C. Building semantic kernels for text classification using wikipedia. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2008, 713-721
CrossRef
Google scholar
|
[29] |
Wang P, Hu J, Zeng H, Chen L, Chen Z. Improving text classification by using encyclopedia knowledge. In: Proceedings of the 7th IEEE International Conference on Data Mining. 2007, 332-341
|
[30] |
Von Luxburg U. A tutorial on spectral clustering. Statistics and Computing, 2007, 17(4): 395-416
CrossRef
Google scholar
|
[31] |
Carbonell J, Goldstein J. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 1998, 335-336
|
[32] |
Xu J, Croft W. Improving the effectiveness of information retrieval with local context analysis. ACM Transactions on Information Systems (TOIS), 2000, 18(1): 79-112
CrossRef
Google scholar
|
[33] |
Lin C. Rouge: A package for automatic evaluation of summaries. In: Text Summarization Branches Out: Proceedings of the ACL2004 WorkShop. 2004, 74-81
|
[34] |
Hu P, Ji D, Teng C. Co-hits-ranking based query-focused multidocument summarization. Information Retrieval Technology, 2010, 6458: 121-130
CrossRef
Google scholar
|
/
〈 | 〉 |