PSG: a two-layer graph model for document summarization

Heng CHEN; Hai JIN; Feng ZHAO

doi:10.1007/s11704-013-2292-2

PDF(438 KB)

Front. Comput. Sci. ›› 2014, Vol. 8 ›› Issue (1) : 119-130. DOI: 10.1007/s11704-013-2292-2

RESEARCH ARTICLE

PSG: a two-layer graph model for document summarization

Author information +

History +

Abstract

Graph model has been widely applied in document summarization by using sentence as the graph node, and the similarity between sentences as the edge. In this paper, a novel graph model for document summarization is presented, that not only sentences relevance but also phrases relevance information included in sentences are utilized. In a word, we construct a phrase-sentence two-layer graph structure model (PSG) to summarize document(s) . We use this model for generic document summarization and query-focused summarization. The experimental results show that our model greatly outperforms existing work.

Keywords

relationship graph / Markov random walk / document summarization

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Heng CHEN, Hai JIN, Feng ZHAO. PSG: a two-layer graph model for document summarization. Front. Comput. Sci., 2014, 8(1): 119‒130 https://doi.org/10.1007/s11704-013-2292-2

This is a preview of subscription content, contact us for subscripton.

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Wan X, Yang J. Multi-document summarization using cluster-based link analysis. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Informal Retrieval. 2008, 299-306 CrossRef Google scholar

[2]	Erkan G, Radev D. Lexrank: graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, 2004, 22: 457-479

[3]	Wan X, Yang J. Collabsum: exploiting multiple document clustering for collaborative single document summarizations. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Informal Retrieval. 2007, 143-150 CrossRef Google scholar

[4]	Radev D, Jing H, Stys M, Tam D. Centroid-based summarization of multiple documents. Information Processing and Management, 2004, 40(6): 919-938 CrossRef Google scholar

[5]	Mihalcea R. Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions. 2004, 20:1-20:4 CrossRef Google scholar

[6]	Otterbacher J, Erkan G, Radev D. Using random walks for questionfocused sentence retrieval. In: Proceedings of the 2005 Conference on Human Language Technology and Empirical Methods in Natural Language Processing. 2005, 915-922 CrossRef Google scholar

[7]	Zhao L, Wu L, Huang X. Using query expansion in graph-based approach for query-focused multi-document summarization. Information Processing and Management, 2009, 45(1): 35-41 CrossRef Google scholar

[8]	Wan X, Yang J, Xiao J. Manifold-ranking based topic-focused multidocument summarization. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence. 2007, 2903-2908

[9]	Daumé III H, Marcu D. Bayesian query-focused summarization. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. 2006, 305-312

[10]	Ramanathan K, Sankarasubramaniam Y, Mathur N, Gupta A. Document summarization using wikipedia. In: Proceedings of the 1st International Conference on Intelligent Human Computer Interaction. 2009, 254-260 CrossRef Google scholar

[11]	Kumar N, Srinathan K, Varma V. Using wikipedia anchor text and weighted clustering coefficient to enhance the traditional multidocument summarization. Computational Linguistics and Intelligent Text Processing, 2012, 7182: 390-401

[12]	Nastase V. Topic-driven multi-document summarization with encyclo pedic knowledge and spreading activation. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. 2008, 763-772

[13]	Erkan G, Radev D. Lexpagerank: prestige in multi-document text summarization. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing Chairs. 2004, 365-371

[14]	Li L, Shang Y, Zhang W. Improvement of hits-based algorithms on web documents. In: Proceedings of the 11th International Conference on World Wide Web. 2002, 527-535

[15]	Radev D, Allison T, Blair-Goldensohn S, Blitzer J. MEAD–a platform for multidocument multilingual text summarization. In: Proceedings of the 4th International Conference on Language Resources and Evaluation. 2004, 699-702

[16]	Abu-Jbara A, Radev D. Coherent citation-based summarization of scientific papers. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. 2011, 500-509

[17]	Mihalcea R. Language independent extractive summarization. In: Proceedings of the ACL 2005 on Interactive Poster and Demonstration Sessions. 2005, 49-52 CrossRef Google scholar

[18]	Cai X, Li W, Ouyang Y, Yan H. Simultaneous ranking and clustering of sentences: a reinforcement approach to multi-document summarization. In: Proceedings of the 23rd International Conference on Computational Linguistics. 2010, 134-142

[19]	Feng J, He X, Konte B, Böhm C, Plant C. Summarization-based mining bipartite graphs. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2012, 1249-1257 CrossRef Google scholar

[20]	Alguliev R M, Aliguliyev R M, Isazade N R. CDDS: constraint-driven document summarization models. Expert Systems with Applications, 2013, 40(2): 458-465 CrossRef Google scholar

[21]	Mukherjee S, Bhattacharyya P. Wikisent: weakly supervised sentiment analysis through extractive summarization with wikipedia. In: Proceedings of the 2012 European Conference on Machine Learning and Knowledge Discovery in Databases. 2012, 774-793

[22]	Pourvali M, Abadeh M S. Automated text summarization base on lexicales chain and graph using of wordnet and wikipedia knowledge base. International Journal of Computer Science Issues, 2012, 9(3): 343-349

[23]	Wan X. Document-based hits model for multi-document summarization. Lecture Notes in Computer Science, 2008, 5351: 454-465 CrossRef Google scholar

[24]	Zhang Z, Ge S S, He H. Mutual-reinforcement document summarization using embedded graph based sentence clustering for storytelling. Information Processing and Management, 2012, 48(4): 767-778 CrossRef Google scholar

[25]	Alguliev R M, Aliguliyev R M, Isazade N R. Multiple documents summarization based on evolutionary optimization algorithm. Expert Systems with Applications, 2013, 40(5): 1675-1689 CrossRef Google scholar

[26]	Kumar N, Srinathan K. Automatic keyphrase extraction from scientific documents using n-gram filtration technique. In: Proceeding of the 8th ACM Symposium on Document Engineering. 2008, 199-208 CrossRef Google scholar

[27]	Cui G, Lu Q, Li W, Chen Y. Mining concepts from wikipedia for ontology construction. In: Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference onWeb Intelligence and Intelligent Agent Technology. 2009, 3: 287-290

[28]	Wang P, Domeniconi C. Building semantic kernels for text classification using wikipedia. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2008, 713-721 CrossRef Google scholar

[29]	Wang P, Hu J, Zeng H, Chen L, Chen Z. Improving text classification by using encyclopedia knowledge. In: Proceedings of the 7th IEEE International Conference on Data Mining. 2007, 332-341

[30]	Von Luxburg U. A tutorial on spectral clustering. Statistics and Computing, 2007, 17(4): 395-416 CrossRef Google scholar

[31]	Carbonell J, Goldstein J. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 1998, 335-336

[32]	Xu J, Croft W. Improving the effectiveness of information retrieval with local context analysis. ACM Transactions on Information Systems (TOIS), 2000, 18(1): 79-112 CrossRef Google scholar

[33]	Lin C. Rouge: A package for automatic evaluation of summaries. In: Text Summarization Branches Out: Proceedings of the ACL2004 WorkShop. 2004, 74-81

[34]	Hu P, Ji D, Teng C. Co-hits-ranking based query-focused multidocument summarization. Information Retrieval Technology, 2010, 6458: 121-130 CrossRef Google scholar