IncPregel: an incremental graph parallel computation model

Qiang LIU; Xiaoshe DONG; Heng CHEN; Yinfeng WANG

doi:10.1007/s11704-016-6109-y

PDF(781 KB)

Front. Comput. Sci. ›› 2018, Vol. 12 ›› Issue (6) : 1076-1089. DOI: 10.1007/s11704-016-6109-y

RESEARCH ARTICLE

IncPregel: an incremental graph parallel computation model

Author information +

History +

Abstract

Large-scale graph computation is often required in a variety of emerging applications such as social network computation and Web services. Such graphs are typically large and frequently updated with minor changes. However, re-computing an entire graphwhen a fewvertices or edges are updated is often prohibitively expensive. To reduce the cost of such updates, this study proposes an incremental graph computation model called IncPregel, which leverages the nonafter- effect property of the first-order Markov chain and provides incremental programming abstractions to avoid redundant computation and message communication. This is accomplished by employing an efficient and fine-grained reuse mechanism. We implemented this model on Hama, a popular open source framework based on Pregel, to construct an incremental graph processing system called IncHama. IncHama automatically detects changes in input in order to recognize “changed vertices” and to exchange reusable data by means of shuffling. The evaluation results on large-scale graphs show that, compared with Hama, IncHama is 1.1–2.7 times faster and can reduce communication messages by more than 50% when the incremental edges increase in number from 0.1 to 100k.

Keywords

graph computation / Pregel / cloud computing

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Qiang LIU, Xiaoshe DONG, Heng CHEN, Yinfeng WANG. IncPregel: an incremental graph parallel computation model. Front. Comput. Sci., 2018, 12(6): 1076‒1089 https://doi.org/10.1007/s11704-016-6109-y

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Dean J, Ghemawat S. MapReduce: simplified data processing on large clusters. Communications of the ACM, 2008, 51(1): 107–113 CrossRef Google scholar

[2]	Malewicz G, Austern M H, Bik A J C, Dehnert J C, Horn I, Leiser N, Czajkowski G. Pregel: a system for large-scale graph processing. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 2010, 135–146 CrossRef Google scholar

[3]	Low Y C, Gonzalez J, Kyrola A, Bickson D, Guestrin C E, Hellerstein J M. GraphLab: a new framework for parallel machine learning. In: Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence. 2010, 340–349

[4]	Low Y C, Bickson D, Gonzalez J, Guestrin C E, Kyrola A, Hellerstein J M. Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proceedings of the Very Large Data Base Endowment, 2012, 5(8): 716–727 CrossRef Google scholar

[5]	Power R, Li J Y. Piccolo: building fast, distributed programs with partitioned tables. In: Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation. 2010, 1–14

[6]	Roy A, Mihailovic I, Zwaenepoel W. X-stream: edge-centric graph processing using streaming partitions. In: Proceedings of the 24th ACM Symposium on Operating Systems Principles. 2013, 472–488 CrossRef Google scholar

[7]	Wilson C, Sala A, Puttaswamy K P N, Zhao B Y. Beyond social graphs: user interactions in online social networks and their implications. ACM Transactions on the Web, 2012, 6(4): 17 CrossRef Google scholar

[8]	Fan W F, Wang X, Wu Y H. Incremental graph pattern matching. ACM Transactions on Database Systems, 2013, 38(3): 18 CrossRef Google scholar

[9]	Logothetis D, Olston C, Reed B, Webb K C, Yocum K. Stateful bulk processing for incremental analytics. In: Proceedings of the 1st ACM Symposium on Cloud Computing. 2010, 51–62 CrossRef Google scholar

[10]	Bhatotia P, Wieder A, Rodrigues R, Acar U A, Pasquin R. Incoop: MapReduce for incremental computations. In: Proceedings of the 2nd ACM Symposium on Cloud Computing. 2011 CrossRef Google scholar

[11]	Sagharichian M, Naderi H, Haghjoo M. ExPregel: a new computational model for large-scale graph processing. Concurrency and Computation: Practice and Experience, 2015, 27(17): 4954–4969 CrossRef Google scholar

[12]	Brin S, Page L. Reprint of: the anatomy of a large-scale hypertextual Web search engine. Computer Networks, 2012, 56(18): 3825–3833 CrossRef Google scholar

[13]	Gyöngyi Z, Garcia-Molina H, Pedersen J. Combating Web spam with trustrank. In: Proceedings of the 30th International Conference on Very Large Data Base. 2004, 576–587

[14]	Bu Y Y, Howe B, Balazinska M, Ernst M D. HaLoop: efficient iterative data processing on large clusters. Proceedings of the Very Large Data Base Endowment, 2010, 3(1–2): 285–296 CrossRef Google scholar

[15]	Kang U, Tsourakakis C E, Faloutsos C. Pegasus: a peta-scale graph mining system implementation and observations. In: Proceedings of the 9th IEEE International Conference on Data Mining. 2009, 229–238 CrossRef Google scholar

[16]	Kang U, Tsourakakis C E, Appel A P, Faloutsos C, Leskovec J. Hadi: mining radii of large graphs. ACM Transactions on Knowledge Discovery from Data, 2011, 5(2): 8 CrossRef Google scholar

[17]	Valiant L G. A bridging model for parallel computation. Communications of the ACM, 1990, 33(8): 103–111 CrossRef Google scholar

[18]	Prabhakaran V, Wu M, Weng X T, McSherry F, Zhou L D, Haridasan M. Managing large graphs on multi-cores with graph awareness. In: Proceedings of USENIX Annual Technical Conference. 2012, 41–52

[19]	Gonzalez J E, Xin R S, Dave A, Crankshaw D, Franklin M J, Stoica I. GraphX: graph processing in a distributed dataflow framework. In: Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation. 2014, 599–613

[20]	Zaharia M, Chowdhury M, Franklin M J, Shenker S, Stoica I. Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing. 2010

[21]	Gonzalez J E, Low Y C, Gu H J, Bickson D, Guestrin C. PowerGraph: distributed graph-parallel computation on natural graphs. In: Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation. 2012, 17–30

[22]	Chen R, Ding X, Wang P, Chen H B, Zang B Y, Guan H B. Computation and communication efficient graph processing with distributed immutable view. In: Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed computing. 2014, 215–226 CrossRef Google scholar

[23]	Desikan P, Pathak N, Srivastava J, Kumar V. Incremental page rank computation on evolving graphs. In: Proceedings of Special Interest Tracks and Posters of the 14th International Conference onWorldWide Web. 2005, 1094–1095 CrossRef Google scholar

[24]	Chien S, Dwork C, Kumar R, Simon D R, Sivakumar D. Link evolution: analysis and algorithms. Internet Mathematics, 2004, 1(3): 277–304 CrossRef Google scholar

[25]	Popa L, Budiu M, Yu Y, Isard M. DryadInc: reusing work in large-scale computations. In: Proceedings of the 2009 Conference on Hot Topics in Cloud Computing. 2009

[26]	Peng D, Dabek F. Large-scale incremental processing using distributed transactions and notifications. In: Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation. 2010, 1–15

[27]	Cheng R, Hong J, Kyrola A, Miao Y S, Weng X T, Wu M, Yang F, Zhou L D, Zhao F, Chen E H. Kineograph: taking the pulse of a fastchanging and connected world. In: Proceedings of the 7th ACM European Conference on Computer Systems. 2012, 85–98 CrossRef Google scholar

[28]	Lovász L. Random walks on graphs: a survey. Combinatorics, Paul Erdos is Eighty, 1993, 2(1): 1–46

[29]	Puterman M L. Markov Decision Processes: Discrete Dynamic Stochastic Programming. New York: John Wiley & Sons, 1994 CrossRef Google scholar

[30]	Shao Y X, Cui B, Ma L. PAGE: a partition aware engine for parallel graph computation. IEEE Transactions on Knowledge and Data Engineering, 2015, 27(2): 518–530 CrossRef Google scholar

RIGHTS & PERMISSIONS

2018 Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature

AI Summary AI Mindmap

PDF(781 KB)

Accesses

Citations

Detail

Sections

Recommended

Received	Accepted	Published
23 Feb 2016	14 Oct 2016	04 Dec 2018
Just Accepted Date	Online First Date	Issue Date
08 Nov 2016	20 Dec 2017	04 Dec 2018

About the journal

Aims & scope

Description

Editorial board

Abstracting / Indexing

Contact us

Browse

Just accepted

Online first

Latest issue

All volumes and issues

Collections

Featured articles

Most accessed

Most cited

Collections

Multimedia collections

Authors & reviewers

Online submisson

Call for papers

Guidelines for authors

Download templates