A communication-reduced and computation-balanced framework for fast graph computation

Yongli CHENG; Fang WANG; Hong JIANG; Yu HUA; Dan FENG; Lingling ZHANG; Jun ZHOU

doi:10.1007/s11704-018-6400-1

PDF(1064 KB)

Front. Comput. Sci. ›› 2018, Vol. 12 ›› Issue (5) : 887-907. DOI: 10.1007/s11704-018-6400-1

RESEARCH ARTICLE

A communication-reduced and computation-balanced framework for fast graph computation

Yongli CHENG¹ ,
Fang WANG²^,³ ,
Hong JIANG⁴ ,
Yu HUA²^,³ ,
Dan FENG²^,³ ,
Lingling ZHANG²^,³ ,
Jun ZHOU²^,³

Author information +

History +

Abstract

The bulk synchronous parallel (BSP) model is very user friendly for coding and debugging parallel graph algorithms. However, existing BSP-based distributed graphprocessing frameworks, such as Pregel, GPS and Giraph, routinely suffer from high communication costs. These high communication costs mainly stem from the fine-grained message-passing communication model. In order to address this problem, we propose a new computation model with low communication costs, called LCC-BSP. We use this model to design and implement a high-performance distributed graphprocessing framework called LCC-Graph. This framework eliminates high communication costs in existing distributed graph-processing frameworks. Moreover, LCC-Graph also balances the computation workloads among all compute nodes by optimizing graph partitioning, significantly reducing the computation time for each superstep. Evaluation of LCC-Graph on a 32-node cluster, driven by real-world graph datasets, shows that it significantly outperforms existing distributed graph-processing frameworks in terms of runtime, particularly when the system is supported by a highbandwidth network. For example, LCC-Graph achieves an order of magnitude performance improvement over GPS and GraphLab.

Keywords

graph computation / communication decrease / computation balance

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Yongli CHENG, Fang WANG, Hong JIANG, Yu HUA, Dan FENG, Lingling ZHANG, Jun ZHOU. A communication-reduced and computation-balanced framework for fast graph computation. Front. Comput. Sci., 2018, 12(5): 887‒907 https://doi.org/10.1007/s11704-018-6400-1

This is a preview of subscription content, contact us for subscripton.

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Valiant L G. A bridging model for parallel computation. Communications of the ACM, 1990, 33(8): 103–111 CrossRef Google scholar

[2]	Malewicz G, Austern M H, Bik A J C. Pregel: a system for large-scale graph processing. In: Proceedings of ACM International Conference on Management of Data. 2010, 135–146 CrossRef Google scholar

[3]	Salihoglu S, Widom J. GPS: a graph processing system. In: Proceedings of ACM International Conference on Scientific and Statistical Database Management. 2013, 22–32 CrossRef Google scholar

[4]	Ching A, Edunov S, Kabiljo K, Logothetis D, Muthukrishnan S. One trillion edges: graph processing at facebook-scale. Proceedings of the VIDB Endowment, 2015, 8(12): 1804–1815 CrossRef Google scholar

[5]	Wang G, Xie W, Demers A J, Gehrke J. Asynchronous large-scale graph processing made easy. in: Proceedings of International Conference on Innovation Database Research. 2013, 135–146

[6]	Simmhan Y, Kumbhare A, Wickramaarachchi C, Nagarkar S, Ravi S, Raghavendra C, Prasanna V. Goffish: a sub-graph centric framework for large-scale graph analytics. In: Proceedings of European Conference on Parallel Processing. 2014, 451–462 CrossRef Google scholar

[7]	Kyrola A, Blelloch G E, Guestrin G. Graphchi: large-scale graph computation on just a PC. In: Proceedings of Usenix Conference on Operating Systems Design and Implementation. 2012, 31–46

[8]	Cheng Y L, Wang F, Jiang H, Hua Y, Feng D, Wang X N. DD-graph: a highly cost-effective distributed disk-based graph-processing framework. In: Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing. 2016, 259–262 CrossRef Google scholar

[9]	Cheng Y L, Wang F, Jiang H, Hua Y, Feng D, Wang X N. LCC-graph: a high-performance graph-processing framework with low communication costs. In: Proceedings of IEEE/ACM International Symposium on Quality of Service. 2016, 91–100

[10]	Cheng Y L, Jiang H, Wang F, Hua Y, Feng D. BlitzG: exploiting highbandwidth networks for fast graph processing. In: Proceedings of IEEE International Conference on Computer Communications. 2017, 2340–2348

[11]	Page L. The pagerank citation ranking : bringing order to the web. Stanford Digital Libraries Working Paper, 1998, 9(1): 1–14

[12]	Stefano L D, Bulgarelli A. A simple and efficient connected components labeling algorithm. In: Proceedings of International Conference on Image Analysis and Processing. 1999, 322–327 CrossRef Google scholar

[13]	Fortunato S. Community detection in graphs. Physics Reports, 2010, 486(3): 75–174 CrossRef Google scholar

[14]	Kothari R, Jain V. Learning from labeled and unlabeled data. In: Proceedings of IEEE International Joint Conference on Neural Networks. 2002, 2803–2808 CrossRef Google scholar

[15]	Lee M, Kim E J, Yousif M. Security enhancement in infiniband architecture. In: Proceedings of IEEE International Parallel and Distributed Processing Symposium. 2005, 105–114

[16]	Karypis G, Kumar V. A fast and high quality multilevel scheme for partitioning irregular graphs. Journal of Scientific Computing, 1998, 20(1): 359–392 CrossRef Google scholar

[17]	Kernighan B W, Lin S. An efficient heuristic procedure for partitioning graphs. Journal of Bell System Technical, 1970, 49(2): 291–307 CrossRef Google scholar

[18]	Bui T N, Moon B R. Genetic algorithm and graph partitioning. IEEE Transactions on Computers, 1996, 45(7): 841–855 CrossRef Google scholar

[19]	Roy A, Mihailovic I, Zwaenepoel W. X-stream: edge-centric graph processing using streaming partitions. In: Proceedings of the 24th ACM Symposium on Operating Systems Principles. 2013, 472–488 CrossRef Google scholar

[20]	Nishimura J, Ugander J. Restreaming graph partitioning: simple versatile algorithms for advanced balancing. In: Proceedings of ACM International Conference on Knowledge Discovery and Data Mining. 2013, 1106–1114 CrossRef Google scholar

[21]	Low Y, Bickson D, Gonzalez J E, Guestrin C, Kyrola A. Distributed graphlab: a framework for machine learning and data mining in the cloud. Proceedings of the VLDB Endowment, 2012, 5(8): 716–727 CrossRef Google scholar

[22]	Gonzalez J E, Low Y, Haijie G, Danny B, Carlos G. Powergraph: distributed graph-parallel computation on natural graphs. In: Proceedings of Usenix Conference on Operating Systems Design and Implementation. 2012, 17–30

[23]	Backstrom L, Huttenlocher D, Kleinberg J, Lan X. Group formation in large social networks: membership, growth, and evolution. In: Proceedings of the 12th ACM International Conference on Knowledge Discovery and Data Mining. 2006, 44–54 CrossRef Google scholar

[24]	Yan D, Cheng J, Xing K, Lu Y, Ng W, Bu Y G. Pregel algorithms for graph connectivity problems with performance guarantees. Proceedings of the VLDB Endowment, 2014, 7(14): 1821–1832 CrossRef Google scholar

[25]	Han M, Daudjee K. Giraph unchained: barrierless asynchronous parallel execution in pregel-like graph processing systems. Proceedings of the VLDB Endowment, 2015, 8(9): 950–961 CrossRef Google scholar

[26]	Yang S, Yan X, Zong B, Khan A. Towards effective partition management for large graphs. In: Proceedings of ACM Conference on Management of Data. 2012: 517–528 CrossRef Google scholar

[27]	Xin R S, Crankshaw D, Dave A, Gonzalez J E, Franklin M J, Stoica I. Graphx: unifying data-parallel and graph-parallel analytics, Computer Science, 2014, 8(3): 125–137

[28]	Da Y, Yingyi B, Yuanyuan T, Deshpande A. Big graph analytics platforms. Foundations and Trends in Databases, 2017, 7(2): 180–195

[29]	Lu Y, Cheng J, Yan D, Wu H. Large-scale distributed graph computing systems: an experimental evaluation. Proceedings of the VLDB Endowment, 2014, 8(3): 281–292 CrossRef Google scholar

[30]	Zhou C, Gao J, Sun B, Yu J X. Mocgraph: scalable distributed graph processing using message online computing. Proceedings of the VLDB Endowment, 2014, 8(4): 377–388 CrossRef Google scholar

[31]	Yan D, Cheng J, Lu Y, Ng Y. Blogel: a block-centric framework for distributed computation on real-world graphs. Proceedings of the VLDB Endowment, 2014, 7(14): 1981–1992 CrossRef Google scholar

[32]	Yan D, Cheng J, Ozsu M T, Lu Y. A general-purpose query-centric framework for querying big graphs. Proceedings of the VLDB Endowment, 2016, 9(7): 564–575 CrossRef Google scholar

[33]	Devine K D, Boman E G, Heaphy R T. New challenges in dynamic load balancing. Applied Numerical Mathematics, 2005, 52(2): 133–152 CrossRef Google scholar

[34]	Cheng J, Liu Q, Li Z, Fan W, Lui J C S. VENUS: vertex-centric streamlined graph computation on a single PC. In: Proceedings of IEEE International Conference on Data Engineering. 2015, 1131–1142 CrossRef Google scholar

[35]	Malicevic J, Roy A, Zwaenepoel W. Scale-up graph processing in the cloud:challenges and solutions. In: Proceedings of International Workshop on Cloud Data and Platforms. 2014, 1–6 CrossRef Google scholar

[36]	Pearce R, Gokhale M, Amato N M. Scaling techniques for massive scale-free graphs in distributed (external) memory. In: Proceedings of International Symposium on Parallel and Distributed Processing. 2013, 825–836 CrossRef Google scholar

[37]	Roy A, Bindschaedler L, Malicevic J, Zwaenepoel W. Chaos: scale-out graph processing from secondary storage. In: Proceedings of Symposium on Operating Systems Principles. 2015, 410–424 CrossRef Google scholar

[38]	Zhu X, Han W, Chen W. GridGraph: large-scale graph processing on a single machine using 2-level hierarchical partitioning. In: Proceedings of Usenix Conference on Usenix Technical Conference. 2015, 375–386

[39]	Xu N, Chen L, Cui B. LogGP: a log-based dynamic graph partitioning method. Proceedings of the VLDB Endowment, 2014, 7(14): 1917–1928 CrossRef Google scholar

[40]	Ming W, Fan Y, Jilong X, Xiao W, Miao Y. GRAM: scaling graph computation to the trillions. In: Proceedings of ACM Symposium on Cloud Computing. 2015, 408–421