Adaptive and scalable load balancing for metadata server cluster in cloud-scale file systems

Quanqing XU; Rajesh Vellore ARUMUGAM; Khai Leong YONG; Yonggang WEN; Yew-Soon ONG; Weiya XI

doi:10.1007/s11704-015-4560-9

PDF(923 KB)

Front. Comput. Sci. ›› 2015, Vol. 9 ›› Issue (6) : 904-918. DOI: 10.1007/s11704-015-4560-9

RESEARCH ARTICLE

Adaptive and scalable load balancing for metadata server cluster in cloud-scale file systems

Author information +

History +

Abstract

Big data is an emerging term in the storage industry, and it is data analytics on big storage, i.e., Cloud-scale storage. In Cloud-scale (or EB-scale) file systems, load balancing in request workloads across a metadata server cluster is critical for avoiding performance bottlenecks and improving quality of services.Many good approaches have been proposed for load balancing in distributed file systems. Some of them pay attention to global namespace balancing, making metadata distribution across metadata servers as uniform as possible. However, they do not work well in skew request distributions, which impair load balancing but simultaneously increase the effectiveness of caching and replication. In this paper, we propose Cloud Cache (C²), an adaptive and scalable load balancing scheme for metadata server cluster in EB-scale file systems. It combines adaptive cache diffusion and replication scheme to cope with the request load balancing problem, and it can be integrated into existing distributed metadata management approaches to efficiently improve their load balancing performance. C² runs as follows: 1) to run adaptive cache diffusion first, if a node is overloaded, loadshedding will be used; otherwise, load-stealing will be used; and 2) to run adaptive replication scheme second, if there is a very popular metadata item (or at least two items) causing a node be overloaded, adaptive replication scheme will be used, in which the very popular item is not split into several nodes using adaptive cache diffusion because of its knapsack property. By conducting performance evaluation in trace-driven simulations, experimental results demonstrate the efficiency and scalability of C².

Keywords

metadata management / load balancing / adaptive cache diffusion / adaptive replication / cloud-scale file systems

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Quanqing XU, Rajesh Vellore ARUMUGAM, Khai Leong YONG, Yonggang WEN, Yew-Soon ONG, Weiya XI. Adaptive and scalable load balancing for metadata server cluster in cloud-scale file systems. Front. Comput. Sci., 2015, 9(6): 904‒918 https://doi.org/10.1007/s11704-015-4560-9

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Raicu I, Foster I, Beckman P. Making a case for distributed file systems at exascale. In: Proceedings of the 3rd International Workshop on Large-scale System and Application Performance. 2011, 11−18 CrossRef Google scholar

[2]	Amer A, Long D, and Schwarz T. Reliability challenges for storing exabytes. In: Proceedings of International Conference on Computing, Networking and Communications. 2014, 907−913 CrossRef Google scholar

[3]	Ousterhout J K, Costa H D, Harrison D, Kunze J A, Kupfer M D, Thompson J G. A trace-driven analysis of the UNIX 4.2 BSD file system. In: Proceedings of ACM Symposium on Operating Systems Principles. 1985, 15−24 CrossRef Google scholar

[4]	Zhu Y, Jiang H, Wang J, Xian F. HBA: Distributed metadata management for large cluster-based storage systems. IEEE Transactions on Parallel and Distributed Systems, 2008, 19(6): 750−763 CrossRef Google scholar

[5]	Hua Y, Zhu Y, Jiang H, Feng D, Tian L. Supporting scalable and adaptive metadata management in ultralarge-scale file systems. IEEE Transactions on Parallel and Distributed Systems, 2011, 22(4): 580−593 CrossRef Google scholar

[6]	Welch B, Unangst M, Abbasi Z, Gibson G A, Mueller B, Small J, Zelenka J, Zhou B. Scalable performance of the panasas parallel file system. In: Proceedings of the 6th USENIX Conference on File and Storage Technologies. 2008, 17−33

[7]	Xu Q, Arumugam R V, Yang K L, Mahadevan S. DROP: Facilitating distributed metadata management in EB-scale storage systems. In: Proceedings of the 30th IEEE Symposium on Mass Storage Systems and Technologies. 2013, 1−10 CrossRef Google scholar

[8]	Chen Z, Xiong J, Meng D. Replication-based highly available metadata management for cluster file systems. In: Proceedings of IEEE International Conference on Cluster Computing. 2010, 292−301 CrossRef Google scholar

[9]	Wendell P, Freedman M J. Going viral: flash crowds in an open CDN. In: Proceedings of ACM SIGCOMM Conference on Internet Measurement. 2011, 549−558 CrossRef Google scholar

[10]	Fan B, Lim H, Andersen D G, Kaminsky M. Small cache, big effect: provable load balancing for randomly partitioned cluster services. In: Proceedings of ACM Symposium on Cloud Computing. 2011, 26−28 CrossRef Google scholar

[11]	Xu Q, Arumugam R V, Yong K L, Wen Y, Ong Y S. C²: Adaptive load balancing for metadata server cluster in cloud-scale storage systems. In: Proceedings of the 18th Asia Pacific Symposium on Intelligent and Evolutionary Systems. 2015, 195−209 CrossRef Google scholar

[12]	Kavalanekar S, Worthington B L, Zhang Q, Sharda V. Characterization of storage workload traces from production windows servers. In: Proceedings of IEEE International Symposium on Workload Characterization. 2008, 119−128 CrossRef Google scholar

[13]	Ellard D, Ledlie J, Malkani P, Seltzer MI. Passive NFS tracing of email and research workloads. In: Proceedings of USENIX Conference on File and Storage Technologies. 2003, 203−216

[14]	Stoica I, Morris R, Karger D R, Kaashoek MF, Balakrishnan H. Chord: a scalable peer-to-peer lookup service for internet applications. ACM SIGCOMM Computer Communication Review, 2001, 31(4): 149−160 CrossRef Google scholar

[15]	Ledlie J, Seltzer M I. Distributed, secure load balancing with skew, heterogeneity and churn. In: Proceedings of IEEE International Conference on Computer Communications. 2005, 1419−1430 CrossRef Google scholar

[16]	Andersen D G, Franklin J, Kaminsky M, Phanishayee A, Tan L, Vasudevan V. FAWN: a fast array of wimpy nodes. In: Proceedings of ACM Symposium on Operating Systems Principles. 2009, 1−14 CrossRef Google scholar

[17]	O’Neil P E, Cheng E, Gawlick D, O’Neil E J. The log-structured merge-tree (LSM-tree). Acta Informatica, 1996, 33(4): 351−385 CrossRef Google scholar

[18]	Chang F, Dean J, Ghemawat S, Hsieh W C, Wallach D A, Burrows M, Chandra T, Fikes A, Gruber R. Bigtable: A distributed storage system for structured data. In: Proceedings of USENIX Symposium on Operating Systems Design and Implementation. 2006, 205−218

[19]	Shetty P, Spillane R P, Malpani R, Andrews B, Seyster J, Zadok E. Building workload-independent storage with VT-trees. In: Proceedings of USENIX conference on File and Storage Technologies. 2013, 17−30

[20]	Wang P, Sun G, Jiang S, Ouyang J, Lin S, Zhang C, Cong J. An efficient design and implementation of LSM-tree based key-value store on open-channel SSD. In: Proceedings of European Conference on Computer Systems. 2014, 13−16 CrossRef Google scholar

[21]	Sivasubramanian S, Pierre G, Steen M, Alonso G. Analysis of caching and replication strategies for web applications. IEEE Internet Computing, 2007, 11(1): 60−66 CrossRef Google scholar

[22]	Gummadi P K, Dunn R J, Saroiu S, Gribble S D, Levy H M, Zahorjan J. Measurement, modeling, and analysis of a peer-to-peer file-sharing workload. In: Proceedings of ACM Symposium on Operating Systems Principles. 2003, 314−329 CrossRef Google scholar

[23]	Khuller S, Kim Y A, Wan Y J. Algorithms for data migration with cloning. In: Proceedings of ACM on Principles of Database Systems. 2003, 27−36 CrossRef Google scholar

[24]	Fan L, Cao P, Almeida J M, Broder A Z. Summary cache: a scalable wide-area web cache sharing protocol. IEEE/ACM Transactions on Networking, 2000, 8(3): 281−293 CrossRef Google scholar

[25]	Bykov S, Geller A, Kliot G, Larus J R, Pandya R, Thelin J. Orleans: cloud computing for everyone. In: Proceedings of ACM Symposium on Cloud Computing. 2011, 1−14 CrossRef Google scholar

[26]	Xu Q, Arumugam R, Yong K L, Mahadevan S. Efficient and scalable metadata management in EB-scale file systems. IEEE Transactions on Parallel and Distributed Systems, 2014, 25(11): 2840−2850 CrossRef Google scholar

[27]	Ratnasamy S, Handley M, Karp R M, Shenker S. Topologically-aware overlay construction and server selection. In: Proceedings of IEEE International Conference on Computer Communications. 2002, 1190−1199 CrossRef Google scholar

[28]	Renesse R, Schneider F B. Chain replication for supporting high throughput and availability. In: Proceedings of USENIX Symposium on Operating Systems Design and Implementation. 2004, 91−104

[29]	Moritz R H, Williams R C. A coin-tossing problem and some related combinatorics. Mathematics Magazine, 1988, 61(1): 24−29 CrossRef Google scholar

[30]	Berenbrink P, Brinkmann A, Friedetzky T, Meister D, Nagel L. Distributing storage in cloud environments. In: Proceedings of the 27^th IEEE International Symposium on Parallel and Distributed Processing, Workshops and PhD Forum. 2013, 963−973 CrossRef Google scholar

[31]	Berenbrink P, Brinkmann A, Friedetzky T, Nagel L. Balls into nonuniform bins. Journal of Parallel and Distributed Computing, 2014, 74(2): 2065−2076 CrossRef Google scholar

[32]	Aho A V, Lam M S, Sethi R, Ullman J. Compilers: Principles, Techniques, and Tools. Reading, Massachusetts: Addison-Wesley Publishing Company, 2006

[33]	Hua Y, Jiang H, Zhu Y, Feng D, Tian L. Smartstore: a new metadata organization paradigm with semantic-awareness for next-generation file systems. In: Proceedings of the ACM/IEEE Conference on High Performance Computing Networking, Storage and Analysis. 2009, 1−12 CrossRef Google scholar

[34]	Godfrey B, Lakshminarayanan K, Surana S, Karp R M, Stoica I. Load balancing in dynamic structured P2P systems. In: Proceedings of IEEE International Conference on Computer Communications. 2004, 2253−2262

[35]	Karger D R, Ruhl M. Simple efficient load balancing algorithms for peer-to-peer systems. In: Proceedings of the 16th Annual ACM Symposium on Parallelism in Algorithms and Architectures. 2004, 36−43 CrossRef Google scholar

[36]	Naor M, Wieder U. Novel architectures for P2P applications: the continuous-discrete approach. ACM Transactions on Algorithms, 2007, 3(3): 1−37 CrossRef Google scholar

[37]	You G, Hwang S, Jain N. Scalable load balancing in cluster storage systems. In: Proceedings of the 12th International Middleware Conference on International Federation for Information Processing. 2011, 101−122 CrossRef Google scholar

[38]	Annapureddy S, Freedman MJ, Mazières D. Shark: scaling file servers via cooperative caching. In: Proceedings of the 2nd USENIX Symposium on Networked Systems Design and Implementation. 2005, 129−142

[39]	Batsakis A, Burns R C. NFS-CD: write-enabled cooperative caching in NFS. IEEE Transactions on Parallel and Distributed Systems, 2008, 19(3): 323−333 CrossRef Google scholar

[40]	Yadgar G, Factor M, Schuster A. Cooperative caching with return on investment. In: Proceedings of the 29th IEEE Symposium on Mass Storage Systems and Technologies. 2013, 1−13 CrossRef Google scholar

[41]	Ramaswamy L, Liu L, Iyengar A. Cache clouds: cooperative caching of dynamic documents in edge networks. In: Proceedings of the 25^th IEEE International Conference on Distributed Computing Systems. 2005, 229−238 CrossRef Google scholar

[42]	Xu Q, Shen H T, Chen Z, Cui B, Zhou X, Dai Y. Hybrid information retrieval policies based on cooperative cache in mobile P2P networks. Frontiers of Computer Science in China, 2009, 3(3): 381−395 CrossRef Google scholar

[43]	Dabek F, Kaashoek M F, Karger D R, Morris R, Stoica I. Wide-area cooperative storage with CFS. In: Proceedings of ACM Symposium on Operating Systems Principles. 2001, 202−215 CrossRef Google scholar

[44]	Ramasubramanian V, Sirer E G. Beehive: O(1) lookup performance for power-law query distributions in peer-to-peer overlays. In: Proceedings of USENIX Symposium on Networked Systems Design and Implementation. 2004, 99−112

[45]	Gopalakrishnan V, Silaghi B D, Bhattacharjee B, Keleher P J. Adaptive replication in peer-to-peer systems. In: Proceedings of the 24th IEEE International Conference on Distributed Computing Systems. 2004, 360−369 CrossRef Google scholar

RIGHTS & PERMISSIONS

2014 Higher Education Press and Springer-Verlag Berlin Heidelberg

AI Summary AI Mindmap

PDF(923 KB)

Accesses

Citations

Detail

Sections

Recommended

Received	Accepted	Published
09 Dec 2014	22 May 2015	10 Nov 2015
Just Accepted Date	Issue Date
08 Jun 2015	10 Nov 2015

About the journal

Aims & scope

Description

Editorial board

Abstracting / Indexing

Contact us

Browse

Just accepted

Online first

Latest issue

All volumes and issues

Collections

Featured articles

Most accessed

Most cited

Collections

Multimedia collections

Authors & reviewers

Online submisson

Call for papers

Guidelines for authors

Download templates