Fault-tolerant precise data access on distributed log-structured merge-tree

Tao ZHU, Huiqi HU, Weining QIAN, Huan ZHOU, Aoying ZHOU

PDF(857 KB)
PDF(857 KB)
Front. Comput. Sci. ›› 2019, Vol. 13 ›› Issue (4) : 760-777. DOI: 10.1007/s11704-018-7198-6
RESEARCH ARTICLE

Fault-tolerant precise data access on distributed log-structured merge-tree

Author information +
History +

Abstract

Log-structured merge tree has been adopted by many distributed storage systems. It decomposes a large database into multiple parts: an in-writing part and several read-only ones. Records are firstly written into a memoryoptimized structure and then compacted into in-disk structures periodically. It achieves high write throughput. However, it brings side effect that read requests have to go through multiple structures to find the required record. In a distributed database system, different parts of the LSM-tree are stored in distributed fashion. To this end, a server in the query layer has to issues multiple network communications to pull data items from the underlying storage layer. Coming to its rescue, this work proposes a precise data access strategy which includes: an efficient structure with low maintaining overhead designed to test whether a record exists in the in-writing part of the LSM-tree; a lease-based synchronization strategy proposed to maintain consistent copies of the structure on remote query servers.We further prove the technique is capable of working robustly when the LSM-Tree is re-organizing multiple structures in the backend. It is also fault-tolerant, which is able to recover the structures used in data access after node failures happen. Experiments using the YCSB benchmark show that the solution has 6x throughput improvement over existing methods.

Keywords

distributed data storage / log-structured merge tree / linearizability / fault tolerance

Cite this article

Download citation ▾
Tao ZHU, Huiqi HU, Weining QIAN, Huan ZHOU, Aoying ZHOU. Fault-tolerant precise data access on distributed log-structured merge-tree. Front. Comput. Sci., 2019, 13(4): 760‒777 https://doi.org/10.1007/s11704-018-7198-6

References

[1]
Chen J C, Chen Y G, Du X Y, Li C P, Lu J H, Zhao S Y, Zhou X. Big data challenge: a data management perspective. Frontiers of Computer Science, 2013, 7(2): 157–164
CrossRef Google scholar
[2]
O’Neil P E, Cheng E, Gawlick D, O’Neil E J. The log-structured merge-tree. Acta Informatica, 1996, 33(4): 351–385
CrossRef Google scholar
[3]
Chang F, Dean J, Ghemawat S, Hsieh W C, Wallach D A, Burrows M, Chandra T, Fikes A, Gruber R E. Bigtable: a distributed storage system for structured data. ACM Transactions on Computer Systems, 2008, 26(2): 4
CrossRef Google scholar
[4]
Lakshman A, Malik P. Cassandra: a decentralized structured storage system. ACM SIGOPS Operating Systems Review, 2010, 44(2): 35–40
CrossRef Google scholar
[5]
Ghemawat S, Gobioff H, Leung S T. The Google file system. In: Proceedings of ACM Symposium on Operating Systems Principles. 2003, 29–43
CrossRef Google scholar
[6]
Baker J, Bond C, Corbett J C, Furman J J, Khorlin A, Larson J, Leon J M, Li Y W, Lloyd A, Yushprakh V. Megastore: providing scalable, highly available storage for interactive services. In: Proceedings of the 5th Biennial Conference on Innovative Data System Research. 2011, 223–234
[7]
Peng D, Dabek F. Large-scale incremental processing using distributed transactions and notifications. In: Proceedings of USENIX Symposium on Operating Systems Design and Implementation. 2010, 1–15
[8]
Sears R, Ramakrishnan R. BLSM: a general purpose log structured merge tree. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 2012, 217–228
CrossRef Google scholar
[9]
Bloom B H. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM, 1970, 13(7): 422–426
CrossRef Google scholar
[10]
Herlihy M, Wing J M. Linearizability: a correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems, 1990, 12(3): 463–492
CrossRef Google scholar
[11]
Levandoski J J, Lomet D B, Sengupta S. The Bw-Tree: a B-tree for new hardware platforms. In: Proceedings of the 29th IEEE International Conference on Data Engineering. 2013, 302–313
CrossRef Google scholar
[12]
Mohan C, Haderle D J, Lindsay B G, Pirahesh H, Schwarz P M. Aries: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. ACM Transactions on Database Systems, 1992, 17(1): 94–162
CrossRef Google scholar
[13]
DeWitt D J, Katz R H, Olken F, Shapiro L D, Stonebraker M, Wood D A. Implementation techniques for main memory database systems. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 1984, 1–8
CrossRef Google scholar
[14]
Gray J, Helland P, O’Neil P E, Shasha D E. The dangers of replication and a solution. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 1996, 173–182
CrossRef Google scholar
[15]
Tang Y, Sun H L, Wang X, Liu X D. An efficient and highly available framework of data recency enhancement for eventually consistent data stores. Frontiers of Computer Science, 2017, 11(1): 88–104
CrossRef Google scholar
[16]
Ongaro D, Ousterhout J. In search of an understandable consensus algorithm. In: Proceedings of USENIX Annual Technical Conference. 2014, 305–319
[17]
Wang D H, Cai P, Qian W N, Zhou A Y, Pang T Z, Jiang J. Fast log replication in highly available data store. In: Proceedings of Asia-Pacific Web and Web-Age Information Managemet Joint Conference on Web and Big Data. 2017, 245–259
CrossRef Google scholar
[18]
Severance D G, Lohman G M. Differential files: their application to the maintenance of large databases. ACM Transactions on Database Systems, 1976, 1(3): 256–267
CrossRef Google scholar
[19]
Ahmad M Y, Kemme B. Compaction management in distributed keyvalue datastores. Proceedings of the VLDB Endowment, 2015, 8(8): 850–861
CrossRef Google scholar
[20]
Tan W, Tata S, Tang Y Z, Fong L L. Diff-index: differentiated index in distributed log-structured data stores. In: Proceedings of International Conference on Extending Database Technology. 2014, 700–711
[21]
Zhu T, Hu H Q, Qian W N, Zhou A Y, Liu M Z, Zhao Q. Precise data access on distributed log-structured merge-tree. In: Proceedings of Asia-Pacific Web and Web-Age Information Managemet Joint Conference on Web and Big Data. 2017, 210–218
CrossRef Google scholar

RIGHTS & PERMISSIONS

2018 Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature
AI Summary AI Mindmap
PDF(857 KB)

Accesses

Citations

Detail

Sections
Recommended

/