WOBTree: a write-optimized B+-tree for non-volatile memory
Haitao WANG, Zhanhuai LI, Xiao ZHANG, Xiaonan ZHAO, Song JIANG
WOBTree: a write-optimized B+-tree for non-volatile memory
The emergence of non-volatile memory (NVM) has introduced new opportunities for performance optimizations in existing storage systems. To better utilize its byte-addressability and near-DRAM performance, NVM can be attached on the memory bus and accessed via load/store memory instructions rather than the conventional block interface. In this scenario, a cache line (usually 64 bytes) becomes the data transfer unit between volatile and non-volatile devices. However, the failureatomicity of write on NVM is the memory bit width (usually 8 bytes). This mismatch between the data transfer unit and the atomicity unit may introduce write amplification and compromise data consistency of node-based data structures such as B+-trees. In this paper, we propose WOBTree, a Write-Optimized B+-Tree for NVM to address the mismatch problem without expensive logging. WOBTree minimizes the update granularity from a tree node to a much smaller subnode and carefully arranges the write operations in it to ensure crash consistency and reduce write amplification. Experimental results show that compared with previous persistent B+-tree solutions, WOBTree reduces the write amplification by up to 86× and improves write performance by up to 61× while maintaining similar search performance.
non-volatile memory / B+-tree / atomic granularity mismatch / write amplification / performance optimization
[1] |
Mittal S, Vetter J S. A survey of software techniques for using non-volatile memories for storage and main memory systems. IEEE Transactions on Parallel and Distributed Systems, 2015, 27(5): 1537–1550
CrossRef
Google scholar
|
[2] |
Li D, Liao X, Jin H, Tang Y, Zhao G.Writeback throttling in a virtualized system with SCM. Frontiers of Computer Science, 2016, 10(1): 82–95
CrossRef
Google scholar
|
[3] |
Barber R, Bendel P, Czech M, Draese O, Ho F, Hrle N, Idreos S, Kim M, Koeth O, Lee J, Li T T, Lohman G M, Morfonios K, Müller R, Murthy K, Pandis I, Qiao L, Raman V, Szabo S, Sidle R, Stolze K. Blink: not your father’s database! In: Proceedings of the 37th International Conference on Very Large Databases. 2011, 1–22
CrossRef
Google scholar
|
[4] |
Diaconu C, Freedman C, Ismert E, Larson P, Mittal P, Stonecipher R, Verma N, Zwilling M. Hekaton: SQL server’s memory-optimized OLTP engine. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. 2013, 1243–1254
CrossRef
Google scholar
|
[5] |
Plattner H. The impact of columnar in-memory databases on enterprise systems. Proceedings of the VLDB Endowment, 2014, 7(13): 1722–1729
CrossRef
Google scholar
|
[6] |
Nishtala R, Fugal H, Grimm S, Kwiatkowski M, Lee H, Li H C, McElroy R, Paleczny M, Peek D, Saab P, Stafford D, Tung T, Venkataramani V. Scaling memcache at facebook. In: Proceedings of the 10th USENIX Conference on Networked Systems Design and Implementation. 2013, 385–398
|
[7] |
Ousterhout J K, Agrawal P, Erickson D, Kozyrakis C, Leverich J, Mazières D,Mitra S, Narayanan A, Parulkar GM, Rosenblum M, Rumble S M, Stratmann E, Stutsman R. The case for ramclouds: scalable highperformance storage entirely in dram. ACM SIGOPS Operating Systems Review, 2010, 43(4): 92–105
CrossRef
Google scholar
|
[8] |
Zaharia M, Chowdhury M, Das T, Dave A, Ma J, McCauly M, Franklin M J, Shenker S, Stoica I. Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation. 2012, 15–28
|
[9] |
Chen S, Jin Q. Persistent B+-trees in non-volatile main memory. Proceedings of the VLDB Endowment, 2015, 8(7): 786–797
CrossRef
Google scholar
|
[10] |
Condit J, Nightingale E B, Frost C, Ipek E, Lee B C, Burger D, Coetzee D. Better I/O through byte-addressable, persistent memory. In: Proceedings of the 22nd ACM Symposium on Operating Systems Principles. 2009, 133–146
CrossRef
Google scholar
|
[11] |
Hwang D, Kim W, Won Y, Nam B. Endurable transient inconsistency in byte-addressable persistent B+-tree. In: Proceedings of the 16th USENIX Conference on File and Storage Technologies. 2018, 187–200
|
[12] |
Götze P, Baumann S, Sattler K. An NVM-aware storage layout for analytical workloads. In: Proceedings of the 34th IEEE International Conference on Data Engineering Workshops. 2018, 110–115
|
[13] |
Lee S K, Lim K H, Song H, Nam B, Noh S H. Wort: write optimal radix tree for persistent memory storage systems. In: Proceedings of the 15th USENIX Conference on File and Storage Technologies. 2017, 257–270
|
[14] |
Danowitz A, Kelley K, Mao J, Stevenson J P, Horowitz M. CPU DB: recording microprocessor history. Communications of the ACM, 2012, 55(4): 55–63
CrossRef
Google scholar
|
[15] |
Yang J, Wei Q, Chen C, Wang C, Yong K L, He B. Nv-tree: reducing consistency cost for nvm-based single level systems. In: Proceedings of the 13th USENIX Conference on File and Storage Technologies. 2015, 167–181
|
[16] |
Oukid I, Lasperas J, Nica A, Willhalm T, Lehner W. Fptree: a hybrid scmdram persistent and concurrent b-tree for storage class memory. In: Proceedings of the 2016 International Conference on Management of Data. 2016, 371–386
CrossRef
Google scholar
|
[17] |
Rixner S, Dally W J, Kapasi U J, Mattson P R, Owens J D. Memory access scheduling. In: Proceedings of the 27th International Symposium on Computer Architecture. 2000, 128–138
CrossRef
Google scholar
|
[18] |
Wang Y, Kapritsos M, Ren Z, Mahajan P, Kirubanandam J, Alvisi L, Dahlin M. Robustness in the salus scalable block store. In: Proceedings of the 10th USENIX Conference on Networked Systems Design and Implementation. 2013, 357–370
|
[19] |
Moraru I, Andersen D G, Kaminsky M, Tolia N, Ranganathan P, Binkert N L. Consistent, durable, and safe memory management for byteaddressable non volatile main memory. In: Proceedings of the 1st ACM SIGOPS Conference on Timely Results in Operating Systems. 2013, 1–17
CrossRef
Google scholar
|
[20] |
Volos H, Magalhaes G, Cherkasova L, Li J. Quartz: a lightweight performance emulator for persistent memory software. In: Proceedings of the 16th Annual Middleware Conference. 2015, 37–49
CrossRef
Google scholar
|
[21] |
Arulraj J, Pavlo A, Dulloor S. Let’s talk about storage & recovery methods for non-volatile memory database systems. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. 2015, 707–722
CrossRef
Google scholar
|
[22] |
Kim W, Kim J, Baek W, Nam B, Won Y. NVWAL: exploiting nvram in write-ahead logging. ACM SIGOPS Operating Systems Review, 2016, 50(2): 385–398
CrossRef
Google scholar
|
[23] |
Chen A. A review of emerging non-volatile memory (NVM) technologies and applications. Solid-state Electronics, 2016, 125(Nov): 25–38
CrossRef
Google scholar
|
/
〈 | 〉 |