Big data storage technologies: a survey

Aisha SIDDIQA, Ahmad KARIM, Abdullah GANI

PDF(1030 KB)
PDF(1030 KB)
Front. Inform. Technol. Electron. Eng ›› 2017, Vol. 18 ›› Issue (8) : 1040-1070. DOI: 10.1631/FITEE.1500441
Review
Review

Big data storage technologies: a survey

Author information +
History +

Abstract

There is a great thrust in industry toward the development of more feasible and viable tools for storing fast-growing volume, velocity, and diversity of data, termed ‘big data’. The structural shift of the storage mechanism from traditional data management systems to NoSQL technology is due to the intention of fulfilling big data storage requirements. However, the available big data storage technologies are inefficient to provide consistent, scalable, and available solutions for continuously growing heterogeneous data. Storage is the preliminary process of big data analytics for real-world applications such as scientific experiments, healthcare, social networks, and e-business. So far, Amazon, Google, and Apache are some of the industry standards in providing big data storage solutions, yet the literature does not report an in-depth survey of storage technologies available for big data, investigating the performance and magnitude gains of these technologies. The primary objective of this paper is to conduct a comprehensive investigation of state-of-the-art storage technologies available for big data. A well-defined taxonomy of big data storage technologies is presented to assist data analysts and researchers in understanding and selecting a storage mechanism that better fits their needs. To evaluate the performance of different storage architectures, we compare and analyze the existing approaches using Brewer’s CAP theorem. The significance and applications of storage technologies and support to other categories are discussed. Several future research challenges are highlighted with the intention to expedite the deployment of a reliable and scalable storage system.

Keywords

Big data / Big data storage / NoSQL databases / Distributed databases / CAP theorem / Scalability / Consistencypartition resilience / Availability-partition resilience

Cite this article

Download citation ▾
Aisha SIDDIQA, Ahmad KARIM, Abdullah GANI. Big data storage technologies: a survey. Front. Inform. Technol. Electron. Eng, 2017, 18(8): 1040‒1070 https://doi.org/10.1631/FITEE.1500441

References

[1]
Aasman, J., 2008. Event Processing Using an RDF Database (White Paper).Association for the Advancement of Artificial Intelligence, p.1–5.
[2]
Abadi, D.J., Boncz, P.A., Harizopoulos, S. , 2009. Columnoriented database systems.Proc. VLDB Endow., 2(2): 1664–1665. https://doi.org/10.14778/1687553.1687625
[3]
Abouzeid, A., Bajda-Pawlikowski , K., Abadi, D. , , 2009. HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads.Proc. VLDB Endow., 2(1):922–933. https://doi.org/10.14778/1687627.1687731
[4]
Abramova, V., Bernardino , J., 2013. NoSQL databases: MongoDB vs Cassandra.Proc. Int. Conf. on Computer Science and Software Engineering, p.14–22. https://doi.org/10.1145/2494444.2494447
[5]
Aerospike, 2012. Aerospike, Architecture Overview (White Paper).http://www.aerospike.com/
[6]
Aerospike, 2015. NoSQL Database, In-Memory or Flash Optimized and Web Scale.http://www.aerospike.com/ [Accessed on May 5, 2015].
[7]
Alex, P., Ana-Maria , B., 2009. Terrastore: a Consistent, Partitioned and Elastic Document Database.http://nosql. mypopescu.com/post/304908601/terrastore-consistentpartitioned-elastic-document-datab [Accessed on May 7, 2015].
[8]
AllegroGraph, 2015. AllegroGraph.http://franz.com/agraph/ allegrograph/ [Accessed on May 5, 2015].
[9]
Anderson, J.C., Lehnardt , J., Slater, N. , 2010. CouchDB: the Definitive Guide.O’Reilly Media, Inc., California.
[10]
Apache, 2015. Apache CouchDB: a Database for the Web.http://couchdb.apache.org/ [Accessed on May 5, 2015].
[11]
Apache Software Foundation, 2015. HBase Apache.http:// hbase.apache.org/ [Accessed on Jan. 15, 2015].
[12]
Armbrust, M., Fox, A., Patterson, D. , , 2009. Scads: scaleindependent storage for social computing applications.arXiv:0909.1775.
[13]
Azeem, R., Khan, M.I.A., 2012. Techniques about data replication for mobile ad-hoc network databases.Int. J. Multidiscipl. Sci. Eng., 3(5):53–57.
[14]
Banker, K., 2011. MongoDB in Action.Manning Publications Co., New York.
[15]
Baron, J., Kotecha , S., 2013. Storage Options in the AWS Cloud.Technical Report, Amazon Web Services, Washington DC.
[16]
Batra, S., Tyagi, C., 2012. Comparative analysis of relational and graph databases.Int. J. Soft Comput. Eng., 2(2):509–512.
[17]
Bohlouli, M., Schulz, F., Angelis, L. , , 2013. Towards an integrated platform for big data analysis. In: Fathi, M. (Ed.), Integration of Practice-Oriented Knowledge Technology: Trends and Prospectives. Springer Berlin Heidelberg, p.47–56. https://doi.org/10.1007/978-3-642-34471-8_4
[18]
Borthakur, D., 2008. HDFS Architecture Guide.http://hadoop. apache.org/common/docs/current/hdfsdesign.pdf
[19]
Bossa, S., 2009. Thoughts and Fragments: Terrastore and the CAP Theorem.http://sbtourist.blogspot.com/2009/12/ terrastore-and-cap-theorem.html [Accessed on May 7, 2015].
[20]
Brewer, E., 2012. CAP twelve years later: how the “rules” have changed.Computer, 45(2):23–29. https://doi.org/10.1109/MC.2012.37
[21]
Bunch, C., Chohan, N., Krintz, C., , 2010. An evaluation of distributed datastores using the AppScale cloud platform.IEEE 3rd Int. Conf. on Cloud Computing, p.305–312. https://doi.org/10.1109/CLOUD.2010.51
[22]
Burrows, M., 2006. The Chubby lock service for looselycoupled distributed systems.Proc. 7th Symp. on Operating Systems Design and Implementation, p.335–350.
[23]
Buza, K., Nagy, G.I., Nanopoulos, A. , 2014. Storageoptimizing clustering algorithms for high-dimensional tick data.Expert Syst. Appl., 41(9):4148–4157. https://doi.org/10.1016/j.eswa.2013.12.046
[24]
Carlson, J., 2013. Redis in Action.Manning Publications Co., New York.
[25]
Cattell, R., 2010. Scalable SQL and NoSQL data stores.SIGMOD Rec., 39(4):12–27. https://doi.org/10.1145/1978915.1978919
[26]
Chandra, T.D., Griesemer , R., Redstone, J. , 2007. Paxos made live: an engineering perspective.Proc. 26th Annual ACM Symp. on Principles of Distributed Computing, p.398–407. https://doi.org/10.1145/1281100.1281103
[27]
Chang, F., Dean, J., Ghemawat, S. , , 2008. Bigtable: a distributed storage system for structured data. ACM Trans.Comput. Syst., 26(2):1–26. https://doi.org/10.1145/1365815.1365816
[28]
Chen, C.L.P., Zhang, C.Y., 2014. Data-intensive applications, challenges, techniques and technologies: a survey on big data.Inform. Sci., 275:314–347. https://doi.org/10.1016/j.ins.2014.01.015
[29]
Chen, M., Mao, S.W., Liu, Y.H. , 2014. Big data: a survey.Mob. Networks Appl., 19(2):171–209. https://doi.org/10.1007/s11036-013-0489-0
[30]
Cichocki, A., 2014. Era of big data processing: a new approach via tensor networks and tensor decompositions.arXiv: 1403.2048.
[31]
Coburn, J.,Caulfield , A.M.,Akel, A. , , 2011. NV-Heaps: making persistent objects fast and safe with nextgeneration, non-volatile memories.ACM SIGPLAN Not., 47(3):105–118. https://doi.org/10.1145/2248487.1950380
[32]
Cudré-Mauroux, P. , Kimura, H., Lim, K.T., , 2009. A demonstration of SciDB: a science-oriented DBMS.Proc. VLDB Endow., 2(2):1534–1537. https://doi.org/10.14778/1687553.1687584
[33]
Deagustini, C.A.D., Dalibón , S.E.F., Gottifredi, S. , , 2013. Relational databases as a massive information source for defeasible argumentation.Knowl.-Based Syst., 51:93–109. https://doi.org/10.1016/j.knosys.2013.07.010
[34]
Dean, J., Ghemawat , S., 2008. MapReduce: simplified data processing on large clusters.Commun. ACM, 51(1):107–113. https://doi.org/10.1145/1327452.1327492
[35]
DeCandia, G., Hastorun , D., Jampani, M. , , 2007. Dynamo: Amazon’s highly available key-value store.ACM SIGOPS Oper. Syst. Rev., 41(6):205–220. https://doi.org/10.1145/1323293.1294281
[36]
Deka, G.C., 2014. A survey of cloud database systems.IT Prof., 16(2):50–57. https://doi.org/10.1109/MITP.2013.1
[37]
Dharavath, R., Kumar, C., 2015. A scalable generic transaction model scenario for distributed NoSQL databases.J. Syst. Softw., 101:43–58. https://doi.org/10.1016/j.jss.2014.11.037
[38]
Diack, B.W., Ndiaye, S., Slimani, Y. , 2013. CAP theorem between claims and misunderstandings: what is to be sacrificed?Int. J. Adv. Sci. Technol., 56:1–12.
[39]
Dittrich, J., Quiané-Ruiz , J., Richter, S. , , 2012. Only aggressive elephants are fast elephants.Proc. VLDB Endow., 5(11):1591–1602. https://doi.org/10.14778/2350229.2350272
[40]
Dominguez-Sal, D., Urbón-Bayes , P., Giménez-Vañó , A., , 2010. Survey of graph database performance on the HPC scalable graph analysis benchmark.In: Shen, H.T., Pei, J., Özsu, M.T., et al. (Eds.), Web-Age Information Management. Springer Berlin Heidelberg, p.37–48. https://doi.org/10.1007/978-3-642-16720-1_4
[41]
Excoffier, L., Lischer , H.E.L., 2010. Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows.Mol. Ecol. Res., 10(3):564–567. https://doi.org/10.1111/j.1755-0998.2010.02847.x
[42]
Fox, A., Brewer, E.A., 1999. Harvest, yield, and scalable tolerant systems.Proc. 7th Workshop on Hot Topics in Operating Systems, p.174–178. https://doi.org/10.1109/HOTOS.1999.798396
[43]
Fox, A., Gribble , S.D., Chawathe, Y. , , 1997. Clusterbased scalable network services.Proc. 16th ACM Symp. on Operating Systems Principles, p.78–91. https://doi.org/10.1145/268998.266662
[44]
Fulton, S., 2011. The Other Non-SQL Alternative: Infinite Graph 2.0.http://readwrite.com/2011/08/24/the-othernon-sql-alternative [Accessed on May 5, 2015].
[45]
Gani, A., Siddiqa , A., Shamshirband, S. , , 2015. A survey on indexing techniques for big data: taxonomy and performance evaluation.Knowl. Inform. Syst., 46(2):241–284. https://doi.org/10.1007/s10115-015-0830-y
[46]
George, L., 2011. HBase: the Definitive Guide.O’Reilly Media, Inc., California.
[47]
Ghemawat, S., Gobioff , H., Leung, S.T. , 2003. The Google file system.SIGOPS Oper. Syst. Rev., 37(5):29–43. https://doi.org/10.1145/1165389.945450
[48]
Gorton, I., Klein, J., 2015. Distribution, data, deployment: software architecture convergence in big data systems.IEEE Softw., 32(3):78–85. https://doi.org/10.1109/MS.2014.51
[49]
Gray, J., 1981. The transaction concept: virtues and limitations.Proc. 7th Int. Conf. on Very Large Data Bases, p.144–154.
[50]
Habeeb, M., 2010. A Developer’s Guide to Amazon SimpleDB.Addison-Wesley Professional.
[51]
Han, J., Haihong , E., Le, G. , , 2011. Survey on NoSQL database.6th Int. Conf. on Pervasive Computing and Applications, p.363–366. https://doi.org/10.1109/ICPCA.2011.6106531
[52]
Hecht, R., Jablonski , S., 2011. NoSQL evaluation: a use case oriented survey.Int. Conf. on Cloud and Service Computing, p.336–341. https://doi.org/10.1109/CSC.2011.6138544
[53]
Helmke, M., 2012. Ubuntu Unleashed 2012 Edition: Covering 11.10 and 12.04.Sams Publishing.
[54]
Hewitt, E., 2010. Cassandra: the Definitive Guide.O’Reilly Media, Inc., California.
[55]
Hilker, S., 2012. Survey Distributed Databases—Toad for Cloud.http://www.toadworld.com/products/toad-for-clouddatabases/ w/wiki/308.survey-distributed-databases.aspx [Accessed on May 5, 2015].
[56]
Hu, Y., Dessloch , S.,2014. Extracting deltas from column oriented NoSQL databases for different incremental applications and diverse data targets.Data Knowl. Eng., 93:42–59. https://doi.org/10.1016/j.datak.2014.07.002
[57]
HyperGraphDB, 2010. HyperGraphDB—A Graph Database.http://www.hypergraphdb.org/ [Accessed on May 5, 2015].
[58]
HyperTable, 2015. Hypertable.http://hypertable.com/ documentation/ [Accessed on Jan. 15, 2015].
[59]
IMDB, 2015. The Internet Movie DataBase.http://www. imdb.com/ [Accessed on May 5, 2015].
[60]
InfiniteGraph, 2014. InfiniteGraph | Distributed Graph Database.http://www.objectivity.com/ [Accessed on May 5, 2015].
[61]
Iordanov, B., 2010. HyperGraphDB: a generalized graph database.In: Shen, H.T., Pei, J., Özsu, M.T., et al. (Eds.), Web-Age Information Management. Springer Berlin Heidelberg, p.25–36. https://doi.org/10.1007/978-3-642-16720-1_3
[62]
Kaisler, S.,Armour, F., Espinosa, J.A. , , 2013. Big data: issues and challenges moving forward.46th Hawaii Int. Conf. on System Sciences, p.995–1004. https://doi.org/10.1109/HICSS.2013.645
[63]
Karaboga, D., Ozturk, C., 2011. A novel clustering approach: artificial bee colony (ABC) algorithm.Appl. Soft Comput., 11(1):652–657. https://doi.org/10.1016/j.asoc.2009.12.025
[64]
Khetrapal, A., Ganesh, V., 2006. HBase and Hypertable for Large Scale Distributed Storage Systems.Department of Computer Science, Purdue University.
[65]
Kim, M., Candan, K.S., 2014a. Efficient static and dynamic in-database tensor decompositions on chunk-based array stores.Proc. 23rd ACM Int. Conf. on Information and Knowledge Management, p.969–978. https://doi.org/10.1145/2661829.2661864
[66]
Kim, M., Candan, K.S., 2014b. TensorDB: in-database tensor manipulation with tensor-relational query plans.Proc. 23rd ACM Int. Conf. on Information and Knowledge Management, p.2039–2041. https://doi.org/10.1145/2661829.2661842
[67]
Kristina, C., Michael , D., 2010. MongoDB: the Definitive Guide.O’Reilly Media, Inc., California.
[68]
Kumar, G., 2014. Just Say Yes to NoSQL.http://www.3pillar global.com/insights/just-say-yes-to-nosql [Accessed on May 5, 2015].
[69]
Lakshman, A., Malik, P., 2010. Cassandra: a decentralized structured storage system.SIGOPS Oper. Syst. Rev., 44(2): 35–40. https://doi.org/10.1145/1773912.1773922
[70]
Lam, C.F., Liu, H., Koley, B.,, 2010. Fiber optic communication technologies: what’s needed for datacenter network operations.IEEE Commun. Mag., 48(7):32–39. https://doi.org/10.1109/MCOM.2010.5496876
[71]
Lorica, B.,2015. The Tensor Renaissance in Data Science.http://radar.oreilly.com/2015/05/the-tensor-renaissancein-data-science.html
[72]
MacFadden, G.,2013. 21 NoSQL Innovators to Look for in 2020.http://blog.parityresearch.com/21-nosql-innovatorsto-look-for-in-2020/ [Accessed on May 5, 2015].
[73]
MemcacheDB, 2015. MemcacheDB.http://memcachedb.org/[Accessed on Jan. 10, 2015].
[74]
Milne, D., Witten, I.H., 2013. An open-source toolkit for mining Wikipedia.Artif. Intell., 194:222–239. https://doi.org/10.1016/j.artint.2012.06.007
[75]
MongoDB, 2015. MongoDB Architecture Guide (White Paper).https://www.mongodb.com/lp/white-paper/architectureguide? jmp=docs&_ga=1.165918654.1239465962.14309 78187:MongoDB [Accessed on May 7, 2015].
[76]
Montag, D., 2013. Understanding Neo4j Scalability.http://info. neotechnology.com/rs/neotechnology/images/Understand ing%20Neo4j%20Scalability(2).pdf
[77]
MovieLens, 2015. MovieLens.https://movielens.org/ [Accessed on May 5, 2015].
[78]
Nagy, D., Yassin, A.M., Bhattacherjee, A. , 2010. Organizational adoption of open source software: barriers and remedies.Commun. ACM, 53(3):148–151. https://doi.org/10.1145/1666420.1666457
[79]
Neo4j, 2015. Overcoming SQL Strain and SQL Pain (White Paper).http://neo4j.com/resources/wp-overcoming-sqlstrain/? utm_source=db-engines&utm_medium=textsqlpa in&utm_content=download&utm_campaign=dl [Accessed on May 5, 2015].
[80]
Neumeyer, L., Robbins , B., Nair, A. , , 2010. S4: distributed stream computing platform.IEEE Int. Conf. on Data Mining Workshops, p.170–177. https://doi.org/10.1109/ICDMW.2010.172
[81]
Niranjanamurthy, M., Archana , U.L., Niveditha, K.T. , , 2014. The research study on DynamoDB—NoSQL database service.Int. J. Comput. Sci. Mob. Comput., 3(10):268–279.
[82]
Objectivity, Inc., 2012. InfiniteGraph: the Distributed Graph Database (White Paper).http://www.objectivity.com/ products/infinitegraph/ [Accessed on May 5, 2015].
[83]
Oliveira, S.F., Fürlinger , K., Kranzlmüller, D., 2012. Trends in computation, communication and storage and the consequences for data-intensive science.IEEE 14th Int. Conf. on High Performance Computing and Communication and IEEE 9th Int. Conf. on Embedded Software and Systems, p.572–579. https://doi.org/10.1109/HPCC.2012.83
[84]
Oracle, 2015a. Managing Consistency with Berkeley DBHA (White Paper).http://www.oracle.com/technetwork/ products/berkeleydb/high-availability-099050.html [Accessed on May 5, 2015].
[85]
Oracle, 2015b. Oracle Berkeley DB.http://www.oracle.com/ technetwork/database/database-technologies/berkeleydb/ overview/index.html [Accessed on May 5, 2015].
[86]
Oracle, 2015c. Unleash High Availability Applications with Berkeley DB (White Paper).http://www.oracle.com/ technetwork/products/berkeleydb/high-availability-0990 50.html [Accessed on May 5, 2015].
[87]
Oracle Secondary, 2015. Secondary Indexes.https://docs. oracle.com/cd/E17275_01/html/programmer_reference/ am_second.html [Accessed on May 5, 2015].
[88]
OrientDB, 2015. OrientDB—OrientDB Mulit-model NoSQL Database.http://orientdb.com [Accessed on May 5, 2015].
[89]
Padhye, V., Tripathi , A., 2015. Scalable transaction management with snapshot isolation for NoSQL data storage systems.IEEE Trans. Serv. Comput., 8(1):121–135. https://doi.org/10.1109/TSC.2013.47
[90]
Pokorny, J., 2013. NoSQL databases: a step to database scalability in web environment.Int. J. Web Inform. Syst., 9(1):69–82. https://doi.org/10.1108/17440081311316398
[91]
Putnik, G., Sluga, A., ElMaraghy, H. , , 2013. Scalability in manufacturing systems design and operation: state-ofthe-art and future developments roadmap.CIRP Ann. Manuf. Technol., 62(2):751–774. https://doi.org/10.1016/j.cirp.2013.05.002
[92]
Qualcomm, 2014a. NoSQL XML Databases Qualcomm Qizx.https://www.qualcomm.com/qizx [Accessed on May 5, 2015].
[93]
Qualcomm, 2014b. Qualcomm Qizx | User Guide.https://www.qualcomm.com/qizx [Accessed on May 5, 2015].
[94]
Ramakrishnan, R., 2012. CAP and cloud data management.Computer, 45(2):43–49. https://doi.org/10.1109/MC.2011.388
[95]
RethinkDB, 2015. RethinkDB: the Open Source Database for Real-Time Web.http://rethinkdb.com/ [Accessed on May 5, 2015].
[96]
RocketSoftware, 2014a. High Availability and Dissaster Recovery for Rocket U2 Databases.http://info.rocket software.com/hadr.html [Accessed on May 5, 2015].
[97]
RocketSoftware, 2014b. Vermont Teddy Bear | A Top Ecommerce Retailer Relies on Rocket U2 to Successfully Manage Information Processing Activities in Its Directto-Consumer Divisions [Case Study].http://blog.rocket software.com/blog/2014/12/22/vermont-teddy-bear-relies-rocket-u2-improve-service-increase-revenue/
[98]
RocketSoftware, 2015. Flexible, High Volume Data Management | Rocket Software.http://www.rocketsoft ware.com/product-families/rocket-u2 [Accessed on May 5, 2015].
[99]
Ruflin,N., Burkhart , H., Rizzotti, S. , 2011. Social-data storage-systems.Proc. Databases and Social Networks, p.7–12. https://doi.org/10.1145/1996413.1996415
[100]
Sakr, S., Liu, A., Batista, D.M. , , 2011. A survey of large scale data management approaches in cloud environments.IEEE Commun. Surv. Tutor., 13(3):311-336. https://doi.org/10.1109/SURV.2011.032211.00087
[101]
Scalaris, 2015. Scalaris.http://scalaris.zib.de/ [Accessed on May 5, 2015].
[102]
Schütt, T.,Schintke , F., Reinefeld, A. , 2008. Scalaris: reliable transactional P2P key/value store.Proc. 7th ACM SIGPLAN Workshop on ERLANG, p.41–48. https://doi.org/10.1145/1411273.1411280
[103]
Sciore, E., 2007. SimpleDB: a simple Java-based multiuser system for teaching database internals.SIGCSE Bull., 39(1):561–565. https://doi.org/10.1145/1227504.1227498
[104]
SD Times Newswire, 2013. OrientDB Becomes Distributed Using Hazelcast, Leading Open Source In-Memory Data Grid.http://sdtimes.com/orientdb-becomes-distributedusing-hazelcast-leading-open-source-in-memory-data-grid/ [Accessed on May 5, 2015].
[105]
Seltzer, M., Bostic, K., 2015. The Architecture of Open Source Applications: Berkeley DB.http://www.aosabook.org/en/ bdb.html [Accessed on May 5, 2015].
[106]
Sheehy, J., 2010. Riak 0.10 is Full of Great Stuff.http://basho. com/riak-0-10-is-full-of-great-stuff/ [Accessed on May 5, 2015].
[107]
Shvachko, K.V., 2010. HDFS scalability: the limits to growth.Login, 35(2):6–16.
[108]
Sivasubramanian, S., 2012. Amazon DynamoDB: a seamlessly scalable non-relational database service.Proc. ACM SIGMOD Int. Conf. on Management of Data, p.729–730. https://doi.org/10.1145/2213836.2213945
[109]
Skoulis, I., Vassiliadis , P., Zarras, A.V. , 2015. Growing up with stability: how open-source relational databases evolve.Inform. Syst., 53:363–385. https://doi.org/10.1016/j.is.2015.03.009
[110]
SourceForge, 2015. KAI SourceForge.http://sourceforge.net/projects/kai/ [Accessed on May 5, 2015].
[111]
Spaho, E., Barolli , L., Xhafa, F. , , 2013. P2P data replication and trustworthiness for a JXTA-overlay P2P system using fuzzy logic.Appl. Soft Comput., 13(1):321–328. https://doi.org/10.1016/j.asoc.2012.08.044
[112]
Stonebraker, M., Brown , P., Zhang, D. ,, 2013. SciDB: a database management system for applications with complex analytics.Comput. Sci. Eng., 15(3):54–62. https://doi.org/10.1109/MCSE.2013.19
[113]
Subramaniyaswamy, V., Vijayakumar , V., Logesh, R. , , 2015. Unstructured data analysis on big data using MapReduce.Proc. Comput. Sci., 50:456–465. https://doi.org/10.1016/j.procs.2015.04.015
[114]
Sumbaly, R., Kreps, J., Gao, L., , 2012. Serving largescale batch computed data with project Voldemort.Proc. 10th USENIX Conf. on File and Storage Technologies, p.18.
[115]
Sun, D.W., Chang, G.R., Gao, S. , , 2012. Modeling a dynamic data replication strategy to increase system availability in cloud computing environments.J. Comput. Sci. Technol., 27(2):256–272. https://doi.org/10.1007/s11390-012-1221-4
[116]
Taheri, J., Lee, Y.C., Zomaya, A.Y. ,, 2013. A bee colony based optimization approach for simultaneous job scheduling and data replication in grid environments. Comput. Oper. Res., 40(6):1564–1578.https://doi.org/10.1016/j.cor.2011.11.012
[117]
Tanenbaum, A., van Steen , M., 2007.Distributed Systems.Pearson Prentice Hall.
[118]
Taylor, R.C., 2010. An overview of the Hadoop/MapReduce/ HBase framework and its current applications in bioinformatics.BMC Bioinform., 11(Suppl 12):1–6. https://doi.org/10.1186/1471-2105-11-S12-S1
[119]
Terrastore, 2015. Terrastore—Scalable, Elastic, Consistent Document Store.http://code.google.com/p/terrastore/ [Accessed on May 7, 2015].
[120]
Tudorica, B.G., Bucur, C., 2011. A comparison between several NoSQL databases with comments and notes.10th Roedunet Int. Conf., p.1–5. https://doi.org/10.1109/RoEduNet.2011.5993686
[121]
Turk, A., Selvitopi , R.O., Ferhatosmanoglu, H., , 2014. Temporal workload-aware replicated partitioning for social networks.IEEE Trans. Knowl. Data Eng., 26(11): 2832–2845. https://doi.org/10.1109/TKDE.2014.2302291
[122]
Vicknair, C., Macias, M., Zhao, Z.D., , 2010. A comparison of a graph database and a relational database: a data provenance perspective.Proc. 48th Annual Southeast Regional Conf., p.1–6. https://doi.org/10.1145/1900008.1900067
[123]
Voldemort, 2015. Project Voldemort.http://www.projectvoldemort. com/voldemort/ [Accessed on Jan. 10, 2015].
[124]
Vyas, U., Kuppusamy , P., 2014. DynamoDB Applied Design Patterns.Packt Publishing Ltd., Birmingham.
[125]
Walsh, L., Akhmechet , V., Glukhovsky, M. , 2009. RethinkDBRethinking Database Storage (White Paper).
[126]
Wang, H.J., Li, J.H., Zhang, H.M. , , 2014. Benchmarking Replication and Consistency Strategies in Cloud Serving Databases: HBase and Cassandra.In: Zhan, J.F., Han, R., Weng, C.L. (Eds.), Big Data Benchmarks, Performance Optimization, and Emerging Hardware. Springer International Publishing, p.71–82. https://doi.org/10.1007/978-3-319-13021-7_6
[127]
Wang, X., Sun, H.L., Deng, T. , , 2015. On the tradeoff of availability and consistency for quorum systems in data center networks.Comput. Networks, 76:191–206. https://doi.org/10.1016/j.comnet.2014.11.006
[128]
Wenk, A., Slater, N., 2014. Introduction.https://cwiki.apache. org/confluence/display/COUCHDB/Introduction [Accessed on May 5, 2015].
[129]
Xiao, Z.F., Liu, Y.M., 2011. Remote sensing image database based on NOSQL database.19th Int. Conf. on Geoinformatics, p.1–5. https://doi.org/10.1109/GeoInformatics.2011.5980724
[130]
Zhang,X.X., Xu, F.,2013. Survey of research on big data storage.12th Int. Symp. on Distributed Computing and Applications to Business, Engineering and Science, p.76–80. https://doi.org/10.1109/DCABES.2013.21
[131]
Zhao, W.Z., Ma, H.F., He, Q. , 2009. Parallel K-means clustering based on MapReduce.IEEE Int. Conf. on Cloud Computing, p.674–679. https://doi.org/10.1007/978-3-642-10665-1_71
[132]
Zicari, R.V., 2015. On Graph Databases.Interview with Emil Eifrem. http://www.odbms.org/blog/2015/05/on-graphdatabases-interview-with-emil-eifrem/ [Accessed on May 5, 2015].

RIGHTS & PERMISSIONS

2017 Zhejiang University and Springer-Verlag GmbH Germany
PDF(1030 KB)

Accesses

Citations

Detail

Sections
Recommended

/