MiNT-OLAP cluster: minimizing network transmission cost in OLAP cluster for main memory analytical database

Min JIAO; Yansong ZHANG; Zhanwei WANG; Shan WANG

doi:10.1007/s11704-012-1080-8

Front. Comput. Sci. ›› 2012, Vol. 6 ›› Issue (6) :668 -676. DOI: 10.1007/s11704-012-1080-8

RESEARCH ARTICLE

MiNT-OLAP cluster: minimizing network transmission cost in OLAP cluster for main memory analytical database

Min JIAO ¹^,³^,^*
, Yansong ZHANG ²
, Zhanwei WANG ¹^,³
, Shan WANG ¹^,³

Author information +

History +

PDF (573KB)

Abstract

Powerful storage, high performance and scalability are the most important issues for analytical databases. These three factors interact with each other, for example, powerful storage needs less scalability but higher performance, high performance means less consumption of indexes and other materializations for storage and fewer processing nodes, larger scale relieves stress on powerful storage and the high performance processing engine. Some analytical databases (ParAccel, Teradata) bind their performance with advanced hardware supports, some (Asterdata, Greenplum) rely on the high scalability framework of MapReduce, some (MonetDB, Sybase IQ, Vertica) highlight performance on processing engine and storage engine. All these approaches can be integrated into an storage-performance-scalability (SP- S) model, and future large scale analytical processing can be built on moderate clusters to minimize expensive hardware dependency. The most important thing is a simple software framework is fundamental to maintain pace with the development of hardware technologies. In this paper, we propose a schema-aware on-line analytical processing (OLAP) model with deep optimization from native features of the star or snowflake schema. The OLAP model divides the whole process into several stages, each stage pipes its output to the next stage, we minimize the size of output data in each stage, whether in central processing or clustered processing. We extend this mechanism to cluster processing using two major techniques, one is using NetMemory as a broadcasting protocol based dimension mirror synchronizing buffer, the other is Received June 24, 2011; accepted August 16, 2012 E-mail: shingle@ruc.edu.cn predicate-vector based DDTA-OLAP cluster model which can minimize the data dependency of star-join using bitmap vectors. Our OLAP model aims to minimize network transmission cost (MiNT in short) for OLAP clusters and support a scalable but simple distributed storagemodel for large scale clustering processing. Finally, the experimental results show the speedup and scalability performance.

Keywords

OLAP cluster / MiNT / NetMemory / schemaaware OLAP

Cite this article

Download citation ▾

Min JIAO, Yansong ZHANG, Zhanwei WANG, Shan WANG. MiNT-OLAP cluster: minimizing network transmission cost in OLAP cluster for main memory analytical database. Front. Comput. Sci., 2012, 6(6): 668-676 DOI:10.1007/s11704-012-1080-8

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	MacNicol R, French B. Sybase IQ multiplex-designed for analyticals. In: Proceedings of VLDB. 2004

[2]	Stonebraker M, Abadi D J, Batkin A, Chen X D, Cherniack M, Ferreira M, Lau E, Lin A, Madden S, O’Neil E J, O’Neil P E, Rasin A, Tran N, Zdonik S B. C-store: a column-oriented DBMS. In: Proceedings of VLDB. 2005, 553-564

[3]	Boncz P A, Mangegold S, Kersten M L. Database architecture optimized for the new bottleneck: memory access. In: Proceedings of VLDB. 1999, 266-277

[4]	Abadi D J. Tradeoffs between parallel database systems, hadoop, and hadoopDB as platforms for petabyte-scale analysis. In: Proceedings of SSDBM. 2010, 1-3

[5]	Abouzeid A, Bajda-Pawlikowski K, Abadi D J, Rasin A, Silberschatz A. HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads. Proceedings of the VLDB Endowment, 2009, 2(1): 922-933

[6]	Zhang Y S, Hu W, Wang S. MOSS-DB: a hardware-aware OLAP database. In: Proceedings of WAIM. 2010, 582-594

[7]	O’Neil P, O’Neil B, Chen X D. The star schema benchmark (SSB).

[8]	Li J Z, Srivastava J, Rotem D. CMD: a multidimensional declustering method for parallel data systems. In: Proceedings of VLDB. 1992, 3-14

[9]	Lima A A B, Furtado C, Valduriez P, Mattoso M. Parallel OLAP query processing in database clusters with data replication. Distributed and Parallel Databases, 2005, 25: 97-123

[10]	Furtado P. Model and procedure for performance and availability-wise parallel warehouses. Distributed and Parallel Databases, 2009, 25(1): 71-96

[11]	Abouzeid A, Bajda-Pawlikowski K, Abadi D J, Rasin A, Silberschatz A. HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads. Proceedings of the VLDB Endowment, 2009, 2(1): 922-933

[12]	Yang C, Yen C, Tan C, Madden S. Osprey: implementing MapReducestyle fault tolerance in a shared-nothing distributed database. In: Proceedings of ICDE. 2010, 657-668

[13]	Chen S. Cheetah: a high performance, custom data warehouse on top of MapReduce. Proceedings of the VLDB Endowment, 2010, 3(2): 1459-1468

[14]	Winter Corporation White Paper. SAP NetWeaver: a complete platform for large-scale business intelligence. 2005

[15]	DeWitt D J, Gerber R H, Graefe G, Heytens M L, Kumar K B, Muralikrishna M. GAMMA-A high performance dataflow database machine. In: Proceedings of VLDB. 1986, 228-237

[16]	Fushimi S, Kitsuregawa M, Tanaka H. An overview of the system software of a parallel relational database machine. In: Proceedings of VLDB. 1986, 209-219

[17]	DeWitt D J, Gerber R H. Multiprocessor hash-based join algorithms. In: Proceedings of VLDB. 1985, 151-164

[18]	Candea G, Polyzotis N, Vingralek R. A scalable, predictable join operator for highly concurrent data warehouse. Proceedings of the VLDB Endowment, 2009, 2(1): 277-288

RIGHTS & PERMISSIONS

Higher Education Press and Springer-Verlag Berlin Heidelberg

PDF (573KB)

1042

Accesses

Citation

Detail

Sections

Recommended

About the journal

Aims & scope

Description

Editorial board

Abstracting / indexing

Contact us

Browse

Just accepted

All volumes and issues

Collections

Featured articles

Most accessed

Most cited

Collections

Multimedia collections

Authors & reviewers

Online submisson

Call for papers

Guidelines for authors

Download templates

Guidelines for reviewers

Abstract

Keywords

Cite this article

References

RIGHTS & PERMISSIONS