Redesign of the gStore system

Li ZENG, Lei ZOU

PDF(601 KB)
PDF(601 KB)
Front. Comput. Sci. ›› 2018, Vol. 12 ›› Issue (4) : 623-641. DOI: 10.1007/s11704-018-7212-z
RESEARCH ARTICLE

Redesign of the gStore system

Author information +
History +

Abstract

gStore is an open-source native Resource Description Framework (RDF) triple store that answers SPARQL queries by subgraph matching over RDF graphs. However, there are some deficiencies in the original system design, such as answering simple queries (including onetriple pattern queries). To improve the efficiency of the system, we reconsider the system design in this paper. Specifically, we propose a new query plan generation module that generates different query plans according to the structures of query graphs. Furthermore, we re-design our vertex encoding strategy to achieve more pruning power and a new multi-join algorithm to speed up the subgraph matching process. Extensive experiments on synthetic and real RDF datasets show that our method outperforms the state-of-the-art algorithms significantly.

Keywords

graph database / subgraph matching / RDF management / SPARQL query

Cite this article

Download citation ▾
Li ZENG, Lei ZOU. Redesign of the gStore system. Front. Comput. Sci., 2018, 12(4): 623‒641 https://doi.org/10.1007/s11704-018-7212-z

References

[1]
Bollacker K D, Cook R P, Tufts P. Freebase: a shared database of structured general human knowledge. In: Proceedings of the 22nd AAAI Conference on Artificial Intelligence. 2007, 1962–1963
[2]
Lehmann J, Isele R, Jakob M, Jentzsch A, Kontokostas D, Mendes P N, Hellmann S, Morsey M, van Kleef P, Auer S, Bizer C. Dbpedia – A largescale, multilingual knowledge base extracted from Wikipedia. Semantic Web, 2015, 6(2): 167–195
[3]
Neumann T, Weikum G. RDF-3X: a RISC-style engine for RDF. Proceedings of the VLDB Endowment, 2008, 1(1): 647–659
CrossRef Google scholar
[4]
Neumann T, Weikum G. The RDF-3X engine for scalable management of RDF data. VLDB Journal, 2009, 19(1): 91–113
CrossRef Google scholar
[5]
Weiss C, Karras P, Bernstein A. Hexastore: sextuple indexing for semantic Web data management. Proceedings of the VLDB Endowment, 2008, 1(1): 1008–1019
CrossRef Google scholar
[6]
Zou L, Mo J H, Chen L, Özsu M T, Zhao D Y. gStore: answering SPARQL queries via subgraph matching. Proceedings of the VLDB Endowment, 2011, 4(8): 482–493
CrossRef Google scholar
[7]
Zeng K, Yang J C, Wang H X, Shao B, Wang Z Y. A distributed graph engine forWeb scale RDF data. Proceedings of the VLDB Endowment, 2013, 6(4): 265–276
CrossRef Google scholar
[8]
Zou L, Özsu M T, Chen L, Shen X C, Huang R Z, Zhao D Y. gStore: a graph-based SPARQL query engine. VLDB Journal, 2014, 23(4): 565–590
CrossRef Google scholar
[9]
Aluç G. Workload matters: a robust approach to physical RDF database design. Dissertation for the Doctoral Degree. Waterloo: University of Waterloo, 2015
[10]
Ingalalli V, Ienco D, Poncelet P, Villata S. Querying RDF data using a multigraph-based approach. In: Proceedings of the 19th International Conference on Extending Database Technology. 2016, 245–256
[11]
Nabti C, Seba H. A simple algorithm for subgraph queries in big graphs. 2017, arXiv preprint arXiv:1703.05547
[12]
Erling O. Virtuoso, a hybrid rdbms/graph column store. IEEE Data Engineering Bulletin, 2012, 35(1): 3–8
[13]
Mcbride B. Jena: a semantic Web toolkit. IEEE Educational Activities Department, 2002
[14]
Guo Y B, Pan Z X, Heflin J. LUBM: a benchmark for OWL knowledge base systems. Web Semantics Science Services and Agents on the World Wide Web, 2005, 3(2): 158–182
CrossRef Google scholar
[15]
Ullmann J R. An algorithm for subgraph isomorphism. Journal of the ACM, 1976, 23(1): 31–42
CrossRef Google scholar
[16]
Cordella L P, Foggia P, Sansone C, Vento M. A (sub)graph isomorphism algorithm for matching large graphs. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2004, 26(10): 1367–1372
CrossRef Google scholar
[17]
Zhao P X, Han J W. On graph query optimization in large networks. VLDB Endowment, 2010, 3(3): 340–351
CrossRef Google scholar
[18]
Zhu K, Zhang Y, Lin X M, Zhu G P, Wang W. Nova: a novel and efficient framework for finding subgraph isomorphism mappings in large graphs. In: Proceedings of the International Conference on Database Systems for Advanced Applications. 2010, 140–154
CrossRef Google scholar
[19]
Peng P, Zou L, Chen L, Lin X M, Zhao D Y. Subgraph search over massive disk resident graphs. In: Proceedings of the International Conference on Scientific and Statistical Database Management. 2011, 312–321
CrossRef Google scholar
[20]
Peng P, Zou L, Chen L, Lin X M, Zhao D Y. Answering subgraph queries over massive disk resident graphs. World Wide Web, 2016, 19(3): 417–448
CrossRef Google scholar
[21]
Lee J, Han W S, Kasperovics R, Lee J H. An in-depth comparison of subgraph isomorphism algorithms in graph databases. Proceedings of the VLDB Endowment, 2012, 6(2): 133–144
CrossRef Google scholar
[22]
Han W S, Lee J, Lee J H. Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 2013, 337–348
CrossRef Google scholar
[23]
Kim J, Shin H, Han W S, Hong S, Chafi H. Taming subgraph isomorphism for RDF query processing. Proceedings of the VLDB Endowment, 2015, 8(11): 1238–1249
CrossRef Google scholar
[24]
Ren X G, Wang J H. Exploiting vertex relationships in speeding up subgraph isomorphism over large graphs. Proceedings of the VLDB Endowment, 2015, 8(5): 617–628
CrossRef Google scholar
[25]
McKay B D, Piperno A. Practical graph isomorphism, II. Journal of Symbolic Computation, 2014, 60(1): 94–112
CrossRef Google scholar
[26]
Shang H C, Zhang Y, Lin X M, Yu F X. Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. Proceedings of the VLDB Endowment, 2008, 1(1): 364–375
CrossRef Google scholar
[27]
Bi F, Chang L J, Lin X M, Qin L, Zhang W J. Efficient subgraph matching by postponing Cartesian products. In: Proceedings of ACM International Conference on Management of Data. 2016, 1199–1214
CrossRef Google scholar
[28]
Atre M, Chaoji V, Zaki M J, Hendler J A. Matrix “bit” loaded: a scalable lightweight join query processor for RDF data. In: Proceedings of the International Conference on World Wide Web. 2010, 41–50
CrossRef Google scholar
[29]
Peng P, Zou L, Özsu M T, Chen L, Zhao D Y. Processing SPARQL queries over distributed RDF graphs. VLDB Journal, 2016, 25(2): 243–268
CrossRef Google scholar

RIGHTS & PERMISSIONS

2018 Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature
AI Summary AI Mindmap
PDF(601 KB)

Accesses

Citations

Detail

Sections
Recommended

/