A subgraph matching algorithm based on subgraph index for knowledge graph

Yunhao SUN, Guanyu LI, Jingjing DU, Bo NING, Heng CHEN

Front. Comput. Sci. ›› 2022, Vol. 16 ›› Issue (3) : 163606.

PDF(12660 KB)
PDF(12660 KB)
Front. Comput. Sci. ›› 2022, Vol. 16 ›› Issue (3) : 163606. DOI: 10.1007/s11704-020-0360-y
Information Systems
RESEARCH ARTICLE

A subgraph matching algorithm based on subgraph index for knowledge graph

Author information +
History +

Abstract

The problem of subgraph matching is one fundamental issue in graph search, which is NP-Complete problem. Recently, subgraph matching has become a popular research topic in the field of knowledge graph analysis, which has a wide range of applications including question answering and semantic search. In this paper, we study the problem of subgraph matching on knowledge graph. Specifically, given a query graph q and a data graph G, the problem of subgraph matching is to conduct all possible subgraph isomorphic mappings of q on G. Knowledge graph is formed as a directed labeled multi-graph having multiple edges between a pair of vertices and it has more dense semantic and structural features than general graph. To accelerate subgraph matching on knowledge graph, we propose a novel subgraph matching algorithm based on subgraph index for knowledge graph, called as F G q T-Match. The subgraph matching algorithm consists of two key designs. One design is a subgraph index of matching-driven flow graph ( F G q T), which reduces redundant calculations in advance. Another design is a multi-label weight matrix, which evaluates a near-optimal matching tree for minimizing the intermediate candidates. With the aid of these two key designs, all subgraph isomorphic mappings are quickly conducted only by traversing F G q T. Extensive empirical studies on real and synthetic graphs demonstrate that our techniques outperform the state-of-the-art algorithms.

Graphical abstract

Keywords

knowledge graph / subgraph matching / subgraph index / matching tree

Cite this article

Download citation ▾
Yunhao SUN, Guanyu LI, Jingjing DU, Bo NING, Heng CHEN. A subgraph matching algorithm based on subgraph index for knowledge graph. Front. Comput. Sci., 2022, 16(3): 163606 https://doi.org/10.1007/s11704-020-0360-y

References

[1]
Hu S , Zou L , Yu J X , Wang H , Zhao D . Answering natural language questions by subgraph matching over knowledge graphs. IEEE Transactions on Knowledge and Data Engineering, 2018, 30( 5): 824– 837
[2]
Xu Q, Wang X, Li J, Gan Y, Chai L, Wang J. StarMR: an efficient star-decomposition based query processor for SPARQL basic graph patterns using MapReduce. In: proceedings of Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data. 2018, 415-430
[3]
Cai T , Li J , Mian A S , Sellis T , Yu J X . Target-aware holistic influence maximization in spatial social networks. IEEE Transactions on Knowledge and Data Engineering, 2020,
[4]
Shekhar S, Xiong H, Zhou X. Encyclopedia of GIS: Resource Description Framework(RDF). 1st ed. Cham: Springer International Publishing, 2017
[5]
Garey M R, Johnson D S. Computers and Intractability: A Guide to the Theory of NP-Completeness. 1st ed. New York: W. H. Freeman, 1979
[6]
Kim J , Shin H , Han W H , Hong S , Chafi H . Taming subgraph isomorphism for RDF query processing. Proceedings of the VLDB Endowment, 2015, 8( 11): 1238– 1249
[7]
Ingalalli V, Ienco D, Poncelet P, Villata S. Querying RDF data using a multigraph-based approach. In: Proceedings of the 19th International Conference on Extending Database Technology. 2016, 245-256
[8]
Ma H , Langouri M A , Wu Y , Chiang F , Pi J . Ontology-based entity matching in attributed graphs. Proceedings of the VLDB Endowment, 2019, 12( 10): 1195– 1207
[9]
Cordella L P , Foggia P , Sansone C , Vento M . A (sub)graph isomorphism algorithm for matching large graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 26( 10): 1367– 1372
[10]
He H, Singh A K. Graphs-at-a-time: query language and accessmethods for graph databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2008, 405-418
[11]
Zhao P , Han J . On graph query optimization in large networks. Proceedings of the VLDB Endowment, 2010, 3( 1): 340– 351
[12]
Han W, Lee J, Lee J H. Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2013, 337–348
[13]
Bi F, Chang L, Lin X, Qin L, Zhang W. Efficient subgraph matching by postponing cartesian products. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2016, 1199−1214
[14]
Shang H , Zhang Y , Lin X , Yu J X . Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. Proceedings of the VLDB Endowment, 2008, 1( 1): 364– 375
[15]
Kim K, Seo I, Han W S, Hong S, Chafi H, Shin H, Jeong G. Turboflux: A fast continuous subgraph matching system for streaming graph data. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2018, 411-426
[16]
Ullmann J R . An algorithm for subgraph isomorphism. Journal of the ACM, 1976, 23( 1): 31– 42
[17]
Jin X, Lai L. MPMatch: A Multi-core Parallel Subgraph Matching Algorithm. In: Proceedings of IEEE 35th International Conference on Data Engineering Workshops. 2019, 241−248
[18]
Bhattarai B, Liu H, Huang H. CECI: Compact Embedding Cluster Index for Scalable Subgraph Matching. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2019, 1447−1462
[19]
Peng P , Zou L , Du Z , Zhao D . Using partial evaluation in holistic subgraph search. Frontiers of Computer Science, 2017, 12( 5): 966– 983
[20]
Ma Y , Yuan Y , Liu M , Wang G , Wang Y . Graph simulation on large scale temporal graphs. GeoInformatica, 2020, 24( 1): 199– 220
[21]
Lin P , Song Q , Wu Y . Fact checking in knowledge graphs with ontological subgraph patterns. Data Science and Engineering, 2018, 3 : 341– 358
[22]
Xu Y, Tong Y, Shi Y, Tao Q, Xu Ke, Li W. An Efficient Insertion Operator in Dynamic RideSharing Services. In: Proceedings of IEEE 35th International Conference on Data Engineering. 2019, 1022−1033
[23]
Zou L , Özsu M T , Chen L , Shen X , Huang R , Zhao D . gStore: a graph-based SPARQL query engine. The VLDB Journal, 2014, 23( 4): 565– 590
[24]
Zeng L , Zou L . Redesign of the gStore system. Frontiers of Computer science, 2018, 12( 4): 1– 19
[25]
Wang X , Chai Le , Xu Q , Yang Y , Li J , Wang J , Chai Y . Efficient subgraph matching on large RDF graphs using MapReduce. Data Science and Engineering, 2019, 4 : 24– 43
[26]
Xu Q , Wang X , Li J , Zhang Q , Chai L . Distributed subgraph matching on big knowledge graphs using pregel. IEEE Access, 2019, 7 : 116453– 116464
[27]
Malewicz G, Austern M H, Bik, A J C, Dehnert J C. Pregel: A system for large-scale graph processing. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2010, 135−146
[28]
Li J , Cai T , Deng K , Wang X , Sellis T , Xia F . Community-diversified influence maximization in social networks. Information Systems, 2020, 92 : 101522–
[29]
Ma Y , Yuan Y , Wang G , Bi X , Wang Z , Wang Y . Rising star evaluation based on extreme learning machine in geo-social networks. Cognitive Computation, 2020, 12( 1): 296– 308
[30]
Wang Y, Tong Y, Long C, Xu P, Xu K, Lv W. Adaptive dynamic bipartite graph matching: a reinforcement learning approach. In: Proceedings of IEEE 35th International Conference on Data Engineering, 2019, 1478−1489
[31]
Zheng W , Zou L , Peng W , Yan X , Song S , Zhao D . Semantic SPARQL similarity search over RDF knowledge graphs. Proceedings of the VLDB Endowment, 2016, 9( 11): 840– 851

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant Nos. 61976032, 62002039).

RIGHTS & PERMISSIONS

2022 Higher Education Press
AI Summary AI Mindmap
PDF(12660 KB)

Accesses

Citations

Detail

Sections
Recommended

/