A subgraph matching algorithm based on subgraph index for knowledge graph

Yunhao SUN, Guanyu LI, Jingjing DU, Bo NING, Heng CHEN

PDF(12660 KB)
PDF(12660 KB)
Front. Comput. Sci. ›› 2022, Vol. 16 ›› Issue (3) : 163606. DOI: 10.1007/s11704-020-0360-y
Information Systems
RESEARCH ARTICLE

A subgraph matching algorithm based on subgraph index for knowledge graph

Author information +
History +

Abstract

The problem of subgraph matching is one fundamental issue in graph search, which is NP-Complete problem. Recently, subgraph matching has become a popular research topic in the field of knowledge graph analysis, which has a wide range of applications including question answering and semantic search. In this paper, we study the problem of subgraph matching on knowledge graph. Specifically, given a query graph q and a data graph G, the problem of subgraph matching is to conduct all possible subgraph isomorphic mappings of q on G. Knowledge graph is formed as a directed labeled multi-graph having multiple edges between a pair of vertices and it has more dense semantic and structural features than general graph. To accelerate subgraph matching on knowledge graph, we propose a novel subgraph matching algorithm based on subgraph index for knowledge graph, called as F G q T-Match. The subgraph matching algorithm consists of two key designs. One design is a subgraph index of matching-driven flow graph ( F G q T), which reduces redundant calculations in advance. Another design is a multi-label weight matrix, which evaluates a near-optimal matching tree for minimizing the intermediate candidates. With the aid of these two key designs, all subgraph isomorphic mappings are quickly conducted only by traversing F G q T. Extensive empirical studies on real and synthetic graphs demonstrate that our techniques outperform the state-of-the-art algorithms.

Graphical abstract

Keywords

knowledge graph / subgraph matching / subgraph index / matching tree

Cite this article

Download citation ▾
Yunhao SUN, Guanyu LI, Jingjing DU, Bo NING, Heng CHEN. A subgraph matching algorithm based on subgraph index for knowledge graph. Front. Comput. Sci., 2022, 16(3): 163606 https://doi.org/10.1007/s11704-020-0360-y

References

[1]
Hu S , Zou L , Yu J X , Wang H , Zhao D . Answering natural language questions by subgraph matching over knowledge graphs. IEEE Transactions on Knowledge and Data Engineering, 2018, 30( 5): 824– 837
[2]
Xu Q, Wang X, Li J, Gan Y, Chai L, Wang J. StarMR: an efficient star-decomposition based query processor for SPARQL basic graph patterns using MapReduce. In: proceedings of Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data. 2018, 415-430
[3]
Cai T , Li J , Mian A S , Sellis T , Yu J X . Target-aware holistic influence maximization in spatial social networks. IEEE Transactions on Knowledge and Data Engineering, 2020,
[4]
Shekhar S, Xiong H, Zhou X. Encyclopedia of GIS: Resource Description Framework(RDF). 1st ed. Cham: Springer International Publishing, 2017
[5]
Garey M R, Johnson D S. Computers and Intractability: A Guide to the Theory of NP-Completeness. 1st ed. New York: W. H. Freeman, 1979
[6]
Kim J , Shin H , Han W H , Hong S , Chafi H . Taming subgraph isomorphism for RDF query processing. Proceedings of the VLDB Endowment, 2015, 8( 11): 1238– 1249
[7]
Ingalalli V, Ienco D, Poncelet P, Villata S. Querying RDF data using a multigraph-based approach. In: Proceedings of the 19th International Conference on Extending Database Technology. 2016, 245-256
[8]
Ma H , Langouri M A , Wu Y , Chiang F , Pi J . Ontology-based entity matching in attributed graphs. Proceedings of the VLDB Endowment, 2019, 12( 10): 1195– 1207
[9]
Cordella L P , Foggia P , Sansone C , Vento M . A (sub)graph isomorphism algorithm for matching large graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 26( 10): 1367– 1372
[10]
He H, Singh A K. Graphs-at-a-time: query language and accessmethods for graph databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2008, 405-418
[11]
Zhao P , Han J . On graph query optimization in large networks. Proceedings of the VLDB Endowment, 2010, 3( 1): 340– 351
[12]
Han W, Lee J, Lee J H. Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2013, 337–348
[13]
Bi F, Chang L, Lin X, Qin L, Zhang W. Efficient subgraph matching by postponing cartesian products. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2016, 1199−1214
[14]
Shang H , Zhang Y , Lin X , Yu J X . Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. Proceedings of the VLDB Endowment, 2008, 1( 1): 364– 375
[15]
Kim K, Seo I, Han W S, Hong S, Chafi H, Shin H, Jeong G. Turboflux: A fast continuous subgraph matching system for streaming graph data. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2018, 411-426
[16]
Ullmann J R . An algorithm for subgraph isomorphism. Journal of the ACM, 1976, 23( 1): 31– 42
[17]
Jin X, Lai L. MPMatch: A Multi-core Parallel Subgraph Matching Algorithm. In: Proceedings of IEEE 35th International Conference on Data Engineering Workshops. 2019, 241−248
[18]
Bhattarai B, Liu H, Huang H. CECI: Compact Embedding Cluster Index for Scalable Subgraph Matching. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2019, 1447−1462
[19]
Peng P , Zou L , Du Z , Zhao D . Using partial evaluation in holistic subgraph search. Frontiers of Computer Science, 2017, 12( 5): 966– 983
[20]
Ma Y , Yuan Y , Liu M , Wang G , Wang Y . Graph simulation on large scale temporal graphs. GeoInformatica, 2020, 24( 1): 199– 220
[21]
Lin P , Song Q , Wu Y . Fact checking in knowledge graphs with ontological subgraph patterns. Data Science and Engineering, 2018, 3 : 341– 358
[22]
Xu Y, Tong Y, Shi Y, Tao Q, Xu Ke, Li W. An Efficient Insertion Operator in Dynamic RideSharing Services. In: Proceedings of IEEE 35th International Conference on Data Engineering. 2019, 1022−1033
[23]
Zou L , Özsu M T , Chen L , Shen X , Huang R , Zhao D . gStore: a graph-based SPARQL query engine. The VLDB Journal, 2014, 23( 4): 565– 590
[24]
Zeng L , Zou L . Redesign of the gStore system. Frontiers of Computer science, 2018, 12( 4): 1– 19
[25]
Wang X , Chai Le , Xu Q , Yang Y , Li J , Wang J , Chai Y . Efficient subgraph matching on large RDF graphs using MapReduce. Data Science and Engineering, 2019, 4 : 24– 43
[26]
Xu Q , Wang X , Li J , Zhang Q , Chai L . Distributed subgraph matching on big knowledge graphs using pregel. IEEE Access, 2019, 7 : 116453– 116464
[27]
Malewicz G, Austern M H, Bik, A J C, Dehnert J C. Pregel: A system for large-scale graph processing. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2010, 135−146
[28]
Li J , Cai T , Deng K , Wang X , Sellis T , Xia F . Community-diversified influence maximization in social networks. Information Systems, 2020, 92 : 101522–
[29]
Ma Y , Yuan Y , Wang G , Bi X , Wang Z , Wang Y . Rising star evaluation based on extreme learning machine in geo-social networks. Cognitive Computation, 2020, 12( 1): 296– 308
[30]
Wang Y, Tong Y, Long C, Xu P, Xu K, Lv W. Adaptive dynamic bipartite graph matching: a reinforcement learning approach. In: Proceedings of IEEE 35th International Conference on Data Engineering, 2019, 1478−1489
[31]
Zheng W , Zou L , Peng W , Yan X , Song S , Zhao D . Semantic SPARQL similarity search over RDF knowledge graphs. Proceedings of the VLDB Endowment, 2016, 9( 11): 840– 851

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant Nos. 61976032, 62002039).

RIGHTS & PERMISSIONS

2022 Higher Education Press
AI Summary AI Mindmap
PDF(12660 KB)

Accesses

Citations

Detail

Sections
Recommended

/