Reachability-state encoding: bridging tradeoffs between the minimal storage and the fastest query for reachability query

Hu-Quan KANG , Xing-Li WANG , Zhou-Yang JIN , Yu-Ang DING , Yi-Ming LIU , Luo-Yi FU , Jia-Xin DING , Xin-Bing WANG

Front. Comput. Sci. ›› 2026, Vol. 20 ›› Issue (10) : 2010502

PDF (4617KB)
Front. Comput. Sci. ›› 2026, Vol. 20 ›› Issue (10) : 2010502 DOI: 10.1007/s11704-025-50122-8
Theoretical Computer Science
RESEARCH ARTICLE

Reachability-state encoding: bridging tradeoffs between the minimal storage and the fastest query for reachability query

Author information +
History +
PDF (4617KB)

Abstract

Reachability query, which determines whether a path exists between any two nodes in a graph, has been the cornerstone of graph operation research for decades. Traditional algorithms, such as depth-first search and transitive closure, are often limited by either slow query or substantial storage. Extensive efforts have increasingly focused on designing reachability indexes to balance the tradeoff between index space and query time. However, existing solutions are limited to specific tradeoff points and lack theory for adjustable, quantifiable tradeoffs. In this work, we introduce Reachability-state Encoding (RE), which adjusts the number of encoded node pairs per chunk to affect compression ratio and decoding time, thus allowing for a smooth transition between the minimal storage and the fastest query. We establish theoretical upper bounds for the minimum encoding length of RE in relation to the size and entropy of reachability-states, which quantifies the limits of compression within the encoding process and ensures superior performance in dense graphs compared to traditional methods. We have implemented a prototype algorithm, RE-toy, to validate the effectiveness of RE. Through experiments conducted on both synthetic and real-world graphs, RE-toy consistently achieves numerous optimal tradeoff points in space-time, demonstrating robust performance across graphs of varying densities. RE thus provides a practical and adaptable approach for managing the balance between storage demands and computational efficiency in reachability queries.

Graphical abstract

Keywords

reachability query / reachability index / reachability-state encoding / graph algorithm

Cite this article

Download citation ▾
Hu-Quan KANG, Xing-Li WANG, Zhou-Yang JIN, Yu-Ang DING, Yi-Ming LIU, Luo-Yi FU, Jia-Xin DING, Xin-Bing WANG. Reachability-state encoding: bridging tradeoffs between the minimal storage and the fastest query for reachability query. Front. Comput. Sci., 2026, 20(10): 2010502 DOI:10.1007/s11704-025-50122-8

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Newman M. Networks: An Introduction. Oxford: Oxford University Press, 2010

[2]

Zhang C, Bonifati A, Özsu M T. An overview of reachability indexes on graphs. In: Proceedings of Companion of 2023 International Conference on Management of Data. 2023, 61−68

[3]

Sakr S, Bonifati A, Voigt H, Iosup A, Ammar K, Angles R, Aref W, Arenas M, Besta M, Boncz P A, Daudjee K, Valle E D, Dumbrava S, Hartig O, Haslhofer B, Hegeman T, Hidders J, Hose K, Iamnitchi A, Kalavri V, Kapp H, Martens W, Özsu M T, Peukert E, Plantikow S, Ragab M, Ripeanu M R, Salihoglu S, Schulz C, Selmer P, Sequeda J F, Shinavier J, Szárnyas G, Tommasini R, Tumeo A, Uta A, Varbanescu A L, Wu H Y, Yakovets N, Yan D, Yoneki E . The future is big graphs: a community view on graph processing systems. Communications of the ACM, 2021, 64( 9): 62–71

[4]

Sahu S, Mhedhbi A, Salihoglu S, Lin J, Özsu M T . The ubiquity of large graphs and surprising challenges of graph processing. Proceedings of the VLDB Endowment, 2017, 11( 4): 420–431

[5]

Sahu S, Mhedhbi A, Salihoglu S, Lin J, Özsu M T . The ubiquity of large graphs and surprising challenges of graph processing: extended survey. The VLDB Journal, 2020, 29( 2): 595–618

[6]

Simon K. An improved algorithm for transitive closure on acyclic digraphs. Theoretical Computer Science, 1988, 58(1−3): 325−346

[7]

van Schaik S J, de Moor O. A memory efficient reachability data structure through bit vector compression. In: Proceedings of 2011 ACM SIGMOD International Conference on Management of Data. 2011, 913−924

[8]

Nuutila E . Efficient transitive closure computation in large digraphs. Acta Polytechnica Scandinavica: Mathematics and Computing in Engineering, 1995, 74: 1–124

[9]

Agrawal R, Borgida A, Jagadish H V . Efficient management of transitive relationships in large data and knowledge bases. ACM SIGMOD Record, 1989, 18( 2): 253–262

[10]

Jin R, Ruan N, Xiang Y, Wang H . Path-tree: an efficient reachability indexing scheme for large directed graphs. ACM Transactions on Database Systems (TODS), 2011, 36( 1): 7

[11]

in R, Xiang Y, Ruan N, Wang H. Efficiently answering reachability queries on very large directed graphs. In: Proceedings of 2008 ACM SIGMOD International Conference on MANAGEMENT of Data. 2008, 595−608

[12]

Trißl S, Leser U. Fast and practical indexing and querying of very large graphs. In: Proceedings of 2007 ACM SIGMOD International Conference on Management of Data. 2007, 845−856

[13]

Wang H, He H, Yang J, Yu P S, Yu J X. Dual labeling: answering graph reachability queries in constant time. In: Proceedings of the 22nd International Conference on Data Engineering (ICDE’06). 2006, 75

[14]

Bertino E, Atzeni P, Tan K L, Chen Y, Tay Y C, Yildirim H, Chaoji V, Zaki M J. GRAIL: scalable reachability index for large graphs. Proceedings of the VLDB Endowment, 2010, 3(1−2): 276−284

[15]

Seufert S, Anand A, Bedathur S, Weikum G. FERRARI: flexible and efficient reachability range assignment for graph indexing. In: Proceedings of 2013 IEEE 29th International Conference on Data Engineering (ICDE). 2013, 1009−1020

[16]

Cohen E, Halperin E, Kaplan H, Zwick U . Reachability and distance queries via 2-hop labels. SIAM Journal on Computing, 2003, 32( 5): 1338–1355

[17]

Jin R, Xiang Y, Ruan N, Fuhry D. 3-HOP: a high-compression indexing scheme for reachability query. In: Proceedings of 2009 ACM SIGMOD International Conference on Management of Data. 2009, 813−826

[18]

Cai J, Poon C K. Path-hop: efficiently indexing large graphs for reachability queries. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management. 2010, 119−128

[19]

Hanauer K, Schulz C, Trummer J . O’Reach: even faster reachability in large graphs. ACM Journal of Experimental Algorithmics, 2022, 27( 1): 1–27

[20]

Jin R, Wang G . Simple, fast, and scalable reachability oracle. Proceedings of the VLDB Endowment, 2013, 6( 14): 1978–1989

[21]

Cheng J, Huang S, Wu H, Fu A W C. TF-Label: a topological-folding labeling scheme for reachability querying in a large graph. In: Proceedings of 2013 ACM SIGMOD International Conference on Management of Data. 2013, 193−204

[22]

Yano Y, Akiba T, Iwata Y, Yoshida Y. Fast and scalable reachability queries on graphs by pruned labeling with landmarks and paths. In: Proceedings of the 22nd ACM international conference on Information & Knowledge Management. 2013, 1601−1606

[23]

Zhu A D, Lin W, Wang S, Xiao X. Reachability queries on large dynamic graphs: a total order approach. In: Proceedings of 2014 ACM SIGMOD International Conference on Management of Data. 2014, 1323−1334

[24]

Wei H, Yu J X, Lu C, Jin R . Reachability querying: an independent permutation labeling approach. Proceedings of the VLDB Endowment, 2014, 7( 12): 1191–1202

[25]

Su J, Zhu Q, Wei H, Yu J X. Reachability querying: can it be even faster? IEEE Transactions on Knowledge and Data Engineering, 2017, 29(3): 683−697

[26]

Lyu Q, Li Y, He B, Gong B. DBL: efficient reachability queries on dynamic graphs. In: Proceedings of the 26th International Conference on Database Systems for Advanced Applications. 2021, 761−777

[27]

Zhang C, Bonifati A, Kapp H, Haprian V I, Lozi J P. A reachability index for recursive label-concatenated graph queries. In: Proceedings of the 39th IEEE International Conference on Data Engineering (ICDE). 2023, 67−81

[28]

Valstar L D J, Fletcher G H L, Yoshida Y. Landmark indexing for evaluation of label-constrained reachability queries. In: Proceedings of 2017 ACM International Conference on Management of Data. 2017, 345−358

[29]

Peng Y, Zhang Y, Lin X, Qin L, Zhang W . Answering billion-scale label-constrained reachability queries within microsecond. Proceedings of the VLDB Endowment, 2020, 13( 6): 812–825

[30]

Chen X, Peng Y, Wang S, Yu J X . DLCR: efficient indexing for label-constrained reachability queries on large dynamic graphs. Proceedings of the VLDB Endowment, 2022, 15( 8): 1645–1657

[31]

Chen Y, Singh G . Graph indexing for efficient evaluation of label-constrained reachability queries. ACM Transactions on Database Systems (TODS), 2021, 46( 2): 8

[32]

Jin R, Peng Z, Wu W, Dragan F, Agrawal G, Ren B. Parallelizing pruned landmark labeling: dealing with dependencies in graph algorithms. In: Proceedings of the 34th ACM International Conference on Supercomputing. 2020, 11

[33]

Zhou J, Zhou S, Yu J X, Wei H, Chen Z, Tang X. Dag reduction: fast answering reachability queries. In: Proceedings of 2017 ACM International Conference on Management of Data. 2017, 375−390

[34]

Zhou J, Yu J X, Qiu Y, Tang X, Chen Z, Du M. Fast reachability queries answering based on RCN reduction (extended abstract). In: Proceedings of the 38th IEEE International Conference on Data Engineering (ICDE). 2022, 1543−1544

[35]

Tarjan R . Depth-first search and linear graph algorithms. SIAM Journal on Computing, 1972, 1( 2): 146–160

[36]

Bonamy M, Esperet L, Groenland C, Scott A. Optimal labelling schemes for adjacency, comparability, and reachability. In: Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing. 2021, 1109−1117

[37]

Williams V V, Xu Y, Xu Z, Zhou R. New bounds for matrix multiplication: from alpha to omega. In: Proceedings of 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 2024, 3792−3835

[38]

Liu X, Liu Y, Yin B, Yang H, Luan Z, Qian D . swSpAMM: optimizing large-scale sparse approximate matrix multiplication on Sunway Taihulight. Frontiers of Computer Science, 2023, 17( 4): 174104

[39]

Awada U, Zhang J, Chen S, Li S, Yang S . Machine learning driven latency optimization for internet of things applications in edge computing. ZTE Communications, 2023, 21( 2): 40–52

[40]

Rissanen J, Langdon G G . Arithmetic coding. IBM Journal of Research and Development, 1979, 23( 2): 149–162

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (4617KB)

Supplementary files

Highlights

354

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/