Supporting nearest neighbors query on high-dimensional data in P2P systems

LI Mei1, LEE Wang-Chien1, SIVASUBRAMANIAM Anand1, ZHAO Jizhong2

PDF(455 KB)
PDF(455 KB)
Front. Comput. Sci. ›› 2008, Vol. 2 ›› Issue (3) : 234-247. DOI: 10.1007/s11704-008-0026-7

Supporting nearest neighbors query on high-dimensional data in P2P systems

  • LI Mei1, LEE Wang-Chien1, SIVASUBRAMANIAM Anand1, ZHAO Jizhong2
Author information +
History +

Abstract

Peer-to-peer systems have been widely used for sharing and exchanging data and resources among numerous computer nodes. Various data objects identifiable with high dimensional feature vectors, such as text, images, genome sequences, are starting to leverage P2P technology. Most of the existing works have been focusing on queries on data objects with one or few attributes and thus are not applicable on high dimensional data objects. In this study, we investigate K nearest neighbors query (KNN) on high dimensional data objects in P2P systems. Efficient query algorithm and solutions that address various technical challenges raised by high dimensionality, such as search space resolution and incremental search space refinement, are proposed. An extensive simulation using both synthetic and real data sets demonstrates that our proposal efficiently supports KNN query on high dimensional data in P2P systems.

Cite this article

Download citation ▾
LI Mei, LEE Wang-Chien, SIVASUBRAMANIAM Anand, ZHAO Jizhong. Supporting nearest neighbors query on high-dimensional data in P2P systems. Front. Comput. Sci., 2008, 2(3): 234‒247 https://doi.org/10.1007/s11704-008-0026-7

References

1. Ratnasamy S, Francis P, Handley M, et al.. Scalable, distributed object location and routingfor large-scale peer-to-peer systems . In: Proceedings of ACM SIGCOMM 2001New York: ACM Press, 2001, 161–172
2. Stoica I, Morris R, Karger D, et al.. Chord: A scalable peer-to-peer lookup servicefor Internet applications. In: Proceedingsof ACM SIGCOMM 2001. New York: ACM Press, 2001, 149–160
3. Rowstron A I T, Druschel P . Pastry: Scalable, distributedobject location and routing for large-scale peer-to-peer systems. In: Proceedings of IFIP/ACM International Conferenceon Distributed Systems Platforms (Middleware).New York: ACM Press, 2001, 329–350
4. Zhao B Y, Kubiatowicz J D, Joseph A D . Tapestry: an infrastructure for fault-tolerant wide-arealocation and routing. Technical Report UCS/CSD-01-1141, Computer ScienceDivision, U. C.Berkeley, 2001
5. Andrzejak A, Xu Z . Scalable, efficient rangequeries for grid information services. In: Proceedings of IEEE International Conference on Peer-to-Peer Computing(P2P).Wahsington D.C.: IEEE ComputingSoceity, 2002, 33–40
6. Aspnes J, Shah G . Skip graphs. In: Proceedings of ACM-SIAM Symposium on Discrete Algorithms (SODA).New York: ACM Press, 2003, 384–393
7. Bharambe A R, Agrawal M, Seshan S . Mercury: Supporting scalable multi-attribute range queries. In: Proceedings of ACM SIGCOMM.New York: ACM Press, 2004, 353–366
8. Ganesan P, Bawa M, Garcia-Molina H . Online balancing of range-partitioned data with applicationsto peer-to-peer systems. In: Proceedingsof International Conference on Very Large Data Bases (VLDB).VLDB Endowment, 2004, 444–455
9. Gao J, Steenkiste P . An adaptive protocol forefficient support of range queries in DHT-based systems. In: Proceedings of IEEE International Conferenceon Network Protocols (ICNP). WashingtonD.C.: IEEE Computer Society, 2004, 239–250
10. Gupta A, Agrawal D, Abbadi A E . Approximate range selection queries in peer-to-peer systems. In: Proceedings of Biennial Conference on InnovativeData Systems Research (CIDR), 2003
11. Sahin O, Gupta A, Agrawal D, et al.. A peer-to-peer framework for caching range queries. In: Proceedings of International Conference onData Engineering (ICDE).WashintonD.C.: IEEE Computer Society, 2004, 165–176
12. Shu Y, Ooi B C, Tan KL, et al.. Supporting multi-dimensional range queries inpeer-to-peer systems. In: Proceedings ofIEEE International Conference on Peer-to-Peer Computing (P2P).Washington D.C.: IEEE Computer Society, 2005, 173–180
13. Banaei-Kashani F, Shahabi C . SWAM: a family of accessmethods for similarity-search in peer-to-peer data networks. In: Proceedings of ACM Conference on Informationand Knowledge Management (CIKM).New York: ACM Press, 2004, 304–313
14. Jagadish H V, Ooi BC, Vu Q H, et al.. VBI-Tree: a peer-to-peer framework for supportingmulti-dimensional indexing schemes. In: Proceedings of International Conference on Data Engineering (ICDE), 2006
15. Li M, Lee W-C, Sivasubramaniam A . DPTree: a balanced tree based indexingframework for peer-to-peer systems. In: Proceedings of International Conference on Network Protocols (ICNP). Washington D.C.: IEEE Computer Society, 2006, 12–21
16. Liu B, Lee W-C, Lee D L . Supporting complex multi-dimensional queries in P2P systems. In: Proceedings of International Conference onDistributed Computing Systems (ICDCS), 2005, 155–164
17. Tanin E, Nayar D, Samet H . An efficient nearest neighbor algorithm for P2P settings. In: Proceedings of National Conference on DigitalGovernment Research, 2005, 21–28
18. Li M, Lee W-C, Sivasubramaniam A . Semantic small world: An overlay networkfor peer-to-peer search. In: Proceedingsof International Conference on Network Protocols (ICNP).Washington D.C.: IEEE Computer Society, 2004, 228–238
19. Li M, Lee W-C, Sivasubramaniam A, et al.. Ssw: a small world based overlayfor peer-to-peer search. IEEE Transactionon Parallel and Distributed Systems, 2008, 19(2): 735–749. doi:10.1109/TPDS.2007.70757
20. Ganesan P, Yang B, Garcia-Molina B . One torus to rule them all: Multidimensional queriesin P2P systems. In: Proceedings of InternationalWorkshop on the Web and Databases (WebDB), 2004, 19–24
21. Tang C, Xu Z, Dwarkadas S . Peer-to-peer information retrieval using self-organizingsemantic overlay networks. In: Proceedingsof ACM SIGCOMM. New York: AMC Press, 2003, 175–186
22. Müller W, Henrich A . Fast retrieval of high-dimensionalfeature vectors in P2P networks using compact peer data summaries. In: Proceedings of ACM SIGMM International Workshopon Multimedia Information Retrieval (MIR).New York: ACM Press, 2003, 79–86
23. Aberer K . P-Grid:a self-organizing access structure for P2P information systems. In: Proceedings of International Conference onCooperative Information Systems (CoopIS) 2001, 179–194
24. Crainiceanu A, Linga P, Gehrke J, et al.. Querying peer-to-peer networks using P-trees. In: Proceedings of International Workshop on theWeb and Databases (WebDB).New York: ACM Press, 2004, 25–30
25. Houle M . E, Sakuma J . Fast approximate similaritysearch in extremely high-dimensional data sets. In: Proceedings of International Conference on Data Engineering (ICDE). Washinton DC.: IEEE Computer Society, 2005, 619–630
AI Summary AI Mindmap
PDF(455 KB)

Accesses

Citations

Detail

Sections
Recommended

/