53
IRUS Total
Downloads
  Altmetric

Efficient partial-pairs SimRank search on large graphs

File Description SizeFormat 
partial-simrank.pdfAccepted version387.96 kBAdobe PDFView/Open
Title: Efficient partial-pairs SimRank search on large graphs
Authors: Yu, W
McCann, J
Item Type: Journal Article
Abstract: The assessment of node-to-node similarities based on graph topology arises in a myriad of applications, e.g., web search. SimRank is a notable measure of this type, with the intuition that "two nodes are similar if their in-neighbors are similar". While most existing work retrieving SimRank only considers all-pairs SimRank s(*, *) and single-source SimRank s(*, j) (scores between every node and query j), there are appealing applications for partial-pairs SimRank, e.g., similarity join. Given two node subsets A and B in a graph, partial-pairs SimRank assessment aims to retrieve only {s(a, b)}∀aεA,∀bεB. However, the best-known solution appears not self-contained since it hinges on the premise that the SimRank scores with node-pairs in an h-go cover set must be given beforehand. This paper focuses on efficient assessment of partial-pairs SimRank in a self-contained manner. (1) We devise a novel "seed germination" model that computes partial-pairs SimRank in O(k|E| min{|A|, |B|}) time and O(|E| + k|V|) memory for k iterations on a graph of |V| nodes and |E| edges. (2) We further eliminate unnecessary edge access to improve the time of partial-pairs SimRank to O(m min{|A|, |B|}), where m ≤ min{k|E|, Δ2k}, and Δ is the maximum degree. (3) We show that our partial-pairs SimRank model also can handle the computations of all-pairs and single-source SimRanks. (4) We empirically verify that our algorithms are (a) 38x faster than the best-known competitors, and (b) memory-efficient, allowing scores to be assessed accurately on graphs with tens of millions of links.
Issue Date: 1-Jan-2015
Date of Acceptance: 1-Jan-2015
URI: http://hdl.handle.net/10044/1/23679
DOI: 10.14778/2735479.2735489
ISSN: 2150-8097
Publisher: VLDB Endowment
Start Page: 569
End Page: 580
Journal / Book Title: Proceedings of the VLDB Endowment International Conference on Very Large Data Bases
Volume: 8
Issue: 5
Copyright Statement: © ACM, 2015. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the VLDB Endowment International Conference on Very Large Data Bases, 8(5), June 2015. https://dx.doi.org/10.14778/2735479.2735489
Sponsor/Funder: NEC Corporation
European Institute of Innovation and Technology - EIT
European Institute of Innovation and Technology - EIT
Intel Corporation (UK) Ltd
Commission of the European Communities
Funder's Grant Number: N/A
.
EIT ICT Labs 2014
PO #3000922346
619795
Keywords: Science & Technology
Technology
Computer Science, Information Systems
Computer Science, Theory & Methods
Computer Science
SIMILARITY
0802 Computation Theory and Mathematics
0806 Information Systems
0807 Library and Information Studies
Publication Status: Published
Appears in Collections:Computing
Faculty of Engineering