Relaxing Relationship Queries on Graph Data

02/24/2020
by   Shuxin Li, et al.
0

In many domains we have witnessed the need to search a large entity-relation graph for direct and indirect relationships between a set of entities specified in a query. A search result, called a semantic association (SA), is typically a compact (e.g., diameter-constrained) connected subgraph containing all the query entities. For this problem of SA search, efficient algorithms exist but will return empty results if some query entities are distant in the graph. To reduce the occurrence of failing query and provide alternative results, we study the problem of query relaxation in the context of SA search. Simply relaxing the compactness constraint will sacrifice the compactness of an SA, and more importantly, may lead to performance issues and be impracticable. Instead, we focus on removing the smallest number of entities from the original failing query, to form a maximum successful sub-query which minimizes the loss of result quality caused by relaxation. We prove that verifying the success of a sub-query turns into finding an entity (called a certificate) that satisfies a distance-based condition about the query entities. To efficiently find a certificate of the success of a maximum sub-query, we propose a best-first search algorithm that leverages distance-based estimation to effectively prune the search space. We further improve its performance by adding two fine-grained heuristics: one based on degree and the other based on distance. Extensive experiments over popular RDF datasets demonstrate the efficiency of our algorithm, which is more scalable than baselines.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset