Algorithms and Analysis for the SPARQL Constructs
As Resource Description Framework (RDF) is becoming a popular data modelling standard, the challenges of efficient processing of Basic Graph Pattern (BGP) SPARQL queries (a.k.a. SQL inner-joins) have been a focus of the research community over the past several years. In our recently published work we brought community's attention to another equally important component of SPARQL, i.e., OPTIONAL pattern queries (a.k.a. SQL left-outer-joins). We proposed novel optimization techniques -- first of a kind -- and showed experimentally that our techniques perform better for the low-selectivity queries, and give at par performance for the highly selective queries, compared to the state-of-the-art methods. BGPs and OPTIONALs (BGP-OPT) make the basic building blocks of the SPARQL query language. Thus, in this paper, treating our BGP-OPT query optimization techniques as the primitives, we extend them to handle other broader components of SPARQL such as such as UNION, FILTER, and DISTINCT. We mainly focus on the procedural (algorithmic) aspects of these extensions. We also make several important observations about the structural aspects of complex SPARQL queries with any intermix of these clauses, and relax some of the constraints regarding the cyclic properties of the queries proposed earlier. We do so without affecting the correctness of the results, thus providing more flexibility in using the BGP-OPT optimization techniques.
READ FULL TEXT