GraphBolt: Streaming Graph Approximations on Big Data

10/05/2018
by   Miguel E. Coimbra, et al.
0

Graphs are found in a plethora of domains, including online social networks, the World Wide Web and the study of epidemics, to name a few. With the advent of greater volumes of information and the need for continuously updated results under temporal constraints, it is necessary to explore novel approaches that further enable performance improvements. In the scope of stream processing over graphs, we research the trade-offs between result accuracy and the speedup of approximate computation techniques. We believe this to be a natural path towards these performance improvements. Herein we present GraphBolt, through which we conducted our research. It is an innovative model for approximate graph processing, implemented in Apache Flink. We analyze our model and evaluate it with the case study of the PageRank algorithm, perhaps the most famous measure of vertex centrality used to rank websites in search engine results. In light of our model, we discuss the challenges driven by relations between result accuracy and potential performance gains. Our experiments show that GraphBolt can reduce computational time by over 50 quality above 95 PageRank without any summarization or approximation techniques.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset