aDFS: An Almost Depth-First-Search Distributed Graph-Querying System

Published in USENIX ATC, 2021

Graph processing is an invaluable tool for data analytics. In particular, pattern-matching queries enable flexible graph exploration and analysis, similar to what SQL provides for relational databases. Graph queries focus on following connections in the data; they are a challenging workload because even seemingly trivial queries can easily produce billions of intermediate results and irregular data access patterns.

In this paper, we introduce aDFS: A distributed graph-querying system that can process practically any query fully in memory, while maintaining bounded runtime memory consumption. To achieve this behavior, aDFS relies on (i) almost depth-first (aDFS) graph exploration with some breadth-first characteristics for performance, and (ii) non-blocking dispatching of intermediate results to remote edges. We evaluate aDFS against state-of-the-art graph-querying (Neo4J and GraphFrames for Apache Spark), graph-mining (G-Miner, Fractal, and Peregrine), as well as dataflow joins (BiGJoin), and show that aDFS significantly outperforms prior work on a diverse selection of workloads.

Recommended citation: Trigonakis, V., Lozi, J.P., Faltín, T., Roth, N.P., Psaroudakis, I., Delamare, A., Haprian, V., Iorgulescu, C., Koupy, P., Lee, J. and Hong, S., 2021. aDFS: An Almost Depth-First-Search Distributed Graph-Querying System. In 2021 USENIX Annual Technical Conference (USENIX ATC 21) (pp. 209-224).
Download Paper | Download Slides | Download Bibtex