Abstract:
Path queries are a central feature of all modern graph query languages and standards, such as SPARQL, Cypher, SQL/PGQ, and GQL. While SPARQL returns \emph{endpoints} of path queries, it is possible in Cypher, SQL/PGQ, and GQL to return \emph{entire paths}. In this paper, we present the first framework for returning paths that match regular path queries under all twenty seven modes supported by the SQL/PGQ and GQL standards. At the core of our approach is the product graph construction combined with a way to compactly represent a potentially exponential number of results that can match a path query. We describe the approach on a conceptual level and provide runtime guarantees for evaluating path queries. We also show how it can be incorporated into the SPARQL query language, thus extending RDF engines with the ability to return witnesses to property path queries (under the GQL semantics). To show the feasibility of our methods in practice, we develop a reference implementation on top of an existing open-source graph processing engine MillenniumDB, and perform a detailed analysis of path querying over several synthetic and real-world datasets. Most notably, we test our implementation over Wikidata using property path queries posted by users, extending them with the ability to return paths. We obtain order-of-magnitude speedups over modern graph database engines and remarkably stable performance, even for theoretically intractable queries.