Measuring the potential of client-side adaptive query optimisation for link traversal over decentralised Linked Data documents

Tracking #: 3922-5136

This paper is currently under review
Authors: 
Jonni Hanski
Simon Van Braeckel
Ruben Verborgh
Ruben Taelman

Responsible editor: 
Axel Polleres

Submission type: 
Full Paper
Abstract: 
Alongside the emergence of decentralisation initiatives to address issues around regulatory compliance and barriers to entry to data-driven markets, the need arises for client-side query engines, to reduce the overhead of service development atop such decentralised environments, by abstracting away the complexities of data access. These engines, however, are responsible for performant data access in interactive applications, where user-perceived sluggishness can ultimately inhibit the adoption of the underlying decentralisation initiatives themselves. The performance cost consists of the network overhead to acquire the data, and the local processing of it, the latter of which is the focus of our work. Prior work has demonstrated how the structure of certain decentralised environments can assist query engines in efficiently locating and accessing query-relevant data, reducing the relative impact of data access, and exposing the local processing as a~major bottleneck. Within this work, we demonstrate the potential of client-side adaptive query planning over decentralised Linked Data documents, using the Solid ecosystem as an example environment. We also consider the impact of request rate limiting and network latency increases, to ensure our findings are also applicable under more realistic circumstances. Through the implementation of a~restart-based query planning technique, we achieve average query execution time reductions of up to 15% compared to a~baseline of unchanged query plan execution. Through the use of request rate limiting, we also identify optimisation potential in the Comunica query engine framework, with reductions of up to 60% in data transfer and 75% in system resource usage possible through smarter resource allocation. This illustrates the importance and potential of client-side optimisation even in distributed environments, and highlights the importance of further investigation in the direction of adaptive query processing techniques for link traversal.
Full PDF Version: 
Tags: 
Under Review