Abstract:
Finding a good query plan is key to the optimization of query runtime. This holds in particular for cost-based federation
engines, which make use of cardinality estimations to achieve this goal. A number of studies compare SPARQL federation
engines across different performance metrics, including query runtime, result set completeness and correctness, number of sources
selected and number of requests sent. Albeit informative, these metrics are generic and unable to quantify and evaluate the
accuracy of the cardinality estimators of cost-based federation engines. To thoroughly evaluate cost-based federation engines, the
effect of estimated cardinality errors on the overall query runtime performance must be measured. In this paper, we address this
challenge by presenting novel evaluation metrics targeted at a fine-grained benchmarking of cost-based federated SPARQL query
engines. We evaluate five cost-based federated SPARQL query engines using existing as well as novel evaluation metrics by using
LargeRDFBench queries. Our results provide a detailed analysis of the experimental outcomes that reveal novel insights, useful
for the development of future cost-based federated SPARQL query processing engines.