Review Comment:
This document reports the second version of the work on planning and execution techniques for [R2]RML mapping rules. The proposed method relies on the partition of mapping rules. Evaluating the groups in the partition, reduces the duplicated generation of RDF triples and maximizes the parallel execution of the mapping rules. The proposed techniques are implemented in MorphKGC, an RML-compliant engine; the behavior of MorphKGC is assessed in three existing benchmarks. The reported results suggest that the proposed methods can accelerate the execution of mapping rules as the ones composing the studied benchmarks.
Overall, this second version addresses some comments on the previous version. The new experimental results provide evidence of the benefits that planning the execution of the mapping rules brings to the process of KG construction. Moreover, the competitive behavior with other engines puts the critical role played by logical and physical planning into perspective. Despite the improvements, this new version of this work still presents imprecise statements which need to be addressed in a new version of the paper.
Definitions:
Definitions 1 and 2 are still not well-formulated; please, reuse the notation and conventions in 2.2. The formal proof of whether given a data source when these two mapping documents will produce the same set of RDF triples is necessary; the correctness of the proposed approach dependents on that.
I do not agree with this statement
“Answer: We have included an additional function const(.) in Section 2.2 to solve this. Note that `if` here does not refer to logics, we have replaced it with `when` to avoid reader confusion. As the values of a term map must be {constant, template or reference}, the invariant is well-defined with the bullets that consider the three possibilities.”
Since “when” and “if” represent logical conditionals, clarify the sufficient and necessary conditions in Definition 5.
Proposed Algorithms:
Please, clearly state the assumptions under which Algorithm 3 is able to generate the Maximal Mapping Partition of an [R2]RML document.
Empirical Evaluation
Please, include absolute values and explain the impact of the selectivity of the joins in the performance of the compared engines.
Minor comments
SDM-RDFizer v4.1.1 and Chimera v2.1. are interpreters of RML and not only parsers. Please, clarify this point.
|