Review Comment:
The paper presents a methodology for managing knowledge graph inconsistencies that originate at rule/ontology level. A DBPedia and a DBLP case are presented as application cases.
The results are significant for this journal (and its special issue). The presentation is nicely elaborated, built around easy to follow examples. In the following I will focus on the perceived shortcomings, mostly with respect to how the problem is motivated in the earlier sections of the paper:
Issue 1:
Methods for lifting RDF graphs out of tables are well-known, even aiming for standardization. See the RDB2RDF method or the D2RQ tool. Their rules are generic for any kind of relational table-based structures, so the reason for having rules such as those in Listing 3 is not clear to the reader. The argument of the paper seems to be that rules 3 and 4 are conflicting while producing triples 1 and 2, but a typical table-to-graph conversion method would not employ such explicit rules that are prone to errors. They would just assign the type based on the entity represented by the table. According to such generic rules, the ID 0 is only used in the people data source, so it would never be typed as furniture.
Consequently, the example employed as motivation makes the problem feel artificial. Authors must improve this rationale - why would someone have those explicit inconsistent rules, when the existing generic transformations work fine? A more generalized and powerful example should support the motivation - perhaps one of the "more than 2000 inconsistencies" in DBPedia, with data sources that are not tables? The initial rule examples could be given in the same rule language to be employed later in the paper... currently there is a drastic gap between the example that motivates the work and the examples that later illustrate the contribution.
Issue 2:
Also, it is not straightforward convincing to state that rules and ontologies are sources of inconsistencies (they are the means for detecting inconsistencies). In my practical experience a vast majority of inconsistencies come from data sources - misalignments in ontology axioms or rules are rather short-lived, temporary (in some intermediate, in-progress ontology draft, before it is released in production - i.e., before it has anything to do with the generation of knowledge graphs). Knowledge graphs are typically produced only after a certain level of quality and stability is reached for their vocabulary.
The authors make some distinction between "root causes" under Table 2, but is is brief and fails to introduce a credible scenario where rules/axioms are to blame - again, references to DBPedia inconsistencies could help with the credibility of this argument, but in the current form the reader is left wondering about the plausability of the problem statement (in the first half of the paper).
Issue 3:
The rules clustering approach seems to be central to this paper's proposal - however section 3.2.1 is one of the briefest in the entire paper (and 4.3.1 does not add much to it). How automated is this clustering? What does it mean "the entity to which a rules relate"? Since the examples in Listing 3 are given in natural language it cannot be assessed how this clustering really works, how it detects the relevant entities. Is the term "entity" used in the traditional sense of the Entity-Relationship model? Are they classes, instances, properties, any of these? How formal are those clustering rules?
To conclude, the authors make several assumptions that the casual journal readers will not necessarily assume and clarifications are still necessary to motivate the work in a generalized context. Additionally, more detailed explanations on the clustering approach should be given.
|