Review Comment:
In the context of ontology learning, enrichment, and validation, axiom scoring is the task of evaluating the acceptability of a (candidate) axiom against the known facts. The authors of this article study axiom scoring in a scenario of ontology evolution, whereby new facts (represented by RDF triples) are added to a knowledge base at different times and the underlying ontology needs to be revised based on the newly acquired knowledge.
To this aim, the authors adapt a possibilistic axiom-scoring heuristic proposed in the literature to the case of ontology evolution based on RDF data streams. After defining the problem of axiom testing against a sliding window of a stream of RDF data, focusing on the property axioms of functionality, inverse functionality, transitivity, irreflexiveness, symmetry and asymmetry, they introduce their adaptation of the possibilistic scoring approach and validate it through two experiments.
In the first experiments, they use the CMT ontology to test the extent to which the possibilistic approach is capable of correctly scoring some axioms that are known to hold, using three different sliding window sizes, and compare it to traditional information-retrieval measures like precision and F1-score. The results allow them to conclude that the possibilistic approach is robust and applicable when scoring axioms in streams and with limited data, and more so than a strictly probability-based one.
The second experiment considers an actual scenario of ontology evolution, using the game-related fictional domain of Pokémons, where successive generations (I through IX) provide sets of instances with different properties. To deal with this scenario, they compare three approaches: (i) using the plain possibilistic score, (ii) using same score together with a user-defined threshold for axiom acceptance, and (iii) defining an evolving possibilistic score as a weighted average of the past and present score for each axiom. The results suggest that approach (ii) is the most effective of the three to capture with the ontology the changes occurring in the stream of instances.
The idea of applying the possibilistic axiom scoring heuristics to RDF data streams for ontology evolution is novel and the proposed adaptation of the approach is original.
The article is well-written and easy to read. I found a few typos, which are detailed below.
The empirical validation is convincing, although choosing a real-world ontology of practical relevance would have made Experiment II even more compelling; nevertheless, I am inclined to believe that the domain of Pokémon can serve as a simplified model of the phenomena one could observe in real-world scenarios.
Overall, the paper is technically sound. My comments on the adaptation of the possibilistic approach, detailed below, have to do with the presentation more than with the substantial correctness of the proposed approach.
Detailed Comments
In Section 2 the authors argue that the possibilistic axiom scoring approach of [8] is "implicitly working under the implication of the knowledge available being complete". That's debatable, given that that framework provides for the existence of "basic statements" that are entailed by the axiom being tested but are neither confirmations or counterexamples; in addition, it makes an explicit open-world assumption. On the other hand, it appears that the assumption of completeness is made by the authors in this work, when they state that E0 = E+ + E- (cf. the definition of computeAxiom in §3.1).
In Section 3.2, the formulas for necessity and possibility (conjunctive and disjunctive) can be derived from the formulas proposed in [8] only under the assumption that E0 = E+ + E-. That's perfectly fine, as long as this assumption is justified and made explicit. However, in that case, the formulas of "conjunctive" necessity and "disjunctive" possibility could (and should) be simplified, because when E- = 0, E-/E0 = 0, and when E+ = 0, E+/E0 = 0, yielding:
- for the conjunctive necessity, N(Ax(P)) = 1, if E- = 0, and 0, if E- > 0;
- for the disjunctive possibility, \Pi(Ax(P)) = 0, if E+ = 0, and 1, if E+ > 0.
I find that the justification for using the "disjunctive" definition of possibility and necessity for transitiveness and symmetry opens some opportunities for a more in-depth discussion. As a matter of fact, the two definitions are proposed in [8] to deal with two alternative forms that the logical development of an axiom could take: if the development is in conjunctive normal form (i.e., a conjunction of many "basic statements"), then the "conjunctive" definition should be used; if the development is in disjunctive normal form (i.e., a disjunction of many "basic statements"), then the "disjunctive" definition should be used instead. However, as it is clear from Table 1, here the logical developments for all the axioms dealt with are in conjunctive normal form, because that's what one obtains from grounding a universally quantified formula. This means that the choice of using the "disjunctive" definition of possibility and necessity is not justified by the theory (on the contrary, it would be wrong from the theoretical point of view), but by empirical considerations alone! I think that this should be pointed out, at the very least, and the reasons why the wrong choice according to the theory turns out to work better than the theoretically sound choice in those two specific cases should be further elucidated.
Typos and minor issues:
In the Abstract:
- This paper presented -> presents
- a knowledge evolving scenario -> an evolving knowledge scenario
In the body of the paper:
- different sliding windows sizes -> different sliding window sizes
- references to other sections should be given as "Section 2", etc., and not as "(2)", etc., which is much less clear and potentially ambiguous.
- "possibility theory" does not take the article, much like "probability theory"!
- (Definition 3): one can use to describe -> one can use it to describe
- not considered either a confirmation or counterexample -> neither considered a confirmation nor a counterexample
- Any two individuals with different URIs must be \ considered as / distinct
- Shouldn't #iUri and #pUri read $iUri and $pUri, like in Code snippet 1?
- The experiments described below were done -> ... were carried out
- In code snippets 2 and 3, the FILTER clauses must be wrong - they appear to have been inverted: I think one should read FILTER ( ?o1 = ?o2 ) in code snippet 2 and FILTER ( ?o1 != ?o2 ) in code snipped 3, and not the other way around.
- In Table 7 (and in the text citing it) the negation symbols is not the usual one...
- known to not be present -> known not to be present
- accessing how the relevance -> assessing ...
- The same conclusion can be withdrawn -> ... drawn
- for axioms \ for / which there may be
- An w_p -> A w_p
- With of the introduction -> With the introduction
- in the property was not present -> if ...
- the decision reached \ using / ARI, ARI + cf, ...
- by favouring previous knowledge, it may -> ... ARI_e may
- conjunctive-form using properties -> properties using the conjunctive form
Assessment of the data file
A link is provided by the authors to the GitHub repository of the TICO_Lite tool.
Upon inspection, this repository appears to be well organized, but a README file is missing. I warmly recommend that the authors add one to the repository.
The provided resources appear to be complete for replication of experiments.
All the data artifacts used in the article are there and they appear to be complete.
|