Ontology Verbalization using Semantic-Refinement

Tracking #: 1788-3001

Vinu Ellampallil Venugopal
P Sreenivasa Kumar

Responsible editor: 
Philipp Cimiano

Submission type: 
Full Paper
In this paper, we propose an inference-based technique to generate redundancy-free natural language (NL) descriptions of Web Ontology Language (OWL) entities. The existing approaches for verbalizing OWL ontologies generate NL text segments which are close to their counterpart statements in the ontology. Some of these approaches also perform grouping and aggregation of these NL text segments to generate a more fluent and comprehensive form of the content. Restricting our attention to description of individuals and atomic concepts, we find that the approach currently used in the available tools is that of determining the set of all logical conditions that are satisfied by the given individual/concept name and translate these conditions verbatim into corresponding NL descriptions. Human-understandability of such descriptions is affected by the presence of repetitions and redundancies, as they have high fidelity to the OWL representation of the entities. In the literature, no major efforts have been taken to remove redundancies and repetitions at the logical level before generating the NL descriptions of entities and we find this to be the main reason for lack of readability of the generated text. In this paper, we propose a technique called semantic-refinement to generate meaningful and easily-understandable (what we call redundancy-free) text descriptions of individuals and concepts of a given OWL ontology. We identify the combinations of OWL/DL constructs that lead to repetitive/redundant descriptions and propose a series of refinement rules to rewrite the conditions that are satisfied by an individual/concept in a meaning-preserving manner. The reduced set of conditions are then employed for generating textual descriptions. Our experiments show that, semantic-refinement technique could significantly improve the readability of the descriptions of ontology entities, especially for domain experts. We have also tested the effectiveness and usefulness of the the generated descriptions in validating the ontologies and found that the proposed technique is indeed helpful in the context. Details of the empirical study and the results of statistical tests to support our claims are provided in the paper.
Full PDF Version: 

Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Basil Ell submitted on 10/Apr/2018
Major Revision
Review Comment:

=== summary ===

The authors present an approach to verbalization OWL axioms about individuals and concepts. The approach consists of a redundancy removal step and a verbalization step. Core to their method is the first step which they refer to as semantic refinement whereas the second step is straightforward. Semantic refinement is carried out via a set of rules and an algorithm executing those rules in a specific order and marks certain restrictions as provisionally reduced until conditions are fulfilled such that they can actually be removed. The authors carry out an evaluation with two ontologies, PD and DSA, where they show that semantic refinement increases the understandability of verbalizations.

Verbalization of OWL axioms about individual and concepts is relevant in the context of validating formalized domain knowledge by domain experts which are typically not experts in OWL - thus it is a relevant research problem . The relevance or removing redundant statements is well motivated.

=== assessment ===

The work is original since redundancies have so far not been handled in ontology verbalization. However, elimination of redundancies in OWL ontologies has been studied before and the paper should mention them and state how and why they deviate from those works. Since the set of rule sets for redundancy detection / semantic refinement are core to the approach, the significance of the present work depends on this point. The quality of writing is good, the content is mostly understandable, and the given examples illustrate concepts well. However, the use of commas should be revisited.

=== major comments ===

0. What I am missing, as already stated above, is a consideration of redundancy elimination in OWL ontologies, e.g., Grimm, Stephan, and Jens Wissmann. "Elimination of redundancy in ontologies." Extended Semantic Web Conference. Springer, Berlin, Heidelberg, 2011.

1. It would be nice to have an online demo or at least results of the refinements of the ontologies used in the evaluation as well as the verbalization results.

2. The authors state that ontology verbalization enables domain experts to validate ontologies. It should be made clear what valid means in this context. For example, is an ontology invalid if none of its concepts are satisfiable? Or is an ontology invalid if it does not contain key concepts from a certain domain? Or is an ontology invalid if it contains anti-patterns as defined in the OntoClean methodology? Or that it captures the beliefs about a domain of most domain experts? Related to that: according to p14, the purpose of the verbalizations is to validate the domain knowledge, whereas according to p16 the purpose of the verbalizations are to validate the ontology, which is another thing.

3. Related to the previous point, depending on how validity is defined by the authors, it would be necessary to construct invalid example ontologies and check whether domain experts are enabled to identify them as invalid using the verbalizations.

4. It should be made clear that the authors use the inferred ontology O'. For example, in the definition of non-vacuous role restrictions etc. on p7, these definitions should actually be built on O' because I believe that the algorithm actually makes use of O' here. Same for the definition of node-label-set. It can be conceptually defined on O via the entailment relation and implemented via O' as set membership. The authors should make clear that the inferred ontology O' is not the set of axioms that are either explicitly contained in O or can be inferred from O but rather a subset of what could actually be inferred. The should make clear why they have decided to have this limitation. That other possible entailments cannot be considered by the patterns defined later is not a sufficient explanation. One might ask: then why not define more pattern?

5. Would it be possible to refine an ontology and use an existing OWL verbalizer to create more sophisticated verbalization? Or is this somehow incompatible, e.g., would they carry out some reasoning steps and reintroduce redundancies?

6. p10, Condition 1. This fact deserves more description. Otherwise one might wonder whether an exhaustive search might sometimes find a label set which is smaller then the refined label set obtained when applying the rule sets in the order specified in the paper.

7. p10, Condition 2. It would be nice to have an example.

8. p10. For all three conditions an explanation would be nice regarding which lines of the algorithm realize them.

9. p11. It is not clear to me why it is necessary to introduce individuals when describing concepts. What about unsatisfiable concepts? Is it just more convenient to describe concepts (implementation-wise or when defining the concept) in this way?

10. p11, Figure 1 could be improved. Initially it was not clear to me which label set is input and which label set is the result of applying the refinement rules.

11. Regarding the evaluation I see some issues:
Were all rule sets relevant (= ever applied) given the two example ontologies PD and DSA / given the selected individuals and concepts? Were evaluators aware that two approaches were compared? If they were, then this might introduce a certain bias in favour of the proposed approach. A blind setting would help here. It would be very interesting to calculate inter-rater agreement, e. g., via Krippendorff’s alpha. Given that the numbers in Figure 2 sum up to 41 for both approaches and given that for the PD ontology 41 generated descriptions were selected, it seems like there was always a majority / it was never the case that less than 4 experts decided on either poor/medium/good? Was every verbalization assessed by every evaluator? E. g., given the PD ontology, were all 41 descriptions assessed by seven experts?

12. p15, "choosing one individual's description from each group". I do not understand how descriptions are grouped based on their label sets.

13. p15. The project website mentioned on p12 is actually not a project website (as of April 10th, 2018), because it only provides information about the mentioned ontologies (PD, DSA, HP) and not about the approach proposed in the paper.

=== minor comments ===

p2. In the context of that section it is not clear to me what the authors mean with "systems that comprehend ..."

p3. It appear as if [12] is the wrong reference?

p4. "domain-specific NL definition": how can an ontology verbalization be not domain-specific? Probably you could remove "domain-specific"

p6/p7. "The first task is to identify the labels that induce redundancy from the label-set. We call this task as content selection". Within the NLG literature, the term content selection usually refers to the content that is selected for verbalization and not the content to be removed from the content that is to be verbalized.

p7. Your example is "All advisors of IIT students are teaching staffs" but the restrictions you write about before that sentence is "\forall hasAdvisor.Professor."

p8, Existential Role Refinement rule. It should be "\exists R.U" and "\exists S.V" instead of "\exists R.U and \exists S.U".

p8. Rule 3a is the first rule to be referred to with its rule number, whereas 1a and 2a are not.

p14. "feedbacks are collected from the experts to get suggestions on improving the system": any outcome worth mentioning?

p17, section 7.1.2. The section header has some problem.

Review #2
By Tobias Kuhn submitted on 05/Jul/2018
Review Comment:

This paper presents a verbalization approach for OWL axioms that aims at removing redundancy in a sophisticated way to provide more helpful and more concise sentences to show formal facts to users. The overall aim is interesting and the paper contains interesting parts, but also suffers from some major issues: It doesn't appropriately discuss some serious limitations, it is too technical, too complicated, and too long for the type of its contribution, core elements contain errors or are at least very confusing, and the positive evaluation results only seem to stem from an inappropriate choice of baseline. Therefore I cannot suggest to accept the paper.

My main points of criticism:

- The restriction to OWL "individuals and concepts" is not well motivated. OWL axioms involving properties can be expected to be harder to understand, given that properties are logically more complex than individuals and classes. Therefore the restriction to individuals and classes is a serious one that limits the potential impact and usefulness.

- The use of axioms like "cat subClassOf animal" to remove "has as pet an animal" if we already know "has as pet a cat" implies that we are starting from a consistent ontology. This means that we cannot use the presented approach as a method to find the errors in inconsistent ontologies, which actually is an important problem that such verbalization tools could help us with. This assumption and the resulting limitation are not discussed.

- The paper contains much formal notation that can be better explained in text. These formal notations are moreover not always used consistently (see below).

- It is unclear in what sense the generated descriptions are "redundancy-free". Some steps actually seem to add redundancy (see below).

- The definition of strictness is either wrong or highly counter-intuitive (see below).

- Overall, the approach seems overly complex given the relatively simple nature of the problem and lacks elegance and/or concise intuitive notations and descriptions. The paper is also too long in my view and I had difficulties to see the broader picture while reading through the technical descriptions.

- The baseline for the empirical evaluation is not a fair one, and the positive results seem to stem just from that unfair baseline choice (see below).

Detailed comments:


- Remove comma: "show that, semantic-refinement"


- "Web Ontology Language (OWL/DL)": DL part of acronym is not explained

- Remove commas: "so that, an intelligent agent with the help of a reasoning system, can"

- I don't find this to be a helpful or accurate explanation of what ontologies are: "Ontologies play an important role in the development and deployment of the Semantic Web since they help in enhancing the understanding of the contextual meaning of data."

- "complex relational context" ?

- "mainly strive for one-to-one conversion of logical statements to NL texts": This is not true for the cited ACE work [8,9]. These relations are not one-to-one there (different equivalent OWL axioms can lead to the same ACE text).

- The introduction contains many unsupported claims. These need evidence/references:
- "Typically, ontologies are developed by a group of knowledge engineers with the help of domain experts."
- "... the process usually follows a spiral model ..."
- "... the quality of the ontology might degrade"
- "... usually an ontology development cycle is accompanied by a validation phase"
- and many more!

Related Work:

- I don't understand how the last part ("hence ...") follows from the rest: "The Semantic Web Authoring (SWAT) NL verbalization tools have given much importance to the fluency of the verbalized sentences [13], rather than removing redundancies from their logical forms, hence have deficiencies in interpreting the ontology contents"

Section 4:

- "In this paper, we use the words “reduction” and “refinement” interchangeably.": I don't see the benefit of this, and I think the authors should commit to just one of them to avoid confusion.

- Often formal notation is used in a way that rather hinders than clarifies the reader's understanding. For example, Definition 2 is just a complicated way of referring to all inferred relations R(x,y) where R is an atomic role. Algorithm 1 is another example, which could be explained in just one sentence.

- L_O(x) is defined in Definition 1 as a structure that is fixed given x and O. However, in Algorithm 1, this notation is misused as a data structure that can be updated (and therefore can have different values for a given x and O throughout the execution of the algorithm).

- Step 2 of the algorithm on page 6 seems to introduce redundancy, which seems strange for an approach that claims "redundancy-free" representations. At least, I didn't understand what purpose is served by introducing this redundancy at that point.

- "Clearly, including the latter description in the verbalization may confuse a reader." I don't think that this statement is so obvious that it doesn't need justification.

- "if a role restriction R1 is implied by another role restriction R2 (i.e., R2 ⇒ R1), then R1 can be said as a stricter version of R2.": That doesn't seem to be right. If R1 follows from R2 then R1 is less strict and not more. Whenever R1 is violated so is R2, but not necessarily vice versa. So, R2 can be violated while R1 is not, therefore R2 is "stricter" in its intuitive sense.

- I don't find the illustrative example on page 11 ("Illustration of the usefulness of the approach") particularly convincing. The original verbalization for a start seems to suffer from different issues except redundancy, specifically the grammatical error of number agreement in "two advisors who is a teaching staff". Then, more importantly, I learn in the original sentence that Sam is a PhD student, which I am not getting from the refined version, which makes me wonder whether the refined version is really more useful.

- The empirical evaluation doesn't seem to be a fair one, as the "traditional approach" is not really what is normally done. The Harry Potter ontology, for example, doesn't contain axioms that Hermione Granger "has as pet only creature, has at least one creature, has at most one creature, as pet". These are inferred properties, and inferring all such properties of course (by definition) introduces unnecessary redundancy. Other verbalization approaches normally don't do that, and so this is not a valid baseline to compare the proposed approach to. By only verbalizing the explicit axioms one gets a result that is very similar to the "proposed approach" one, and there is no reason to believe that it would have performed worse than the proposed approach with a user evaluation like the presented one. The worse performance of the "traditional approach" can easily be explained by the very artificial and abstract phrases like "has as factor only organism and has as factor something", which are not explicitly in the original ontologies.