Review Comment:
This manuscript was submitted as 'full paper' and should be reviewed along the usual dimensions for research contributions which include (1) originality, (2) significance of the results, and (3) quality of writing.
Thank you for this interesting paper on reducing the repetitiveness of information in natural language verbalisations of ontologies by application of rules to OWL axioms pre-verbalisation. The aim of doing so is to improve the readability of verbalisations and make them more useful for domain experts to read.
The work presented is certainly original and seems highly likely to be a significant and useful approach to ontology verbalisation, although I believe there could be some improvements in the presentation, and there are some points that I'd like to have clarified.
Firstly, the rules are presented as all in some sense performing a similar task, that of refining which semantic concepts should be verbalised and which should not. However, there appear to be at least two different and quite distinct sub-tasks within this, and it would be interesting and helpful to0 see this distinction drawn out, and perhaps separated in the text. These tasks are content selection, and inference. For quite a few of the examples given, it seems that a great deal of the redundancy-reduction which is achieved comes from only mentioning the most specific class in a subclass-chain, rather than listing the same property repeatedly for each class in the chain (content-selection), whereas others involve deriving from, e.g., conjunctions. I wonder if it might not be easier to follow, particularly the inference-based rules, if they could be stated in terms of the inferences that were being carried out instead of in syntactic terms.
In several of the examples, the selection of content seems a little arbitrary, with the presented "simplified" text containing less information than the original, where some of the "missing" information does *not* appear to be redundant. For example, on p3, section 1, the "simplified" example "Sam is a cat-owner having at least one cat as pet" certainly communicates the description of the individual Sam, but is missing the generic/definitional semantics in the original text - "A cat-owner is a person having at least one cat as pet". The former does rule out the possibility of non-Sam individuals being cat-owners who have no cat as pet. Similarly on p13, the example relating to Florida in Table 10 omits that Florida is a major administrative subdivision, which, at least on a surface reading, does not appear to be redundant when compared to the rest of the information which is selected to show about Florida. All of this raises questions about the approach and how we can be sure in general that potentially important information isn't being omitted, and it would be very helpful if the reasons behind the selection of content were made clear.
Finally, and potentially related to the above, is the question of context-sensitivity of the word "redundant". It is possible that, in the Florida example, "major administrative subdivision" is omitted because there is a more specific subclass mentioned. However, I wouldn't say that this constitutes redundancy for someone who does not know the ontology in question and who wouldn't always recognise that a more generic class was being omitted. That is to say, the idea of redundancy depends on what the reader already knows. It would therefore be useful to have some discussion on what is intended by "redundancy" in this paper.
There is other work on redundancy in ontology verbalisation which might be relevant. Apologies for the self-citation, but http://dl.acm.org/citation.cfm?id=2392726 covers another form of redundancy which might complement the approach given here.
|