Review Comment:
This paper represents a visual syntax, G-OWL, for representing ontologies. As the name suggests, G-OWL is closely aligned with OWL. The authors cover a lot of material in this paper: (a) presentation of the G-OWL syntax, (b) an evaluation compared to other notations using Moody’s physics of notations, and (c) a relatively small user study to evaluate G-OWL against other notations. The authors claim, for the most part, that these two evaluations establish that G-OWL is more human-readable that competing notations.
Overall, I have mixed feelings about this paper. On the positive side, it is great to see such a diverse coverage of material in the paper. It is very evident that a lot of work has gone in to the design of G-OWL, with particular consideration given to human aspects, reflecting the inaccessible nature of textual languages to non-computing end users. The multi-faceted approach to evaluation is also to be commended, here exhibited by exploiting Moody’s work and the user study. On the negative side, however, I find significant shortcomings in the research, or at least how it is presented, as I will explain in the coming paragraphs. These shortcomings, in my view, arise precisely due to the diverse nature of the paper’s content: so much has been included that none of it is covered to the depth I would expect to see in a scientific paper. This paper could be split into (at least) three journal papers, each covering one of (a), (b) and (c) above.
From this perspective, then, making a recommendation to the editor on a decision for this paper is not easy. I can see many merits in the work and the shortcomings would suggest a recommendation of major revisions. In turn, these revisions would make for a very long paper indeed, and perhaps far too much for one publication. Given this line of reasoning, I feel I can only recommend ‘reject’, which feels rather harsh given the potential for the paper to be transformed into a high quality piece of research. However, this recommendation would more readily give the authors the option of considering whether to sub-divide their research into a set of individually stronger papers, each with a more directed contribution.
So, focusing on the shortcomings, I will consider each of (a), (b) and (c) in turn. My most significant reservations lie with (a) and (b).
Regarding (a), this contribution is largely covered in the paper by sections 2 and 3 and my concern relates to two specific claims made by the authors in the introduction: that G-OWL has a _completely visual_ syntax, and its symbols have semantic correspondents in W3C recommended semantic web syntax. Here, I also include one quote, taken from page 18: “The G-OWL Model is a totally visual language …. It achieves a complete visual symbolization of OWL”.
Regarding the claim that G-OWL is completely visual, which is made in many places in the paper, I find this highly disputable. Aside from the rather philosophical question of what it means to be ‘visual’ or ‘completely visual’, how can G-OWL be considered ‘completely visual’ when it exploits textual annotations for _fundamental_ parts of its syntax? For instance, the standard textual notations for the union and intersection of classes are exploited within ‘containers’ to assert that the container represents the union, resp. intersection, of the contained classes. Other standard textual notations are used in a similar way. In addition, links are annotated with opaque textual abbreviations. These abbreviated annotations are paramount to understanding the semantic content provided by the link and the linked graphical components. Surely this points towards G-OWL being a hybrid notation, using the authors’ own definition of hybrid? Indeed, the authors make a very similar point about other languages to which they compare G-OWL: “In TBCGraph, … all the kinds of properties are covered by putting OWL 2 textual expression directly on the links, making the representation dual, only partially visual…” As a further quote (my highlighting) “Although G-OWL contains *some semantic aspects denoted by textual elements* on the figures representing its entities or relations, *there are no semantic aspects that are represented textually*.” I am left with the impression that these kinds of inconsistent arguments arise in various places in the paper, including places where a criticism applied to another notation could equally be applied to G-OWL, but it is not.
Further, given the goal that G-OWL sets out to improve ontology readability, including by non-computer scientists, why were standard textual symbols exploited? Why were (readable) naming conventions for arrow labels avoided and, instead, replaced by single letters that must be internalised and have their semantics recalled when needed? Why use textual annotations in the corners of the containers used for property types? It is particularly confusing, to me at least, why a dash is used as part of some labels: S for symmetric, -S for asymmetric, R for reflexive, -R for irreflexive. The dash appears more like a negation of the first property, which would clearly have the potential for misunderstanding (obv. asymmetric is not the property of being not symmetric etc). Overall, the design choices made for G-OWL need more careful explanation and justification, and claims about being ‘entirely visual’ either more clearly made or entirely removed.
Now, regarding the quote from page 18, nowhere in the paper is it actually _proved_ that G-OWL is a complete visualization of OWL. One approach to proving this result would be to establish that basic vocabulary of G-OWL and OWL align and, in addition, that the grammatical rules used to construct statements in each notation also align. Of course, this approach may not be appropriate. But, the point is that the authors have made an unsubstantiated claim about the expressivity of G-OWL. The paper should actually prove that this property holds, not just merely assert it. The extent to which this is a major addition to the paper will depend on the ease with which the result can be established.
I have some positive comments on the design, however. The use of containers to form groups of related concepts is a nice feature of the language. Visual ontology languages rarely use spatial relations between their graphical objects to convey semantics and are largely based on topological relations (being based on node-link diagrams). G-OWL is not unique, though, in its blending of topological and spatial relations in its grammar. One example is concept diagrams, which go much further than G-OWL in their use of spatial relations. Indeed, they avoid many of the textual annotations that G-OWL exploits, although not all of them. In this regard, it would have been interesting for the authors to discuss why they have not further reduced the textual annotations used by G-OWL. Based on how concept diagrams have been designed, specifically for modelling ontologies, there may be many ways to further reduce the textual elements of G-OWL via the use of more spatial syntax. It would be nice to see if this was possible.
Relatedly, the design could also be further explored and justified by appealing to the work of Shimojima, amongst others, on so-called free rides which has been more recently extended to the notion of an observational advantage. This idea is related to work by Peirce, who the authors already cite. Piece writes on the direct observation of truths from an object that arise from the object’s construction. Does the G-OWL notation give rise to many observable facts, beyond those that are intended to be encoded? This is a notable point, since the ability of diagrams, and visual notations generally, to convey information explicitly that would usually require deduction is widely seen as one of their important strengths: effective visualizations should, ideally, support observability. See Shimojima’s book [e], for more on this aspect of visual notation design and effectiveness.
I will now turn my attention to (b): the evaluation compared to other notations using Moody’s physics of notations. In one sense, this evaluation is very carefully structured, as it proceeds by considering nine principles in turn that are suggestive of an effective visual language. However, the execution of this analysis, which forms sections 5.3 and 6 of the paper is, in my view, lacking sufficient depth and objectivity to be scientifically robust. The evaluation needs to be deeply expanded, so that each of the nine principles is given its due attention. At present, the presentation is suggesting that the authors have cherry picked parts of notations to argue about whether a principle is met or not and, invariably, the ‘best’ notation is G-OWL. Now, it may well be the case that Moody’s framework can indeed be used to make such a deduction. However, to be convincing, the evaluation should consider the notations in their entirety, not just carefully selected examples to illustrate the possibility that a principle is either met or not. In my view, the current write-up has a biased feel: the impression is that the authors selected properties of the notations that are geared towards supporting the superiority of G-OWL. They can readily overcome this (likely false) impression by giving a much more in-depth, carefully structured analysis of each notation with respect to the principles. Indeed, I would encourage them to carefully articulate a method by which the notations were explored in the context of seeing whether they meet, or to what extent they meet the nine principles.
In addition to a more methodological approach to this evaluation, the authors should exercise more caution in what they can deduce from it. Whilst an entirely valid approach to evaluation, it is not possible to necessarily deduce that a notation that better meets the principles is guaranteed to be more effective. All that can be said is that, for example, notation A better meets the principles than notation B, which _suggests_ that notation A may be more effective for some tasks. However, a true judgement of whether a notation is more effective for certain tasks can never be fully established and the best evidence to support such a hypothesis is an empirical evaluation designed to test competing notations for specific tasks. A simple example of such a study can be seen in [a] below. In light of this, even accounting for the user study in the paper, I do not believe the authors can justify these two claims given in the introduction: “its semantic is easily interpretable by humans from the
visual representation;” and “compared to semantic web ontology language its syntax contains a limited number of visual symbols to be easily manageable for modeling and communication to human readers and designers”. Other claims on accessibility, readability or any kind of claims about the relative effectiveness of G-OWL compared to other notations should be examined by the authors and either justified or rephrased as conjecture or opinion.
Relatedly, but not a major concern regarding (b), I was surprised that Warren et al.’s research papers on the accessibility of description logics and, specifically, the Manchester OWL syntax were not referenced, not least because they highlight accessibility issues with traditional textual ontology languages. Notable omissions include [b,c,d], although I would encourage the authors to explore other publications by these authors as they may help to motivate or justify some claims made in the paper.
My last major concern is with regard to (c): the small user study to evaluate G-OWL against other notations. Here, I can be brief: any study should be reported on in sufficient detail to allow it to be reproduced by other scientists. The details given in this paper fall far short of this requirement. I appreciate that this was a small-scale study but, nonetheless, the authors are making some deductions about the relative effectiveness of the evaluated notations. The study design is missing, with only brief details provided. No indication is given about how the participants were recruited nor confirmation that they were independent from the authors. Did they know that the authors developed G-OWL? If they did, this is a threat to the validity of the study. In fact, the threats to validity are not discussed at all. The analysis of the collected responses is also weak, with deductions, even though tentative, based on comparing absolute average scores with no attempt to check for significant differences.
Moving on from (a), (b) and (c), it would also be useful if the authors clarified the intended use of G-OWL. At times, it felt like the goal was to (i) visualise OWL ontologies and at other times to (ii) visualise the domains being described by ontologies. These are clearly not the same goal. Specifically, the authors suggest that one goal was to help those less able to read computing-like languages to understand ontologies – is this in the sense of (i) or (ii)? There could be some tension between _effectively visualizing an owl ontology_ and _effectively visualizing the domain being modelled by the OWL ontology_. These different perspectives could be more clearly explored in the paper, particularly in the context of the design of G-OWL. Do these two different design goals lead to trade-offs in the design of G-OWL for example?
I could make many other comments, some significant and others more minor, on the content of the paper. However, my main concerns raised above are sufficient to justify the recommendation made. In any revised version of this paper, I encourage the authors to carefully check all of the claims they make and ensure that they are fully justified. Any claims based on opinion are mere conjecture and should be phrased as such.
I very much hope to see a resubmission of this paper, as I believe there is the kernel of some high quality research here but it needs a lot more space dedicated to its dissemination to reveal its depth and robustness.
[a] Alharbi et al. Visual logics help people: An evaluation of diagrammatic, textual and symbolic notations.
[b] Warren et al. Improving comprehension of Knowledge Representation languages: a case study with Description Logics.
[c] Warren, Ontology Users’ Survey report, available at http://kmi.open.ac.uk/people/member/paul-warren.
[d] Warren et al. The usability of Description Logics: Understanding the cognitive difficulties presented by Description Logics.
[e] Shimojima, Semantic Properties of Diagrams and their Cognitive Potentials.
|