A Conceptual Model for Ontology Quality Assessment

Tracking #: 3003-4217

Authors: 
Shyama Wilson
J. S. Goonetillake
W.A. Indika
Athula Ginige

Responsible editor: 
Aldo Gangemi

Submission type: 
Survey Article
Abstract: 
With the continuous advancement of methods, tools, and techniques in ontology development, ontologies have emerged in various fields such as machine learning, robotics, biomedical informatics, agricultural informatics, crowdsourcing, database management, and the Internet of Things. Nevertheless, the nonexistence of a universally agreed methodology for specifying and evaluating the quality of an ontology hinders the success of ontology-enabled systems in such fields as the quality of each component is required for the overall quality of a system and in turn impact the usability in use. Moreover, a number of anomalies in definitions of ontology quality concepts are visible, and in addition to that, the ontology quality assessment is limited only to a certain set of characteristics in practice even though some other significant characteristics have to be considered for the specified use-case. Thus, in this research, a comprehensive analysis was performed to uncover the existing contributions specifically on ontology quality models, characteristics, and the associated measures of these characteristics. Consequently, the characteristics identified through this review were classified with the associated aspects of the ontology evaluation space. Furthermore, the formalized definitions for each quality characteristic are provided through this study from the ontological perspective based on the accepted theories and standards. Additionally, a thorough analysis on the extent to which the existing works have covered the quality evaluation aspects is presented and the areas further to be investigated are outlined.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 15/Feb/2022
Suggestion:
Accept
Review Comment:

I would like to thank the reviewers for their revisions. I really like the new version of the paper and I believe it is ready for publication, providing a useful contribution to the community.

Review #2
By Stefano De Giorgis submitted on 22/Aug/2022
Suggestion:
Major Revision
Review Comment:

-- General Review --

Thanks to the authors for their work and the time dedicated to this paper.
Although I recognise the considerable improvements and the amount of work made by the authors to integrate lacking resources and include them in a proper way in the text, I'm sorry to say that the quality of the paper does not seem to me appropriate for a "minor" review, and this for the following reasons:

1) Content issues
2) Readability issues

1)
The "Survey articles" for SWJ have to be evaluated under the following four dimensions:
(a) Suitability as introductory text, targeted at researchers, PhD students, or practitioners, to get started on the covered topic.
(b) How comprehensive and how balanced is the presentation and coverage.
(c) Readability and clarity of the presentation.
(d) Importance of the covered material to the broader Semantic Web community.

Regarding (d): this paper tries to tackle a well known problem, therefore it fully satisfies the requirement.
Regarding (b): considerable improvements have been made from previous version, which make it satisfy the requirement.
Regarding (a) and (c): I still have many concerns, exposed in detail in the followings.

The paper structure could be further improved, e.g. parts of Section 4 (e.g. quality models already mentioned in Sec. 3) should go before Section 3, maybe better integrated with section 2.
How is "Relevancy" different from "External Consistency" ?
Conclusion should better underline the work done and the classification realised.
Discussion section should highlight critical open issues (eventually proposing possible solutions) and main areas of disagreement among reviewed works.
Finally, as a minor issue, prefixes like "rdfs" (for RDF schema) are not unpacked anywhere, this could be done in another Appendix.

2)
Although I haven't spotted any grammar misspell, unluckily, the whole paper readability has not improved from previous version.
I found really difficult to proceed due to 3 main reasons:
- A considerable amount of syntactic errors and lack of proper punctuation: it seems to have been partially tackled from the previous version since there are less errors, but still, there seems to be at least one per each paragraph, which makes really difficult to proceed in reading (further info in the "Detailed Review")
- vagueness of terms: I perfectly understand the necessity to vary the terms in a long paper like this one, in order to avoid too many repetitions, but the usage of e.g. "context", or expressions like "the semantic nature of the ontology", especially in the first 15 pages, makes it difficult to follow and understand what we are talking about.

Here a detailed review page by page.

-- Detailed Review --

Page 1 - First line: "Ontology" --> "an ontology"; furthermore the reference is correct but the sentence is too generic in saying "for some purpose", if it's a quotation please put it in quotation marks, if not please rephrase.

Page 2
1st column:
What do you mean with "domain knowledge from operational knowledge" ? rephrase please.
"To this end, it would be..." --> this sentence is incorrect
"However, such a methodology, model or..." --> this sentence is incorrect
"For instance, it has been revealed that..." --> what do you mean with "ontology enabled system"?
"Furthermore, quality is considered as a judgement and not a feature." --> Do you mean that ontologies are ranked according to their quality? If so, rephrase please.

2nd column
"From the ontological point of view, a user would be the person who interacts with an ontology or an ontology-enabled IS" --> what do you mean with "ontology-enabled IS" ?
"The context not only considers the..." --> insert "the" before "user type" and before "available equipment".
"Figure 1 shows the abstract process" --> It shuold be probably "abstraction process", but I'm not sure this definition fits here, it's more a schematization of quality requirement specification process (change accordingly also in Fig. 1)
"There are countable quality models" --> several?
"None of them are..." --> "None of them is..."
"due to some of them are generic" --> this sentence is incorrect
"Moreover, some of the quality models are specific to the context such as..." --> Do you mean "domain dependent?" The notion of "context" is very generic, please be more specific

Page 3

1st column
"face difficulties as mentioned follows due to..." --> this sentence is incorrect + it ends with ";" while followed by a numbered list, should it be a ":" ?

Page 4

2nd column
"Exclusion criteria" --> the sentence should end with "."

Page 6

2nd column

"These measures are often useful to observe the dispersion of the ontology structure..." --> "dispersion" seems to be used here as in machine learning, but it sounds in some way incorrect due to its being related to the "ontology structure" (Tbox, better "ontology schema") while it is usually applied to data (Abox), please clarify

Page 7

1st column

XD [49,50], and AMOD [4], and CDOAM framework [19] --> remove "and" before "AMOD"

Page 8

2nd column
"In the following sections we have illustrated..." --> we illustrate

Page 9

1st column
"By adopting ROMEO methodology [75], the following questions were formulated to derive the intrisinc quality requirements." --> "are formulated", in general all the verbs should modified coherently.
"For instance, to provide accurate pest and disease knowledge for farmers, ontology representation should be correctly modeled." --> "ontological modeling" or "domain knowledge representation"
"For that, all required definitions (i.e., axioms) with regards to pest and disease management should be correctly defined in the ontology." --> definitions and axioms are not exactly the same thing, a definition could be provided e.g. via rdfs:comment in order to be human readable, axioms allow further automatic reasoning, it would be better to keep them separated.
"An interesting reader may refer" --> "interested"?
"- Q1:" --> this seems to refer to the "ontology completeness" matter to some extent, it could be worth mentioning it.

2nd column
"For our context, external consistency can be evaluated using the measure of precision which provides a ratio between the correctly defined definitions (i.e., axioms) and the total definitions defined in the ontology." --> "For our context" means "in our framework" or "in the use case in exam" ? Clarify please; furthermore this measurement still does not provide a way to measure the completeness in terms of presence or absence of a definition, namely, not only if some axiomatized class is generating inconsistencies or if it's not respecting domain experts definitions but also if the domain knowledge is properly covered by the classes and properties in the ontology.
"To this end, if the value of precision is one..." --> Do you mean "if the precision value is equal to 1" ? if so, please clarify also if it's a boolean value or not, and if you consider any acceptable threshold and how.
"To this end, if the value of precision is one then it implies that the particular ontology has correctly captured all axioms in the considered context. " --> "has properly represented the domain in object, and properly satisfied domain experts competency questions."

Page 10

1st column
"Design patterns" in the table --> unnecessary space between words
"Coupling - The degree of relatedness between ontology module" --> "modules"

2nd column
"Quality Requirements and Evaluation" --> "a" should be in bold

Page 11

1st column
Garvin's quality model --> "Performance" has a "p" from another font

2nd column
"Semiotic theory [123] has taken as the foundation to this model as the ontology has a semiotic nature" --> This sentence is incorrect
"precision, Recall (i.e., coverage)" --> "recall"

Page 12

1st column
"To this end the attributes: presence, amount, completeness, and reliability have been described under three levels: recognition annotation, efficiency annotation, and interfacing annotation." --> In your model or by whom? please clarify
"Based on our survey, it has been recognized that OQuaRE [6] has not focused on the semantic nature of the ontology that is an essential component..." --> "...ontology, which is an essential..." but still it is unclear what is meant with "the semantic nature of the ontology", do you refer to actual coverage of domain knowledge?

2nd column
"cognitive ergonomic." --> "ergonomy" ?
"Gangemi et al's" --> "Gangemi et al.'s"
"This can be viewed as dimensions" --> "These can be considered..."

Page 13

2nd column
"...on constructing a relational (i.e. non-hierarchical)" --> the taxonomic and relational organization of entities in an ontology are not at all in conflict, please rephrase or provide a better example.

Page 14
"conciseness (Precision)" --> "precision" + maybe unnecessary space before the parenthesis

Page 15

2nd column
"Then, the identified characteristics were grouped under the four evaluation aspects also can be viewed as dimensions namely:" --> this sentence is incorrect

Page 16

"Thus, the evaluation does not depend on the knowledge of the domain that an ontology is being modeled." --> This sentence is incorrect
"As well, the characteristics can be quantitively evaluated and are automatable." --> What is it meant here? The evaluation can be automated? How?
"Therefore, many artifacts have been..." --> is "artifact" the correct word here?

Page 17

1st column
"...understand the ontology and to interpret the knowledge." --> remove "the" before "knowledge", or use a different term, also, it is unclear what is meant here, if the knowledge base modeled in the ontology or the knowledge in some real world domain.

2nd column
"For instance, Sánchez..." --> remove the "are" after the parenthesis before the comma.
"with respect to description Logic" --> put both upper case or both lower case
"which stated that" --> "states"

Page 18

1st column
"design patterns (i.e., best practices)" --> design patterns constitute one of the good modelling practices, but it seems to me unnecessary to mention "best practices" here since they are not an example of design patterns (although it is true the inverse).
"and its denial not-S is true at the same time" --> "as true" ? This sentence is incorrect
"However, all the definitions mentioned so far describe the term module" --> Is this sentence lacking a negation ("don't describe the term...")?

2nd column
"Other articles have described the attributes related to modularity mainly cohesion, and coupling" --> This sentence is lacking punctuation + there is an unnecessary "." .
"In the light of this, we defined" --> the first expression is a bit conversational, but mainly: "we define" (present)
"which is given in definition 4" --> I would put all references to "Definitions" uppercase
"modularity can be placed in both structural and..." --> "Modularity can be classified both as... and...as shown in Table 6"

Page 19

Table 7 --> Relationship Richness missing subscripts for "Ci"
Table 7 --> Coupling: "class in external ontologies which referenced by the..." This sentence is incorrect

Page 20

1st column
"design principles (i.e. design patterns [18,85], principles)" --> insert "and" after comma
"in a way that an ontology consists of instances only with unique formal definitions." --> I don't understand this sentence: individuals are by default different for having different URIs, if they have the same URI either they are different type of entities (e.g. classes and individuals realized via punning in OWL2) or they are the same individual.

2nd column
"an ontology is complete if and only if;" --> "...if:"

Page 21

1st column
"the author has also discussed" --> "The author..."

2nd column
"also it has been named as ease of use in [93]" --> highlight in some way that "ease of use" is the name used, e.g. italic, quotation marks etc.

Page 23

1st column
"Fox and Grüninger [104] defined functional completeness as “can ontology..." --> please intergate better the quotation since it seems to be a question but without question mark.

Page 24

1st column
"the ontology are required to interpret by the..." --> "to be intepreted by" ? If so, I'm not sure "interpret" is the best choice here.
"particular application the ontology is integrated." --> "integrated in."

2nd column

"changes in application needs that the ontology is integrated," --> "integrated in" ? or "requires integrations to the original ontology"
"Sometimes, when changes are performed to the ontology that may cause inconsistencies" --> Review punctuation or change "that" with "these may cause"

Page 25

2nd column
"real domain" --> "real world use case" ?

Page 26

1st column
Definition 15:
"information" seems here an unfortunate lexical choice, since "information" is semantic content before formal organisation, which becomes "knowledge" in ontological structure, I would suggest to revise this definition

2nd column
"Credibility is the quality of being trusted and believe in" --> "believed in"
"Ontology is not directly accessed by the users and usually, it is accessed through applications." --> "Ontologies are..." + this seems to me false: ontologies can be accessed via applications, which often show only the inference results, if the end user is using ontology based reasoning systems, but assuming that ontologies are in general not directly accessed by users seems in conflict with FAIR principles and with the "Accessibility" dimension itself.

Page 27

1st column
Definition 19 --> I would remove the "through its application" part of the definition, since it is exactly a high Accessibility which could foster developing an ontology application.

2nd column
"although it is an important characteristic." --> why? according to whom? I do agree that it is an important characteristic that has to be taken into account, but without a proper explanation expressions like this one are of no use to the reader.

Recoverability definition --> It seems to me that the final definition and the one proposed by Duque-Ramos are pointing in slight different directions: the original definition is focused on "how much an ontology can recover its performance level when incurring in some failure"; while the definition proposed in the paper seems to focus more on "the degree of consistency maintained in the event of failure". The difference to me is that the one proposed in the paper is introducing some sort of quantitative factor, which seems pretty interesting, and tries to tackle the question "how much of the ontology is still consistent in the eventuality of some failure?". E.g. imagine having in one ontology the class :HorseOrUnicorn, which includes both horses and unicorns under the same class, and in another ontology two classes :Horse and :Unicorn. Now imagine to discover that unicorns don't exist. (Pity.) The first ontology would need to be remodelled, since it was collapsing two different entities under the same class, while the second one simply could deprecate the :Unicorn class, or remove it from the module, while still being consistent in its :Horse class and axioms.
It seems an interesting proposal but 1. I would not call this "Recoverability" and 2. it should need some further explanation.

Table 11
History: "the number" --> "The number"
Availability: all the references present a "?" next to them, please remove it.

Page 29

1st column
"papers for review. Of which..." --> Please connect these two sentences in a better way
"The authors in [51] have not specifically..." --> This, for many reasons included the list of characteristics, is a 13 lines sentence, although I could appreciate long and convoluted sentences these usually do not help readability and as rule of thumb should be avoided.

2nd column
"In the light of the survey..." --> This is part of "Future works" rather than Discussion.

Page 31

2nd column

"The results of its also have..." --> the results of what?

Table 13 exceeds the right paper border
Table 13 - OntoCheck [44] --> Plug-in (i.e. protégé) --> "Protégé" + Structural intrinsic --> Structural Intrinsic (either all uppercase or not.), the same with XD analyzer Aspects
Table 13 - OntoDebug... protégé --> Protégé

Page 32

Acknowledgement

Stefano de Giorgis --> Stefano De Giorgis (Although I'm not sure that the thankings to editors have to be included in the Acknowledgement)

Review #3
Anonymous submitted on 13/Nov/2022
Suggestion:
Accept
Review Comment:

I think that the authors have done a great job in improving the paper. As said in my previous review, ontology evaluation is a very challenging topic for research in ontology engineering. As far as I can tell, the authors have contributed in clearly identifying the state of the art with a broad review and analysis of the literature. I think that all actors involved in the ontology design lifecycle will benefit from this research; the paper will (at least) help readers in having an overview of what it means to evaluate an ontology and understanding why it is so challenging.

As a remark (that I hope can contribute to further enhance the quality of this work), I think that the paper still misses a discussion about the evaluation of conceptual modeling aspects relative to ontology design. For instance, some (foundational) ontologies conceive material objects (washing machines, desks, trees, etc.) as being primarily extended in space (in the philosophical sense of three-dimensionalism), whereas other ontologies conceive them as being extended in both space and time (in the sense of philosophical four-dimensionalism). This is just an example; there are lots of different modeling options that one can take at the conceptual level (e.g., representing social roles as concepts, qualities, relations, etc.). How to evaluate conceptual modeling choices remains a challenging topic for research, and it is in many cases hard to come up with precise evaluation metrics. For this reason, it is common in the ontology engineering community to document the alternative choices for a single modeling problem showing their pros and cons (see, e.g., the special issue on Applied Ontology on foundational ontologies, vol 17(1), 2022); for an example of alternative modeling options in industrial engineering, see Terkaj, W. et al (2022). Ontology for Industrial Engineering: A DOLCE Compliant Approach, CEUR-WS Vol-3240).

I think that the addition of a paragraph about these aspects will contribute to valorize and enhance the quality of the research presented by the authors. It will indeed give to readers a broader overview on the topic of ontology design lifecycle. At first glance, looking at table 6, I think that the evaluation of the conceptual modeling choices is cross-cutting to the structural intrinsic, domain intrinsic/extrinsic dimensions.