Ontology of Descriptions and Observations for Integrated Modelling (ODO-IM)

Tracking #: 3663-4877

Authors: 
Greta Adamo
Ferdinando Villa

Responsible editor: 
Stefano Borgo

Submission type: 
Ontology Description
Abstract: 
Knowledge integration and interoperability have central roles in connecting information and extracting evidence to support sustainability agendas at local and global levels. Several initiatives, such as FAIR, Open Science, and many EU policies have been proposed to curate and manage information. However, researchers endure continued fragmentation of scientific knowledge, e.g. data, models, and approaches, which are typically compartmentalized by field and within communities. In addition, relevant scientific knowledge that can be valuable for critical assessments can be locked or has limited access, phenomena that further constrains scientific enterprise and cohesive efforts to address pressing challenges, such as climate change. This paper presents the ontological commitments of the core ontology Ontology of Descriptions and Observations for Integrated Modelling (ODO-IM), which is designed to capture scientific observations and related descriptions for integrating scientific assets. The ontology has been developed to serve knowledge.LAB (k.LAB), an open-source semantic web software for integrated modeling, in which scenarios are modeled using a dedicated English-like declarative language. Some examples are provided that demonstrate the role of ODO-IM in the context of social and environmental sustainability, which is currently the main application of the integrated modeling software.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Reject

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 23/May/2024
Suggestion:
Major Revision
Review Comment:

The paper introduces a core ontology called ODO-IM devoted to the representation of (scientific) observations (and affine concepts). This ontology is imported into k.LAB worldviews via "semantic anchors" and used in the k.LAB framework.

This paper puzzles me. On the one hand, it addresses an extremely important topic and the proposed ontology considers several interesting and fundamental aspects of observations (like their context, perspective, etc.). Furthermore I found the section 2 on related works well written, updated, and informative.
On the other hand, the ontological analysis of the main concepts introduced in the ontology is, at least for me, confused and it lacks the precision that these complex concepts deserve.
In the section 4 the authors refer to a huge amount of works without however be clear on what exactly they take from these works and/or how they modify them. Furthermore, each cited work concerns only a small fragment of the concepts considered in the proposed ontology, it is not clear how they are combined into a consistent picture; actually, a systematic comparison of ODO-IM with (at least some of) the ontologies discussed in section 2 is not present and this does not help the reader (also note that in the cited works, several distinctions are grounded on primitive relations like inheritance, dependence, parthood, ecc. while the submitted paper describes almost exclusively the categories of ODO-IM; even the two mains relations in fig.1, namely "characterizes" and "contextualizes", are only very partially discussed).
These limitations prevented me to review in detail the ontological choices of the authors, I'm just unable to understand them in deep. Consequently, most of my comments just limit to highlight unclear points.
From a more applicative perspective, the paper lacks a clear explanation of the whole framework in which the ontology is introduced and the systematic way ODO-IM is imported into k.LAB. Furthermore, the applicative examples considered in section 5 touch only few aspects of the proposed ontology.

I think that the paper touches several interesting points, but additional effort is necessary to publish it. I then encourage the authors to submit a revised version. I have the feeling that the submitted manuscript is derived from a longer paper by omitting several details (by the way, the list of references includes several works that are not cited in the paper). I don't know whether there are space limits for the submissions to the Semantic Web journal but I suggest to add details (increasing the length of the paper) or to just focus on the ontological analysis (describing ODO-IM independently of k.LAB) or, alternatively, on the applicative/practical aspects (giving few details on ODO-IM and showing its power in applicative terms). I also think that the introduction of one or two illustrative examples (involving all the main concepts and relations included in the ODO-IM) progressively analysed throughout the paper would significantly increase its readability (please add some details on k:LAB if the examples use this language -- personally I don't think the use of k.LAB is strictly necessary if the authors focus on ODO-IM). I also think that the authors could minimise the number of references but make explicit in a systematic way the relation they have with ODO-IM.

I don't want to push the authors to adopt a foundational ontology but I don't understand their argument to avoid the alignment with foundational ontologies. First, I think that several distinctions in ODO-IM are very general (e.g., most of the categories in fig.2 appear in several foundational ontologies). Second, even though foundational ontologies include some non pertinent elements, the ones that are pertinent could be reused (the authors already "import" notions from several other works).
To me, more than a core ontology (for observations), ODO-IM seems a sort of top-level ontology integrated with the notions necessary for modelling observations. I can understand that there are no foundational ontologies that satisfy all the requirements of the authors, but I would like to understand these requirements and, in any case, to have a more systematic comparison between the choices in ODO-IM and the ones in other existent foundational ontologies.
The authors include classes like event, process, substantial, quality, etc. under Observable. First Observable seems more a role than a kind; is this just an applicative choice to collect all the entities that can be observed or there is a strong ontological reason behind it? Second, how do the authors decide the link between the classes under Observable and the ones considered in other ontologies (e.g., why the class "Agent" of PROV-O is not included under Observable, similarly for "InstantaneousEvent")?
(the latest ODO-IM owl version does not seems completely aligned with the ODO-IM discussed in the paper, for instance it contains the class DirectObservable that is not considered in the paper and that not corresponds to the entities with "arity of dependence" 0 that does not contains Process (included under DirectObservable)

----

(p.2, l20-l25) I think that the illustration of how to represent the example of "the retained soil mass" is not understandable at this point of the paper because it requires several concepts that are introduced only later.

(p.3, l1-l2) "For example, when a scientific description invokes a countable observable, such as an insect or a plant, the concept is concretized."
What "invokes" and "concretized" mean?

(p.3, l2-l4) "The inclusion of the concrete-abstract distinction..."
I don't understand this sentence

(p.5, l25-l26) "the scientific process is seen as the transformation of existing knowledge artifacts into others that incorporate and define scientific advancement"
It would be useful to understand if there are (and what they are) *basic* knowledge artifacts, i.e., knowledge artifacts that cannot be reduced to other knowledge artifacts.

(p.6, l26-l27) "Nevertheless, the label "observation" carries an ambiguity that lies between its meaning in terms of activity, either as type or an execution ("doing an observation of x"), and an information, i.e. a content, that can be replicated, copied, transcribed, and analyzed to create more content."
Here you use the notion of information and its content that, as far as I know, are not used in [33] you cite here. Similarly for the notions of replication, copy, analysis, etc.

(p.7, l14-l16) "we interpret scientific observations as information entities derived and elaborated through hypothesis and contextualization of other scientific artifacts (e.g. datasets, data models, and images) taken as phenomenological evidence and interpreted based on scientific human-driven perspectives [46], perspectives that are also domain-based [47]."
It would be nice to make more clear and explicit the way these perspectives are represented, I have some intuitions but not a clear understanding. Also the distinction between structural and functional perspective is never explained in detail.
Why structural perspectives are linked to Subject and Quality while the functional perspectives are linked to Event and Process (see table 1)? In which sense an object or an event is a perspective (according to [46] and [47])?

(p.8, l3-l12) The notion of event the authors adopt seems close to the one of "event-type" as opposed to "event-token". If this is the case the notion of event differs from the one embraced in BFO, DOLCE, and UFO. Why events are not predicates? Some motivations for this choice should be useful?
When one observes the quality of an event does she observe the quality of a specific individual happening in a specific spatiotemporal region or not? Do the authors have some mechanisms to "particularise" the "event-type" in a spatiotemporal region? If this is the case, why a different choice for substantials has been done?

(p.8, l33-l41) Can the authors better motivate the choice of embracing the notion of process of Galton? Galton's processes are quite unusual and peculiar. Several other possibilities may be considered to model the composition of events (as done for instance in BFO or DOLCE; one could also just refer to some mereology). Which notion of composition do the authors need?

(p.9, l3-l4) "For example "temperature" cannot be measured without a reference entity, such as water, atmosphere, and a substantial body."
What kind of entity "temperature" is? Is temperature an instance of Observable, and more specifically of Quantifiable Quality? If yes, can "temperature" inheres in different "reference/intermediate entities" or not? This would clarify if the ODO-IM qualities are similar to the one in BFO, DOLCE, UFO or, similarly to the case of events, they are more abstract, more close to types than to tokens.

(p.9, l9-l10) "ODO-IM focuses on the informational representation of time as a quality that defines the granularity and extent of a description"
Intuitively, this sentence makes sense for me, but given the importance of time (and change) for observations, I think that the formal characterisation of time and how it is used in the representation of observations must be more precisely described in the paper.

(p.9, l12-l13) "Spatial qualities are also accounted for with a similar topological description, noting important differences compared with temporal qualities, for example, variable dimensionality."
What is a "topological description" (time does not necessarily requires a topology, an order can be enough)? What is the "variable dimensionality" of spatial qualities?

(p.10, l2-l3) "priority describes the monotonic orderings of concepts"
I cannot understand this sentence. Can "orderings of concepts" have qualities? Are "orderings of concepts" observables (if yes, of what kind)?

(p.10, l7) I think that the notion of Configuration and the way configurations are represented in k.LAB deserve more details. More generally it is important to understand whether all the classes in ODO-IM (and their instances) are uniformly represented in k.LAB or there are some differences (and, in this case, motivate this differences). The authors discuss semantic anchors at p.14, but a more detailed and complete discussion would be useful.

(p.10, table 1)
Table 1 talks about dependence and arity of dependence. How these notions are represented in ODO-IM is not clear to me.

(p.11, l18) "Attribute is the most generic Predicate that can be used to specialize Observables"
I'm not sure to understand this sentence (in Fig.1 the only relation between predicates and observables is named "characterizes")

(p.12, l8-l9) "In other words, epistemic Predicates are often, although not always, themselves kinds of knowledge systems, more specifically taxonomies."
I'm not sure to understand in which sense a Predicate is a taxonomy. More generally, a clarification about the nature of predicates would be useful. At the beginning of the paper the authors claim "While a commitment to universal and particular is not part of ODO-IM, concrete and abstract categories are implicitly included. Indeed, Observables and Predicates are abstractions that can be concretized to compute an observation" First, this claim is not clear to me. Second, ODO-IM predicates seem a sort of linguistic counterpart of properties (where properties are here intended as a general term that includes universals). Please clarify the point.

(p.12 l36-l37) "More specifically, Descriptions are derived from the Resource-in-input and Resource-in-output processes."
What "derived" means here? What are Resource-in-input and Resource-in-output processes?

(p.14 l14-l15) In the example

abstract process Process
equals core odo:Process;

what the first small "process" (modifier) stands for? Also the informal description of the modifier "abstract" is not clear to me: what does it means that a concept requires a further concept to be observed (in which sense concepts can be observed?)
Similarly for the declaration of Transformation.

Review #2
By Simon Scheider submitted on 14/Jun/2024
Suggestion:
Reject
Review Comment:

In this article (a ontology description), the authors introduce ODO-IM, an ontology supposed to be helpful in describing observations in various scientific models. The article includes an extensive review of related work, introduces the ontological concepts (re-)used in their own model, mostly picked from a large list of references in ontology engineering. The ontology is presented in an informal, anecdotal manner and is illustrated with a couple of examples using a software implementation that allows querying for observations. The ontology is available as an OWL file. The authors the ontology reflects a phenomenological/perspectivist approach towards observations.

The endeavor of clarifying observations in terms of conceptual models and various forms of contexts is laudable, however, the submitted article is too preliminary for acceptance and fails regarding various basic scientific requirements regarding quality, relevance and clarity. Here are the details:

- First and most importantly, the article lacks a clear knowledge gap and a corresponding research question/goal. What exactly is this new ontology good for, and to what degree is this not yet covered by previous attempts? The introduction immediately jumps into the technicalities of an ontology/implementation and leaves this unclear. Even after reading the entire article, I am unsure about its purpose. The authors mention some potential gaps here and there (including linguistic focus, parsimony, resolution), however, these gaps are not systematically addressed. The introduction throws tons of technicalities at the reader without making clear why any of it is needed. The central idea, it seems, of a perspectivist/context-dependent approach towards observations does not become graspable in the paper.

- The related work section is very detailed and may be the most valuable part of the article. However, it remains unclear how all the discussed ontological work is reflected in the design of the ontology. That there are "differences in the formalization" is not a reason for a new ontology (there may be good reasons for these differences in those ontologies).

- The introduction of ODO-IM resembles a tour de force through decades of work in ontology engineering, properly referenced, but only vaguely understood and largely confusing. It seems the authors tried to pack as much as possible into their model and thereby did justice to only very few. Furthermore, there are numerous claims about these borrowed ontological concepts that do not seem to make much sense, or are on such an abstract level (lacking examples) that is hard to grasp: Why is a scientific process only a knowledge transformation (as science is a knowledge generating endeavor?). Why are time and space only occurring as contextual entities (counter example: a moving car trajectory, where the location does not form the context, but the measurement outcome). And why is context an information resource (When I am placing a thermometer, I might not produce any formal description of the location), and why is there no distinction between data and phenomena it is about?. To what degree is this a linguistic ontology? Why is the relation between processes and events (parthood) modeled like the one between qualities and their bearers (inherence)? Furthermore, among the concepts, I missed essential ones needed for geographic information, e.g. the concept of a field, needed to describe the given application example corresponding to the spatial distribution of vegetation density (similar to process, but only in space instead of time). What is the difference between configurations and network/relationships? And finally, how to distinguish between Predicates, Observable categories and Descriptions? I doubt this is really possible, since all categories of observables seem also to be usable as both- predicates and descriptions demarcating context of measurement. For example, I can use a spatial extent also to determine and define the observable (e.g. the vegetation mass contained in a certain spatial window). Thus, there are fundamental problems regarding a consistent and useful interpretation of the concepts when applied, and the fact that the ontology is not formalized beyond basic OWL does not help.

- Finally, the article lacks any kind of evaluation or evidence of the ontology's quality. The described examples are not clear enough to see the use or purpose of the ontology, and since a goal/question is lacking, I could not even start thinking about how to evaluate this ontology (e.g. in terms of competency questions, or of user studies of the system).

In general, the prose is hard to follow, with many arguments and claims not really fleshed out, leaving the impression that the authors are not sure yet what their contribution really should be.


Comments

Reference [22] cited as Hitzler et al. appears to correspond to https://ebooks.iospress.nl/publication/45580 which is Presutti & Gangemi.

While they provided the URL for SSN (p4, line 12) the authors appear not to have considered the W3C recommendation version of the SSN Ontology, from 2017.
All the discussion in the manuscript appears to relate to earlier drafts of SSN, such as the 2011 report from the W3C incubator group.

It is notable that the 2017 edition of SSN has explicit alignments to both OBOE and PROV-O which overlap or perhaps supersede the discussion in the paper, and also supersedes intermediate work such as reference [25].

Furthermore, a new edition of SSN is in preparation, which incorporates extensions proposed in https://www.w3.org/TR/vocab-ssn-ext/ (2020) and in OMS (reference [15]*).
The alignments to OBOE and PROV-O are further improved - see https://w3c.github.io/sdw-sosa-ssn/ssn/#PROV-alignment and https://w3c.github.io/sdw-sosa-ssn/ssn/#OBOE-alignment

* [15] is mis-cited.
- The OGC publication is not a Technical Report, it is 'OGC Abstract Specification, Topic 20'. The URI is http://www.opengis.net/doc/as/om/3.0 (which redirects to the URL https://docs.ogc.org/as/20-082r4/20-082r4.html)
- OMS is also published by ISO - https://www.iso.org/standard/82463.html (paywalled).