Review Comment:
The paper introduces a core ontology called ODO-IM devoted to the representation of (scientific) observations (and affine concepts). This ontology is imported into k.LAB worldviews via "semantic anchors" and used in the k.LAB framework.
This paper puzzles me. On the one hand, it addresses an extremely important topic and the proposed ontology considers several interesting and fundamental aspects of observations (like their context, perspective, etc.). Furthermore I found the section 2 on related works well written, updated, and informative.
On the other hand, the ontological analysis of the main concepts introduced in the ontology is, at least for me, confused and it lacks the precision that these complex concepts deserve.
In the section 4 the authors refer to a huge amount of works without however be clear on what exactly they take from these works and/or how they modify them. Furthermore, each cited work concerns only a small fragment of the concepts considered in the proposed ontology, it is not clear how they are combined into a consistent picture; actually, a systematic comparison of ODO-IM with (at least some of) the ontologies discussed in section 2 is not present and this does not help the reader (also note that in the cited works, several distinctions are grounded on primitive relations like inheritance, dependence, parthood, ecc. while the submitted paper describes almost exclusively the categories of ODO-IM; even the two mains relations in fig.1, namely "characterizes" and "contextualizes", are only very partially discussed).
These limitations prevented me to review in detail the ontological choices of the authors, I'm just unable to understand them in deep. Consequently, most of my comments just limit to highlight unclear points.
From a more applicative perspective, the paper lacks a clear explanation of the whole framework in which the ontology is introduced and the systematic way ODO-IM is imported into k.LAB. Furthermore, the applicative examples considered in section 5 touch only few aspects of the proposed ontology.
I think that the paper touches several interesting points, but additional effort is necessary to publish it. I then encourage the authors to submit a revised version. I have the feeling that the submitted manuscript is derived from a longer paper by omitting several details (by the way, the list of references includes several works that are not cited in the paper). I don't know whether there are space limits for the submissions to the Semantic Web journal but I suggest to add details (increasing the length of the paper) or to just focus on the ontological analysis (describing ODO-IM independently of k.LAB) or, alternatively, on the applicative/practical aspects (giving few details on ODO-IM and showing its power in applicative terms). I also think that the introduction of one or two illustrative examples (involving all the main concepts and relations included in the ODO-IM) progressively analysed throughout the paper would significantly increase its readability (please add some details on k:LAB if the examples use this language -- personally I don't think the use of k.LAB is strictly necessary if the authors focus on ODO-IM). I also think that the authors could minimise the number of references but make explicit in a systematic way the relation they have with ODO-IM.
I don't want to push the authors to adopt a foundational ontology but I don't understand their argument to avoid the alignment with foundational ontologies. First, I think that several distinctions in ODO-IM are very general (e.g., most of the categories in fig.2 appear in several foundational ontologies). Second, even though foundational ontologies include some non pertinent elements, the ones that are pertinent could be reused (the authors already "import" notions from several other works).
To me, more than a core ontology (for observations), ODO-IM seems a sort of top-level ontology integrated with the notions necessary for modelling observations. I can understand that there are no foundational ontologies that satisfy all the requirements of the authors, but I would like to understand these requirements and, in any case, to have a more systematic comparison between the choices in ODO-IM and the ones in other existent foundational ontologies.
The authors include classes like event, process, substantial, quality, etc. under Observable. First Observable seems more a role than a kind; is this just an applicative choice to collect all the entities that can be observed or there is a strong ontological reason behind it? Second, how do the authors decide the link between the classes under Observable and the ones considered in other ontologies (e.g., why the class "Agent" of PROV-O is not included under Observable, similarly for "InstantaneousEvent")?
(the latest ODO-IM owl version does not seems completely aligned with the ODO-IM discussed in the paper, for instance it contains the class DirectObservable that is not considered in the paper and that not corresponds to the entities with "arity of dependence" 0 that does not contains Process (included under DirectObservable)
----
(p.2, l20-l25) I think that the illustration of how to represent the example of "the retained soil mass" is not understandable at this point of the paper because it requires several concepts that are introduced only later.
(p.3, l1-l2) "For example, when a scientific description invokes a countable observable, such as an insect or a plant, the concept is concretized."
What "invokes" and "concretized" mean?
(p.3, l2-l4) "The inclusion of the concrete-abstract distinction..."
I don't understand this sentence
(p.5, l25-l26) "the scientific process is seen as the transformation of existing knowledge artifacts into others that incorporate and define scientific advancement"
It would be useful to understand if there are (and what they are) *basic* knowledge artifacts, i.e., knowledge artifacts that cannot be reduced to other knowledge artifacts.
(p.6, l26-l27) "Nevertheless, the label "observation" carries an ambiguity that lies between its meaning in terms of activity, either as type or an execution ("doing an observation of x"), and an information, i.e. a content, that can be replicated, copied, transcribed, and analyzed to create more content."
Here you use the notion of information and its content that, as far as I know, are not used in [33] you cite here. Similarly for the notions of replication, copy, analysis, etc.
(p.7, l14-l16) "we interpret scientific observations as information entities derived and elaborated through hypothesis and contextualization of other scientific artifacts (e.g. datasets, data models, and images) taken as phenomenological evidence and interpreted based on scientific human-driven perspectives [46], perspectives that are also domain-based [47]."
It would be nice to make more clear and explicit the way these perspectives are represented, I have some intuitions but not a clear understanding. Also the distinction between structural and functional perspective is never explained in detail.
Why structural perspectives are linked to Subject and Quality while the functional perspectives are linked to Event and Process (see table 1)? In which sense an object or an event is a perspective (according to [46] and [47])?
(p.8, l3-l12) The notion of event the authors adopt seems close to the one of "event-type" as opposed to "event-token". If this is the case the notion of event differs from the one embraced in BFO, DOLCE, and UFO. Why events are not predicates? Some motivations for this choice should be useful?
When one observes the quality of an event does she observe the quality of a specific individual happening in a specific spatiotemporal region or not? Do the authors have some mechanisms to "particularise" the "event-type" in a spatiotemporal region? If this is the case, why a different choice for substantials has been done?
(p.8, l33-l41) Can the authors better motivate the choice of embracing the notion of process of Galton? Galton's processes are quite unusual and peculiar. Several other possibilities may be considered to model the composition of events (as done for instance in BFO or DOLCE; one could also just refer to some mereology). Which notion of composition do the authors need?
(p.9, l3-l4) "For example "temperature" cannot be measured without a reference entity, such as water, atmosphere, and a substantial body."
What kind of entity "temperature" is? Is temperature an instance of Observable, and more specifically of Quantifiable Quality? If yes, can "temperature" inheres in different "reference/intermediate entities" or not? This would clarify if the ODO-IM qualities are similar to the one in BFO, DOLCE, UFO or, similarly to the case of events, they are more abstract, more close to types than to tokens.
(p.9, l9-l10) "ODO-IM focuses on the informational representation of time as a quality that defines the granularity and extent of a description"
Intuitively, this sentence makes sense for me, but given the importance of time (and change) for observations, I think that the formal characterisation of time and how it is used in the representation of observations must be more precisely described in the paper.
(p.9, l12-l13) "Spatial qualities are also accounted for with a similar topological description, noting important differences compared with temporal qualities, for example, variable dimensionality."
What is a "topological description" (time does not necessarily requires a topology, an order can be enough)? What is the "variable dimensionality" of spatial qualities?
(p.10, l2-l3) "priority describes the monotonic orderings of concepts"
I cannot understand this sentence. Can "orderings of concepts" have qualities? Are "orderings of concepts" observables (if yes, of what kind)?
(p.10, l7) I think that the notion of Configuration and the way configurations are represented in k.LAB deserve more details. More generally it is important to understand whether all the classes in ODO-IM (and their instances) are uniformly represented in k.LAB or there are some differences (and, in this case, motivate this differences). The authors discuss semantic anchors at p.14, but a more detailed and complete discussion would be useful.
(p.10, table 1)
Table 1 talks about dependence and arity of dependence. How these notions are represented in ODO-IM is not clear to me.
(p.11, l18) "Attribute is the most generic Predicate that can be used to specialize Observables"
I'm not sure to understand this sentence (in Fig.1 the only relation between predicates and observables is named "characterizes")
(p.12, l8-l9) "In other words, epistemic Predicates are often, although not always, themselves kinds of knowledge systems, more specifically taxonomies."
I'm not sure to understand in which sense a Predicate is a taxonomy. More generally, a clarification about the nature of predicates would be useful. At the beginning of the paper the authors claim "While a commitment to universal and particular is not part of ODO-IM, concrete and abstract categories are implicitly included. Indeed, Observables and Predicates are abstractions that can be concretized to compute an observation" First, this claim is not clear to me. Second, ODO-IM predicates seem a sort of linguistic counterpart of properties (where properties are here intended as a general term that includes universals). Please clarify the point.
(p.12 l36-l37) "More specifically, Descriptions are derived from the Resource-in-input and Resource-in-output processes."
What "derived" means here? What are Resource-in-input and Resource-in-output processes?
(p.14 l14-l15) In the example
abstract process Process
equals core odo:Process;
what the first small "process" (modifier) stands for? Also the informal description of the modifier "abstract" is not clear to me: what does it means that a concept requires a further concept to be observed (in which sense concepts can be observed?)
Similarly for the declaration of Transformation.
|
Comments
References
Reference [22] cited as Hitzler et al. appears to correspond to https://ebooks.iospress.nl/publication/45580 which is Presutti & Gangemi.
SSN Ontology
While they provided the URL for SSN (p4, line 12) the authors appear not to have considered the W3C recommendation version of the SSN Ontology, from 2017.
All the discussion in the manuscript appears to relate to earlier drafts of SSN, such as the 2011 report from the W3C incubator group.
It is notable that the 2017 edition of SSN has explicit alignments to both OBOE and PROV-O which overlap or perhaps supersede the discussion in the paper, and also supersedes intermediate work such as reference [25].
Furthermore, a new edition of SSN is in preparation, which incorporates extensions proposed in https://www.w3.org/TR/vocab-ssn-ext/ (2020) and in OMS (reference [15]*).
The alignments to OBOE and PROV-O are further improved - see https://w3c.github.io/sdw-sosa-ssn/ssn/#PROV-alignment and https://w3c.github.io/sdw-sosa-ssn/ssn/#OBOE-alignment
* [15] is mis-cited.
- The OGC publication is not a Technical Report, it is 'OGC Abstract Specification, Topic 20'. The URI is http://www.opengis.net/doc/as/om/3.0 (which redirects to the URL https://docs.ogc.org/as/20-082r4/20-082r4.html)
- OMS is also published by ISO - https://www.iso.org/standard/82463.html (paywalled).