An Ontological Analysis of Observation Collections

Tracking #: 1423-2635

Auriol Degbelo

Responsible editor: 
Krzysztof Janowicz

Submission type: 
Ontology Description
The Semantic Sensor Web community has extensively discussed the concept of ‘observation’, providing ontologies for it. ‘Observation collection’, however, have been comparatively less examined. Monitoring spatial and temporal variations of phenomena is a task which requires observation collections (not just single observations) for their completion. This paper presents an ontological analysis of observation collections. The analysis helps to identify five essential parameters for the characterization of observation collections in the Sensor Web, namely: collector, observable, members, spatial ordering, and temporal ordering. Changes in one of these parameters lead necessarily to a new observation collection. The article presents also an Ontology Design Pattern for observation collections which implements some of the ideas introduced in the analysis. The design pattern distinguishes three main types of observation collections: time series, trajectories, and coverage.
Full PDF Version: 


Solicited Reviews:
Click to Expand/Collapse
Review #1
By Simon Cox submitted on 25/Jul/2016
Major Revision
Review Comment:

1. The paper is an exploration of one type of observation collection. Though it is classified as an “Ontology Description, the focus of the paper is not an axiom-based description of an ontology, but rather an analysis of the semantics of this special case. Furthermore, though a link to an Ontology Design Pattern is provided (using a URI), it resolves to an OWL ontology having a base URI at which ends with the string “/untitled-ontology-20”. This is not a very convincing URI for an ontology that is the topic of a formal “Ontology Description”. The owl:Ontology element is poorly documented, with almost no metadata. Also, this OWL resource cannot be accessed at the URI, so is not really on the semantic _web_ and cannot be used via its namespace. Overall, the category “Ontology Description” is not appropriate.
2. In section 2.1 the paper proposes that the membership of an observation-collection should be fixed. I disagree. This is one option, but equally reasonable choices in practice might be that an observation collection has a dynamic membership – e.g. (i) continuously growing, or (ii) a moving window. There are perfectly valid applications in environmental science, meteorology, oceanography, involving observation collections whose membership can be defined in all these ways.
3. The paper proposes that the spatial location of an observation is the location of the sensor. This simplification effectively limits the application to in-situ sensors only, since the results of observations made using remote-sensing, or on specimens moved ex-situ from their sampling location, would be very misleading if ascribed to the location of the sensor during observation. Note that this concern – the potential ambiguity of observation location - was one of the primary motivations for the separation of the procedure, the feature-of-interest, and the observation in the OGC O&M theory [1,2] which provides a vocabulary that can be used for any of these applications.
4. The paper proposes to reserve the name “observation collection” for the single case of an invariant observed-property. As noted in [2] and [3] this is only one of the potential homogeneity conditions for a collection of observations.
5. OGC O&M v1 ([2] - section 6.5) and OM-JSON ([3] - section 7.10) include implementations of an ObservationCollection class, and appears in the online version of ontology described by Cox [2016] (though it is not explored in that paper). In these implementations the homogeneity constraint on member observations is more flexible than presented in the current paper, allowing for result-sets describing a greater range of use-cases than contemplated in the paper. A complete ontological analysis would explore the semantics of all the potential patterns within an observation-collection, e.g.
a. fixed feature of interest, fixed time, varying observed property (to get a full snapshot of the state of the feature
b. fixed observed property, fixed feature of interest, varying time (to get a picture of the variation of a particular property with time – e.g. monitoring)
c. fixed observed property, fixed time, varying feature of interest/location (to get a snapshot of the variation of a particular property in space)
d. other homogeneity constraints are also possible, such as fixed procedure, fixed result-time, fixed uom.
6. There is an inconsistency in the membership of observation-collection between section 2. Paragraph 2, list-item 3 (“An observation collection has multiple observations (and only observations) as members.”) and section 2.5 first sentence (“The depth of a collection refers to the fact that members of a collection can themselves be collections or not.”)
7. The “Different types of time series” analysis in section 3.1 could be compared with the taxonomy of time-series in [7].
8. There are a number of obvious omissions from the bibliography. Since they are mentioned in “5. Related Work”, please provide full references for RDF framework [5], SPARQL[4], OGC Observations and Measurements[1,2], OGC GeoSPARQL[6]. While I am aware that the original citations for these are not journal articles, they are easily citeable. I note that a W3C Recommendation is included in the bibliography (the OWL2 Overview) so clearly the author recognises that standards and similar are citeable.
Overall, I find the paper interesting, but ultimately unsatisfactory and incomplete. It presents one kind of observation collection with a rather limited selection of homogeneity constraints and appropriates the name ‘observation collection’ for this.

[1] S.J.D. Cox, Topic 20 - Geographic Information - Observations and Measurements (same as ISO 19156:2011), OGC Abstr. Specif. 10-004r3 (2011) 54. doi:10.13140/2.1.1142.3042.
[2] S.J.D. Cox, Observations and Measurements - Part 1 - Observation schema, OGC 07-022r1. (2007) 73 + xi.
[3] S.J.D. Cox, P. Taylor, OGC Observations and Measurements – JSON implementation, OGC Discuss. Pap. 15-100r1 (2015) 1–46.
[4] S. Harris, A. Seaborne, SPARQL 1.1 Query Language, W3C Recomm. (2013).
[5] G. Klyne, J.J. Carroll, Resource Description Framework (RDF): Concepts and Abstract Syntax, W3C Recomm. (2004).
[6] M. Perry, J. Herring, OGC GeoSPARQL – a geographic query language for RDF data., OGC 11-052r4. (2012).
[7] P. Taylor, OGC® WaterML 2.0: Part 1- Timeseries - OGC 10-126r4, OGC Implement. Stand. (2014).

Review #2
By Kerry Taylor submitted on 03/Oct/2016
Review Comment:

This manuscript was submitted as 'Ontology Description' and should be reviewed along the following dimensions: (1) Quality and relevance of the described ontology (convincing evidence must be provided). (2) Illustration, clarity and readability of the describing paper, which shall convey to the reader the key aspects of the described ontology.

The paper proposes a small ontology for collections of observations. It focuses on the notion of a collection.

It is not terribly clear what an observation is, although we learn that "an observation is generated by observing the physical reality" (it should be much clearer) but ssn:Observation is mentioned and appears to be the driving concept, as is that an observation is a dul:Situation (consistent with ssn). On the other hand, all the examples show an observation as a qb:Observation. The popular (but non-ontological) notion of observation in the OGC's O&M appears to be unknown to the authors. If an observation collection is indeed an important and distinguished concept, then observation needs clear definition (or at least we need to know what collections are *not* observation collections).

While the OGC notion of "coverage" is acknowledged (and there is, to my knowledge, no published ontology representation of it) the paper seems to fail to recognise that all three kinds of 'observation collections' identifed are actually coverages in the OGC sense. This appears to be a serious flaw (possibly inherited from reference 22).

I am afraid I have to disagree that the concept of "observation collection ...[has] been ignored... occasionally mentioned." On the contrary it is rare to find only single observations published on their own. This might be because the authors find there is no need for a whole new notion for collections, as you propose. Certainly, some publishers use the RDF datacube for the use cases you describe (except, possibly, trajectories). And I note that in all 3 of your worked examples you use things that are both qb:observation and your own obs:Observation (and you do not show any trajectories). The authors should have referenced other work that uses qb:observation as this paper does.

Which brings me to the main objection to the paper. I cannot see any value that the proposed ontology (and corresponding design pattern) brings to the representation of multiple observations. The paper's examples (which are only examples of the ontology applied, but do not offer any "convincing evidence") barely use the few terms in the ontology anyway.

To step through the terms in the ontology:

The paper makes the point itself that "collector" being the agent responsible for bringing the collection together, could just as well be re-used from almost any other ontology (eg DC).

There is also the 3 types of collections (see comment re coverage above -- one type? -- what does the typing do for us?).

There is the (borrowed) idea about 4 different types of time series. Without any objection to those (and indeed recognising the utility gained by labelling a time series as one of those -- which the paper entirely fails to justify) -- 4 simple un-original concepts seems a bit underdone. And only one of those, once, is used in the paper's examples.

Finally, there is the 3 types of trajectories (also a borrowed notion, also barely used in the examples, and also hard to see what they would be useful for).

The ontology is well organised with appropriate use of annotation properties.

The discussion of design principles is thorough, but confusing in places. e.g. spatial extent...can only be understood when talking about collections of observations" -- what about aerial photos?

The attempt to distinguish "single observation" from "observation collection" is self-contradictory "it cannot be both" whereas it is both by definition in the previous paragraph.

At one point we a are assured that observation collections are not sets due to the difference between abstract and concrete entities (which needs explanation) but later on it becomes apparent that observation collections are actually sequences -- so most certainly not sets! There is great confusion (p3) around the ideas of ordering when there may be multiple dimensions. It appears that the collector is obliged to choose exactly one total order over the collection, although the multiple alternative ways are "equally valid". As it turns out there does not seem to be any use made of the ordering anyway -- at least not in the examples presented. This is a pity, as there are indeed some useful things that could be done with it (or at least with an ordering over each dimension, separately, which is not considered in the paper).

Why must all observations of the collection be "generated by observing the same observed property"? Indeed the geolife trajectory example observes lat,long,alt and timeperiod --are these somehow all the same observed property? Or if not, what is the observed property in that example?

An observation can have a spatial location and this location is fixed to be the location of the sensor--- is this is a good design? What about an earth-observing satellite image?

The paper is well structured and well written in clear English. Reduced use of footnotes would aid readability, though. There is a very extensive reference list.

I urge the authors to check out the work of the W3C/OGC spatial data on the web working group,
especially the work on ssn and coverage.