Basic Observations and Sampling Feature Ontology

Tracking #: 890-2101

Authors: 
Simon Cox

Responsible editor: 
Mark Gahegan

Submission type: 
Ontology Description
Abstract: 
We introduce new OWL ontologies for observations and sampling features, based on the O&M conceptual model from ISO 19156. Previous efforts, through the W3C SSN project, and following the ISO rules for conver-sion from UML, introduced dependencies on elaborate pre-existing ontologies and frameworks. The new ontolo-gies, known as om-lite and sam-lite, minimize such dependencies, and can therefore be used to harmonize obser-vational data with minimal ontological commitment beyond the conceptual model. Patterns for linking existing ontologies for time and space to stub-classes in the new ontologies are described, thus providing a route for har-monization of more specific observation applications. The PROV-O ontology is re-used to support certain re-quirements for the description of specimens, and a more general alignment of both observation and sampling fea-ture ontologies with PROV-O is described, as well as mappings to some other observation models and ontologies.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Peter Fox submitted on 28/Jan/2015
Suggestion:
Minor Revision
Review Comment:

This is an excellent contribution to knowledge representation of observations and measurements, grounded in both specific discipline examples and prior experience with the prctice of implementing ontology for application use. While, the details of the shortcomings of previous implementations are not explicitly included in the article, and the reasoning of why a different observatinal model (e.g. OBOE) was not adopted or extended is lightly addressed the article provides an excellent and well formed presentation of the key concerns, considerations and evolution of the O&M ontology to advance specific capabilities in semantic representations in fields beyond the immediate motivating discipline.

Accept with minor revisions (which follow).

Misc.

P1
define/ expand/ provide link to GML

add (OGC) after Open Geospatial Consortium

expand UML and provide reference/ link.

P6

Link for TopBraid.

In Table 1. the entry: oml:observationContext oml:Observation oml:ObservationContext raises the question of the use of "ObservationContext" and "observationContext" with only one letter different, cf. procedure and Process. This is a candidate for mis-casing the "o" and perhaps introducing integrity violations in the ontology. Can the author address this?

Is there a reason that oml:resultTime is xsd:dateTime rather than the sweet numberline time representation? (there seems to a partial answer early in Section 4. and 5.2.1, but something here would be useful).

P7

“sam-lite” (namespace prefix “saml:”) - saml is unfortunate... Security Assertion Markup Language - well known and while it is a very different usage the clashes may be a problem - see http://en.wikipedia.org/wiki/SAML_2.0

The phrase "Following the strategy used in om-lite..." does not quite indicate which strategy (mentioned earlier in the paper?). Can this be
made more explicit? (what follows that looks like the implementation, not the strategy). What is done here is very good so it would help
the reader to know more.

Why saml:Location and not gml or other ontology? or is it subclassed? Same question on P8 with the geometry narrative. Does saml: relation to another ontology?

P8

Any chance of a reference for "Specific subclasses restrict the type of saml:shape, corresponding to common prac- tice particularly in earth and environmental sciences."?

P10

The sentence: "From this view it is clear that om-lite does provides a light-weight framework, in comparison with both SSN and OMU which any application to make a significant commitment to an existing framework." could be reworded for clarity. Is it meant to be "From this view it is clear that om-lite does provides a lighter-weight framework, in comparison to both SSN and OMU, in which a partiuclar application needs to make a significant commitment to an existing framework." I am still not sure what this means... can this be made more specific?

I do not see a Section 0? "As anticipated in section 0, this may be done by ....".

P13

The first row in Listing 6 is intriguing. Can the author clarify this assertion (beyond the brief reference in 5.2.3, item 2?
- oboe-core:Measurement rdfs:subClassOf oml:Observation .
Especially when O&M is Observattions and Measurement.

P14

In the phrase: "PROV-O [24] is the only legacy ontology used di- rectly in the new ontologies, apart from the basic RDF, RDFS and OWL infrastructure.", it is not clear that PROV-O would be considered legacy? Perhaps "existing"?

Title sub-section: 5.3.3. Information resources or real-world things8 - assume the "8" is a typographic error?

There is a somewhat hidden but nice strong claim in this sub-section "We thus demonstrate the applicability of PROV-O to real-world things."

Review #2
Anonymous submitted on 17/Mar/2015
Suggestion:
Major Revision
Review Comment:

The ontology of observations remains in bad shape, theoretically as well as practically. Papers proposing to advance it are therefore more than welcome. This manuscript presents valuable and sensible practical advances over earlier work. In particular, it nicely combines insights about observation processes with pragmatic goals of providing adequate metadata vocabularies. As such it should be published, but it also needs work, mainly regarding its style and readability.

The main contribution of the paper is to propose two new OWL ontologies, om-lite and sam-lite, which appear useful and an improvement on the state-of-the-art, particularly through the ideas of sampling features and samples as items with provenance. The main argument advanced in favor of these ontologies, namely that they minimize dependencies on other ontologies, however, is a bit one-sided. The paper would have more stature if the pros and cons of such dependencies were briefly discussed and the chosen solution was rationalized based on reasoning support (rather than talking mainly about complexity and the discomfort of users). Along the same lines, claiming (in the abstract) that om-lite and sam-lite can be used to *harmonize* observational data is rather courageous. Apart from the ill-defined notion of harmonization itself, it seems obvious that terms with less semantics can be more broadly applied, but this does not mean that they harmonize or integrate anything. Also, it would be good to see a more solid discussion of the strengths and weaknesses of stub classes.

The characterization of SSN as mainly introducing the idea of a stimulus (over O&M before) is narrow. At the time, for example, O&M still had the ambiguous notion of a single feature of interest. While Cox now sensibly introduces the distinction of sampled and sampling features, it is not true that this distinction was not possible before. SSN, for example, offers the Sensor concept for this purpose, defined as "physical devices, computational methods, a laboratory setup with a person following a method, or any other thing that can follow a Sensing Method to observe a Property".

The mapping to PROV-O, as sensible as the choice is, got me rather confused. In the paper, saml: and oml:Processes are subclassed from prov:Entity, which seemed odd, but the correction in the comment (below) to subclass them from prov:Agent seems equally questionable. At least the paper should argue the avoidance of prov:Activity more substantially. A one paragraph explanation of PROV-O's basic concepts would help.

The weakest part of the paper is section 2, on the conceptual model, which oozes standard numbers and acronyms and leaves its diagrams often unexplained. In fact, figures in the paper as a whole lack explanations and sometimes appear redundant. What does Figure 6 add as an idea that Figure 2 did not contain already? Figure 8 is overloaded and not very informative, lacking explanations. Figure 9 may just not be worth the 100 words or so that it would take to explain the ideas in it, but in any case, it cannot replace these.

The main weakness of the paper overall is in fact its jargon and acronym-loaded style and its shortage on explanations. Using standardization jargon so pervasively detracts from ideas and insights. The problem starts in the abstract, which should of course be generally intelligible, but uses many acronyms and an ISO standard number without even explaining what that standard is about or what context it is taken from.

The same style continues throughout the paper, with a highlight in this phrase at the beginning of the conceptual model (!) section: "Note the use of types and classes from other ISO 19100-series models, indicated by the prefixes GF, TM, GM, MD, DQ, LI (from ISO 19109, 19108, 19107 and 19115)." Or consider the sentence at the end of 4.1, ending with: "some information that is provided in additional xlink attributes alongside the href in the GML implementation is not available locally in the RDF.".

Many phrases and sentences also need to have their English revised to become clear, starting with the tile (where the word "Basic" is used ambiguously), and continuing with spelling and grammar errors already in the first and second sentence. For another example, I could not understand this sentence (in the discussion section), even after adjusting the verb: "From this view it is clear that om-lite does provides a light-weight framework, in comparison with both SSN and OMU which any application to make a significant commitment to an existing framework".

Two more minor points:
- What are "data individuals" or "individual data instances"?
- The wfs.example.org links in the example listings are broken.

Overall, the paper reads like a submission that was somewhat hastily put together from standards documents and discussions, rather than providing the necessary reflection on and communication of the new ideas for a broader SWJ audience. It is based on good work that warrants publication, but the current exposure falls short of communicating it well, let alone making the read enjoyable.


Comments

After submission I realised that the mapping between O&M and PROV-O given in 5.2.1 and 5.2.2 was incorrect, concerning saml:Process and oml:Process. A better fit is
saml:Process rdfs:subClassOf prov:Agent .
oml:Process rdfs:subClassOf prov:Agent .
and correspondingly
oml:procedure rdfs:subPropertyOf prov:wasAssociatedWith .
This will be corrected in revision.

Listing 1 on page 9 uses the class oml:Measurement. I can't find this elsewhere in the paper. Should it have been oml:Observation?