Review Comment:
This paper presents an ontology that models domain knowledge relevant to exposures, environmental factors, person, dose, risk and how these concepts are related to each other. Despite being well-motivated and providing a contribution to the field, the paper has weaknesses in terms of how the development of the ontology follows ontology development methodologies, some modeling decisions, some unclear description of the ontology, and the limited evaluation of the ontology.
Therefore, my advice for this paper is Major Revision. My detailed comments are shown below.
In the abstract, the authors present “[…] systematically compare different methodological approaches, but also to better link and align […] scientific publications […]”. Reading through the paper, it is clear that the proposed ontology can link and align content of scientific publications. However, it is not clear what are the benefits of comparing different methodological approaches. The six articles the authors chose are from different domains such as food and air quality. Would it be more reasonable to compare two methodological approaches from the same domain to see if they model exposures in the same way or not?
Section 2:
The competency questions 1 and 5 ask some provenance information such as articles and datasets. However, your ontology (ExposureBasis.ttl) does not contain semantics that can answer these two questions. It is only somehow captured in your encoded RDF data of your six selected articles. It should be explicitly captured in your ODP such as how provenance information is represented for dataset and environmental factor.
In your “Helbich_2016.ttl” file, there are such triples as “_:accident a dcat:Dataset, expB:EnvironmentalFactor;rdfs:comment "accidents".
_:accidentdensity prov:wasDerivedFrom _:accident.
_:accidentdensity a expB:EnvironmentalFactor, dcat:Dataset ;rdfs:comment "accident density".”. It is not clear why an instance can be a dataset and environmental factor at the same time. Since your competency question 5 is related to your such modeling, when competency question 5 is considered in the process of developing ontologies, the ontology developer should therefore create concepts such as dataset and environmental factor as well as the relationship between them.
Section 3:
In the text, the authors mentioned “six articles on exposure to food, air quality, crime, active transport, and physical activity”. However, in Table 1, you have more specific exposure types such as neighborhood social norms and urban green space. These specific topics should be explained in detail and categorized according to the 5 mentioned topics.
Although the authors present the development follows the idea of pattern development [6, 11], the details of how the development follows the idea are not presented. For instance, to what extent you follow [6, 11]; how different steps in Figure 1 are aligned with development guidelines in [6, 11]; do you have to make any adaptions when you follow [6, 11].
The description of Figure 1 can be polished and improved. The description right now still misses some overview introduction of different steps in Figure 1, and some sentence is not clear. For instance, you reused existing ontologies for developing your ontologies, but such reuse step is not shown in Figure 1; The sentence “We then filled the slots of the pattern with examples manually extracted from exposure articles.” is not clear in terms of which step (or a missing step) it refers in Figure 1.
Section 4:
The authors presented “Our ontology pattern can be used across many domains […]”. These domains should explicitly be mentioned in the text.
Although section 4.1.3 provides a description of active and passive exposure in a nutshell, the description can be explained in a better way. Since the active and passive exposure seem to be distinguished by two different causal configurations. As a reader, I would like to get the information directly and precisely at the beginning, what are these two causal configurations; what are the chains of these configurations. After the overview introduction, introducing examples like food intake and noise exposure would be helpful. Currently, your modeling approach and the example introduction are mixed up and make it difficult to understand. Also, at some places you refer to the example or configuration using "latter case" or "second causal configuration" make it difficult to follow.
Figure 2 can be explained in more detail and can be referred when active and passive exposure are described. What do the notions (i.e., active and passive) actually mean in Figure 2? Does it mean an exposure with environmental factor involved always is a passive exposure? The notions of these arrows should be explained precisely, otherwise it will become controversial and confusing when a reader reads axiom 3 and definition 4.
Is Environment in Figure 2 same as EnvironemtalFactor? The terminology should be unified (same for the ontology file).
In addition in the ontology file for the "EnvironmentalFactor" concept, there is a rdfs:comment “Environment playing a role in some exposure. Can be conceptualized in different ways (see core concepts)”. What are these different ways to conceptualize EnvironmentalFactor?
For Axiom 1, why (Person AND Dose), (Person AND Risk), (Dose AND Risk) are not included in the left side of the axiom?
Why there are not axioms regarding the connection between Activity and EnvironmentalFactor? Can an activity be caused by an EnvironmentalFactor?
Minor issues:
Page 1: (cf. [1]. -> missing right parenthesis
Page 2: “ontology design challenge [9, 10]” -> it does not make sense why [9] and [10] are cited here.
Page 7: “) Furthermore, it can also be […]” -> missing period.
Page 8: NO2 -> $\mathrm{N}\mathrm{O}_\mathrm{2}$
For the ontology : It would be better to declare the domains and ranges for some object properties (such as causedBy). Therefore people who reuse the ontology can have a better understanding.
|