Review Comment:
SUMMARY
This manuscript describes process of linking and harmonizing across an
impressively broad cross-section of ontologies related to the food
domain, within the framework of the OBO Foundry. It covers both
technical processes for building and integrating these, as well as the
social aspects such as organizing networks and organizations and
various stakeholders.
The manuscripts describes each ontology in turn, illustrating each
with screenshots or images showing part of the hierarchy, sometimes
describing how each is related to the others. Each ontology is
summarized in a fairly descriptive, qualititative fashion.
Overall the work being described is impressive, spanning many
different disparate ontologies with diverse stakeholders. It appears
there is quite a bit of sophisticated OWL being used to glue
everything together.
Unfortunately the manuscript in its current form does not do justice
to the work described.
Overall the manuscript is quite disjointed and repetitive, with
multiple screenshots (often taken from different tools with different
styles) all showing roughly similar things.
The reader isn't left with a clear sense of how all these efforts fit
together, how the ontologies truly cohere (or don't cohere). Some
parts feel rushed, and there are problems the formatting, with many
blank areas and blank pages. The manuscript would benefit from a
clearer sense of who the audience is. It seems to be written for an
internal audience who knows a lot of jergon. Terms and ontology
acronyms are frequently introduced without any definition.
For a domain scientist or subject matter expert there isn't a strong
sense of what problem all these ontologies are solving, and how
successful they have been. For a semantic web audience, there is a
lack of technical detail or technical methodology. There isn't really
any lessons to be learned or generalized for similar communities in
different domains. The theme seems to be interconnectivity but we
aren't left with any kind of quantitative understanding of how well
connected these ontologies are, what the challenges in connecting are,
and what remains to be done.
As a kind of conference-report style descriptive summary paper, the
manuscript works fine, and could be published in its current format
after a lot of tidying. There are lergely no strong claims being made
that require backup and evidence.
However, publishing in its current form would be a missed opportunity
to communicate what seems like an impressive and complex undertaking,
possibly even unique, and to present lessons learned in a way that
would be useful to other audiences.
I am structuring my review into major changes and discretionary
changes. If the major changes are addressed then it meets the bar for
publication. But I would urge the authors to consider the
discretionary changes, and try and write a manuscript explains to a
broader less internally focused audience the significance of what you
have done.
MAJOR / REQUIRED CHANGES
1. The manuscript should be read through and tidied. Figure legends
should accompany the actual figure. Figures should be legible and
correctly proportioned, and adequately described what is seen in
them. There should be no missing figures (e.g. currently you skip from
4 to 6). There should be no blank pages. Normally these things might
be handled post-review but in this case I think these need to be
handled up front to make the manuscript more understandable. Acronyms
(e.g. QUDT, NCBI Taxon, IAO, etc) should be spelled out, and citations
given where appropriate.
2. There should be minimally some kind of quantitive information
provided for these ontologies - at least number of classes per
ontology. Given the theme is reuse and interoperation, this needs at
least be partly quantified. How many terms in each ontology are
native, and how many are imported? How many object properties are used
for linkage, and what kind of axioms? At the very least, make the
quantitave information consistent - currently some ontology specific
section reports number of terms (and object properties), others don't.
3. The methods description is very vague. There is off-hand mentions
of various tools and axiomatization strategies, but these lack
precision or basic descriptions. ROBOT is mentioned only in the FIDEO
section, leading the reader to think that FIDEO is the ontology to use
ROBOT. Protege is not mentioned once, but I suspect many or all of
these ontologies are authored using Protege.
I suspect there is actually a lot of interesting methodology at work,
combining ontologist-driven editing and axiomatizing in Protege,
SME-driven term collection using ROBOT templates, and automatic
conversion from food databases. Remember your audience here may be
very unfamiliar with practices that may be common in your community. I
appreciate that you have heterogeneous collection of ontologies with
different practices. But there needs to be a clearer and more precise
description of common methods up-front. Then in each ontology-specific
section, only mention what is specific to that ontology or how that
ontology differs.
4. All figures need to clearly indicate which terms are native to an
ontology and which are imported. Figure legends must explain what the
content is.
DISCRETIONARY
Given this is for the SWJ, I would strongly recommend an upfront table
of prefixes and their URL expansions for all CURIEs used in the manuscript.
There are 14 figures. Do you need so many? They are repetitive,
inconsistent in style, and as a whole don't communicate much of use,
or lead to unanswered questions or confusion. For example, fig9 is
purportedly for a food-drug interaction ontology, yet no interactions
are shown, and the hierarchy is confusingly similar to FoodOn's. The
legends are often incomplete. The most useful figure is figure 14, as
this shows interconnectivity between some of these ontologies -- yet
this is buried at the end.
I would recommend fewer figures, more tables, and a more consistent
presentation style. Don't have some figures with arrows from sub to
superclass, and other figures with the reverse. Given the theme of
interconnection, I recommend relying less on simple subclass
hierarchies, and more on a clear consistent way to show cross-ontology
axioms.
If your intended audience is SWJ readers I think it is worth being
very clear about what kinds of OWL axioms are being used. These are
hinted at in some screenshots (e.g. Fig 11) but with a lack of
details. Remember, most audiences may be unfamiliar with what is taken
for granted in your community. Many semantic web ontologies are simple
RDFS ontologies - if you are using richer OWL axioms both within
ontologies and for interconnectivity between ontologies, this is quite
interesting, and it is worth being clear and precise about this. A
table of example axioms in manchester syntax would not be amiss for
this journal. Alternatively, if you wish to keep the paper accessible
to a non-technical audience that is fine, but be consistent. As it is,
figures like Fig 3 are hard to interpret by both SMEs and semantic web/OWL
experts (what is _blank? is this a TBox or ABox? how is lemon
related?)
There are a lot of random context-less details in each ontology
section. The manuscript would strongly benefit from a lot of editing,
and I would recommed removing extraneous details, references to random
tools that are being proposed or thought about. Instead stick to clear
descriptions of each ontology, why it exists, and how it connects to
the others. By all means highlight particular challenges specific to
each ontology, but do it clearly with concrete examples.
The manuscript could benefit greatly from more information on how
these ontologies are used, and how success is measured.
MINOR / OTHER COMMENTS
The screenshot for Fig 11 "vegan dietary pattern" shows that this
class has OWL axioms such as SubClassOf eats some algal diet. This
sounds odd (a diet doesn't actually eat things). But more importantly
it suggests you will have logical inconsistencies if you try and
introduce a subclass for a more restricted vegan diet that excludes
algal food.
Half of page 7 is missing, and page 8 is blank. Same for 11 and
12. Same with 16 and 17
"This view was created because of ChEBI’s inherent hierarchy of
molecular entities and their roles is not easy for nutritionists to
navigate"
What is an inherent hierarchy? Why is it hard to navigate? How does
the view help? This isn't clear. It seems there is a real problem
being addressed here, one that is potentially generalizable (making an
ontology developed for one audience usable by another). I would
recommend focusing on a few areas like this, and clearly explaining
them.
> The Human Disease Ontology (DO) uses Relation Ontology (RO) term
“has allergic trigger†to attach an allergic disease to the food(s)
that trigger it.
This implies that a single disease class can have multiple food
triggers. What are the semantics of this and how is this encoded?
Does this mean that all the foods or only one of the foods trigger the
disease? Is this encoded with UnionOf, what kind of OWL Restriction?
Also: be consistent in your technical language. Elsewhere you use
"object property" which is precise OWL terminology, albeit somewhat
abstruse. Here you use "term" to describe what in an object
property. Use consistent language and technical terminology throughout.
there are inconsistencies in the text re DO or DOID. Again, introduce
all prefixes and use them consistently.
"This branch DOID also supports the Immune Epitope Database (IEDB)
[29] by way of an IEDB “slim†export file of almost the entire food
allergy branch. It appears that the vegetable allergy branch is
accidentally omitted from this slim file (shown in Figure 4), and a
Github issue has been raised to remedy the omission"
This is oddly worded and strangely specific. What is a branch DOID?
What is a slim export file, and how does it help this database? Rather
than quoting terms, use a standard vocabulary familiar to the readers
of the journal, or clearly describing methodological practice like
"slimming".
And why are you telling the reader random minor things about github
issues? Just say on the figure "some terms omitted".
> OBO provides a web service for permanent links to term resolution
(called purls),
It's normally written PURLs, all caps. This makes it seem like the web
service is called purls. I think what you are trying to say is that
OBO provides a web service for resolving OBO PURLs, which have a
standardized form.
Consider citing https://content.iospress.com/articles/data-science/ds190022
> OBO’s standard OWL ontology format is suited to expressing
international standards in minute detail as data structures with
context-sensitive terminology, synonymy, and categorical, numeric
and textual variables
This is confusing. It makes it sound like OBO is responsible for the
OWL standard, which is is not.
I don't know what it means to express a standard in "minute detail"
It's not really clear what these data structures are how they support
"context-sensitive terminology". Is this something in the OWL standard
itself (I don't think so, as OWL only provides generic annotations for
terminology). Maybe this is something OBO provides?
> Adoption of OBO standardized term URLs that usually resolve to an
ontology search engine term result like ontobee.org [21]. Along
with the MIREOT principle, this fulfills the encyclopedic FAIR data
vision of OBO
I think many readers will not understand what you mean here. Given
this is the SWJ I think clearly explaining what you mean when you talk
about standardized URLs will be useful. Consider using terms familiar
to this audience, e.g. PURLs.
I don't know what an "encyclopedic FAIR vision" is
> FoodOn entered the OBO Foundry in 2016, calving-off food terms from ENVO
What is calving-off?
> An upcoming objective is to map to these branches more extensively
by way of ‘has member’ relation to FoodOn’s own food product
hierarchy, to enable data exchange and harmonization to the deepest
level of food product classes.
I don't know what this means.
> CheBI
I believe they refer to themselves as ChEBI
The ONS section has a lot of jargon and it's hard to understand what
the ontology is and why it exists. Terms like "information content
entity" are not defined, and acronyms such as IAO are introduced
without any definition.
"In ONS conceptualization, ‘dietary pattern’ denotes ‘diet’."
I have no idea what this means, or why there are two parallel sets of
boxes in the figure.
There are similar problems with the description of ONE. E.g:
"The first version of ONE extends IAO document parts so they can cover
research paper structure, and as well description of food surveys
which form the underlying datasets for many studies"
It's unclear why we are suddenly talking about document parts and
research paper structures
"Applications of ontology will unlock information contained in the
guidelines for automated modelling of trends to assess dietary habits"
This is hinting at a potentially interesting application but it's not
clear from the text how extending document parts achieves this goal.
"The ONE curators are currently developing a Natural Language
Processing (NLP)-SPARQL linkage to enable a natural language query of
ONE, as well as dashboard development to visualize nutritional
knowledge contained in research manuscripts and population based
recommendations"
The readers of SWJ may be very interested to here more about
NLP-SPARQL linkages but the description here is extremely vague. I
strongly suggest trimming text that is not absoluetly necessary, and
focusing on having clear descriptions of the parts that remain.
the PO2 DOI doesn't resolve to OWL
> The development of the OBO food related ontologies is occurring in a
semi-autonomous parallel fashion, with interconnectivity issues
arising on a weekly basis
Some concrete examples here would be great. This is a fantastic
opportunity to be specific about some of the kinds of issues that
arise from coordinating the development of multiple ontologies.
> Knowledge is beginning to accrue within these OBO ontologies as they
express at a class level the subject predicate object facts or
assertions provided by the collective language that OBO can provide.
I don't understand this
> The relative recency of the OWL standard has contributed to
methodological growing pains stemming from its roots in formal logic
and philosophy which are not easy to comprehend in terms of
capability, computability, and ontology and database infrastructure
especially for implementing term reuse
This is not an accurate summary. OWL has absolutely no roots in
philosophy (and it's not particularly recent, it is almost twenty
years old, and has its roots in DL systems from the 80s/90s). The
following sentence is about BFO - maybe that is what you mean? I'm
sorry, but I just don't understand what you're trying to say here. It
sounds important, in that it's about making ontologies easier to
develop, so it's important to state your points clearly.
Even simple points like "our experts have found contributing to
ontologies difficult. Some tools like X have helped, but it's still
hard to do Y" would be really useful to communicate to a broader
audience. Try and be simple, clear, and concrete.
SPECIFIC COMMENTS ON FIGURES
Fig 1: it's good to get an overall overview like this, this is a
useful figure. Is is it accurate to say that CHEBI is part of the food
ontology stakeholders collective? I think you have an opportunity to
here to talk a bit more precisely about the challenges of coordinating
ontology development involving multiple stakeholders. Remember your
readers probably have no idea about how these ontologies are funded,
how they operate in the absence of funding, and what some of the most
basic mechanisms for contribution are.
Fig 2: The figure legend says it is CDNO, but it looks just like
CHEBI. Does CDNO duplicate CHEBI? or does it take CHEBI terms and
patch them into its own hierarchy? The figures should clearly state
which terms are imported and which are native, and ideally the text
should describe the process for melding hierarchies.
Fig 3: it's not clear what is being drawn here. Are the edges object
property assertions? Or is this an OWL class expression? If the
latter, then what does _blank mean? Why is _blank there for ascorbic
acid and not for material entity? What are the unabelebed edges, and
what is the dotted egde from BFO to lemon? What does "inheres in"
mean? Why are the IDs not shown as CURIEs as they are in the rest of
the text? Consider using the same style of diagram for all ontology
figures
Fig 4: it may not be obvious to readers how this branch of DO connects
to the other ontologies you mention. It's also not really clear why we
are looking at this hierarchy and why it's interesting (other than to
perhaps suggest that it is highly incomplete - e.g. the list of fish
allergies) In the text you seem to implicitly state that they connect
up to foodon classes, but when I look at some on OLS, e.g. zebrafish
allergy, it connects to the NCBI taxonomy.
Fig 5: there is no figure 5
Fig 6: the figure is oddly stretched. The significance of the figure is not
clear
Fig 7: this is interesting as it shows the interconnection of
different terms across ontologies. A legend would be useful. What do
the dashed lines signify? Are we looking at a TBox or ABox? Why is
there is a link between 'exposure to chlorpyrifos' and 'apple'? What
is the meaning of receptor vs medium and why is apple a receptor in
one case and a medium in another? What is "Individual"? An OWL
individual? Make sure all the prefixes in the figure are defined (the
text mentions HPO - is this the same as HP?). What is GO? (don't
assume readers know this)
Fig 8: what is the significance of the different arrow colors? Why are
some solid and some dashed? Why do the arrows flow in the opposite
direction compared to other diagrams?
Fig 9: the legend is disconnected from the figure. The legend says
this is FIDEO, but it looks like foodon terms. If FIDEO is food-drug
interactions, why doesn't the figure show interactions?
Fig 10: Same comments, I don't know what the different colors are,
what a dashed line is, what the difference between a dietary pattern
and a diet is and why both are there,
Fig 11: The definition is not grammatically well constructed. There
isn't any explanation of what we are looking at - you are assuming the
reader has some knowledge of Protege displays and manchester syntax
but this may not be the case: take care with figure legends to
clearly explain the contents. The axioms are unusual. You are saying
"every vegan dietary pattern eats some algal food product". This
doesn't make any sense. Also I assume the food terms are imported from
foodon, but this isn't explicitly stated.
UNDEFINED TERMS
- NCBITaxon, NCBI Taxon
- ENVO is introduce in the intro without spelling what it is. It is
later mentioned again and cited, but it's never defined
- The Relation Ontology (RO) is first mentioned on page 10, but there
are many uses of the RO prefix before this
|
Comments
Semantic Web for the Global Food System
This paper is meant to be submitted for the Special Issue for the Semantic Web for the Global Food System.