Review Comment:
This manuscript describes initial development of the Plant-Pathogen Interactions Ontology (PPIO). The introduction is well written and clearly lays out the need for an ontology to cover plant-pathogen interactions. Furthermore, the manuscript contains many commendable ideas of how to develop such an ontology, such as using logical definitions (equivalence axioms) to classify instance data and importing existing classes from existing ontologies. Where it falls short is in the actually execution. As described in more detail below, there are many areas where improvement is needed in the PPIO before it is ready for publication.
Comments on specific sections of the manuscript:
Design principles and high-level overview of classes:
The “disease triangle” is a reasonable framework for modeling plant-pathogen interactions. I think a figure that links the conceptual model to the ontological model would be useful, especially for justifying how the creation of two ontology classes (Environmental parameter and Organism) represent the three concepts in the triangle.
Organism and use of taxonomy classes:
The organism class should be imported from the Common Anatomy Reference Ontology (CARO), rather than creating a new PPIO class. Likewise, PPIO:plant should be imported from NCBI-taxon (Viridiplantae). NCBI hierarchy is imported, but generally not used in the ontology. The manuscript says that the subclasses of PPIO:Organism are linked to NCBI classes, but there are no links in the ontology.
Logical definitions:
The definition of PPIO:host plant includes both “Plant and susceptible to some plant pathogen” as well as “Plant and expresses phenotype some Susceptibility phenotype”. These seem somewhat redundant. The definitions of ‘susceptible to’ and ‘Susceptibility phenotype’ should imply one another. As it is now, the authors have to maintain two separate hierarchies (one for relations and one for phenotypes), which makes it harder to maintain the ontology. Nonetheless, I laud the authors’ effort to use logical definitions to automatically classify data.
Phenotype classes:
There are multiple published examples of phenotype ontologies (e.g., Gkoutos et al. 2005; Köhler et al. 2013; Dahdul et al. 2010; Park et al. 2013), and the authors should review their design patterns and consider using some type of entity-quality model. While the entity-quality model is not the only valid way to model phenotypes, it does have a lot of merit and is widely used. Currently, logical definitions of phenotypes in the PPIO omit the entity, which can lead to incorrect inferences. For example, PPIO:Abnormal growth development phenotype is defined as equivalent to “modifies some cell growth development trait”. When the reasoner is run, it infers that PPIO:Necrotic lesion is equivalent to Abnormal growth development trait, but Necrotic lesion is a subclass of PPIO:Phenotypcic process, so this does not make sense.
Phenotypic process classes:
These terms are biological processes and should be developed in conjunction with the Gene Ontology, as subclasses of the GO:biological process class. The names for most of the classes here are ambiguous and seem to refer to material entities rather than processes. For example, “necrotic lesion” should refer to an actual lesion, which is a material entity, but this class instead refers to a process. A better name would be something like “lesion formation process”.
Trait classes:
There is no need for the class PPIO:Trait as a superclass of TO:plant trait. Instead, just replace PPIO:Trait with TO:plant trait. The manuscript says that the Trait class is axiomatically related to both the Phenotype and Phenotypic process classes within the PPIO, but in fact there are no axioms among those classes. What should be done is to add axioms linking specific PPIO phenotypics to specific subclasses of TO:plant trait.
Discussion:
How do you plan to integrate with Darwin Core? It is not clear what the relationship between PPIO and Darwin Core annotated data would be.
The link to the prototype data collection portal (http://1.tfguc3m.appspot.com/) does not work.
The intention to automate ontology development is admirable and important, but without proper (human generated) knowledge model behind it, the PPIO cannot effectively organize plant disease data.
General comments about the ontology
No text definitions for most classes, only comments on some terms! Textual definitions are very important for human users and can act as check to insure that logical definitions are correct.
All of the logical definitions lack a subject and therefor are they are very broad and likely to lead to incorrect inference. For example, PPIO:Pathogen physiology trait is defined as PPIO:trait of some PPIO:Plant pathogen, where “trait of” is a property. Using an Aristotlean or genus-differentia definition (which is a more common practice in ontologies), this would be defined first as a subclass of trait: trait and trait_of some Plant pathogen. (Also, this definition is too broad, as it includes all traits of plant phathogens, not just physiological traits). Another example of incorrect inference due to poorly constructed logical definitions is that TO:tiller number is inferred to be a subclass of PPIO:Phenotypic process.
It is not clear why some of the TO traits show up as direct subclasses of “Thing” (e.g., internode color, leaf collar color) rather than in their appropriate place in the TO hierarchy.
All of the phenotype classes are defined logically using the “modifies” relation, but this relation is not defined, so it is hard to know what is actually meant by the definitions of phenotypes and phenotypic processes, but there are in fact no relations among PPIO:Trait, PPIO:Phenotype, and PPIO:Phenotypic process. What should be added is axioms linking specific phenotypes in PPIO to specific PO:trait subclasses. In fact, none of the relations are defined. However, they do specify domains and ranges, which is good.
What is the justification for modeling pathogens as individuals rather than classes?
Class names should not be capitalized.
The constructions of class URIs is good (use of numbers rather than term names).
References:
Dahdul, Wasila M, James P Balhoff, Jeffrey Engeman, et al.
2010 Evolutionary Characters, Phenotypes and Ontologies: Curating Data from the Systematic Biology Literature. PloS One 5(5): e10708.
Gkoutos, Georgios V., E.C.J. Green, A.M. Mallon, J.M. Hancock, and D. Davidson
2005 Using Ontologies to Describe Mouse Phenotypes. Genome Biology 6(1): R8.
Köhler, Sebastian, Sandra C. Doelken, Christopher J. Mungall, et al.
2013 The Human Phenotype Ontology Project: Linking Molecular Biology and Disease through Phenotype Data. Nucleic Acids Research: gkt1026.
Park, Carissa A, Susan M Bello, Cynthia L Smith, et al.
2013 The Vertebrate Trait Ontology: A Controlled Vocabulary for the Annotation of Trait Data across Species. Journal of Biomedical Semantics 4(1): 13.
|