Interpretable Ontology Extension in Chemistry

Tracking #: 3183-4397

Martin Glauer
Adel Memariani
Fabian Neuhaus
Till Mossakowski
Janna Hastings

Responsible editor: 
Guest Editors Ontologies in XAI

Submission type: 
Full Paper
Reference ontologies provide a shared vocabulary and knowledge resource for their domain. Manual construction and annotation enables them to maintain high quality, allowing them to be widely accepted across their community. However, the manual ontology development process does not scale for large domains. We present a new methodology for automatic ontology extension for domains in which the ontology classes have associated graph-structured annotations, and apply it to the ChEBI ontology, a prominent reference ontology for life sciences chemistry. We train Transformer-based deep learning models on the leaf node structures from the ChEBI ontology and the classes to which they belong. The models are then able to automatically classify previously unseen chemical structures, resulting in automated ontology extension. The proposed models achieved an overall F1 scores of 0.80 and above, improvements of at least 6 percentage points over our previous results on the same dataset. In addition, the models are interpretable: we illustrate that visualizing the model's attention weights can help to explain the results by providing insight into how the model made its decisions. We also analyse the performance for molecules that have not been part of the ontology and evaluate the logical correctness of the resulting extension.
Full PDF Version: 


Solicited Reviews:
Click to Expand/Collapse
Review #1
By Uli Sattler submitted on 02/Aug/2022
Review Comment:

This is a revised version, and so I will focus on relevant comments. In general it's improved and clearer, and I recommend acceptance. I have basically two remaining points for the authors to consider:

In reaction to the following comment in the first round, the authors added an explanatory paragraph to Section 4, changed its header to ‘Interpretability’, and mentioned the limitation to atomic subclass relationships (in Section 6). While this clarifies matters, I still think that the title, in particular, promises more than the paper provides: "Some of the claims made are not strongly supported by the evidence provided in the paper: the interpretability/explainability is discussed by an interesting example, but a suitable evaluation is left for future work. Furthermore, it seems that explanations will only be available for positive classification: what would one do for false negatives? Similarly, the current approach addresses ontology learning in a very weak form as it is restricted to learning of atomic subclass-relationships. While the results are interesting, one could also call this ‘class localisation’ or ‘class insertion’.” Related to this, a sentence such as "Visualisations such as those in Figure 11 provide a representation of the attention structure that is more intuitive for chemists, and provide a sort of visual explanation for the classification.” …do still read a little strong as we’re missing any evidence that a chemist would find these helpful (or perhaps the authors have such evidence?)?

Regarding the following comment: "Would the following be clearer? ”Given the *documented, structured* design decisions by the ontology developers, how would they extend their ontology to cover a novel entity? “, the authors responded that their "approach has been developed under the assumption that there are certain reoccurring design decisions that are *implicitly* reflected in the structure of the ontology. The goal of the system is to understand these design decisions and reflect them in its classification. We rephrased the submission to put a higher emphasis on the exact kind of input data that is used.” …and I am still confused: the current approach *does* consider the structured annotations of classes in the ontology, and so one could argue that the design decisions are partly implicit in the structure of the ontology and partly explicitly documented in the structured annotations? I.e., the approach uses *both* the structure/logical axioms of the ontology as well as the (structured) annotations?!

Related to this, page 5 still says "Our goal is to train a system that automatically extends the ChEBI ontology with new classes of chemical entities (such as molecules) based on the design decisions that are implicitly reflected in the structure of ChEBI. “. I maintain that this (’the structure’ of Chebi) is still confusing as I read it as, eg, the class hierarchy/graph of ChEBI and definitely not as including its annotations!


Page 5: "The preformance” -> "The performance”?

Review #2
Anonymous submitted on 15/Aug/2022
Minor Revision
Review Comment:

Comments were largely addressed

Figure text on x and y axes are still small in some cases

Why does the introduction still have a sentence about explainability? I think that can be cut completely

Review #3
Anonymous submitted on 11/Sep/2022
Review Comment:

I am satisfied with the author's responses and with the current state of the paper, and I believe that it can be accepted.