CEO: Counterfactual Explanations for Ontologies

Tracking #: 3315-4529

Authors: 
Matthieu Bellucci
Nicolas Delestre
Nicolas Malandain
Cecilia Zanni-Merk

Responsible editor: 
Guest Editors Interactive SW 2022

Submission type: 
Full Paper
Abstract: 
Debugging and repairing Web Ontology Language (OWL) ontologies has been a key field of research since OWL became a W3C recommendation. One way to understand errors and fix them is done through explanations. These explanations are usually extracted from the reasoner and displayed to the ontology authors as is. In the meantime, there has been a recent call in the eXplainable AI (XAI) field to use expert knowledge in the form of knowledge graphs and ontologies. In this paper, a parallel between explanations for machine learning and for ontologies is drawn. This link enables the adaptation of XAI methods to explain ontologies and their entailments. Counterfactual explanations have been identified as a good candidate to solve the explainability problem in machine learning. The CEO (Counterfactual Explanations for Ontologies) method is thus proposed to explain inconsistent ontologies using counterfactual explanations. A preliminary user-study is conducted to ensure that using XAI methods for ontologies is relevant and worth pursuing.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 22/May/2023
Suggestion:
Major Revision
Review Comment:

The main contribution of this research paper is the proposal of a new method called CEO (Counterfactual Explanations for Ontologies) to explain inconsistent ontologies using counterfactual explanations. The paper also draws a parallel between explanations for machine learning and ontologies, which enables the adaptation of eXplainable AI (XAI) methods to explain ontologies and their entailments. Additionally, the paper highlights the need for explainability methods in the Knowledge Representation and Reasoning (KRR) domain, which is widely used in medicine and the web, as demonstrated by ongoing projects such as the Gene Ontology and DBpedia.

This manuscript was submitted as 'full paper' and is reviewed along the usual dimensions for research contributions which include (1) originality, (2) significance of the results, and (3) quality of writing.

Originality
The CEO method is a novel approach to generating counterfactual explanations for OWL ontologies. While there have been previous efforts to generate explanations for machine learning models, the CEO method is unique in its focus on ontologies and its use of counterfactual reasoning.

Significance of the results
small Sample Size: The CEO method relies on a small sample size of experts to evaluate the quality of the counterfactual explanations generated by the system. This may limit the generalizability of the results and make it difficult to compare the performance of different methods.
limited applicability: The CEO method is tested in a specific domain, which may limit its applicability in other domains or contexts.
Interpretability: While counterfactual explanations can be useful for understanding how an ontology works, they may not always be easy to interpret or understand for non-experts.

Quality of writing
English needs to be checked and improved with a mother-tongue english speaker
What the AI algorithm detected -> What did the AI algorithm detect?

Minors:
the of use -> the use of

Improvements in the organization of sections

Introduction. I think that introductions should be concise and focused. Aim to engage the reader, establish the relevance of your research, and clearly convey the purpose and objectives of your study.
Therefore, I would propose removing the subsections and organizing it as follows:
Start with a general opening: Begin your introduction with a broad statement. You can mention a real-world problem
*Provide background information: Give a brief overview of the background information related to your research topic. Explain the key concepts, theories, or existing research that are relevant to your study. This helps the reader understand the context and significance of your work.
*State the problem or research question: Clearly state the specific problem or research question that your study aims to address. Explain why it is important to investigate this problem and how it relates to existing knowledge or gaps in the field. This will help readers understand the purpose and relevance of your research.
*Summarize related work (you do not need to put here all the related work sections). Review the existing literature and summarize the key findings or approaches related to your research question. Identify the strengths and limitations of previous work and highlight the gaps that your research aims to fill. This demonstrates your familiarity with the field and positions your work within the broader research landscape.
*Present your research objectives: State the specific objectives or goals of your research. Clearly articulate what you intend to achieve through your study and how it will contribute to the existing body of knowledge in computer science. This helps readers understand the expected outcomes and significance of your research.
*Outline the paper's structure (which you already did)

Related Work. The related work section is missing and I think there are enough mentioned works here and there but all of them and how you differ from the state of the art should be in one section named Related Work

Approach. The proposed approach section is a crucial part of a research paper in computer science. This section outlines the methodology, algorithms, models, or techniques that you plan to use in your research. Here are some suggestions for organizing and clarifying the proposed approach:
*Clearly state the objective and provide an overview by explaining the main steps or components involved and how they contribute to achieving the research objective. This provides a roadmap for readers to follow and understand the structure of your approach.
*Describe the fundamental principles, theories, or concepts on which your proposed approach is based. This helps readers understand the theoretical foundations of your work.
*Provide a detailed description of the technical aspects of your approach, including flowcharts where necessary to clarify the steps involved. Use clear and concise language to explain each component and its purpose.
*If there are any assumptions or limitations associated with your proposed approach, explicitly state them. Discuss how these assumptions or limitations may impact the results or the applicability of your approach.
*Explain why you have chosen the specific approach and how it aligns with the research objective. Provide evidence or references to support your decision.
*It is also helpful to connect the proposed approach back to the research problem or objective, emphasizing how your chosen approach is suitable for addressing the specific challenges identified in your study.

Concern about the limitations of conducting experiments on a single dataset and the statistical significance of the number of users. When designing and reporting experiments in a research paper, it is crucial to address these limitations to ensure the credibility and generalizability of your findings. Here are some suggestions for addressing these concerns:
*Acknowledge that your experiment is conducted on a single dataset and explain the reasons behind this choice. For example, you can discuss the dataset's relevance to the research question, its availability, or its uniqueness in terms of characteristics or domain. However, it is important to acknowledge that the results may be specific to that particular dataset and caution readers against making broad generalizations.
*Highlight the potential limitations of generalizing the results beyond the specific dataset used in your experiment. Discuss the dataset's properties, such as size, diversity, and representativeness, and how these factors may impact the generalizability of your findings. Consider suggesting future work that could involve multiple datasets or different data sources to validate and strengthen the conclusions.
*Discuss the potential impact of the sample size on the statistical power and the precision of the results. Consider providing a justification for the chosen sample size, such as resource constraints or the uniqueness of the user population.
*Offer strategies or suggestions for addressing the limitations of a single dataset and limited user sample size. For instance, you can propose cross-validation or bootstrapping techniques to evaluate the robustness of the results. Additionally, you could recommend future research that involves replication studies on different datasets or larger-scale user studies to enhance the reliability and generalizability of your findings.

Review #2
By Md Kamruzzaman Sarker submitted on 31/May/2023
Suggestion:
Major Revision
Review Comment:

Authors proposed an interesting idea of using counterfactual explanations (CE) to guide ontology development. Currently, counterfactual explanations are used heavily in machine learning explanation systems, whereas in the knowledge representation/ontology domain, it is not used so far. Authors' work to use the CE in the domain of ontology development seems novel.

Authors took inspiration from CE in machine learning, where it is mostly used for supervised learning. Authors had to adapt the CE to suit the ontology reasoning case, which is pretty good.

This work is well motivated and may be helpful for ontology development. Authors provided the resource link. Writing is good to read and understand.

While this work is exciting and novel, I am concerned about the effectiveness/scalability of this method:
* Suppose a user is developing a large ontology and has already developed millions of axioms. Then the user makes a mistake in a single axiom development. However, the mistake is single, but there are many potential solutions to make the ontology consistent. I understand the counterfactuals are sorted, but will it not take too long to generate all those counterfactuals and then sort them? It would be better to understand the effectiveness of the counterfactuals by having a measurement of the time. Then we can compare the scalability of it comparing with the traditional reasoner.

* The experimental ontology is minuscule, and having a larger ontology may skew many results in the paper. It needs to have more experimental data with varying size of the ontology.

Seven participants took part in the study. Though the authors explanations about it are reasonable, it would be better to have more human participants.