Searching for explanations of black-box classifiers in the space of semantic queries

Tracking #: 3380-4594

Jason Liartis
Edmund Dervakos
Orfeas Menis-Mastromichalakis
Alexandros Chortaras
Giorgos Stamou

Responsible editor: 
Guest Editors Ontologies in XAI

Submission type: 
Full Paper
Deep learning models have achieved impressive performance in various tasks, but they are usually opaque with regards to their inner complex operation, obfuscating the reasons for which they make decisions. This opacity raises ethical and legal concerns regarding the real-life use of such models, especially in critical domains such as in medicine, and has led to the emergence of the eXplainable Artificial Intelligence (XAI) field of research, which aims to make the operation of opaque AI systems more comprehensible to humans. The problem of explaining a black-box classifier is often approached by feeding it data and observing its behaviour. In this work, we feed the classifier with data that are part of a knowledge graph, and describe the behaviour with rules that are expressed in the terminology of the knowledge graph, that is understandable by humans. We first theoretically investigate the problem to provide guarantees for the extracted rules and then we investigate the relation of "explanation rules for a specific class" with "semantic queries collecting from the knowledge graph the instances classified by the black-box classifier to this specific class". Thus we approach the problem of extracting explanation rules as a semantic query reverse engineering problem. We develop algorithms for solving this inverse problem as a heuristic search in the space of semantic queries and we evaluate the proposed algorithms on four simulated use-cases and discuss the results.
Full PDF Version: 

Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 16/Apr/2023
Minor Revision
Review Comment:

###Summary of the paper###

This paper considers the problem of explaining the behaviour of machine learning models. In particular, the authors propose a framework to explain the behaviour of a black-box classifier, which exploits the use of semantic annotations on the sample data. The meaning of such annotations is supported by an underlying ontology formulated using a Description Logic (DL).

The general idea of the framework is as follows.

- Given are: a classifier F, a set of data samples annotated with semantic descriptions from a vocabulary V and a DL ontology defined over V. For each data sample of interest, the assertional component of this ontology (the ABox) identifies it with a constant, states its classification according to F, and contains formulas expressing the corresponding semantic annotations.

- Goal: To produce a set of rules expressed over V that explain the behaviour of F in an understandable and meaningful way. More precisely, a rule produced by the framework is of the form:

R_C = Body(x,x_1,...,x_n) ----> C(x),

where Body is a conjunction of unary and binary predicates from V with parameters in {x,x_1,...,x_n}, and C is a unary predicate representing a classification class of F. Intuitively, a rule expresses sufficient conditions for an item to be classified in the class C.

Besides the proposed framework, the other main contributions of the paper are:

- A suite of algorithms to compute (approximate) explanation rules. These algorithms are based on the fact that finding a correct rule R_C can be reduced to finding a conjunctive query Q_C whose certain answers w.r.t. the ontology are all positive instances of C.

- Experiments are conducted to evaluate the quality of the queries produced by the algorithms, in terms of how accurate they represent the behaviour of the classifier.

###General Evaluation###

This paper is a revised version of a submission that I have previously reviewed. It is concerned with the topic of eXplainable Artificial Intelligence (XAI), which has recently drawn considerable attention in AI research, and it is definitely relevant for this journal.

Overall, the paper has been considerably improved w.r.t. previous versions. I believe the results are of value and interest for the Semantic Web community. There is only one minor issue, but nevertheless important, that I described below. Once this is addressed/corrected, I would recommend the paper to be accepted for publication.

###Minor issue###

One of the strategies used to merge queries in the proposed algorithms is to compute the Query Least Common Subsumer (QLCS). The existence of such query depends on assuming that every conjunctive query (CQ) contains an atom of the form TOP(x), where TOP is the well-known constructor from DLs. However, this assumption is not entirely consistent with the definition of CQs, i.e.,

- by definition, a CQ cannot have an empty body or an atom of the form TOP(x). Note that TOP is not a concept name. Therefore, it is wrong to assume that every CQ contains such an atom. This makes, in addition, the use of the empty query as a shorthand for { | TOP(x)} not well-defined.

One way to achieve the desired effect could be to add to the TBox the GCI $TOP \sqsubseteq A$ where $A$ is a fresh concept name, and then assume that all CQs contain the atom A(x). This does not change the set of certain answers, and should not be a problem for the query subsumption partial order.

Perhaps this is what the authors meant in the first place. However, this must be carefully explained.

###Some typos###
- p.4, l.38: ...expressivity of a *knowledge base* (instead of $\mathcal{K}$).
- p.7, l.8: ... is *the* main strength...
- p.10, l.16: please, mind the calligraphy used in $$.
- p.11, l.41: there should be space after the comma in *D,C*.
- , l.44: ...knowledge *base*...
- p.12, l.23: ...but only *by* a set...
- p.17, l.33: remove space before the comma at * Alg.1 , and...*
- , l.34: a missing comma after $a_1$.

Review #2
Anonymous submitted on 17/Apr/2023
Minor Revision
Review Comment:

The paper presents a framework for extracting global rule-based explanations of black-box classifiers. Rules simulate, in a possibly understandable way, the behavior of the the back-box classifier.
Section 1 introduces the motivation of the approach, discusses related work, and indicates the contributions of the paper.

Section 2 presents the background definitions: Description Logics and Knowledge Bases, Conjunctive Queries, graph based representations, classifiers.

Section 3 introduces the framework, where Example 1 is used as guiding example. An explanation is provided by selecting a language and a knowledge base and by introducing explanation rules. Briefly, an explanation rule is correct if the (conjunctive) queries are always answered according to the outputs of the classifier.
In this section, rules with exceptions are considered, thus the concept of precision, recall and a combination of the two are introduced to assess the performance of a rule.

Section 4 approaches the problem of computing explanations as a the problem of reverse engineer the queries. The problem is hard, so it is approached here by putting a number of constraints, see page 13. Algorithms that produce a set of queries, which are then converted to rules, given certain knowledge about a domain (an Abox here) are studied.

Section 5 contains the experimental evaluation. Experiments are designed to apply to various scenario, briefly, depending on whether and how the input data are associated to semantic descriptions.

Section 6 concludes and discusses future work.

The paper is fairly well written. The approach is well motivated and investigated in detail, although a few aspects of a fully developed model are left for future work (e.g. the case of Tbox elimination, Boolean combination of concepts).

I only have clarification questions.

1) If I understand correctly, the approach is independent of the classifier and of the features that are used by the black-box model.
In particular, concept names of the Knowledge Base are in general independent of the features used by the classifier. On the one hand, this is quite general and it is in principle applicable to any black-box. On the other hand, there is a sense in which the explanations provided by this framework are not explanations of the black-box model, we do not know how the model decided, and the explanation dataset and model do not mention information used by the classifier.
The framework, in fact, provides a way in which someone, possibly the experts of the domain, rationalize the black-box model classification by means of semantic information about the samples.

2) Reading the paper I was somehow missing a clear understanding of the contribution of the axioms of the Tbox. In Example 1 and, if I understand correctly, in the rendering of the explanation models of the experimental evaluation, ontologies appear quite simple. This may be ok for the scope of this paper, which focus on the general model and its properties.
However, the authors could improve the discussion of the motivations for ontologies (if this is so).
Are rich semantic characterizations of the concepts of the ontology useful? Do they affect the understandability for users?
Are richer DL dialects (e.g. with Booleans) significant for the quality of explanations?

3) It seems that in general the quality of the explanation model, it terms of correct explanation rules, depends on the language of the ontology, the number of its concepts, i.e. on its granularity. It seems that in very abstract terms, an explanation model with a sufficient number of classes allows for defining rules for each prediction of the classifier.
The dependence on the explanation dataset is acknowledge by the authors on page 33 (Line 50).
If this is so, then, it seems that there is a trade-off to consider, between the richness of the knowledge base and its capability of providing correct explanation rules. Discussing this aspect is important to assess the significance of the approach.

4) A comment about exceptions. It seems that also for exceptions a trade-off is reasonable. The fewer the exception the closer the explanation model is to the black-box model, so in principle having no exceptions sounds like overfitting, thus possibly replicating the opacity of the black-box model.

Review #3
Anonymous submitted on 28/Apr/2023
Review Comment:

In this paper, the authors study the problem of searching
for explanations of the black-box classifiers. The assumption is that the
data sets are equipped with a knowledge base e.g. in Description
Logic. The explanation, in the post-hoc explainability fashion, is
given as (a set of) semantic queries. The problem of searching such
queries can be seen as the reverse query answering problem, i.e.,
given query answers finding the most appropriate queries. In practice,
there is often no "perfect" explanation. Therefore, several notions of
the quality (recall, precision, and degree) of such have been
defined. The paper introduces several novel algorithms to search for
such explanations. An extensive set of experiments have been

The paper is relatively easy to read. It is a bit heavy in notations,
but this seems to be unavoidable, given that the nature of paper
touches several fields. Overall the approach of this paper is
convincing and the result is inspiring.

There are just some detailed comments/suggestions:

- some abbreviations (e.g. sav and fol) are given in lowercase. It would be better to use upper cases for readability (e.g. SAV and FOL)

- The experiment part is rather rich. How about the reproducibility? Are the developed system (KGrules) and the experiment data sets available publicly?