Neural Axiom Network for Knowledge Graph Reasoning

Tracking #: 2852-4066

Juan Li
Wen Zhang
Xiangnan Chen
Jiaoyan Chen
Huajun Chen

Responsible editor: 
Freddy Lecue

Submission type: 
Full Paper
Knowledge graph reasoning is essential for improving the quality of knowledge graphs due to automatic mechanisms involved in KG construction which probably introduces incompleteness and incorrectness. In recent years, various KG reasoning techniques such as symbolic- and embedding-based methods, have been proposed for inferring missing triples and detecting noises. Symbolic-based reasoning methods concentrate on inferring new knowledge according to predefined rules or ontologies, where rules and axioms have been proved to be effective but are difficult to obtain. Meanwhile, embedding-based reasoning methods learn low-dimensional representations of entities and relations primarily by utilizing structural information, and the learned embeddings achieve promissing results in downstream tasks such as knowledge graph completion. These methods, however, ignore implicit axiom information which are not predefined in KGs but can be reflected through data. To be specific, each correct triple is considered to satisfy all axioms, as it is also a consistent triple. In this paper, we explore how to combine explicit structural and implicit axiom information to improve reasoning ability. Specifically, we present a novel NeuRal Axiom Network framework (NeuRAN) that only uses existing triples in KGs to address issues in the above methods. The framework consists of a knowledge graph embedding module that preserves the semantics of a triple, and five axiom modules that are encoded based on the characteristics of five kinds of axioms corresponding to five typical object property expression axioms defined in OWL2, including ObjectPropertyDomain, ObjectPropertyRange, DisjointObjectProperties, IrreflexiveObjectProperty and AsymmetricObjectProperty. The knowledge graph embedding module and axiom modules respectively calculate the probabilities that the triple conforms to the semantics and the corresponding axioms. Evaluations on KG reasoning tasks including noise detection, triple classification and link prediction show the efficiency of our method.
Full PDF Version: 

Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Simon Halle submitted on 18/Feb/2022
Minor Revision
Review Comment:

1. Originality:
This paper introduces a neural network model where attention mechanisms are exploited to learn common object properties from the OWL2 ontology standard.
Similar to what some of the co-authors proposed for the IterE model, this paper exploits axiom properties throughout its learning, but instead of learning rules guided by those properties, it uses those rules to implement an attention mechanism affecting the model embeddings.
In conclusion, it exploits recent deep learning techniques, trough attention mechanism, along with Ontology formalisms, which represents an original way of mixing deep learning and symbolic AI.

2. Significance of the results:
This paper provides tangible proofs about the value of ontologies' axiom information, in the learning process of knowledge graph embeddings.
Information on test results and the method used to produce them are well presented.
Results on the WN18RR with Hits@1 metric, being a very difficult task, are very good, and show potential advantage of this method for tasks where precision is needed in noisy (real life) settings.
The significance of those results is satisfactory but somewhat modest considering additional tests and evaluation would be needed to fully assess the level of improvement brought by this method on knowledge graph embeddings and its applicability to different type of datasets.
Nevertheless, it truly confirms the validity of the approach and applicability to certain use cases.

3. Quality of writing:
The quality of the writing is very basic, many sentences are missing words or using the wrong word, verb tense, etc., which makes reading unpleasant but does not affect the technical comprehension.

4. Paper Code/Resources:
The github reference was created but is empty (at least the public branch), which makes it impossible to review the solution code quality, reproduce results and confirm whether artifacts will be available.

Review #2
By Christophe Guéret submitted on 08/Apr/2022
Major Revision
Review Comment:

This manuscript was submitted as 'full paper' and should be reviewed along the usual dimensions for research contributions which include (1) originality, (2) significance of the results, and (3) quality of writing. Please also assess the data file provided by the authors under “Long-term stable URL for resources”. In particular, assess (A) whether the data file is well organized and in particular contains a README file which makes it easy for you to assess the data, (B) whether the provided resources appear to be complete for replication of experiments, and if not, why, (C) whether the chosen repository, if it is not GitHub, Figshare or Zenodo, is appropriate for long-term repository discoverability, and (4) whether the provided data artifacts are complete. Please refer to the reviewer instructions and the FAQ for further information.

This paper tackles the challenge of making use of semantic information when training knowledge graph embedding models. The expected outcome is better models capable of more accurate predictions but also able to be trained on less data - the semantic data making up for less exemplars. The authors introduce an approach based on 5 semantic axioms derived from the OWL ontology of the data and show a positive outcome in the experiments campaign. This is a significant piece of work likely to have an impact on the community. However the paper can not be published as is and some minor and major points needs to be addressed.

* It is rightly recalled that maintaining manually defined semantic axioms for KG is tedious but the working assumption of of NeuRAN is that such axioms are actually available. In particular OWL axioms defining the domain, range, symmetry, reflexivity and disjointness of all the classes. This is already tedious and, although desired, not always present in ontologies. It then sounds like those being a hard pre-requirement for NeuRAN the definition of axioms remains essential. This should be made more clear in the paper; and if that is a misunderstanding some more explanation should be given about the creation of the 5 axioms.
* In Section 3, generating negative triple with a random process whereas ontological axioms are available sounds like a missed opportunity. Those axioms could be used to generate harder negative triples which are logically sounds but factually wrong (instead of more easy ones which are wrong on both aspects). I wonder why this is not considered here. Furthermore the choice of random corruption over other possible strategies is not motivated.
* After reading the paper it is still unclear to me why a neural approach was needed here to incorporate the axioms in the loss function. Considering they are ontological rules, the probabilities could be booleans with a 1 if the rule is satisfied and a 0 otherwise. What is the exact gain of doing otherwise? As depicted in Figure 2, going for this approach of neural-like loss for the axioms implies having embeddings for the ontology ("Type embedding") whereas it is common to discard the TBox triples when doing KGE with semantics agnostic approaches such as TransE. But it appears that two embeddings are generated (one for ABox, one for TBox) so NeuRAN seems to still follow those best practices.

* There are a few grammatical errors in different places (e.g. "How could [...] is [...]" questions in Fig 1 - the "is" should be a "be").
* The running head title reads "Running head title"
* Notations are not always convenient and not always introduce. For example, using bold letters to indicate embeddings is not the most convenient approach for the reader. In Equation (3) the argument "t" of the loss function is not used nor introduced. I assume "t" is for "triple" and thus t=(s,r,o) \in T. Equations (1) and (2) are not aligned with the content depicted in Fig 2 where E is derived from all the P whereas those two equations use only E functions

* It is unclear from reading Table 4 what those numbers are. This is explained in the text as AUC values but not recalled as a reading guide in the caption. It is also unclear if then a difference of 0.003 (FB) to 0.008 (WN) on average between CKRL and NeuRAN results is statistically significant.
* The gold standard for each of the three sets of experiments is unclear. In particular for the triple classification it is mentioned that negative triples are generated at random but there is no indication whether those negatives where also, or not, part of the negatives in the training set. For the other two experiments it is assumed the gold standard was part of the dataset but this is not made clear in the paper
* Considering the importance of the ontological axioms, there should be a table reporting the number of them available in both FB15K237 and WN18RR. This table could be matched with a short discussion explaining if those are deemed sufficient. Maybe with too a discussion on the potential downsides on predictions if it would happen that some of those axioms are actually wrong.
* In experiments 4.3 the performance is announced to be evaluated against accuracy, precision and recall but the results for those metrics are not reported in the paper.

Stable link for resources:
* This is a major concern: as of today (April 8, 2022) the repository contains only an empty README file pushed 9 months ago.

Review #3
Anonymous submitted on 03/May/2022
Minor Revision
Review Comment:

This paper presents a novel NeuRal Axiom Network framework (NeuRAN) approach that uses existing triples in KGs to derive new consistent knowledge, useful for downstream tasks such as noise detection, triple classification and link prediction. The contribution is towards neural axiom learning.

I would suggest to add the key quantitative results of the experiments in the Introduction - that would help positioning the impact of the approach, and clearly state where is the improvement.

I would suggest to position the work against past authors' work [a] as concept of consistent knowledge is also captured, and a comparison would be appreciated to better position the work.

On the methodology, I would suggest to add a textual description, with annotation on the picture to better follow the flow diagram - may be a paragraph in the caption will help. This is a nice picture but textual description need to be added to bring value to the understanding of the approach.

Could yu extend the framework to other embeddings approaches - you mentioned "Our framework considers two translation-based embedding models TransE and TransH as basic KG embedding models.". It would be nice to understand if this is a limitation. In my understanding this is not but better to be clear.

In the experiments I would suggest the authors to explain why FB15K237 and WN18RR have been considered. What is making them suitable for evaluating your work. Please also clearly state why other dataset won't be eligible for yoour evalluation e.g., what about ConceptNet?

[a] Jiaoyan Chen, Freddy Lécué, Jeff Z. Pan, Shumin Deng, Huajun Chen: Knowledge graph embeddings for dealing with concept drift in machine learning. J. Web Semant. 67: 100625 (2021)