Review Comment:
The paper proposes a new approach for incorporating ontology constraints into knowledge graph embeddings. The basic idea is to infer new positive triples or negative triples based on the constraints of the ontology and then use different loss functions for each of this cases. Then, a joint loss function with different co-efficients on the two new loss functions is used for the optimization process. The authors also propose a new Hits@k metric to take into consideration these new derived positive or negative triples. The presented results show that the proposed framework can improve the performance. Although the paper is interesting there are still some deficiencies, especially in the evaluation. Below are the more detailed comments:
- There are some articles missing from the related work:
[A] . Meng Qu, Jian Tang: Probabilistic Logic Neural Networks for Reasoning. NeurIPS 2019: 7710-7720
[B]. Zoi Kaoudi, Abelardo Carlos Martinez Lorenzo, Volker Markl: Towards Loosely-Coupling Knowledge Graph Embeddings and Ontology-based Reasoning. CoRR abs/2202.03173 (2022)
- The related work does not contrast with the proposed framework but simply outlines the other approaches. What are the differences of the proposed approach to the related works?
- What is it that makes the inclusion of other axioms, such as assertion/class axioms to the framework, hard to add in this paper? Why focusing only on relations? A discussion on this is missing.
- The definition of the loss functions in 4.2 is a bit confusing: For the L_C, the authors state that I^- is replaced with null but in the equation 4.1 there are triples taken from that set. Similarly, in equation 4.2, what does it mean when a positive triple is taken from I_v?
- The evaluation results are missing a comparison with certain related work on ontology-guided knowledge graph embeddings, such as KALE [37], pLogicNet (see [A] above) and Iter [39]. How the proposed approach compares with them.
- I assume that the training time results presented in Table A6 do not include the time to derive the sets D+ and D-. Is that correct? If yes, these times should also be reported because they are part of the training process.
|