HiHo: A Hierarchical and Homogenous Subgraph Learning Model for Knowledge Graph Relation Prediction

Tracking #: 3654-4868

Authors: 
Jiangtao Ma
Yuke Ma
Fan Zhang1
Yanjun Wang
Xiangyang Luo
Chenliang Li
Yaqiong Qiao

Responsible editor: 
Guest Editors KG Gen from Text 2023

Submission type: 
Full Paper
Abstract: 
Relation prediction in Knowledge Graphs (KGs) aims to anticipate the connections between entities. While both transductive and inductive models are incorporated for context comprehension, we need to focus on two primary issues. First, these models only collate relations at each layer of the subgraph, overlooking the potential sequential relationship between different layers. Second, these methods overlook the homogeneity of subgraphs, thus impeding their ability to effectively learn the importance of relationships within the subgraphs. To address this challenge, we propose a hierarchical and homogenous subgraph learning model for knowledge graph relation prediction (HiHo). Specifically, we adopt a subgraph-to-sequence mechanism (S2S) to learn the potential semantic associations between layers in the subgraph of a single entity, and thus model the hierarchy of the subgraph. Then, we implement a common preference inference mechanism (CPI) that assigns higher weights to co-occurrence relations while learning the importance of each relation in the subgraphs of two entities, and thus model the homogeneity of the subgraph. In our study, we sequentially employ induction on each layer of subgraphs pertaining to the two entities for relation prediction. To assess the efficacy of our method, we perform experiments on five publicly available datasets. The results of our experiments demonstrate that our method surpasses the current state-of-the-art baselines in both transductive and inductive settings.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 27/Aug/2024
Suggestion:
Minor Revision
Review Comment:

The paper proposed a new approach/method for KG relation prediction based on three subtasks - SubGraph2seq to infer hierarchical information on a single subgraph ; a common preference inference to lear homogeneous info between 2 subgraphs and an alternative induction method consisting of collecting relations in each subgraph entity of both head and tail entities to infer the relation.

Please, find below a detailed review of the paper with comments, questions and suggestions.

(1) originality.

The paper is quite original based on the proposal of a new type of alternative induction method compared to the current SoA.
- In "may encounter challenges with rare relations or complex subgraph structures" - is it possible to give more details regarding what type of challenges and what can be done to mitigate them?

(2) significance of the results
The authors demonstrated with sound experiments that their results are far better compared to different existing approaches in the relation prediction settings.
- Page 8: What is the preferred value of lambda (Eq. 7) used for the experiment?
- The conclusion gives the impression that the solution does not scale on real-world KGs. Hence, how useful is the method if it is not to be used in real-world scenarios?

(3) quality of writing.
The paper is easy to read, but there are still many typos to be reviewed.
- Many capital letters used after a "," (e.g., Most, Specifically in Page 2 - Please review all such cases in the paper)
- Consistent use of concepts - Sometimes we read "Subgraph-to-Sequence (S2S)", and in a different part "Subgraph2seq" (P.5)
- I suggest reviewing the first part of section 4.2 to align the example with the figure.
- Consistent use of the term Eq.(x) with Formula (x). In Page 8, there is a mix of both terms. Please, consider one and use it throughout the paper.

(4) experiment replication
It is missing a reference to a repository online (e/g/, Github, Zenodo, etc) containing the datasets, the algorithm implemented in a given programming language to be able to replicate the experiments. I encourage the authors to make such data available for transparency.

Questions
========
- In Figure 2, what is the meaning of "e"? t and h are explained in the text. Where is this "e" coming from? Please, clarify and/or add a legend. Additionally, add a space between the number and the name of the task. s/(1)Subgraph preparation/(1) Subgraph preparation.
- I am curious to hear from the authors why they don't consider RDF graph databases present in the Linked Open Data Cloud for their experiments; such as DBpedia, Wikidata, etc. (see section 4.1)
- P. 7: "Taking Figure 2 as an example, for the entity "William Shakespeare"" - Is it Figure 1? It is misleading. Please, review this sentence because maybe the example is not the same as in the Figure.
- P. 10: "the optimal parameters of are finally.." - It is missing something after "of".
- P. 11: Add in Table 1 a row with the total triples per datasets.
- P. 11: What does "sparse KG" mean? Please, add a definition of this term.
- P. 12: "As can be seen from the results in 3 to 7" - You mean from Figure 3 to Figure 7?
- I suggest replacing "Ours" by "HiHo" in the different Figures (3-7) and Table 3.
- Page 13: how to quantify dense in those datasets mentioned? Any references to what is called "some real-world KGs"?

Review #2
By Janneth Chicaiza submitted on 03/Sep/2024
Suggestion:
Minor Revision
Review Comment:

The subject on which the proposal is focused is interesting, bellow the authors can find some comments.

(1) Originality. The manuscript entitled “HiHo: A Hierarchical and Homogenous Subgraph Learning Model for Knowledge Graph Relation Prediction” is original because it proposes a new learning model for knowledge graph relation prediction.

For context comprehension, the authors introduce two key components in their proposal. First, a subgraph-to-sequence mechanism (S2S) captures potential semantic associations between layers within a single entity's subgraph, thus modelling the subgraph hierarchy. Second, a common preference inference mechanism (CPI) assigns higher weights to co-occurring relations while learning the importance of each relation within subgraphs of two entities, thus modelling subgraph homogeneity. By including these two components, this proposal tackles two problems of the traditional inductive and transductive models 1) they disregard potential sequential relationships between different layers and 2) overlook the homogeneity of subgraphs, thereby hindering effective learning of relationship importance within them.

(2) Significance of the results. For experiments, the authors test their proposal by using 5 benchmark datasets, setting different parameters for training and using the metrics MRR and HIT to compare results with other relation extraction prediction methods. According to the experimental results shown in Table 2, HiHo, the proposed method, outperforms other methods in all datasets.

In addition to highlighting the advantages of the proposed method, the authors identify the limitations or possible problems of their proposal—for example, the prediction of rare relations or complex subgraph structures. Additionally, the performance of the method could be affected by the density of subgraphs and the sparsity of relations in the knowledge graph.

To facilitate the replication of experiments, it is important that authors attach a resource file that includes the source of the methods or share a prototype to run tests.

(3) Quality of writing.

In general, the manuscript is easy to follow and the language used is appropriate, but it requires the revision of some points, for example, the nomenclature: in Figure 2, the terms and are incorporated, but they have not been found in the text. Likewise, in that figure, in the fourth component "Alternately induct mechanism" is mentioned, but in the explanatory text "Alternating Induction Mechanism" is mentioned.

Review #3
Anonymous submitted on 10/Sep/2024
Suggestion:
Minor Revision
Review Comment:

The paper introduces a novel approach to knowledge graph relation prediction using a hierarchical and homogeneous subgraph learning model, denoted as HiHo. This model integrates a subgraph-to-sequence (S2S) mechanism and a common preference inference (CPI) mechanism to enhance relation prediction accuracy. While the paper is well-written and presents a compelling approach, several aspects require further clarification for improved comprehensibility and reproducibility.
Key Points:
1. The terms "historical" and "future" (or "previous" and "later") state sequences used in the GRU processing are ambiguous. It is important to clarify what each term specifically refers to within the context of the model.
2. The paper lacks a detailed explanation of the Bidirectional Gated Recurrent Units (Bi-GRU) architecture. Bi-GRU processes input sequences in both forward and backward directions to capture contextual information from both past and future states. An in-depth explanation of this architecture is necessary to understand its role in the proposed model.
3. State Sequence Transformation:
◦ The process of state sequence transformation within the approach is not clearly described. A more detailed explanation of how state sequences are transformed would be beneficial for understanding the model’s operational mechanism.
4. Clarity on State Sequence Utilization:
◦ The paper does not clearly specify which state sequence is used for different operations. Clearly stating the sequence used at each step is essential for reproducibility and a complete understanding of the methodology.
5. Output from Bi-GRU:
◦ There is confusion regarding the use of the output from the Bi-GRU. It is unclear whether the N vector or another form of output from the GRU is being employed. This needs explicit clarification to avoid misunderstandings.
6. Explanation of Equations 3, 5, 7, and 8:
◦ The purpose and context of equations 3 and 5 are not adequately explained, which hampers comprehension of their role within the model. Similarly, the outputs of equations 7 and 8 are not well-defined. Providing clear explanations for these equations would aid in understanding the computational flow and their contributions to the model.
7. Clarity on Specific Sections:
◦ The content on lines 25-27 on page 8 is unclear. Revising this section for better clarity is recommended to improve the reader's understanding of the material presented.
8. Presentation of Algorithm 1:
◦ The presentation of Algorithm 1 is unclear, making it difficult to follow the proposed method. A step-by-step walkthrough or additional explanatory notes could enhance clarity and help readers grasp the algorithm’s implementation.
9. Redundancy in Lines 22 and 23:
◦ Lines 22 and 23 appear to be identical. Revising these lines to eliminate redundancy would improve the manuscript's readability.
10. Definition of "Score":
◦ The term "score" on line 32 is not clearly defined. Providing a definition or context for this term is necessary to understand its significance within the model.
11. Integration of CPI with GRU:
◦ The integration of the CPI mechanism with the GRU is not thoroughly described. A detailed explanation of how CPI interacts with the GRU would enhance understanding and demonstrate the interplay between these components.
12. Parameter Variation:
◦ It is not mentioned whether different values of parameters k and l were tested. Discussing the effects of varying these parameters could offer insights into the model’s robustness and performance under different settings.
Summary
The paper presents an innovative approach to knowledge graph relation prediction through the HiHo model, employing S2S and CPI mechanisms. However, several key aspects of the methodology and implementation require further clarification. Providing additional details and resolving ambiguities in the paper would significantly enhance its comprehensibility and impact. Overall, while the approach is promising, clearer explanations and thorough descriptions of the model components and processes are necessary for readers to fully appreciate and replicate the proposed method.