Review Comment:
The paper proposed a new approach/method for KG relation prediction based on three subtasks - SubGraph2seq to infer hierarchical information on a single subgraph ; a common preference inference to lear homogeneous info between 2 subgraphs and an alternative induction method consisting of collecting relations in each subgraph entity of both head and tail entities to infer the relation.
Please, find below a detailed review of the paper with comments, questions and suggestions.
(1) originality.
The paper is quite original based on the proposal of a new type of alternative induction method compared to the current SoA.
- In "may encounter challenges with rare relations or complex subgraph structures" - is it possible to give more details regarding what type of challenges and what can be done to mitigate them?
(2) significance of the results
The authors demonstrated with sound experiments that their results are far better compared to different existing approaches in the relation prediction settings.
- Page 8: What is the preferred value of lambda (Eq. 7) used for the experiment?
- The conclusion gives the impression that the solution does not scale on real-world KGs. Hence, how useful is the method if it is not to be used in real-world scenarios?
(3) quality of writing.
The paper is easy to read, but there are still many typos to be reviewed.
- Many capital letters used after a "," (e.g., Most, Specifically in Page 2 - Please review all such cases in the paper)
- Consistent use of concepts - Sometimes we read "Subgraph-to-Sequence (S2S)", and in a different part "Subgraph2seq" (P.5)
- I suggest reviewing the first part of section 4.2 to align the example with the figure.
- Consistent use of the term Eq.(x) with Formula (x). In Page 8, there is a mix of both terms. Please, consider one and use it throughout the paper.
(4) experiment replication
It is missing a reference to a repository online (e/g/, Github, Zenodo, etc) containing the datasets, the algorithm implemented in a given programming language to be able to replicate the experiments. I encourage the authors to make such data available for transparency.
Questions
========
- In Figure 2, what is the meaning of "e"? t and h are explained in the text. Where is this "e" coming from? Please, clarify and/or add a legend. Additionally, add a space between the number and the name of the task. s/(1)Subgraph preparation/(1) Subgraph preparation.
- I am curious to hear from the authors why they don't consider RDF graph databases present in the Linked Open Data Cloud for their experiments; such as DBpedia, Wikidata, etc. (see section 4.1)
- P. 7: "Taking Figure 2 as an example, for the entity "William Shakespeare"" - Is it Figure 1? It is misleading. Please, review this sentence because maybe the example is not the same as in the Figure.
- P. 10: "the optimal parameters of are finally.." - It is missing something after "of".
- P. 11: Add in Table 1 a row with the total triples per datasets.
- P. 11: What does "sparse KG" mean? Please, add a definition of this term.
- P. 12: "As can be seen from the results in 3 to 7" - You mean from Figure 3 to Figure 7?
- I suggest replacing "Ours" by "HiHo" in the different Figures (3-7) and Table 3.
- Page 13: how to quantify dense in those datasets mentioned? Any references to what is called "some real-world KGs"?
|