Review Comment:
Summary:
This paper presents a new method for embedding horn rules in a knowledge graph as queries to improve link-prediction.
Discussion:
The paper presents an initially very interesting idea, but very quickly loses its motivation with a great deal of quite dense definitions and methodology that seems mostly ad-hoc. Accompanied with positive yet mediocre results this extreme level of seemingly arbitrary formalism does not feel warranted, and makes it seem more like the experiment was mostly designed through iterated trial-and-error until better numbers were achieved rather than the actual pursuit of a new method. Specifically sections 3.2 - 4 are concerning. It may be the case that all steps were deliberate and designed to test an overarching idea but this is not apparent from the writing. Additionally, though none of the errors were significant enough to cause concern, there are many very small language errors in the text that don't take away from the idea but do interrupt reading the paper. Lastly, Table 3 feels disingenuous and it should be changed in the case where DegreEmbed is shown to be optimal when it is in fact a tie, or at least more precision should be included to show it is not a tie. If the authors are able to motivate the work better, streamline some of the definitions, edit the tables a bit, and correct the many minor English errors then I believe it may be good enough to accept, though this is not certain.
Evaluation:
Accept only with major revisions
Notes ([#,#,#] refers to page, column, line number of note):
Too many very minor English errors were found to reproduce all of them. None felt problematic but it should be thoroughly checked to make sure it reads smoothly.
It may be better not to speculate on the causes of incompleteness for KGs in the abstract and introduction. This is certainly an issue but it is not certain exactly why this happens in any particular case, and it may even be unavoidable.
The notion of path accompanied by a max length seems highly susceptible to influence by the presence of reflexive triples, which may not exist in the studied KGs but are probably allowed (depending on the source some may even be implicit) and could massively increase the number of possible paths with unhelpful and redundant information.
In Definition 2 is almost like you are computing precision, in which case it may be easier to redefine the reasoning as a prediction task with TPs etc
Given the previous comment, it almost feels like Definition 4 could be modified to represent a recall analog and you could you get an F1 score which would not need any definition.
3.2 all the way to the end of 4 either needs some motivation to explain why so many complex equations are presented so that readers can connect them to the broader idea, or at the very least it can be changed to an overview that shows why certain methods were chosen with the equations moved to a separate section. It is not that they are incorrect, as far as I can tell, but rather it is quite difficult to understand why these particular methods are chosen. To simplify this comment: the "how" of the method is abundantly clear, the "why" is not obvious to me and possibly absent entirely.
[7,2,3] "Transferred" should be clarified, since my understanding is that the model must be retrained on other sources, even if it can still work, and the wording makes this ambiguous.
Table 3 has scores that are not bold but equal to the DegreEmbed results. This could be interpreted as a dishonest presentation unless it is clarified.
Table 4 RotatE WN18 MME is missing a decimal
Tables 5-7 are quite large and could be reduced to a few select examples, reproducing so many results explicitly is not helping show the overall performance of the method.
|