Review Comment:
Summary:
This paper presents an innovative approach to noise-tolerant RDFS reasoning by combining a layered RDF graph approach, its encoding in 3D tensors and mapping RDF graphs and their inference graphs using a GRU-based architecture. The inference graphs are generated using the rule-based reasoner Jena, the performance of which also provides the main comparison for the evaluation of the empirical setting, tested on the LUBM dataset and a DBpedia subset. In comparison, the RNN reasoner performed reasonably well on propagable noise (impact on inference graph) but was outperformed by Jena on non-propagable noise (no impact on inference graph). In a rather exhaustive description the paper contributes a typology of noise types, a layered graph model based on 3D tensors, a method for learning graph embeddings, and a GRU-based architecture to learn RDFS rules in a noisy environment.
In spite of a detailed and generally well written mode of presentation, explanations are exhaustive and could benefit from reduction and/or restructuring. For instance, a lengthy section on different graph types does not contribute to the readability of the paper or substantially to its content. This tendency to go into a lot of detail can be observed in other sections as well. Nevertheless, aside from the following comments, the motivation for conducting this research is well-motivated, sufficiently novel, and its approach is well within the scope of this special issue. Several of the claims made, however, need proper reflection in the face of missing related work/events and the entire argumentation and structure of the paper require a stronger focus (in particular Section 5).
Major comments:
- clear method overview: A succinct and detailed overview of the major methodology with all its individual steps and methods chosen for each step somewhere at the beginning of the paper would strongly contribute to its readability. There is an outline, however, it does not provide a very good overview of the process explained later, but instead only indicates where the individual steps are explained in detail.
- evaluation: The current evaluation metrics are quite opaque and difficult to interpret. Without any comparison to other models or datasets as baseline not much can be gathered from the presented metrics. Since the datasets are not standard knowledge base completion datasets, it would be useful to maybe apply a standard link prediction model, such as TransE, to the proposed datasets for better comparison or apply the proposed model architecture on one of the standard datasets.
- evaluation corrupted triples: how does the approach handle valid GRU-generated triples that are not in the OWL LUBM knowledge base against which their validity is tested?
- design choice: why this RNN architecture superior over others? There is a bit of a lack of motivation for the chosen architecture. What are the merits of a simple sequence-to-sequence architecture over more recent graph-based architectures, such as the one used in the following work? Several design choices could be motivated, such as the positioning and high number of dropout layers. This motivation could also be of empirical nature.
Michael Schlichtkrull, Thomas Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, Max
Welling (2018) "Modeling Relational Data with Graph Convolutional Networks" Proceedings of
ESWC 2018
- related work is lacking (see below)
- deep learning is not equivalent to "classification algorithms" as stated on page 3
Related work:
- embedding generation: what is the relation of the proposed embedding learning approach and RDF2Vec or other knowledge graph embedding methods? I think it would be good to explain why RDF2Vec should not be sufficient for this scenario.
- knowledge graph completion approaches are related enough to be included as related work (e.g. Dong, X., Gabrilovich, E., Heitz, G., Horn, W., Lao, N., Murphy, K., ... & Zhang, W. (2014, August). Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 601-610). ACM.); Additionally, Socher has a paper on knowledge-base completion using a similar method as the work cited in this paper, which would be much more related than the one cited; in fact the paper even claims later to be "capitalizing on the emerging research on deep learning for graphs" - but none of it is provided
- I fail to comprehend how ontology learning from text and the Section 2.2.2 are related to the proposed approach in any way - I strongly suggest removing those two sections and in general focus on approaches/details that are actually related to your approach throughout the whole paper (this shortens the paper and resolves some issues of the argumentation)
- if the Bipartite Graph Model is not suitable for this approach, there is no reason to describe it in detail - the same goes for the Metagraph and the Hypergraph model
Other comments:
- architecture description: In the architecture description there is a mix of implementation specific details and architectural details. For instance, cuDNN and GRU described at once. The hyperparameters should be described as parameters rather then values.
- some of the claims raised: as indicated above a large bulk of related work is missing. Thus, claims about there being no approaches to bridge the "Neural-Symbolic Gap" are problematic since there are so many Knowledge Graph Embeddings, including some specific to RDF graphs (all of which are "suitable for neural network input")
- similarly the claims raised about "initiating the communication" on Deep Learning and Semantic Web are not quite justified - the authors submitted to a special issue on Semantic Deep Learning, which is a series that has been providing a platform to bring together Deep Learning experts and Semantic Web researchers for more than a year, now holding its 4th workshop this year; in line with this comment, I am also not sure that the author's classification of Semantic Web and Deep Learning research really also reflects approaches on Semantic Web injection, e.g. disjointness axiom injection, in the process of training Deep Learning models. In fact, the whole classification does not contribute to the claims made in the paper.
Minor comments:
Many references are missing (e.g. "Table ??" or Fig. 6 is missing completely - only the caption shows)
Please add to the running text that some algorithms and tables are in the appendix, e.g. Appendix B. Algorithm 2
- the references to the RDFS entailment patterns either should reference the online resource or the table provided in the paper so that the reader understands what rule RDFS9 refers to
- thousands separator (e.g. 17,189 instead of 17189) strongly increases readability of larger numbers throughout the paper
- variables used in running text should be written in italics, e.g. (s,p,o) => the "s", "p", "o" afterwards in running text
Spelling of the properties:
- according to RDF Schema 1.1 the namespace is rdfs and not RDFS; the same applies to the namespace "rdf" in RDF properties
- dbr also camel-cases, e.g. dbr:Semantic_Web rather than dbr:Semantic_web => please ensure that all properties are spelled correctly
Minor comments on orthography/language in order of appearance:
"the noise can be as a consequence" => the noise can be a consequence of
"efforts in noise-tolerance" => "efforts on noise-tolerance"
references for the claims of "many researchers" and "current work" would be nice in the introduction Section 1.1.
"combing sound symbolic reasoning with" => combining?
"The research hypothesis are" => hypotheses
"described respectively in ??" => there is some reference missing
"based on Lehigh University Benchmark" => based on the LUBM
"In SDType" => In the SDType algorithm
"infer the the" => infer the
"aim full" => "aim at full"
"Spacial region" => did you mean "Spatial"?
"domain specific" => "domain-specific"
"classes hierarchy" => "class hierarchy"
"different than" => different from (a number of times)
quantification of "University" is missing in Table 4
"foundations of the graph theory" => foundations of graph theory
"order triple (.." => closing bracket missing (p. 9)
"S ubj-obj(T)" => Subj-obj(T)
the text does not fit Figure 3
Definition 6 misses closing brackets ")"
"non isomorphic" => "non-isomorphic"
Figure 5 is barely readable in the current size - enlarge font?
Caption in Table 7 contains ??
"to not update" => "not to update"
"non zeros values" => "non-zero values"
"encoding the inputs graph" => input graph
"This way when the" => "This way, when the"
"the layer dul:isDescribed By" => dul:isDescribedBy
"be searched' two" => be searched two
I am not sure Figure 7 truly contributes to the content of this paper
"set of subjects and objects resources" => subject and object resources
"layers through out" => throughout
"the goal of words embedding" => word embeddings
"words embedding also solves" => word embeddings also solve
"seventeen thousands" => thousand
"Few hyper-parameters needed to be changed though" => A few hyperparamters
"training speeds for both models are" => "training speed for both models is"
NVidia => NVIDIA
12 Gb => GB
this qusi => quasi
the formulas on page 24 extend into the text of the other column
There is a reference to Section 7.2.3 describing its content within Section 7.2.3 (at its end)
"scientists dataset" => Scientists dataset
"88 inference" => inferences
"adversarial generative models" => generative adversarial models
"it can not only" => cannot
|
Comments
Figure missing
Is Figure 6: 3D Adjacency matrix on page 12 missing?
Yes, thanks for the
Yes, thanks for the observation!
I uploaded the figure here:
https://drive.google.com/file/d/1KgHBXAPlCMHI0oAmDEHjSLufbLzpTv8K/view?u...