Review Comment:
Hybrid reasoning in knowledge graphs: Combing symbolic reasoning and statistical reasoning
Submitted by Guilin Qi on 05/06/2019 - 11:49
Tracking #: 2200-3413
Since this is an entry for the Editorial board papers, I am reviewing it in a light weight fashion, and didn't put any revision needs on it. still, I'd recommend the authors to take into consideration the comments below.
Please find attached also an annotated PDF with handwritten notes.
Some points in more detail below:
* Please check singular/plural mix in some sentences, articles and word-order and in general maybe have the whole paper grammatically proof-read. I marked several things I noticed in the attached PDF; anyway (as my handwriting is probably hard to read, don’t hesitate to get back to me, if you can’t read it.
* When you talk about "query answer” on page 2, you really mean “question answering (QA)” right? I thinks these terms shouldn’t be mixed/used interchangeably, because they refer to different things: the latter to answering (natural language) questions, the former to structured queries in a query language, which I think you didn’t mean.
* When you mention definitions of KGs, you may also want to look at our Dagstuhl report, where one chapter was about KG definitions:
cf. Piero Andrea Bonatti, Stefan Decker, Axel Polleres, and Valentina Presutti, editors. Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web (Dagstuhl Seminar 18371), volume 8, Dagstuhl, Germany, 2019. Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik. http://drops.dagstuhl.de/opus/volltexte/2019/10328
* On p.3 something’s wrong with footnote 0 … not referenced in the text, it seems.
* p.3: I think “transnational distance models” should be "*translational* distance models"
* p. 4 TCE takes two structured information —> TCE takes two kinds of structured information
* I am not entirely clear about the separation between sections 3.2 and 3.3 topicwise, as they seem to cover similar issues/methods: Could/should they be combined or can you make the separation a bit clearer? I find it hard to grasp the concrete common task addressed by the methods in 3.3, please clarify.
* 3.5 as mentioned above should IMHO be Question Answering. FYI, you also may want to have a look at our latest CIKM paper on QA, I can send you a pre-print, if you want:
Svitlana Vakulenko, Javier Fernández, Axel Polleres, Maarten de Rijke and Michael Cochez. Message Passing for Complex Question Answering over Knowledge Graphs. to appear in CIKM2019.
* On the combination of rules and ML/statistical methods for KG enrichment, you may also want to check Stefan Bischof’s thesis, who did that for a concrete domain: https://aic.ai.wu.ac.at/~polleres/supervised_theses/Stefan_Bischof_Disse...
* I was a bit wondering why you didn’t mention RDF2Vec and GloVe when talking about graph embeddings:
Cochez, M., Ristoski, P., Ponzetto, S.P., Paulheim, H.: Biased graph walks for RDF graph embeddings. In: WIMS 2017. pp. 21:1–21:12 (2017)[6]
Cochez, M., Ristoski, P., Ponzetto, S.P., Paulheim, H.: Global RDF vector space embeddings. In: ISWC 2017. pp. 190–207. Springer (2017)
Last, but not least: I like the quite comprehensive literature list of the paper! Maybe I would have hoped for one or two more critical take-aways in your open challenges in the conclusions… i.e., in terms of an estimate, how far off we are in solving these or whether the open problems you mention are feasible/solvable at all in the near future (e.g. getting existentials, disjunction, or in general complex axioms into rule learning seems to be a pretty hard nut to crack). I think in the special issue article, we are free to add opinions, and your view on that would be appreciated.
One more thing: as for the own refernces I mentioned, please feel free to ignore them, I just thought they might be interesting for you, I don’t mean to push in citations to our own work.
best regards,
Axel
|
Comments
two comments: word2vec and your six categories
Thanks for submitting the paper. I'm not a reviewer, but have read the paper and have two questions:
1. the work on RDF2Vec seems strangely absent from your overview. Is that for a reason?
2. I'm somewhat confused by your six categories (section 3). Some of your categories seem to me to be methods ("statistical relational learning") that could be used for many different goals, while others seem to me to be goals ("knowledge alignment") that could be achieved with many different methods. Does it make sense to have such mixed categories in your list? Or am I mistaken?
reply to Prof. Frank van Harmelen's two comments
Thanks for the comments. For the first question, we did not include RDF2Vec because we cannot cite all the paper about kg embedding, but we agree that RDF2Vec is important and will add a reference for it. For the second question, thanks for pointing out this, indeed, we would like to classify methods according to goals, so we will make this clear and modify the paper.
additional notes to my review...
... on top of the review, I sent my handwritten notes with some additional editorial suggestions and typo corrections to the author per email.