Machine Learning for the Semantic Web: Lessons Learnt and Next Research Directions

Tracking #: 2191-3404

Authors: 
Claudia d'Amato

Responsible editor: 
Guest Editor 10-years SWJ

Submission type: 
Other
Abstract: 
Machine Learning methods have been introduced in the Semantic Web for solving problems such as link and type prediction, ontology enrichment and completion (both at terminological and assertional level). Whilst initially mainly focussing on symbol-based solutions, recently numeric-based approaches have received major attention, motivated by the need to scale on the very large Web of Data. In this paper, the most representative proposals, belonging to the aforementioned categories are surveyed jointly with an analysis of their main peculiarities and drawbacks, afterwards the main envisioned research directions for further developing Machine Learning solutions for the Semantic Web are presented.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Accept

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Agnieszka Lawrynowicz submitted on 07/Aug/2019
Suggestion:
Minor Revision
Review Comment:

The paper surveys methods of machine learning as solutions developed for the Semantic Web, dividing them into symbolic ones and numeric ones.
Machine learning methods proved efficient in supporting Semantic Web tasks, and there have been an icreasing interest in their application in the Semantic Web, especially regarding the numeric approaches, which is what the paper also discusses.
Besides of their strenghts, the paper also points to drawbacks of current numeric machine learning approaches such as non-interpretability or lack of reasoning capabilites with respect to standard languages (especially OWL).
The paper also points to next research directions in the development of machine learning solutions for the Semantic Web, and I fully agree with the author when it comes to these directions.

Below I provide some suggestions for improving the manuscript:

1) Overall, the manuscript contains several technical words (ILP, propositionalization, embeddings etc.), which may be not known to a reader not knowledgeable in machine learning. I suggest to explain those which are not explained to make the paper self-contaied, e.g. by injecting phrases with explanations, similarly, like it is already done in some places in the paper, e.g.: "latent attributes (i.e. attributes not directly observable in the data)".

2) The paper surveys methods developed by researchers active in the field, including the author. It would be much nicer to mention their names along with the citations, when suitable.

3) It would be valuable to summarize the main, recurring peculiarities and drawbacks of the methods discussed in Sections 2-3, maybe even using some table or graphics?

4) Regarding definitions, they are in an informal style (which is perfectly OK for a position paper), but still there is some care needed:
* "embedding models (also called energy-based models)" -> are energy-based embedding models a class of embedding models or they are equivalent to each other?
* "In this context, link prediction is also referred to as knowledge graph completion." -> in what context, in the context of KGs? Are there other tasks of knowledge graph completion, beyond link prediction?

5) Numeric methods are described for one major task: link prediction. Are there any other tasks that have been tackled by numeric machine learning methods for the Semantic Web?

6) References:
It would be also nice to include a book within the topic, but of course this is up to the author:
Agnieszka Lawrynowicz, Semantic Data Mining - An Ontology-Based Approach. Studies on the Semantic Web 29, IOS Press 2017.

There is also a highly cited survey that deals with the topic of knowledge graph completion:
Heiko Paulheim, Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic Web 8(3): 489-508 (2017)

Minor issues, typos:

*** Section 1. Introduction ***
Page 1: it would be valuable to provide a reference to OWL
Page 1: "and assertion" -> "assertions"
Page 1: "some these gaps" -> "some of these gaps"
Page 2: "are illustrated is Sect. 4" -> "are illustrated in Sect. 4"

*** Section 2. Symbol-based Methods for the Semantic Web **
Page 2: "One of the first problem" -> "One of the first problems"
Page 3: "by the the employment" -> "by the employment"

*** Section 3. Numeric-based Methods for the Semantic Web **
Page 4: "Almost any reasoning" -> "Almost no reasoning"

*** Section 4. Machine Learning for the SemanticWeb: Next Research Directions ***

Page 5: "As a first step, the integration of numeric and symbolic approaches should be focused."->"The first step should focus on the integration of numeric and symbolic approaches"?
Page 5: "The main the conclusion"-> "The main conclusion"
Page 5: "how representing expressive logics within neural networks" -> "how to represent expressive logics within neural networks"
Page 6: "background knowledges" -> "background knowledge"
Page 6: "and and makes it understandable" -> "and makes it understandable"

*** Section 5. Conclusions ***
"their main peculiarities and drawback" -> "their main peculiarities and drawbacks"