Building Relatedness Explanations from Knowledge Graphs

Tracking #: 2039-3252

Authors: 
Giuseppe Pirrò

Responsible editor: 
Guest Editors Knowledge Graphs 2018

Submission type: 
Full Paper
Abstract: 
Knowledge graphs (KGs) are a key ingredient to complement search results, discover entities and their relations and support several knowledge discovery tasks. We face the problem of building relatedness explanations, that is, graphs that can explain how a pair of entities is related in a KG. Explanations can be used in a variety of tasks; from exploratory search to query answering. We formalize the notion of explanation and present two algorithms. The first, E4D (Explanations from Data), assembles explanations starting from all paths interlinking the source and target entity in the data. The second algorithm E4S (Explanations from Schema) builds explanations focused on a specific relatedness perspective expressed by providing a predicate. E4S first generates candidate explanation patterns at the level of schema; then, it assembles explanations by proceeding to their verification in the data. Given a set of paths, found by E4D or E4S, we describe different criteria to build explanations based on information-theory, diversity and their combination. As a concrete use-case of relatedness explanations, we introduce relatedness-based KG querying, which revisits the query-by-example paradigm from the perspective of relatedness explanations. We implemented all machineries in the RECAP tool, which is based on RDF and SPARQL. We discuss an evaluation of the explanation building algorithms and a comparison of RECAP with related systems on real-world data.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Accept

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 07/Nov/2018
Suggestion:
Accept
Review Comment:

The manuscript has been improved significantly with respect to the previous version. I am satisfied with the clarifications on my comments. I also appreciate the additional content the author added to answer my questions. I think that now the motivation for the problem and the role of the two different algorithms is clear, and the additional experimental details provided useful clarifications for reproducibility.

A few minor comments:
- Section 2.1: “The high-level objective of this paper is to tackle the problem of explaining knowledge in KGs.” I think this sentence is vague and too broad; I would replace “knowledge” with “relationships”.
- Section 7.2.1: “..for instance, were linked by a *larger* number…”
- Section 7.3.1: “Running times are *shown* in Fig. 20”
- Section 7.3.2.: “..to allow the generation of *focused* explanations..”

Review #2
Anonymous submitted on 15/Nov/2018
Suggestion:
Minor Revision
Review Comment:

I would like to thank the author for the additional work done to improve the paper and for the detailed answers to my comments. Additional content was helpful for me to understand some aspects of work better.

However, some issues still remain, in my view.
1. The discussion of E4D vs E4S emphasizes that E4S is useful where one wants to explore relatedness between entities with a fixed target predicate. Isn’t it possible to achieve the same with E4D by using the relatedness predicate explicitly in the connectivity pattern queries (page 6)? Perhaps this should also be clarified.

2. Discussion of related work misses one relevant domain, namely semantic search where a similar task often has to be tackled: a search system needs to find relationships between candidate entities that match the user keywords (e.g., see [1]).

3. A thorough check for typos and language issues would still be helpful. A couple of things I noticed:
p. 3: The remained of the paper -> The remainder of the paper
p. 5: The underlying assumption of the E4D algorithm is to the data via the query endpoint -> Sentence not clear, should be rephrased.
p. 6: E4D can lead to a potential large … -> potentially
p. 8: an integer d to bound the length of the pattern -> to bind, to restrict
p. 20: do not provide a high contribute -> contribution

References:
1. Peter Haase, Daniel M. Herzig, Mark A. Musen, Thanh Tran: Semantic Wiki Search. ESWC 2009: 445-460

Review #3
By Dennis Diefenbach submitted on 30/Nov/2018
Suggestion:
Minor Revision
Review Comment:

Thank you for the revisited version and the replies. I still see two mayor points that in my opinion need to be addressed:

1) About the code.

- There is a README missing explaining briefly the project. Describe for example the output format which is not obvious.
- The code is stored on dropbox, which is a quite unusual place to publish code (better github, bitbucket)
- the UI is not public

2) All the experiments were performed on a MacBook Pro with a 2.8 GHz i7 CPU and 16GBs RAM. Results are the average of 3 runs. —> I didn’t thought much about that the first time, but the query result will be cached after the first run. So making 3 runs does not make any sense!?! Could you position with respect to this.

Minor points

1. Introduction
I would add the following sentences to make it clear how E4S works.

To meet this needs, we introduce a second algorithm called explanations from the schema. The goal is the same as in E4D, find pathes between to entities in order to make their relatedness explainable. Differntly from E4D, E4S also assumes that a target predicate is also specify that drives the selection of the path to a specific knowledge domain. …..

2. Background and Problem description

A KG is a directed node ??? -> is a node and edge labeled, directed, multi-graph

2.1

wa -> was

3.1

the underlaying assumption of the E4D algorithm is to data via -> mistake

Definition 12 -> the type constrains are missing

Cordially
Dennis Diefenbach