Building Relatedness Explanations from Knowledge Graphs

Tracking #: 1945-3158

Giuseppe Pirrò

Responsible editor: 
Guest Editors Knowledge Graphs 2018

Submission type: 
Full Paper
Knowledge graphs (KGs) are a key ingredient for searching, browsing and knowledge discovery activities. Motivated by the need to harness knowledge available in a variety of KGs, we face the problem of building relatedness explanations. Relatedness explanations are graphs that can explain how a pair of entities is related in a KG and can be used in a variety of tasks; from exploratory search to query answering. We formalize the notion of relatedness explanation and present two algorithms. The first, E4D (Explanations from Data) assembles explanations starting from paths in the data. The second algorithm E4S (Explanations from Schema), starting from the KG schema, allows to find paths driven by a specific relatedness perspective. We describe different criteria to build explanations based on information-theory, diversity and their combination. As a concrete application, we introduce relatedness-based KG querying, which revisits the query-by-example paradigm from the perspective of relatedness explanations. We implemented all machineries in the RECAP tool, which is based on RDF and SPARQL. We discuss an evaluation of our explanation building algorithms and a comparison of RECAP with related systems on real-world data.
Full PDF Version: 

Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 26/Aug/2018
Major Revision
Review Comment:

The paper is an extension of a previous conference paper [10] and concerns the generation of explanations from a knowledge graph: subgraphs showing different connections between given entities. The original paper [10] proposed an algorithm for extracting such explanations from data (E4D) and introduced a tool RECAP for visualizing them. The current paper proposed an alternative additional method for generating explanations by verifying the patterns extracted from schema (E4S). To reduce the size of the explanation graph, the paper proposes a number of techniques for ranking alternative explanation paths and selecting the most informative ones.

The topic is relevant for the Special Issue call and consider an important practical issue of visualizing the knowledge graph information in the most informative way: this is a problem arising in many real-world use cases. It is also important that the explanations are hard to evaluate in a quantitative way: “informativeness” is often subjective. From this perspective, the paper makes a step towards addressing an important issue.

However, some aspects of the approach do not appear fully clear and convincing and should be motivated and explained more. In particular, the role and the value of the E4S algorithm is unclear. Is it merely an alternative to E4D aiming at improving scalability in some scenarios or is it a complementary method and its results should be combined with those of E4D? Section 6 suggests an either/or approach while the evaluation focuses on the comparison between E4S and E4D. The discussion provided in 7.3 merely compares E4D and E4S and does not clarify the conclusion: is one approach consistently better than another in all respects? It is also not explicitly stated which algorithm produces more informative explanations: e.g., are smaller explanation graphs obtained by E4S sufficient to discover the most informative/diverse explanations paths?

The role of the target predicate in the E4S approach is also not completely clear. Definition 11 suggests that the target predicate p^* should have the domain and range consistent with each intermediate type t_x and t_y in the explanation pattern in the sequence. Is it a typo? That would be too restrictive and it does not fit with the examples, where the target predicate is the one which has domain and range consistent with type(w_s) and type(w_t) respectively. Moreover, in the evaluation it appears that having the predicate which has either the domain consistent with the source entity or the range consistent with the target entity is sufficient.

One more aspect which potentially limits the applicability of E4S is its reliance on an explicit schema. Is it necessary that the ontology contains all domain/range axioms explicitly? This is a rather rare case in many, if not most, practically used KGs: the axioms are often missing or underspecified (domain/range too generic), while the data can contain exceptions.
How strong is the impact of such issues and is there a mechanism to cope with them? The evaluation of E4S was only performed on DBpedia. Was it because Wikidata does not use a standard RDFS schema? Can E4S be adapted to use the schema vocabulary of Wikidata?

Evaluation was performed on DBpedia and Wikidata knowledge graphs. Did the experiments use the public endpoints of the datasets or local installations? It is unclear to which extent the performance results reflected different algorithm settings and to which extent the capabilities of the underlying triple stores (e.g., Virtuoso and Blazegraph). Could one query type have advantage on one dataset and underperform on another? This factor probably did not have a strong impact on the evaluation, but should be mentioned and discussed.

The notation used is sometimes confusing. E.g., what do PF and IPF introduced in section 4.1 stand for? Moreover, the abbreviation PF is then reused in section 5, but is defined differently. These should be clarified.

In summary, I would say that the paper should better discuss the motivation, intended role, added value, and applicability criteria of E4S (which is the main additional contribution in comparison with [10]).

Some typos:
- P. 1 “One common-need for many classes of knowledge discovery tasks is that explaining the relatedness between entities.” -> … is explaining …
- P.4 “We are now ready our first algorithm, which discovers paths useful to build different kinds of relatedness explanations by directly looking at the KG data” -> maybe, “ready to present our first algorithm” or “We now present our first algorithm”?
- P.8: “Can be done in constant type” -> “Can be done in constant time”
- P. 18: “common to the source of target entity” -> ? From the context, it seems to mean “source and target entity”, is this correct?

Review #2
By Dennis Diefenbach submitted on 04/Sep/2018
Major Revision
Review Comment:

This paper tackles the problem of finding, give two entities x1, x2, the most significant paths connecting them in a KG, so that a user is able to understand what relates the two entities. This main contributions of the paper are two algorithms to determine the paths between the two entities, some metrics to rank the found paths and a user study that compares a tool based on the introduced techniques with existing ones.
Note: I'm not an expert in the field, in particular I'm not aware of the state-of-the-art. I reviewed this paper as someone seeing this problem for the first time but having expertise in the field of Semantic Web.

Usual dimensions for reviewing the paper:
(1) originality, (2) significance of the results: The results taken singularly to not represent a breakthrough, but as a whole they give a nice contribution and overview to the tackled problem.

(3) quality of writing: the paper contains many mistakes (typos, problems with the definitions). On the other hand it is well organised and didactically well presented.

Main points:

- Please take care of integrating the comments below in a new version of the paper. There are things that are difficult to understand, sometimes I found errors in the definitions (which definitely doesn't help). In general the E4S is not well introduced, neither in the introduction 1, nor in section 4. In particular it was not clear for me at the beginning that you still fix two entities. Please also revise section 4.1., in particular the rational behind you measures.

- In Section 2 and 3, you ignore the schema properties like type, but you do not state it clearly. Right? Moreover the query patterns defined in section 3 do not take this into account. I guess that a huge performance gain can be expected there by filtering (excluding) rdf:type and wd:Q31 edges since these are generally very dense (ex dbo:Person, wd:Q5)

- The tool you created is not online. I see in this a big problem since your approach is not easily reproducible. Please either make it available as a web-service or put the code on github.

- The evaluation contains a big drawback. You are describing all performance metrics on the client side, but the most of the work happens on the remote sparql endpoints. And you are not using some SPARQL endpoint on a small infrastructure, but exploiting the dbpedia and wikidata endpoints which are designed to be able to handle a lot of requests. The best think would be to repeat the experiment on a local endpoint with some reasonable infrastructure and then get the performance metrics. If this is not done this must at least be critically discussed in the paper.

- You describe that you are only using SPARQL queries and consider this as a big plus. Surly it has it advantages but the big disadvantage is performance. I would also clearly state the bad part, i.e. that with suitable indexes you can have response times in milliseconds. This for me is a possible future work. SPARQL is known to not perform well on typical graph search operations like breath-search.

Minor Points

- First sentence, what you mean with “searching acttivities”? In the KG or in web search using KG? The same for browsing?
- Second sentence, you motivate relatedness before saying what it is in the next sentence, moreover the motivation is not so clear, especially “in a variety of KG”, sounds like multiple at the same time
- Before introducing E4D and E4S you should say how relatedness is coupled with paths
check diversity!

- KG that are mantaining
- an even large number ? english ?
- extracting knowledge from KG has applications in twitter analysis? I think extracting is the wrong word here.
- One common-need for many knowledge discovery tasks is the explanation of relatedness between entities. The original sentence is not english
- fig 1 strange order of letters

- Definition 1, triple pattern mot triple
- why did you not introduce a definition for KG and schema closure?
- In the definition 2 of schema graph closure you start from the schema closure S. S already contains the inferred triples, so why do you write (asserted or “infereed via RDFS reasoning”)? The same thing in the example? In Fig 2a you have the schema not the schema closure.
- I’m not an expert of RDF formalization. Did you made the distinction between KG and Knowledge Base or can you cite a paper defining it like this?

- the objective is to tackle the problem of explaining knowledge in KG, to fuzzy
- output, you speak about relatedness explanation, but it is not clear what it is at that point

- after definition 6, we are now ready “for”
- does the community agree on this definition of explanation, especially minimal explanation of are you the first formalising it?

- example 8, the third pattern is wrong
- figure 8, line 3 createThread, function called pathThread
- the algorithm leverages a monitor class an instance “of” such a class
- definition 10, definition 11, schema graph was called before schema graph closure
- definition 11, is this correct? I mean the domain and range of EVERY property should not be equal to p*

it was long time not clear to me that you fix both the explantation pattern and the source and target entity. I thought you are only fixing the first one and I found it wired. So be more clear with that at the beginning.

- write that PF = predicate frequency, IPF = inverse predicate frequency
- triples of the form, commas are needed to interpret correctly the boolean expression
- i would also describe in a sentence what this measure roughly capture, i.e. PF = the frequency of both pi and pj in the dataset, IPF = ?????
- to build a co-occurrence matrix “with entries”:
- the paragraph is to abstract, I’m still not sure if I completely got the measure
- theorem 12, I’m not sure this is interesting. It is not a theoretical work, I would say the times in the experimental setting are more important

- the algorithm for each of the most related predicates (line 4) “runs” in parrallel
- length 1 explanation patterns are expanded (line 11-25 ???? not 19????)

Algorithm 1
- Ts subscript
- line 33 tau, not defined

- On the othe hand, the last pattern (maybe change the word pattern, otherwise one thinks about the explanation pattern)
- definition 13, t_{x} t_{y} are dependent from i!

- Definition 14, Given G=(V,E,T) -> what is G, pay attention you do not have a definition for KG
- Definition 17 -> no relation between \eta and p!, you need to describe \eta
- Definition 19 -> you do not define Label, even if it is clear what it is

- People and films -> persons and films

- the tool and the dataset are available upon request -> I really dislike this. I’m mean either the code is published in github or the web is available as an online service. The scientific community must be able to reproduce and use in some way what you are doing easily.
- path and then merged -> are then merged

- usage for querying KGs. -> I find the naming very wired, please change it
- described in Section 6.2 -> section 6.2.1

- that it is a mac book pro is not important for the specs ; )

- Again the same questions, do you traverse rdf:type and wd:P31 properties. If you do you will loose a lot of time there i guess

- graph via path ranking ans -> and

- of defining of a -> defining a

Dennis Diefenbach

Review #3
Anonymous submitted on 16/Sep/2018
Minor Revision
Review Comment:

This paper addresses the task of providing graph-based explanations given a knowledge graph entity pair. It reviews an existing method (Explanations from Data - E4D) and proposes a novel method (Explanations from Schema - E4S). The authors provide an efficiency analysis and a human-based evaluation.

Overall, the paper is well written and easy to follow. The E4S method is a valid original contribution. The authors describe the method in detail, provide examples that help in the understanding of their method and provide experimental details. However, the paper has some drawbacks, which I describe below:

Section 1:
- The idea of E4S is a well-justified algorithmic approach for providing explanations, as it uses the target predicate as a constraint to filter the search space. Having said that, it would be beneficial for the reader to have a better understanding of the use of such constraint in a practical application. Also, a very similar idea has been recently explored in the context of knowledge graph fact ranking given an entity pair and a relation between them [1], and thus it would be beneficial to point out how these works relate and differ.

Section 2.1:
- The authors use the SENSE4US FP7 project to justify the need for their system: “relatedness explanations were useful to investigate and show to the user topic connectivity, which enabled to find out previously unknown information that is of relevance, understand how it is of relevance, and navigate it”. However, I think the paper would be more self-contained if the authors provide specific cases where their approach helped, for instance by providing specific examples in the context of that project.

Section 6:
- Has there be a study with human assessors on how the number of nodes in the explanations affect the usefulness of the explanations? I can imagine that showing the users a graph as the one in Figure 12 (b) can be visually overloading.

Section 7:
- There is a lack of insights in the experimental evaluation (Section 7.4). The authors report average numbers in Table 5. However, I am missing an error analysis w.r.t. specific entity pairs that perform better/worse in for RECAP and for the baselines, or specific kinds of entity pairs (e.g. popular or not).

- It is not entirely clear to me which parts of E4D and E4S are used in the version of RECAP they evaluated in Section 7.4. The authors should make this more clear in this section.

- (minor comment): I appreciate the detailed study on the performance and scalability of E4D. However, I was expecting to see the empirical evaluation of the method (7.4) first in the results section.

- Section 7.4. “The inter-annotator agreement was of 0.85.” Which metric was used for measuring agreement?

- In my opinion, the entity pair id in Figures 16, and 17 makes the visualization confusing. Also, it would be natural to show the mean and variance of each setting before diving into the metrics per entity pair.

[1] Voskarides et al. Weakly-supervised Contextualization of Knowledge Graph Facts, In SIGIR 2018
[2] Cheng et al. Explass: Exploring Associations between entities. In ISWC 2014.

Section 1, paragraph 2: “is that explaining” → “is that of explaining”
Section 1, paragraph 3: “starred Metropolis” → “starred in Metropolis”
Section 4, definition 10: “type(w_s), p*, type(w_s)” → “type(w_s), p*, type(w_t)”
Section 4.2. paragraph 2: “The algorithm for each of the most related predicates (line 4) in parallel”.--> “The algorithm is run in parallel (?) ….”
Section 4.2. “Length-1 explanation patterns are expanded (line 11-25)”: The provided line numbers seem to be inaccurate
Section 4.3.1. “and the target predicate is the same”: “and the target predicate are the same”.
Section 7.2.3. “We hypnotize” → “We hypothesize (?)” :-)