FrameBase: Enabling Integration of Heterogeneous Knowledge

Tracking #: 1517-2729

Jacobo Rouces
Gerard de Melo
Katja Hose

Responsible editor: 
Guest Editors ESWC2015

Submission type: 
Full Paper
Large-scale knowledge graphs such as those in the Linked Data cloud are typically stored as subject-predicate-object triples. However, many facts about the world involve more than two entities. While n-ary relations can be converted to triples in a number of ways, unfortunately, the structurally different choices made in different knowledge sources significantly impede our ability to connect them. They also increase semantic heterogeneity, making it impossible to query the data concisely and without prior knowledge of each individual source. This article presents FrameBase, a wide-coverage knowledge base schema that uses linguistic frames to represent and query n-ary relations from other knowledge bases, providing multiple levels of granularity connected via logical entailment. Overall, this provides a means for semantic integration from heterogeneous sources under a single schema and opens up possibilities to draw on natural language processing techniques for querying and data mining.
Full PDF Version: 


Solicited Reviews:
Click to Expand/Collapse
Review #1
By Valentina Presutti submitted on 09/Dec/2016
Review Comment:

I am happy with the response of the authors and with the new version of the paper.
I recommend the authors to include in the paper all explanations given in the cover letter that are not yet included.
I have some minor remark that the authors may want to consider while writing their camera ready version.

The authors say that the annotators were familiar with both FrameNet and Semantic Web, hence their assumption is that they can assess whether a Semantic Web or FrameNet-specific term or sense is present or not. Ok, but I recommend to make this clear in the paper and to remove any reference to the aim of evaluating if the meaning of a frame element is obvious for a layman reader, as layman users were not involved in the study.

Have you tried without the semantic pointers? I think that comparing the results would better support your final choice.
--> Yes, the lexical overlap was too sparse.

Please include this clarification in the paper.

AS for this comment/reply from the previous review:
Section 6.1:
The examples given of the manually built Class-Frame integration rules refer to classes that were designed with an n-ary shape in mind. The authors are invited to comment on the issue of aligning frames to classes that are not designed to represent n-ary relations and how this issue can be (or has been?) overcome.
--> Following the model from FrameNet, we assume that all classes can be represented in terms of frames (issues with current coverage aside). Of course, in many cases, these would be very simple ones for which an n-ary representation is not necessary. In such cases, the rules will simply create a frame instance without additional properties, similar to the dbr:SocietalEvent example in our paper.

I think the authors did not reply, or possibly I was not clear or did not get it.
What I mean is that there are some classes e.g. in DBpedia, designed with a frame or event-like model in mind. Aligning these classes to frames is hence easier than aligning other entities that do not explicitly show such modelling approach. But still other classes that do not show a nary-relation-like model can be aligned to frames, at least conceptually. This may be hard and may require additional heuristics or a different approach. My question is related to discussing this problem or describing an example of manually built rule for such cases so as to help the reader to appreciate this type of case.

Review #2
By Bonaventura Coppola submitted on 24/Mar/2017
Review Comment:

This work has now come to its third revised submission, and has reached a very good level of clarity and consistency. The paper addresses thoroughly and from several perspectives the relevant issue of representing complex events with an unrestricted number of participants (i.e. n-ary relations) by the convenient means of standard RDF triples. The relevance of such topic is supported by the high number of papers in the Semantic Web community which, while addressing very interesting tasks (as e.g. on-the-fly novelty detection) still fail short of an appropriate, effective representation model. In fact, many of them rely on toy assumptions like the frequent "one triple per event" oversimplification. Hence, a first important contribution is the preliminary survey of the several current, standard methods for projecting arbitrary n-ary relations into consistent sets of triples. All of such methods are discussed and their pro's and con's considered. The eventual choice of the authors towards the Semantic Frames representation is extremely well motivated by considerations about expressiveness, space complexity, and reliance on sound, widely supported theories of meaning (Frame Semantics) from the linguistic-theoretical counterpart. Building on the top of such analysis, the practical creation process of a frame-based knowledge base schema ("FrameBase") is presented. FrameBase is founded on a dual representation of n-ary relations: 1) a neo-davidsonian reification allowing expressive and compact representation of complex events, and 2) a bare direct binary predicate (DBP)-based representation intended to preserve both compatibility towards other/source KB schemas and simplified/legacy querying, when the expressivity of n-ary relations is not required. The general methodology, algorithmic details and an adequate number of practical examples are given. As a final and most relevant result, the practical method for the consistent integration of several popular heterogeneous KBs into a single instance of FrameBase is presented.

The paper is now in its third revision stage. The authors have completely fulfilled a wide set of editing and clarification requests, as well as the introduction of additional material that I requested in my previous reviews (ref. 1239-2451 and 1392-2604). The current version of the paper has reached a high level of clarity and consistency and it is definitely suitable for publication in the Semantic Web Journal.

As a last minor flaw, after the authors' decision to rename "cluster microframes" into "miniframes", Figures 3 and 4 should be changed accordingly and aligned to the new terminology.