Building Knowledge Graphs About Political Agents in the Age of Misinformation

Tracking #: 1948-3161

Daniel Schwabe
Carlos Laufer
Antonio Busson

Responsible editor: 
Guest Editors Knowledge Graphs 2018

Submission type: 
Full Paper
This paper presents the construction of a Knowledge Graph about relations between agents in a political system. It discusses the main modeling challenges, with emphasis on the issue of trust and provenance. Implementation decisions are also presented .
Full PDF Version: 


Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 06/Aug/2018
Major Revision
Review Comment:

This paper describes a new ontology to create a KG of political agents. This has been put in practice with data from Brazil.

I have some concerns with this paper:

* The introduction is not sufficiently motivated to emphasize the novelties of the work. How is this work different from other similar KGs? Why does this particular case need a new ontology?

* Is this ontology generalizable and will it be generalized to other countries?

* Will the KG be released?

* The "in the age of misinformation" in the paper title seems rather irrelevant and unnecessary. It doesn't seem to inform about the contents of the paper.

* Overall, I struggle to see the scientific contribution of this work. It seems more like a technical contribution over an existing scientific method. Motivation in this regard should be strengthened.

Minor comments:
* There are some issues with references, i.e. "Error! Reference source not found"

Review #2
Anonymous submitted on 14/Aug/2018
Review Comment:

This paper describes a set of vocabularies for describing political entities and their relations, where the overall aim is to provide a KB of political relations and claims

While the topic is interesting, the work seems premature, with respect to both content and presentation. The contribution remains unclear (details below), no actual implementation is provided (neither of the vocabularies nor of the KB) and the presented work merely seems to bundle considerations (often seemingly ad-hoc) on modeling of political processes, and lacks in clarity, grounding in related work and actual implementations/evaluations.

In particular, the paper lacks a clear description of the intended contribution: if the contribution is intended to be the set of vocabularies, the authors should provide a sound description of the modeling process, the addressed questions, the way stakeholders/domain experts were involved etc. Other claimed/mentioned objectives (eg a KB populated from extracting political entities/relations from various sources) are never actually addressed beyond some conceptual considerations.

Section 2 introduces a "Domain Model" (some vocabularies, mostly based on established terms complemented with a few additional ones) but it's not clear what elements are an original contribution of this work and how these go beyond the state of the art.

It remains unclear, how the vocabulary was designed, what research process led to it, why certain design choices were made, and if actual users/stakeholders were involved in the process (presumably not). In addition, it is not clear, what state the proposed vocabulary is in: is it already developed/final or still work in progress? Is it actually available?

The authors seem to ignore state-of-the-art modeling patterns and upper level ontologies (eg DOLCE), and plenty of questions remain: Why are proposals/laws modeled but not other types of artifacts in the political process? Figure 5: why is a "Voter" voted (by a Person)? Isn't the voter doing the voting? How about starting off modeling key actors in the political sphere (parties, MPs, voters, etc etc)?

Section 3 ("Trust Framework"): unfortunately the used notations are not very clear and the shown concepts are unclear too. Eg in Figure 10: isn't an agent part of the "context"? What exactly is the contribution of the ProvHeart ontology? Fig. 11 seems to show mostly established terms and also the definition of provenance chains has been already intended by PROV-O.

I also fail to understand the semantics of the model shown in Figure 12: why does a "DirectRel" have a "provenance" linking to an entity? You seem to be making statements here about the *claim* about a direct relationship or referral, not about the actual relationship. This claim has some sort of provenance (of course) but this would be modeled in a different way and would require an explicit distinction between the claim and the relationship. Metadata about the claim is NOT the same as metadata about the relationship.

Section 4 ("Implementation Approach") contains merely some general considerations on how a KB could be populated but not any actual implementation (it's apparent that there is none yet).

In summary, I believe the topic certainly deserves further attention but my recommendation is to get back to the drawing board, identify the (first) contribution you are aiming for with a first publication (e.g. the vocabularies) and then design a research process capable of resulting in a convincing and consensual vocabulary which meets the requirements of actual domain users/experts.

- the abstract appears very odd and contains only 3 sentences
- Section 1: lengthy introduction on political systems not required
- Section 2.2.1: "shown in Error! Reference source not found.."
- avoid ", etc..."
- plenty of typos throughout

Review #3
Anonymous submitted on 15/Oct/2018
Major Revision
Review Comment:

The manuscript describes a quite comprehensive approach for modeling the political Brazilian arena using Semantic Web tools and techniques.
It presents concepts and a methodology to describe agents, roles they may play along the time, connections between them (e.g. family links or through organizations), a model for attaching provenance to statements, and an approach for modeling trust that users may place in certain statement(s), based on who made them, in what context etc.

The manuscript is based on previous small publications by the authors, duly referenced. It makes sense to see them together in a single place.

The manuscript also discusses several aspects concerning the concrete platform that serves to build this environment (Section 4). I found these very interesting for the practical impact of the work. It is one (good) thing to set up a vocabulary etc. It is another to work with actual stakeholders (journalists), real data, spend time gathering and integrating all this, and ask oneself the questions needed in order for people to really use the platform, feed with with data, improve it etc. By nature, database researchers have a "take" of (angle to look at) the data which specialists from the "real world" may have a hard time grasping and adopting.
However, I am not sure of the significance and actual application of the effort; I have looked up "Se liga na politica Brasil" and I found few references, including a Facebook page with posts from 2016. It is hard for me to tell now, after reading the paper, if the authors have actually users, if the effort is still going or if it has been stopped.

Here are requests that I would like to make of a revision.

1. As is common practice, please add to the beginning of each section a short text explaining what will be presented in that section, referring to each subsection of the respective section, to briefly explain how they fit together. I had a hard time following the structure of Section 2; its beginning was too vague and not connected to the subsections.
Section 4 also has a pretty vague header, Section 5 does not have one at all etc.

Please do the same for sub-sections (they should start with a short text explaining what each sub-subsection handles, and how they all relate/what is the meaning of having them together).

2. Please put Related Work as a standalone section. Please also refer there more extensively to the related works, references to which you interspersed in the Trust section. In particular, the comparison with the "microstatement" model and more information about the trust works you mentioned would be welcome here.
It appears the "Believe it or not" paper of Suciu et al.
is also a pertinent part of the related work.

Also, the "all or nothign" model you adopt for belief (trust) may be too extreme. I agree that "I believe this with probability 0.7" is not suitable, but I am not convinced that one necessarily believes everything a given source says. It would be good if the authors can ellaborate and give more flexibility in this aspect.

3. Please fix the error on page 3 ("Error! Reference source not found")

4. Please clarify the interest of 2.3 "Using SHACL". I didn't understand "In order to characterize the particular composition patterns intended to be instantiated in the knowledge graph, the same information... is expressed using... the Shapes Constrain" (I guess "Constraint"?) "Language" etc.
Why do we see SHACL here, how does it matter?

5. Please state how far the application of this methodology has advanced. Provide elements such as:
- how many actors are currently described in the SNLP warehouse (or database)
- how many people have been contributing this information
- since when (how much effort has it taken)
- who is using it or has used it
- what is the typical interaction of a user with the platform (how are the added, vetted, who has access to what parts of the data etc. ...) What is the actual lifecycle of this platform, how does it function?
- if there are or were obstacles/hurdles toward adopting it or using it, which are they? In an era where we need to automate fact-checking as much as possible, the authors' experience can be very valuable

6. Are there connections of this work with computational fact-checking? Please develop.

7. It would be good to sharpen the discussion of SKOS vs. OWL in Section 2.1.2. As it is, it has rather made be doubt of the various choices.