Ontology-mediated query answering over temporal and inconsistent data

Tracking #: 1770-2982

Authors: 
Camille Bourgaux
Patrick Koopmann
Anni-Yasmin Turhan

Responsible editor: 
Guest Editors Stream Reasoning 2017

Submission type: 
Full Paper
Abstract: 
Stream-based reasoning systems process data stemming from different sources and that are received over time. In this kind of application, reasoning needs to cope with the temporal dimension and should be resilient against inconsistencies in the data. Motivated by such settings, this paper addresses the problem of handling inconsistent data in a temporal version of ontology-mediated query answering. We consider a recently proposed temporal query language that combines conjunctive queries with operators of propositional linear temporal logic and extend to this setting three inconsistency-tolerant semantics that have been introduced for querying inconsistent description logic knowledge bases. We investigate their complexity for EL_bottom and DL-Lite_R temporal knowledge bases. In particular, we consider two different cases, depending on the presence of negations in the query. Furthermore we complete the complexity picture for the consistent case. We also provide two approaches toward practical algorithms for inconsistency-tolerant temporal query answering.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 28/Jan/2018
Suggestion:
Minor Revision
Review Comment:

In temporal reasoning, it is common to have inconsistency in the
data. This paper lifts the classical results of inconsistency handling
KBs in lightweight DLs (DL-LiteR and EL) to the level of temporal KBs
(TKBs). The definition of TKB is natural and relatively simple: it
consists of a fixed TBox, a sequence of ABoxes for timestamps, and
possibly a set of rigid predicates. The query language adopted is
conjunctive query combined with LTL operators. The main reasoning task
considered in this paper is temporal query answering under three
inconsistency-tolerant semantics, i.e., AR, IAR, and brave semantics.

The main contribution of this work is a systematic study of complexity
analysis of temporal query answering under different semantics and
configuration: (1) DL language (DL-LiteR and EL), (2) semantics of query
answering (AR, IAR, and brave semantics), (3) presence of negation in
the query language, and (4) presence of rigid predicates in the TKB.
The obtained complexities range from AC0 to NExpTime. This detailed
complexity analysis results provide a complete picture of temporal
query answering.

Another contribution is towards practical algorithm. The first
algorithm is for TCQ answering under classical semantics in the
presence of rigid predicates. The second algorithm is for
inconsistency-tolerant TCQ answering without rigid predicates. These
two techniques are important for the future adoption of TCQ in
practice.

The paper is in general well-written with many novel
contributions. The obtained results are significant and extensive. The
paper is structured well and self-contained. Provided examples are
adequate. The reviewer just wanted to recommend a few suggestions to
further improve the paper so that the readers can benefit more from
reading the paper.

Suggestions:

1. Since temporal knowledge base is a rather new notion, it would be
helpful to provide more examples in the preliminaries. Some examples
on rigid predicates and/or usage of LTL operators are particularly
useful.

2. This paper is targeting the special issue on stream reasoning. The
authors should spend more efforts on explaining how to apply the
results in stream reasoning. For instance, is it to possible to
incorporate the window operator in stream reasoning into the
framework?

3. The timestamps are given in discrete values as in LTL. In practice,
it is more suitable to use metric temporal logic (MTL) to model the
timestamps. Can we lift the results of this paper to MTL?

Minor issues/typos:

p4. Def 2.1. The author may want to explicitly say that in the rest of this
paper, we always use "n" as the max timestamp of a TKB. In the current
version, "n" is often used in the text but without any context.

p5. "We denote by NKC, NKR , NKRC, NKRR, and NKI respectively the sets
of concepts, roles, rigid concepts, rigid roles, and individuals that
occur in the TKB K." -> These notions are introduced bit used rarely
in the paper. In fact, throughout the paper, several occurrences should
use the K-version.

p10. ~K is defined in Figure 4. Since Figure 4 does not take up much
space, the content could be put directly into the text without a
separate float Figure.

p11. We thus shown a direct -> show

p14. Proof of Proposition 6.10. Please be consistent about \phi and
\varphi

p26. This allow us to -> allows

Review #2
Anonymous submitted on 16/Mar/2018
Suggestion:
Accept
Review Comment:

The paper aims at handling the problem of data inconsistency in query answering over lightweight DL temporal knowledge bases. This paper is an extension of results published in ISWC2017 with comprehensive complexity analysis.

The authors investigate the data and combined complexities for BTCQs with/without negation.They have established the complexity results for three inconsistency-tolerant semantics namely, AR, IAR, and brave. Authors have considered these semantics wiith respect to three cases of rigid predicates in knowledge bases. The results presented in this paper are based on the general algorithms discussed in Section 5.

Additionally, authors have presented techniques for handling rigid predicates while answering queries in DL-Lite(R). They achieve some reasonable theoretical results that can be used to develop practical algorithms for inconsistency-tolerant temporal query answering.

Paper is well-written and the flow of the paper is easy to follow. Contributions in this paper are clearly laid out. All necessary proofs are provided.
A few minor comments/quesitons;

Section 5: It is not clear to understand that what is the benefit of using ARNonEntailment algorithm instead of AREntailment? (Similar for other brave and IAR semantics).

Section 8: Could you please further explain in the paper that why data complexity in the case of BTCQs without negation reduce only over DL-LiteR with the absence of rigid predicates?

Overall, the paper provides a significant extension from the previously published work of authors in the same domain and this extension provides a reasonable contribution beyond state of the art giving a clear picture of overall complexity analysis for query answering over temporal and inconsistent lightweight DL knowledge bases. I have a positive feedback for this paper and I would accept this paper to be published in SWJ, but I will encourage authors to address above mentioned two points.

Review #3
Anonymous submitted on 19/Apr/2018
Suggestion:
Minor Revision
Review Comment:

Ontology-mediated query answering over temporal and inconsistent data

Contents
--------

This article sonsider the issue of answering queries, formulated in a
temporal logic on top of conjunctive queries, on (possibly infinite)
sequences of knowledge bases, K_0, K_1,... where elements of the
sequence may be inconsistent. The temporal logic is linear time logic
(LTL) with a variety of operators looking to the future and the past;
the knowledge bases K_i= consist of a taxonomy T and
assertions (facts) A_i from a description logic (DL), which is either
DL-Lite_R or EL_\bot; these are prominent low-complexity description
logics related to OWL profiles.

Inconsistency of K_i is assumed to be due to the factual part A_i,
which may be tolerated by resorting to repairs for reasoning, i.e., to
subsets A'_i of A_i that restore consistency. To this end, according
to a regime, selected repairs are used for query answering.

The paper thus studies a setting that combines two aspects that have
been considered in the literature before: (1) temporal query answering
(with no special attention to inconsistency) and (2)
inconsistency-tolerant reasoning for "atemporal" (ordinary) DL
knowledge bases. Specifically, it considers the widely used (a) AR
semantics, in which all inclusion-maximal subsets A'_i of A_i are
considered; (b) IAR semantics, in which the intersection of all
inclusion-maximal subsets A'_i is considered; and (c) brave
semantics, where some arbitrary inclusion-maximal subset A'_i is
considered to sufficient for deriving the query.

The main part of the paper is devoted to analyze and characterize the
computational complexity of answering Boolean temporal conjunctive
queries over sequences K_0,K_1, ... of knowledge bases for the three
semantics (a)-(c) above, both in the setting of data complexity (where
just the data varies) and the combined complexity (general case). The
study considers restrictions on the queries, more specifically whether
negation is allowed or not, and on the dynamic nature of concepts and
roles, viz. whether they are rigid or susceptible to change. Furthermore,
the paper considers properties and reduction techniques that might be
exploited for practical implementations of query answering (e.g., to
eliminate rigid concepts and roles for query answering).

The authors report exact complexity characterizations for almost all
of the many cases considered, with very few exceptions. They reveal a
rich landscape of complexity from tractable (in few cases for IAR
semantics, under data complexity) to intractable up to co-nexptime
complete problems; as shown in several cases, inconsistency tolerance
does not increase the worst case complexity, and excluding negation
might lead to significant savings (unfortunately, perhaps not in data
complexity).

Evaluation
----------

The results of this paper complement and extend previous results on
temporal and inconsistency-tolerant query answering and are a
significant contribution to the literature. While the focus of the
paper is on temporal reasoning as such and less on stream reasoning in
the realm of dropping data, they provide a solid foundation as a
starting point for the development of suitable algorithms.

The derivation of the results is based on common techniques and
methods, which however are as in the case of the proof of Proposition
8.3 (type elimination to obtain a polynomial time algorithm) show
mastery in its use.

The paper is well-written and provides an ample exposition of
background and context, and puts the contributions in perspective. It
is clearly a research and theory oriented article, but I believe that
the results are accessible to a broader audience.

The technical results are formally stated and furnished with detailed
proofs or proof sketches; it appears from them that the results as
such are correct.

Overall, this is very informative and good article that should be
published. Below are some minor questions and comments which the
author mat consider in a final version.

Minor comments and corrections
-------------------------------

page 2: "[19,20]. Ontology-based approaches for
stream reasoning often" Starting a new paragraph here would be
suggestive.

page 3: with respect of three -> with respect to three

the achieved complexity results --> the complexity results obtained.

page 4: the notation [[ 0,n]] for an interval is unusual; why no use
the common notation [0,n] ?

page 5: "We impose the constraint that the past operators..." please
say at this point why, or give a pointer to where this will be discussed.

page 7: "the impact rigid predicates can have on repairs." Adding
*which* might make the reading more natural.

page 8, proof of Prop. 3.5: Towards the end, you conclude that a^{I_i}
= a^{I_j} holds. However, it is not clear that the unique name
assumption by itself ensures that a is interpreted canonically (unless
you use Herbrand interpretation; perhaps I miss something.

page 8, "...TCQs that are rooted." Please briefly explain what this
means.

page 10, "Entailment under IAR semantics." Please explain or give a
hint why this procedure is useful, as this becomes apparent to the non-expert only
later; as it stands, one would think it is not a good idea to compute
all the minimal inconsistent sets, as in general there can be clearly
exponentially many such sets.

page 11: axiom that involve two non-rigid -> axiom that involves two non-rigid

"checking of L KBs" it would be better to use hyphen and
write "L-KBs" here and in the rest of the paper.

page 13: proof of Proposition 6.9.: Please explain the intuition
behind the construction in the hardness parts, as this will help the
reader to grasp the idea faster. The formal argument that the
construction works may then be entirely left for the appendix; or
claimes are formulated that are proved in the appendix. By the way, it
is unclear what B_{3i} is.

page 14: proof of Proposition 6.10: "such that each x_j has either
only Pos or only Neg incoming edges" please explain what you mean with
"edges" (this might not be clear to non-experts).

regarding the open case where no rigid concepts and roles do exist:
perhaps you can express an intuition or your expectation whether
this is still NP-hard or gets tractable. It might be worthwhile to
link this to the case where only negation-free queries are
considered (where you show tractability) and explain, e.g. on an example,
why a similar method could not be extended to the case with
negation.

page 15, 16: Theorem 7.4, Proposition 7.6: Perhaps the details of the
proof an be outsourced to the appendix, and only the case discussed
that would not be routine. This may increase readability.

page 17: before Proposition 7.7: "." should be added.

Theorem 7.8: the two different cases should be separated using
"respectively," not to raise the impression that both properties
have to apply at the same time.

page 19: non-determistic -> non-deterministic

it would be helpful to illustrate the notion of propositional
abstraction on a simple example, so that a broader range of readers
can better grasp it.

page 20: "verification of Condition 3 or Condition 2 is linear in n"
it is not clear why this is linear in n, the number of time steps;
rather, the number of possible sets (there are polynomially many) is
relevant? What kind of data structures would be used?

page 22: the original algorithms omits -> the original algorithm omits

"We say a PI is applied" if I am not mistaken, "PI" has not
been formally introduced.

page 23:

"then let a = x_{a'P_1,...,P_l}^{i_1...i_l}" this is confusing,
rather one should say that a must be of this form, and continue
with this

page 24: we break the proof --> we split the proof (or "we divide
the proof")

"we obtain that \exists P^- (resp. \exists P^-_2 ) is unsatisfiable"
please detail what you mean (as this might be not known the
non-experts).

page 26: after Proposition 9.12, you write: "This is a remarkable
result, ..." However, the results seems not to be difficult to prove,
and it is also intuitive. So why should one call this "remarkable"?

page 28: for brave semantics which are relevant --> for brave
semantics which is relevant

page 29: In [3], "C.F. Beckmann" should be "C. Folie"

page 30: Perhaps "omitted proofs" should be reformulated, as in some
cases you present complete versions of proofs for which
already detailed sketches were given (e.g., for Proposition
6.9). In order to avoid replication, you may consider to add
in the appendix the missing parts or make the informal claims
formal and prove them.