UCQ-rewritings with Disjunctive Existential Rules and Conjunctive Queries with Negated Atoms

Tracking #: 2371-3585

Authors: 
Enrique Matos Alfonso
Giorgos Stamou
Alexandros Chortaras

Responsible editor: 
Bernardo Cuenca Grau

Submission type: 
Full Paper
Abstract: 
We focus on the problem of query rewriting for the framework of disjunctive existential rules. It is a well known approach for query answering on knowledge bases with incomplete data. We propose a rewriting technique that uses negative constraints and conjunctive queries to remove components from the disjunction in disjunctive existential rules. This process will eventually generate new existential rules. The generated rules can then be used to produce new rewritings considering the existing rewriting approaches for the existential rules framework. The algorithm is implemented in COMPLETO in order to provide complete rewritings for unions of conjunctive queries with negation. Additionally, we report some experiments to evaluate the viability of the proposed solution.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 25/Nov/2019
Suggestion:
Minor Revision
Review Comment:

Even though the submission has improved, I find that there are still some issues with the current version. Because I would like to review the paper again before it is published, I am going to give it a "Minor Revision" score. Below, I list the main issues that I find with the submission.

1. The write-up needs to be improved. Apart from all of the minor comments that I list below, I find the following issues.
- The authors include content that (IMO) is not really necessary (e.g., Example 2 or the explanation of De Morgan's Law) and then do not define notions that are central (e.g., substitutions, unifiers, graph of rule dependencies...). As a consequence of this, it is difficult to understand some of the notions presented in the paper (e.g. Definition 2.5).
- I find that the notation used is at times a bit confusing. For instance, the notation for rules is not very clear. Instead, I would have preferred if the authors would use \wedge and \vee instead of , and []. As a result of this choice, e.g., Example 3.6 is quite unclear.
- As I pointed out in the previous version, the definitions of some notions could be unified. For instance, you can define disjunctive existential rules and then define existential rules as a restricted form of disjunctive existential rules.
- Some of the provided theorems are really difficult to read and understand. E.g., Theorem 5.

2. Presentation and motivation for the technical results.
- It is not clear from the write-up what is previous work and what is a novel contribution of the authors. For instance, are the results from Section 3 novel? What about theorems 3.2 and 3.3? This needs to be properly clarified.
- Some of the provided formal arguments are not proper proofs. E.g., the proof Theorem 3.9 needs more work.
- Doesn't Theorem 3.8 subsume 3.7? That is, if Algorithm 1 stops, then that means that all queries have rewritings. Hence, the knowledge base in Theorem 3.8 is a fus.
- Even though the authors propose many results that can be used to determine whether a knowledge base is fus, I do not believe that any of these are very useful. To validate the usefulness of these results, the authors could (1) study real-world rule sets, (2) find some rule sets not captured by any well-known fragment that guarantees query rewritability, and (3) show that at least some of these can be characterised as fus by applying the results in this paper.

3. Issues with the evaluation
- Even though there are no other systems that can deal with negated atoms you could have still compared performance for queries without negation.
- In the paper you mention that "The OWL 2 ER fragment of both ontologies was translated into existential rules". Do you mean to say "OWL EL" instead of "OWL ER"? Furthermore, I have looked at the ontologies online and have seen that they contain inverses and functional properties. Therefore, these ontologies are not in EL. Can you explain this situation? Are the ontologies that you put online not the ontologies that you used in the evaluation?
- As far as I can tell, your implementation was capable of rewriting all the queries. Why is this the case? Are both ontologies fus? Or is it because you only consider a restricted set of queries?

Not clear what is novel versus what is not:

Theorem 3.1

Minor Comments:
- Shorten the title so there is not a line at the end with a single word.
- If you can put the "footnote asterisk" after the comma.

0. Abstract
- "for the framework of disjunctive existential rules" -> "for disjunctive existential rules".

1. Introduction
- The write-up of the first paragraph is not very precise
- Line 18: do not use a line for such a short word.
- Add a ~ after \wedge in line 18 so it is not so close to the closing parenthesis.
- It appears that there is way too much empty vertical space in Example 1.1.
- "although the size of the rewriting can be exponential" -> add a citation here.
- Having no restriction on the expressivity of the rules already makes the query entailment undecidable. If you mention negated atoms here, the reader will think that you need this feature for the language to be undecidable.
- It would be nice to find shortcuts so the rules given in the examples fit in a single rule. Also, you could ignore universally quantified variables, as it is usually done.
- "Conjunctive query entailment is undecidable even in the case of existential rules without disjunction" -> You have mentioned this before.
- Page 2, line 24: AFAIK, CQs do not allow for the use of universal quantifiers.
- Page 2, line 30: this example does not make much sense, does it? It implies that every person is either married or is the parent of someone? Also, add ~ after the \vee operator at the end of the line.
- Page 2, line 28: modify the text so you don't have a line with just 4 letters.
- Page 2, line 40: move the citation to the end of the sentence.
- Page 2, line 51: why do you uppercase "union"?
- Page 2, column 2, line 7: what is a "rewriting operator"?"
- Page 3, column 1, line 12: remove these extremely short lines.

2. Preliminaries
- Page 3, column 2, line 11: maybe use FOL to be consistent.
- P3, C2, L25: you do not need to uppercase "Formulas". Same for the next sentence.
- P3, C2, L45: do not use the future tense.
- Definition 2.1: wouldn't it be easier to just use disjunctions and/or conjunctions instead of introducing this notation that is probably new for most readers?
- P4, C1, L45: it would be easier to say that a literal is positive or negative. In the next sentence, you can say that a formula is ground if it does not contain variables.
- Figure 1 is not a very good representation of a hypergraph because the edges are undirected. That is, by looking at the picture, I do not know whether parent(X, Z) or parent(Z, X).
- When you say that a CSF of atoms can represent a hypergraph: specify that this is a directed hypergraph.
- P4, C2, L23: connected -> \emph{connected}
- P4, C2, L40: you do not need to use 3 lines here.
- P5, C1, L21: do not use the future tense. Same for the rest of the proof.
- I would not include Example 2.2.
- P5, C1, L39: do not use the future tense.
- P5, C2, L5: what is the definition of unifiable and unifier?
- P5, C2, L13: what is a most general unifier?
- In Definition 2.3: use the \emph{} environment when youintropduce a new notion.
- Definition 2.4: do not use the future tense.
- P6, C1, L51: remove that line.
- P6, C2, L1: what is a ground instance?
- Theorem 2.2: What are C1 and C2?
- P7, C1, L35: these are Boolean conjunctive queries (since you do not have universally quantified variables). Also, write "(CQ)" with parenthesis when you introduce a new shortcut.
- P7, C1, L40: "all of the variables in a query are present in positive literals"
- P7, C2, L13: comma before "i.e."
- Definition of existential rule: by your definition, the variables in \vec{Y} may appear in the body of the rule.
- "The quantification of variables follows the same rules described in the definition of an existential rule" -> this does not make much sense...
- Add a citation for the graph of rule dependencies.
- It would be easier to define a knowledge base as a binary tuple consisting of a set of facts and a set of rules (disjunctive and constraints are also rules).
- Example 2.4: "Let's" -> "Let us"
- P8, C1, L29: which existential rule? Also, use numbers in your equations and point to the corresponding formulas in the text.
- Usually, the notion of a UCQ-rewriting implies completeness. The presented notation is confusing.
- If possible, place the captions of the tables below the actual table.
- Example 3.2: the first sentence in this example is grammatically incorrect.
- Definition 3.4: the notion of a unifier should be formally introduced in this paper. Also, quantify H', q', and v. Also, include the unifier in the expression rew(r, q).
- Theorem 3.5: what is a UCQ component? The sentence after "iff" is not readable.
- In Alg. 1, you do not need to introduce so many variables.
- "In general, it is undecidable to know if there exists a finite UCQ-rewriting" -> add citation
- P15, C1, L32: what is the purpose of this sentence?
- Next sentence: you should say "A cut for R is a partition..."
- P15, C2, L1: What is \mathcal{R} here?
- P15, C2, L26: remark that you can only use finitely many predicates, variables and constants.
- P15, C2, L29: what does this sentence mean?
- Theorem 3.9: "of of", "o"
- P17, C1, L41: the last sentence is not necessary; answers by definition only include constants.
- P17, C2, L5: don't you have to existentially quantify the variable Y? Also, write "pedro" instead of "perdo" in the previous line.
- P17, C2, L33: a variable would be assigned to a Skolem term, but not to a Skolem function.
- P17, C2, L37: what does it mean for an existential rule to produce a "disconnected existential rule"?
- Example 3.6: align the formulas that result from the constraint refutation in a different way.
- P20, C2, L1: OWL 2 ER -> OWL 2 EL

- P15, C1, L12: uppercase the word "algorithms". Same with the word "algorithm" in Theorem 3.6.
- P15, C2, L40: remove lines with 1 word.
- P15, C2, L41: you have already defined disconnected rules before.
- P20, C1, L47: what are siblings concepts

Review #2
Anonymous submitted on 10/Jan/2020
Suggestion:
Accept
Review Comment:

The authors addressed properly my original comments. Indeed after the changes the contribution of the paper is better positioned with respect to existing related work. The overall writing of the paper was also improved - however, I believe it can be further improved -.

The results are not technically very challenging (some are consequence of existing results), but provide new interesting insights. I thus believe these results have enough merit to be published in SWJ. In conclusion, I recommend this paper for acceptance.


If the paper gets accepted I recommend the authors to revisit the writing of the paper to make it less “chatty” (as described by another reviewer), and to make it more cohesive.

Review #3
By Michael Morak submitted on 13/Feb/2020
Suggestion:
Reject
Review Comment:

In the revised version, the authors have added a related work section as part of
the introduction, as well as some new observations about their proposed
rewriting algorithms in Section 3. Except for typos and minor changes, the rest
of the paper remains relatively unchanged.

I would like to start by saying that, after revisiting the paper, I still
believe that this topic is very interesting. I would like to see this work
published soon, appropriately restructured as discussed in the reviews.

Regarding the current version of paper, however, I still believe that it is not
yet in a form that is ready for publication. Since the editors asked for an
accept/reject rating, this leaves me with only once choice: the latter. I have
carefully read the authors' responses to the comments by myself and the other
reviewers. While the points mentioned in the first paragraph above have been
integrated into the paper, many issues raised in the reviews were either not
addressed or just explained in the rebuttal but not incorporated into the paper.

All reviewers raised issues with the organization of the paper, with incomplete,
inconsistent, redundant, or changing definitions, and several writing issues,
including the absence of formal details, that make the paper unnecessarily
difficult to follow. In addition, all reviewers expressed that they are not
convinced by the somewhat unconventional definition of CQs, combined with the
very restricted classes of disjunctive existential rules considered in the
paper.

Both of these issues could have been remedied by following the, in my view,
reasonable suggestions by the reviewers; e.g. fixing the latter issue by giving
a (concrete) argument that motivates why these languages are interesting and
relevant. But this is still missing in the current revision of the paper.

To illustrate the point above, I will give some examples, but many more can be
found when comparing the revised version with the original and taking into
account the reviewers' suggestions.

1) Review 1 identifies redundant and somewhat lengthy and unnecessary parts in
the paper (which the other reviewers seem to agree with). The authors did not
respond to this point and did not change the structure of the paper.

2) In my previous review I pointed out that the result of Theorem 3.6 directly
follow from results by Bourhis et al. (2016). In fact, in that paper, a
*backward-chaining* procedure is devised to rewrite linear DTGDs into a UCQ.
While the authors argue in the rebuttal that they have now included this
citation in the paper and that offering an alternative proof is interesting (I
agree), in the revised version they still say that "To the best of our
knowledge, existing research for existential rules with disjunction is only
based on forward chaining algorithms." Also Datalog rewritings are based on
backward chaining methods.

3) In my previous reviews, comments for P9C2L11 and P9C2L20, the authors
responded with an exhaustive explanation, which now makes these points somewhat
clearer. However, the paper itself did not incorporate these changes. I maintain
that these steps should be formalized and made explicit in the paper.

4) Reviewer 3 suggests that the structure of the paper could be improved. This
mainly concerns thinking about what to include in the preliminaries and a
restructuring of Section 3. Reviewer 1 also makes a suggestion for restructuring
(regarding defining the general framework for disjunctive rules first, and then
dealing with the non-disjunctive version as a special case). These remarks have
not been addressed in the rebuttal and no changes have been made in the paper,
despite all reviewers raising these as major points that need improvement.

The above list is by no means complete. However, because of these issues, I
would think that it would be better for this paper to undergo a substantial
rework before being ready to be published. My suggestions to improve the paper
would be as follows:

- Make sure that every concept introduced in the preliminaries is used later on.
If it is only used to introduce some further concept in the preliminaries,
think about whether this could be avoided.

- Split up Section 3 into several separate sections. Here are some natural
splits where this could be done: the two different versions of resolution; the
break between the theoretical investigation and proposing concrete algorithms;
disjunctive and non-disjunctive rules; investigating theoretical properties of
the resolution formalism and investigating what they can be "used" for (that
is, where does the resolution procedure terminate).

- Have a deeper comparison between your approach and other, related or similar
approaches by identifying where other approaches fail and where your approach
works.

- Motivate the languages you are using (e.g. you could give a concrete example
that shows why disconnected rules are useful and interesting).

- Be consistent with your notation and definitions. If you define a concept one
way (e.g. CQs), stick to it, and don't redefine it later to mean something
else (e.g. in the beginning CQs don't have answer variables, but later they are
redefined to now allow them).

I believe that with these changes the paper would communicate its results much
better and make it much more joyful to read. I want to reiterate that I would be
more than happy to see an improved version accepted soon in the future.