A survey on SPARQL Query Relaxation Under the Lens of RDF Reification

Tracking #: 3621-4835

Authors: 
Ginwa Fakih
Patricia Serrano-Alvarado

Responsible editor: 
Marta Sabou

Submission type: 
Survey Article
Abstract: 
Query relaxation has been proposed to cope with the problem of queries that produce none or insufficient answers. The goal is to modify these queries to be able to produce alternative results close to those expected in the original query. Existing approaches querying RDF datasets generally relax the SPARQL query constraints based on logical relaxations through RDFS entailment and RDFS ontologies. Techniques also exist that use the similarity of instances based on resource descriptions. These relaxation approaches defined for SPARQL queries over RDF triples have proved their efficiency. Nevertheless, significant challenges arise for query relaxation techniques in the presence of statement-level annotations, i.e., RDF reification. In this survey, we overview SPARQL query relaxation works with a particular focus on issues and challenges posed by representative RDF reification models, namely, standard reification, named graphs, n-ary relations, singleton properties, and RDF-Star.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Accept

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Louise Parkin submitted on 19/Feb/2024
Suggestion:
Accept
Review Comment:

This paper is a survey of relaxation methods for SPARQL queries focusing on reification. The authors have addressed the comments and improved the paper significantly by adapting its structure and using a motivating example throughout to illustrate. The main issue I had with this paper was the positioning of the focus on reification, which has been corrected in this version. I therefore suggest this paper be accepted for publication.

This paper meets the criteria for a survey paper : (1) it is suitable as an introductory text, (2) the literature review is comprehensive and balanced, (3) the paper is well written, (4) the subject is relevant to the field of semantic web.

There are a few minor formalization and spelling issues that I missed in the first review or that have been introduced in the revision:
1 how much relevant -> how relevant
3.2.1 if one or more relaxation rules are applied to tp' -> if one or more relaxation rules are applied to tp in order to produce tp'
3.2.2 if one or more relaxation rules are applied to Q -> if one or more relaxation rules are applied to Q in order to produce Q'
3.2.1 by replacing any element e ∈ tp by e′: tp′ = tp \ e ∪ e′ where e′ ∈ {≺sp, ≺sc, ≺s} & 3.2.2 by replacing any element e ∈ P by e′: Q′ = X ← (P \ e ∪ e′) where e′ ∈ {≺sp, ≺sc, ≺s}
I find this formalization unhelpful, there is a confusion over the nature of e', which is treated as both an element of a triple pattern (subject, predicate or object) and as a relaxation rule (≺sp, ≺sc, ≺s) in the first instance, then as both a triple pattern and as a relaxation rule. I suggest removing these entirely as this formalization is not reused in the paper, and the nature of a relaxation step is clearly explained in the text before.
3.2.3 the similarity of the original query Q′ to the original query Q, -> the similarity of the relaxed query Q' to the original query Q,
5.4 all relaxation works are able to relax composite querie -> all relaxation works are able to relax composite queries

Review #2
Anonymous submitted on 23/Feb/2024
Suggestion:
Accept
Review Comment:

In this revised version, my remarks about the previous version have
been satisfactorily addressed. I thank the authors for their effort in
improving the organization of the manuscript. I therefore recommend to
accept the paper, provided that the minor changes listed below are
done (mostly typos).

## Introduction (Section 1)
- spurious comma after "The original proposed idea", and after "The evolution of the web"
- exploitng --> exploiting
- a is-a taxonomy --> an is-a ...
- Such distance --> Such a distance
- considering the precised context --> ... precise ...

## Background (Section 3)
- (Turtle syntax)(on the left) --> (Turtle syntax, on the left)
- Figure 2, right: align with left part
- type --> rdf:type
- ex:knows --> foaf:knows
- the AND item in the subsection about SPARQL is actually part of BGPs defined above, and there is no 'AND' operator in SPARQL, to my knowledge. It's a bit confuse. Introduce one piece at a time: triple pattern, then FILTER, then BGP, then GGP, then OPTIONAL, UNION...
- Table2, Figure 4:
- subProperty --> subPropertyOf, subClass --> subClassOf
- change arrow direction in Figure 4
=> To agree with RDFS, and also because otherwise this suggests the wrong direction ( reads ).
- Equation 1: superclasses --> ... or superproperties
- Notice that same as --> Notice that, like
- a series OF contributions
- Section 3.3.2 (named graphs): the cost is 1 triple + n "quadruples", to be exact
- Section 3.3.3 (n-ary relations): the cost per instance is 3+n triples, not 4+n, because the triple like in Listing 1.(c) is shared by all instances. I think that it shouldn't count in your comparison.
- Section 3.3.5 (RDF-star): the quoted triple should be asserted, it's the main triple, the other triples annotates this triple. So the cost is 1+n triples. Otherwise, if there is no annotation on a triple, the cost would be zero, which does not make sense.
- moreover, it's not correct that the query has only one triple pattern. The semi-colon is just syntactic sugar for the dot, so there are two triple patterns (in one Turtle-like sentence) in Listing 2.(e), and there should be three with the asserted triple.
- propagate change when discussing "Number of triples below"
- Figure 5: ex:ernolldate --> ex:enrolldate

## Remainder
- p.21: t_4, t_7... --> tp_4, tp_7..
- p.33: experimentd --> experimented
- p.34, line 42-43: superfluous commas
- p.35: [14,16] stand out ... --> name them for readability (in addition to the refs)

Review #3
By Daniel Hernandez submitted on 06/Apr/2024
Suggestion:
Accept
Review Comment:

I thank the authors for addressing the comments on the previous versions. I recommend the paper for acceptance.


Comments

In addition to the modifications outlined in the cover letter, we have also included in the modified version of the survey a new work “Query relaxation for portable brick-based applications” (reference 19 in the paper).

Dear reviewers,

We sincerely appreciate your valuable feedback and the acceptance of our survey paper focusing on SPARQL query relaxation techniques and their impact when applied to datasets with RDF reification. We deeply appreciate the time and effort you invested in reviewing our work. We considered your comments in our camera ready version of the paper and corrected the formalization and spelling issues as pointed out.

In addition, we polished our references, (we say about the color of the RDF and SPARQL code). As well, as the RDF 1.2 draft is now introducing Triple Terms, we changed our sentences in Page 15, Lines 3-6.

Review #1
Submitted by Louise Parkin

All minor spelling issues are solved.

Detailed Comments:

Section 3.2.1 by replacing any element e ∈ tp by e′: tp′ = tp \ e ∪ e′ where e′ ∈ {≺sp, ≺sc, ≺s} & 3.2.2 by replacing any element e ∈ P by e′: Q′ = X ← (P \ e ∪ e′) where e′ ∈ {≺sp, ≺sc, ≺s}, I find this formalization unhelpful, there is a confusion over the nature of e', which is treated as both an element of a triple pattern (subject, predicate or object) and as a relaxation rule (≺sp, ≺sc, ≺s) in the first instance, then as both a triple pattern and as a relaxation rule. I suggest removing these entirely as this formalization is not reused in the paper, and the nature of a relaxation step is clearly explained in the text before.

Authors: We suppress this definition in page 8, Section 3.2.2.

Review #2
Submitted by Anonymous

All typos mistakes are solved.
Minor modifications are done in figures based on the reviewer’s comments.

Detailed Comments:

1.2.4. the AND item in the subsection about SPARQL is actually part of BGPs defined above, and there is no 'AND' operator in SPARQL, to my knowledge. It's a bit confuse. Introduce one piece at a time: triple pattern, then FILTER, then BGP, then GGP, then OPTIONAL, UNION…

Authors: Every definition is now introduced one at a time starting by triple patterns, then FILTER, then BGP, then GGP, and so on (Page 5 Section 3.1.2). The semantics of the AND operator is detailed in [21] so we add this reference.
1.2.10. Section 3.3.2 (named graphs): the cost is 1 triple + n "quadruples", to be exact
Authors: We took into consideration this comment and modified the related statement.

1.2.11. Section 3.3.3 (n-ary relations): the cost per instance is 3+n triples, not 4+n, because the triple like in Listing 1.(c) is shared by all instances. I think that it shouldn't count in your comparison.

Authors: That’s right, as the class declaration does not count, we made it clear that the number of triple patterns for n-ary is 3+n and not 4+n. (Page 15 Lines 14-15)

1.2.12. Section 3.3.5 (RDF-star): the quoted triple should be asserted, it's the main triple, the other triples annotates this triple. So the cost is 1+n triples. Otherwise, if there is no annotation on a triple, the cost would be zero, which does not make sense.

Authors: We took into consideration this comment and modified Listing 1.e. by making the triple to be annotated asserted. (see Listing 1.e. in Page 13)
Review #3
Submitted by Daniel Hernandz

I thank the authors for addressing the comments on the previous versions.

Authors: Thank you.