Review Comment:
I confirm that most of the weaknesses mentioned in my review of the previous version of this manuscript has been addressed now. However, there are still a few issues for which I would like to see another minor revision (after which, I am sure, the manuscript will be ready to be accepted).
1/ There are issues with the definition of the notion of "induced sub-graph" that was added. The notion is defined for a given pair consisting of a star BGP S and a solution mapping μ over an RDF graph G. The first issue is that the notion of "a solution mapping over an RDF graph" is undefined, and also not a notion commonly considered in the SPARQL literature (instead, the notion of a solution mapping as commonly defined is independent of any particular RDF graph). The second issue is that, by defining "induced sub-graph" based on such a pair, it should be necessary to mention both the corresponding star BGP and the solution mapping when referring to an induced sub-graph. However, the paper refers to induced sub-graphs by mentioning a star BGP and an RDF graph instead; by doing that, it is not clear anymore, which solution mapping would be used to construct the corresponding subgraph and, thus, it is not clear which subgraph exactly it is. This problem becomes evident in Example 3.2 where the subgraph g2 contains two phone triples. There is no way to produce this subgraph from the star BGP S1 and a single solution mapping! Subgraph g4 in the example has the same problem. Another smaller issue is that the authors start in the example to use some notation for induced sub-graphs, namely S1(|G|), which has not been introduced.
2/ The definition of the center of a triple pattern (page 9, LHS, lines 31-37) still has the following issue. The definition of this notion assumes a given dp-schema but, later in the paper, the notion is used as if it is independent of any dp-schema. In other words, for a given triple pattern tp, it is not sufficient to write center(tp) but, instead, it must be center(tp,P) where P is a dp-schema. This change to the definition must then be reflected in the rest of the paper.
3/ Point 7 from my previous review has not been addressed. I am repeating this point here (but with the updated line numbers): There is an incorrect claim in Sec.5.2 on page 11, lines 3-4 on the RHS. The claim is that "these frequencies can be pre-computed as they are independent of any query." This claim is incorrect because, by definition, these frequencies depend on the BGP of the given user queries. An indication for that is that such a frequency is denoted by mpv(?x,B,G), see line 21, where B is the BGP of the given user query. Also, the suggested approach to determine the frequency, as described in lines 23-30, is based on a query that uses the BGP of the user query in its WHERE clause.
--> I wonder why this point has not been addressed. Perhaps there is a misunderstanding about what exactly are the frequencies that the claim refers to? If so, then this part of the claim should be made more clear. Otherwise, I would expect that the claim is elaborated on further because I still don't see how this claim can be correct.
4/ Many small typos in the parts of the manuscript that have been changed or added for this revision leave the impression that this revision was done in a haste.
* "ralational model" on line 12, RHS, page 2
* "Johnson et al. approach’" -> "Johnson et al.'s approach" on line 19, RHS, page 2
* "start" -> "star" on line 38, RHS, page 6
* "triple patter" on line 40, RHS, page 6
|