Review Comment:
Overall, the authors have provided a satisfactory response to the comments. They have addressed each comment individually and provided clear explanations for their choices. Here are some possible suggestions for further improvement:
Regarding the first comment about the definition of knowledge graphs not signifying that the systems work as a normally distributed graph, the response is appropriate. The authors have acknowledged that their work focuses specifically on knowledge graphs and have added a clarification in Section 1 to avoid any misunderstandings. They have also indicated that they will consider general distributed graphs in future work, which shows that they are open to exploring other types of graphs beyond knowledge graphs.
In response to the second comment about the lack of coverage of knowledge graph querying literature, the authors have acknowledged that they purposefully focused their discussion on the most related querying literature, such as the LDF framework, federated query processing, and Peer-to-Peer systems. They have also indicated their willingness to expand the discussion if the reviewer provides more specific details about what aspects are missing.
the authors have added overview figures to two sections in the paper, as well as expanding Table 3 and updating Section 4.2 and adding a paragraph to the end of Section 4.3 to incorporate the discussion on the impact of graph complexity on the system. They have also explained why they did not aim to beat centralized systems in terms of performance but rather focused on making query processing in the decentralized setup feasible with high query scalability. They have simplified some sentences and clarified others to make the paper easier to follow.
They explained that there may be non-relevant data obtained through the query fragment, but they try to avoid it by pruning non-relevant fragments in the source selection step. They differentiate the compatibility graph from the fragmented KG by saying that the compatibility graphs capture which pairs of fragments are compatible for a given query. The authors have simplified the definition of a compatibility graph to avoid confusion. The authors have updated the introduction to Section 4.3 to more clearly motivate SPBF indexes. They have added a clear example to the beginning of Section 5.1 with a visual element in Figure 7 that shows exactly what compatibility means. They have added Definition 14, which defines the relevantFragment(P,f) and updated the text throughout Section 5 with paragraphs stating what relevance means in this context. The relevance of fragments is a binary value; yes or no, rather than computing relevance scores. The authors also explained that the execution plans in Table 2 contain all the relevant fragments found in the query.
Page 6 Definition 3
The definition appears to be correct and well-defined. However, there are a few potential issues that could arise in practice and it would be great if authors can explain this:
The size of the set of solution mappings [[P]]G can be very large, making it computationally expensive to enumerate all possible mappings. In practice, some optimization techniques may be needed to efficiently compute the set of solutions.
The definition assumes that the underlying knowledge graph G is static, and does not change over time. However, many real-world knowledge graphs are dynamic, meaning that new triples can be added or existing triples can be deleted or modified. The definition does not specify how to handle dynamic changes to the knowledge graph, which could affect the correctness and completeness of the results.
The definition does not specify how to handle inconsistencies or contradictions in the knowledge graph. In practice, it is common for knowledge graphs to contain conflicting or inconsistent information, which could lead to unexpected or erroneous results. Additional techniques may be needed to handle such cases.
Page 12: Line 20-33
The statement suggests that merging small fragments can improve lookup time for optimizing join order and estimating cardinalities, but it is not clear how significant this improvement is or if there are any trade-offs (such as increased memory usage for larger fragments).
Page 13: Line 19-34
the complexity and potential performance impact of the proposed Semantically Partitioned Bloom Filters (SPBFs) indexing schema. This may need to be evaluated further in real-world scenarios to ensure that it is practical and efficient. Additionally, the fact that the SPBF indexes have to match entire star patterns to fragments rather than triple patterns, as opposed to the PPBF indexes from [19], may require some adjustment in the query optimization process.
Page 16 Line 1-20
the authors assume that query execution plans are always left-deep, which may limit the potential for optimization in certain cases. This is a limitation that could be addressed in future work as well.
|