Materialisation approaches for Façade-based data access with SPARQL

Tracking #: 3331-4545

Authors: 
Luigi Asprino
Enrico Daga
Justin Dowdy
Aldo Gangemi
Paul Mulholland

Responsible editor: 
Raghava Mutharaju

Submission type: 
Full Paper
Abstract: 
The Knowledge Graph concept is gaining momentum as an ideal approach to data integration. Therefore, it is of paramount importance to equip knowledge engineers with tools for accessing data from multiple, heterogeneous resources. The successful W3C standard SPARQL is the reference language for interacting with RDF knowledge graphs. For that reason, approaches extend SPARQL for accessing data in non-RDF formats. Recent research proposes relying on an intermediate RDF model, named Façade-X, whose components can be transparently mapped to various file formats. However, although Façade-X specifies how its components map to many different formats (CSV, JSON, HTML, Markdown, and others), it is still unclear how to implement a SPARQL execution engine that relies on it. In other words, what are the possible strategies for executing Façade-X queries? This article explores materialisation approaches for executing Façade-X queries. Specifically, we study two in-memory strategies for performing Façade-X data access with SPARQL. A complete materialised view strategy fully transforms the data source into RDF. Instead, a sliced materialised view strategy segments the data source and generates an RDF view on each part. Both strategies can be optimised by only materialising the part of the RDF graph that has potential matches with triple patterns in the query (triple-filtering). In addition, we compare these approaches with an on-disk alternative, which relies on a temporary database instance. We analyse the characteristics of these methods and perform extensive experiments, reporting on benefits and limitations of both approaches. Finally, we contribute guidelines and best practices derived from the findings.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 01/Aug/2023
Suggestion:
Major Revision
Review Comment:

In this work, authors discuss a number of on-the-fly materialization approaches for executing SPARQL queries in a facade-based fashion, where a facade is essentially an abstract interface to underlying data coming under heterogeneous formats.

The strategies tested by the authors are all on-the-fly strategies, in the sense that materialization of triples for answering queries is not performed upfront, but rather at query execution time. This is the first aspect that I found unclear of the work. Authors should clarify why one would prefer such an approach as opposed, for instance, to one where all triples are simply materialized upfront, in an offline stage, as it is usually done in classical direct-mapping settings or ETL. Is it really reasonable to fully re-materialize the whole dataset at each query execution, as done in one of the analyzed scenarios? By the way, to me, the whole facade-based data access story can be summarized as a sort of "direct mapping" approach to non-relational sources. Therefore, the W3C recommendation of direct mapping should at least be cited.

Another point that is not clear to me, is the rationale behind facade-based data access. I understand that the focus of this work is to discuss implementation strategies for facade-based data access, and that the facade-based data access approach itself has been introduced and justified in another publication ([15]). However, as a reader not familiar with the approach, I must admit that reading this work left me with many unanswered questions, which is an indication of a lack of self-containment of the discussion. I believe it is particularly critical to give a clear and strong motivation for facade-based data access, for several reasons: the approach is not well-established (yet), as the only other work about it is [15], a B-ranked conference according to LiveSHINE, or C-ranked according to Microsoft academics (I used this tool for checking the rankings: https://scie.lcc.uma.es/ratingSearch.jsf). In particular, it is not self-explanatory what the actual advantage of the approach is, nor its purpose, apart from the self-tautological argument about the application of a design-pattern. In particular, I believe that such motivation should at the very least clarify the aspects I am going to list below.

The idea behind facade-based data access is quite straightforward, and is to consider a single and very general data structure, list, and interpret all sources as instances of this structure. The correspondence between data in various formats and instances of lists is performed in a direct mapping fashion, that is, each "tuple" (if CSV is considered) corresponds to an object in the RDF graph, and identified by a blank node. Now, in a typical knowledge-graph construction settings, objects would usually correspond to URIs, not to blank nodes, and one of the main job performed by the mapping designer is indeed to provide a consistent way of constructing these URIs. Moreover, these URIs often have to adhere to certain strict policies about their format, following predefined vocabularies in order to enhance interoperability, as prescribed by well-known Linked Data principles. This is often the case, for instance, for application in the biomedical domain, where standard ontologies are widely used to share information about genes, proteins, and so on. Therefore, in mapping-design activities, the creation of appropriate object identifiers is of critical importance. What I find bizarre about facade-based data access is that, unlike for W3C Direct Mapping, users are required AT QUERY TIME to "bind" data values to specific URIs, through quite convoluted string concatenation operations. Basically, a user writing a SPARQL query in a facade-based data access setting needs not only to fetch the desired information, but also to resolve certain issues that, in other settings, are typically resolved by a mapping designer (or even automatically, in W3C Direct Mapping). This aspect to me is clearly critical, and I would like an explanation on what are the advantages of moving mapping design burdens to the end-users.

Another doubt I have is on the intuition of using lists in the first place. Essentially, the RDF graph "directly mapped" by the facade-based approach is a list of lists. There is no real structure among objects (e.g., there are no object properties), and although the list gives great flexibility (as it can capture essentially any kind of source, as shown by the authors) this is paid at query time, where users need to write SPARQL queries that interrogate collections, in a purely syntactical fashion. Note that this goes against the main argument about publishing data as RDF, which is the one of providing semantics to your data, often pairing them with ontologies. To me, a possibly meaningful application of this facade-based data access would be in the construction of RDF graphs, through the CONSTRUCT keyword of SPARQL, so to produce semantically enriched RDF graphs starting from the very coarse list abstraction. Note, though, that this is essentially a mapping design effort, and one would then need to motivate why not to use dedicated tools such as RML, specifically designed for that job (as opposed to SPARQL).

Apart from these general considerations about facade-based data access, that if clarified would render the work more self-contained, I now list some other considerations on the rest of the contribution.

I have some concerns about the claimed contribution: "However, although Facade-X specifies how its components map to these formats [...], it is still unclear how to implement a SPARQL execution engine that relies on it". Reading this sentence, I got the impression that the actual contribution of this paper is to provide the first implementation for facade-based data access. However, when I looked at [15], I saw that authors already provided an implementation for the framework, as well as an empirical evaluation. Therefore, this is not the actual contribution of the paper, and in this sense the sentence is misleading (at least, it was to me). The actual contribution, as I said at the beginning of my review, is to study different on-the-fly materialization strategies for SPARQL execution over a facade-based data access setting. I find the contribution interesting, but actually quite limited. Moreover, considerations about the applicability of on-the-fly materialization strategies would actually apply also to conventional query answering settings, and is not specific to facade-based data access. This kind of considerations is also not a novelty for SPARQL federation systems: for instance, the "ParSet" structure of Squerall (https://link.springer.com/chapter/10.1007/978-3-030-30796-7_15) looks very related to your in-memory setting. Summing this up with the somehow "not obvious" advantages of the framework, I believe that in its current state the contribution is too limited for a venue such as the Semantic Web journal.

The paper is generally well-written, however there are a number of presentation issues concerning the evaluation part. Specifically, cross-comparisons between the considered materialization settings are not immediately clear, as one needs to jump between different pages, tables, and figures. Considering that comparing the different strategies against each other is *the* actual contribution of the paper, this is clearly something to be improved. For instance, rather than having four gigantic figures, essentially representing the same information for each setting and basically wasting space, I suggest to shrink and arrange them in a grid, all within a single figure. Similarly, one might consider strategies for merging Table~2, ..., 5 within a single table, given that also these provide the very same information.

Another minor issue with the evaluation part is that, for figures, the heap allocated is never specified, despite being relevant information that should be included.

I appreciated the fact that authors provided a repository with material for replication, and replicability is a strong point of this paper.

Finally, I list some minor issues:

- In problem formalization, point (ii), the actual relationship between the data source, the query, and G is not specified, nor an intuition is given. What should a facade specify?
- After point (iii) in the problem formalization, ds and r are not sets, therefore they cannot be in a strict subset relationship.
- The formal definitions provided in 2.2 are never used throughout the rest of the paper. Either remove them, or better show how the concrete examples instantiate these definitions.
- On page 6, first column, when saying that "A is the algorithm that matches a resource R with a facade function F to return an RDF dataset that can resolve the query Q". Is there any difference between your algorithm and source-selection algorithms proper of SPARQL federation engines, such as, for instance, HiBISCus, PolyWeb, SAFE, etc.? There is a clear connection to me, however this line of works is never cited.
- Page 7: "withing" -> within (typo)
- "we consider how resources can be interpreted as collections of data sources by applying a segmentation method" -> what method? Please, clarify.

Review #2
Anonymous submitted on 28/Aug/2023
Suggestion:
Major Revision
Review Comment:

This paper presents a set of different approaches for implementing the intermediate model Façade-X, which allows the transformation of heterogeneous data sources into RDF and answers queries extending SPARQL. The implementation is made over a well-known engine (SPARQL-Anything) and tested over the GTFS-Madrid-Benchmark, a benchmark for testing performance and scalability widely used by the community on KG construction processes. The authors present two main strategies for implementing the Façade-X: a complete materialization view and a sliced view (which incrementally answers the input query). Additionally, they implement an optimization approach for materializing only the RDF data required by the query (what they call triple-filtering). They also include in the evaluation the management of the intermediate results using memory or writing them on disk. The paper is mainly well-organized and easy to follow.

The paper is submitted as a “full paper” so I’ll use the journal guidelines to review it w.r.t. three aspects: originality, significance of results, and quality of writing.

originality:
The paper aims to compare different approaches to implement the Façade-X, treating this intermediate representation as a standard or recommendation widely used and accepted by the whole community. In the end, to the best of my knowledge, Façade-X is only implemented by SPARQL-Anything, and although it is used by several projects, it is not a standard such as R2RML or Direct Mapping (implemented by many vendors and stakeholders), so the potential interest on this kind of analysis by the SW community is quite limited, IMHO.

Secondly, the paper needs to be better positioned with respect to the state of the art. The authors vaguely mention MapSDI and Morph-CSV, which are core contributions to take into consideration as they implement very similar approaches. Although as mappings they use RML, this is a technological decision, and the formalization and approach could be seen as almost the same. MapSDI takes the RML rules and projects into temporary files only the fields from the data source that are required to be mapped, providing a horizontal solution that can work with any RML mapping and it is already implemented by default in the SDM-RDFizer. This approach could be seen as a mapping-filtering very similar to the triple-filtering implemented in this paper. Morph-CSV goes beyond mappings and takes into account SPARQL queries together with RML mappings for only applying the data cleaning tasks it implements over the triple patterns detailed in the SPARQL query. Although technologically, the approaches differ, from a research point of view the problem and solutions are quite similar. Indeed, Morph-CSV has already tested its triple-flittering approach over the GTFS-Madrid-Bench, confirming that it enhances the performance of OBDA engines. The paper needs to clearly describe the differences between the proposed approach and the others already published and tested, and not from a technological point of view, but from the research one.

The complete and sliced views implemented are not very original from the research point of view as they are usually approaches already implemented in KGC systems. For example, Ontop implements a sliced view when it performs materialization, and in the configuration of the SDM-RDFizer, one can detail the chunk size of the input sources to be used during the materialization. Focused on the complete view, is there any reason why this approach is needed? i.e., if the input SPARQL query already describes the fields from each data source required to answer the query, why does the engine decide to first map the complete dataset into RDF and then run the query? IMO, if the complete view is removed after each query and not reused for other queries as a caché, the triple-filtering approach must be implemented by default always.

Finally, the engine runs a data integration process, and I would suggest the authors take a look at the literature to formalize their approach w.r.t. standard concepts and definitions already accepted (e.g., OBDA) so the research contributions are comparable to previous work. A few interesting papers [1, 2]

If there is an extension of the GTFS-Madrid-Bench as the authors claim as a contribution, I would also suggest the authors contribute to the main repository of the benchmark, including the created resources so they can be used by other users.

Significance of the results:
The authors tested the proposed approaches over the GTFS-Madrid-Benchmark, detailed the methodology, query construction, etc. which ensure the reproducibility. The version of the engine used should be mentioned, the link to the releases is not enough. I would also suggest linking the GitHub repository with Zenodo, so each release will have a dedicated DOI which is very useful for citation and reproducing experiments. How RMLMapper was able to handle the generation of GTFS-1000? From previous experiments (see citation 7 of your paper), it isn’t able to handle such sizes in a reasonable time. Why not select an optimized engine such as Morph-KGC or SDM-RDFizer?

The obtained results are not very surprising, triple-filtering works better or similar (depending on the query) w.r.t. complete. Sliced view is a bit difficult to evaluate, since there is no information about the actual size of the slice (is only one row/array?, maybe a more efficient algorithm could be implemented?). In any case, it’s a parameter that should be considered in the evaluation as clearly will impact the performance. Data format does not impact the performance, which is good for the implementation. The authors use the average execution time to have an overview of the behavior of each approach, but why not Geometric mean? The metric has been already used in other performance experimental evaluations and provides a better measure of the performance of each approach. I would like to see a table with that numbers. In tables 2-5, what not executed (black) means? If it is not a memory limit or timeout error what happened? It is not clear to me after reading the explanations. Figures with average execution time for the queries could be smaller and occupy less space.

The authors didn’t include any other engine from the state of the art in the evaluation, claiming that the aim of the paper is to study different implementation approaches for Façade-X. Although I found this claim fair, I also think that it directly affects the potential impact of the results. IMHO, to be published in this journal, other similar solutions need to be incorporated in the evaluation (Ontop, SPARQL-Generate, and maybe an RML engine) to provide the readers with a complete overview of the proposed approaches. For example, a similar paper already under evaluation in this journal follows a similar structure (presenting different implementations and operators over the same engine) but it provides a comparison with the state of the art [3], although the paper is being evaluated as a system report and not as a research paper. I don’t think this paper could be published if these results are not incorporated.

quality of writing:
In general, the paper is very well-written and easy to follow. I was expecting more details and insight discussion on the results (not just analyzing the numbers obtained). Extending the bullet points at the end of section 4 would help. There are some acronyms that are used but not introduced (e.g., OBDA), and try to be consistent with the naming (GTFS-Madrid-Benchmark is called in many different ways). Captions of the tables should be on the top of the table.

For all these reasons, I think the paper really needs to go through a major revision. so we can evaluate the contributions from a research perspective and understand what are the main differences w.r..t. previous and similar solutions. The experimental evaluation needs also to be extended incorporating other engines that perform the same tasks.

[1] Poggi, A., Lembo, D., Calvanese, D., De Giacomo, G., Lenzerini, M., & Rosati, R. (2008). Linking data to ontologies. In Journal on data semantics X (pp. 133-173). Springer Berlin Heidelberg.
[2] Xiao, G., Calvanese, D., Kontchakov, R., Lembo, D., Poggi, A., Rosati, R., & Zakharyaschev, M. (2018). Ontology-based data access: A survey. International Joint Conferences on Artificial Intelligence.
[3] https://www.semantic-web-journal.net/content/empowering-sdm-rdfizer-tool...

Review #3
Anonymous submitted on 19/Nov/2023
Suggestion:
Major Revision
Review Comment:

This paper discusses how a SPARQL execution engine can be implemented on top of Façade-X to answer to Façade-X queries. Façade-X specifies how to map data in different formats (e.g., CSV, JSON, HTML etc.) as-if they were in RDF. The paper studies two strategies: a complete materialization strategy where the data is fully transformed to RDF to answer to a query and a sliced materialized view strategy where the data is segmented, and RDF views are generated for each segment. Both strategies are optimized by filtering the triples which do not match with triples patterns in the query. The two strategies and their optimizations are compared to the on-disk alternative where the data is temporarily stored.

The paper still needs significant improvement before it gets published. The related work needs to be reworked to better cover the relevant state of the art. The evaluation needs to be extended to position the systems and its results with respect to the state of the art. The strategies section needs to be curated to better describe the different algorithms. The paper also needs thorough proofreading due to an excessive amount of typos and grammar errors. Last, it would be good if the authors create zenodo entries for the resources they used for the system they developed and the evaluation they performed.

Regarding the evaluation, in particular, I would suggest to further run evaluations to cover the following aspects. The rationale behind each of these points is explained in my more detailed comments under the relevant section:
- Different Façade-X representations to assess the impact of the Façade-X representation to the queries implementation.
- More nested JSON data sources and XML data sources.
- More slices’ sizes to assess the impact of the slices’ size to the queries’ answers.
- More benchmarks to have a better view of the results.
- Compare with state-of-the-art systems, e.g., Ontop, SPARQL-Generate and SANSA.

This manuscript was submitted as 'full paper' and should be reviewed along the usual dimensions for research contributions which include (1) originality, (2) significance of the results, and (3) quality of writing.
(1) The originality and significance of the evaluation results are questionable as state-of-the-art solutions have already proven these results for other systems. The strategies are applied for the first time in the case of Façade-X but it is not clear based on the current version of the paper what is novel in the way that these strategies are implemented.

Long-term stable URL for resources
(A) whether the data file is well organized and in particular contains a README file which makes it easy for you to assess the data:

The URL for the resources points to the adjusted benchmark but there is no URL that points to the SPARQL Anything system which was used to implement the different strategies. The repository with the benchmark contains a READ.ME file with some basic instructions. The data is available via a link to the repository to the original benchmark.

(B) whether the provided resources appear to be complete for replication of experiments, and if not, why,
At least the link to the release of SPARQL-Anything that implements the different strategies is still required to be able to reproduce the results.

Introduction
-----------------
The introduction is very focused on Façade-X and it does not reflect neither on the bigger picture, i.e. what the actual problem is, nor on how other approaches address the problem, e.g., what strategies were followed by other non-Façade-X approaches to address similar issues?

The different strategies are introduced in the introduction, as well what data formats are considered, and which benchmarks were considered. These comments will come back in the following sections as well, but I will already outline my concerns:
- The paper only looks at the Façade-X case and the performed evaluation fails to position the Façade-X implementation strategies with respect to the state of the art. These strategies are interesting but what if the best performing strategy is still less performant than other approaches which perform the same task but without Façade-X?
- It is hard to assess how innovative these strategies are as the paper does not discuss the state of the art. Were these strategies considered by other approaches? Were other strategies considered, and if yes which ones? Why weren’t existing approaches considered and why these strategies are considered as the most promising for Façade-x?
- JSON files are considered but when hierarchical data is considered, XML formats bring more challenges than JSON data sources. In particular in the case of the slicing strategy, creating slices over XML data sources may be significantly more challenging than with JSON data sources as the XML data sources may have both multiple nested elements as well as attributes.
- Why was only GTFS considered and not other benchmarks, such as LUBM (https://github.com/oeg-upm/lubm4obda) and BSBM (http://wbsg.informatik.uni-mannheim.de/bizer/berlinsparqlbenchmark/), the COSMIC testbeds (https://github.com/SDM-TIB/SDM-RDFizer-Experiments/tree/master/cikm2020/...), NPD benchmark (https://github.com/ontop/npd-benchmark) for which we have the original data as well?

Section 2: Façade-based data access
------------------------------------------------

It is mentioned at page 3, line 20-21 that g_ds,q is the minimal, optimal. Is there a proof of this statement? How do you define the minimum optimal? Are the minimal set of triples and optimal set of triples the same? Is there a proof for this?

At the same page, line 31-32, it is mentioned that the answer to the query should not be the minimal/optimal but any superset. Would that mean that more triples could be returned than the triples that answer the query?

At page 4, lines 20-21, it is mentioned that a Façade-X engineer can use either IRIs or blank nodes for the containers. I’m wondering what’s the impact of this choice for the evaluation of the queries and thus if that was considered in the evaluation.

At page 4, lines 23-24, it is mentioned that a Façade-X engineer can design connectors to an open-ended set of resource types. What resource types are meant here? And how this affect the potential implementation?

At page 4, lines 27-28, it is mentioned that instead of using container membership properties, to use the first row of the CSV file to create named properties for the inner list instead of membership containers. I’m wondering what the impact of this choice is on the implementation of the SPARQL Engine. Did the authors take this into consideration in their evaluation? If not, which version did they use and why? I’d suggest that both options are evaluated.

Section 3: Strategies for executing Façade-X queries
--------------------------------------------------------------------

Page 5, lines 29-30 and 34-35: It is mentioned that in Figure 2, the components which are in gray color are given whereas the components which are in green are parts of the proposed system. But then in lines 34-35, it is mentioned that the system creates the query plan. Thus, is the query plan part of this system or another system? I’d expect of this one given that the paper describes how SPARQL queries are answered over a Façade-X.

- I’d suggest to the authors to clearly indicate which components in the figure are part of this system and which are not, and if they are not, discuss on which systems the complete solution depends on.
- As the query plan is integral part of the query answering, I’d suggest to the authors to include the algorithm that produces the query plan.
- I’d also suggest to the authors to provide the algorithm for the triple-filtering, which is not discussed at all in the text, as well as the algorithm for the streaming of the results and their assembly in particular for SPARQL queries that contain e.g., aggregations.

Page 5, lines 48-49: It is mentioned that the user can indicate how the CSV and JSON data source can be segmented. However, it is not indicated how the user expresses it in the case of CSV data sources as opposed to JSON data sources where the users can indicate this with JSONPath expressions. It is also not specified at which point the users indicate the segmentation method, e.g., when the query is performed?, nor whether these users are the data consumers, i.e. the ones who query the Façade-X or the data owners, i.e. the ones who own the original data. Then again, if it is the data consumers, that means that they know the structure of the original data. And what is the strategy of the system? Does the user choice overrule the system choice? What is the fall-back strategy if the user does not indicate it? How does the system decide on the segmentation strategy? All these questions should be answered in the text of the paper and the corresponding algorithms should be provided.
--> This comment is partially answered in the evaluation section but the comment for the clarification here still holds.

Figure 4 barely extends Figure 3, thus figure 4 is enough and figure 3 is not needed in the paper. The difference between the two can be indicated in the caption.

Section 3.3. This section makes some strong statements which can be debatable and others which are not fully correct.
- I’d suggest to the authors to refer to the paper which prove some of the statements, e.g., that systems fail with large data sources if they load the data sources in memory.
- The authors should discuss the difference between their triple-filtering approach as opposed to solutions like Ontop which opt for query rewriting.
- Lines 33-34: The statement that filtering should be beneficial for reducing the resource requirements should either be supported by references or be turned into a research question. I question this statement as the process of filtering may add overhead. Are the authors aware of what its impact is e.g., on the performance?
- Lines 42-43: The authors mention that the slicing strategy might be less efficient for answering queries as it adds overhead. That is true but if this is a discussion section, the impact to the performance should also be discussed. As before, I’d suggest to the authors to turn this into a research question that needs to be answered. What is the impact of the sliding size to the overhead?
- Lines 48-49: The authors claim that the memory footprint should favor the sliced approach. Again, I would suggest to the authors to add a reference to this claim or turn it into a research question to be answered by this paper. The memory footprint might still depend on the size of the slices. A large slice might again challenge the system’s memory as small slices might as well impact the memory given that intermediate results need to be maintained in memory.
- State of the art solutions consider parallelization to improve the performance as opposed to slicing, such as Morph-KGC (https://doi.org/10.3233/SW-223135), RMLStreamer (https://doi.org/10.1007/978-3-031-19433-7_40) and SANSA (https://ceur-ws.org/Vol-3471/paper8.pdf). Have the authors thought of including a parallelization strategy? I would suggest to the authors to include such a parallelization strategy in their evaluation and compare their system with the state of the art solutions.

Section 4: Evaluation
----------------------------

What are the segmentation sizes which were used for the evaluation? I would suggest to the authors to run the experiments with more segmentation sizes and present the results so we can assess the impact of the segmentations to the query results.

The results for size 1 are trivial. The columns for size 1 could be removed from all figures and replaced by 1 sentence saying that all were ok. This way, the tables could be grouped by 2 which would make the comparison of the results easier.

Figure 6 shows a peak for sliced+triplefiltering which should be discussed in the paper. The same for results q12, q13, and q14. This comment holds for all outliers which are shown in the figures of the results.

Page 8, lines 28-29: It is mentioned that a benchmark was designed but in fact a benchmark was reused. That should be rephrased.

Page 8, lines 30-31: it is mentioned that queries of varying complexity were performed. What it is meant as complexity should be clarified.

Page 8, line 49: it is mentioned that python scripts were used to analyse the results of the benchmark. Why didn’t you reuse the scripts of the benchmark?

Page 9, lines 29-30: It is mentioned that the strategy should be mentioned in the SPARQL query. While this should be clarified earlier where the strategy is presented, I find it odd that the user who wants to query some data should specify how the query will be resolved.

Page 9, lines 28: It is mentioned that the results of the queries as produced by the system were compared with the results of the GTFS benchmark. However, I would suggest to the authors to clarify how this comparison took place. Was the number of triples or the isomorphism of the graphs? Or other comparison?

Page 10, lines 29-30: The GTFS sizes which are considered are low compared to other benchmarks. It is also discussed in the state of the art that other benchmarks provide different challenges and by executing more of them, we can have a more holistic view of the situation.

Page 10, line 39: The section header is Discussion and the the section starts by saying that the results were presented in the previous section, but no results were presented in the previous section.

Page 12, line 49: the queries which perform well with triple-filtering are the queries which have the simpler patterns. This needs to be further discussed and investigated.

Page 14, line 49. The paper concludes based on the results that the execution time is independent of the format. This cannot be concluded by the provided results. The authors did not try different levels of nesting objects and arrays for JSON to assess their impact nor did they try with XML which is typically a more challenging hierarchical structure than JSON, nor did they try other data formats e.g., HTML.

Page 15, line 11-12: I do not think that there is enough evidence based on the results of this evaluation to claim that the results can be generalized for other formats for the same reasons as my previous comment.

Overall, the results are trivial. The paper does not present results which were not already proven before or were not expected. The paper eventually concludes that the complete materialization strategy is the most performant solution with respect to time but they authors do not compare their system with state-of-the-art solutions. State-of-the-art solutions use different strategies, e.g., parallelization, which significantly improves the performance of the systems as opposed to complete materialization approaches. Moreover, state-of-the-art solutions rely on query rewriting which is also proven to be significant more efficient on certain occasions, but the authors also did not compare with such systems. The fact that a certain strategy may be the fastest for a system does not make it the best strategy unless it outperforms the state of the art as well.

Section 5: Related Work
--------------------------------
A large part of the related work is not covered by the paper. Surveys on the domain covering both materialization and virtualization systems are not mentioned at all: doi.org/10.1162/dint_a_00011 and doi.org/10.1016/j.websem.2022.100753

Relevant benchmarks are not discussed in the related work section, so it remains vague why the authors have chosen this benchmark to assess their system.

The related work does not touch papers which were published related to querying SPARQL endpoints with different schemas and where the queries needed to be rewritten nor does the paper compare with such systems.

Typos and grammar errors
------------------------------------
Some typos are mentioned here but the paper needs to be thoroughly checked:

Page 4, line 20: two times of
Page 8, line 16: we don’t --> we do not
Page 8, lines 48-49: which it is reasonable to keep --> which is reasonable to be kept
Page 11, line 48: 69.8% of the use cases not supported --> are not supported
Page 13, line 44: %202 and %50 --> 202% and 50%