Review Comment:
This paper describes a framework to integrate, analyze and exploit quality metrics of geospatial data using a semantic web approach. The authors propose the use of several semantic web technologies and standards to integrate the wide amount of non-semantic data models in the geospatial domain. The main use case where this framework has been implemented is the Ordnance Survey Ireland (OSi) mapping agency. The topic fits very well with the special issue. I use the recommendations of the journal for these kinds of papers (application report) to review the paper.
(1) Quality, importance, and impact of the described application
The paper tackles a very important problem in the geospatial domain, which is the wide amount of different standards and ways of representing the quality of the data. The idea of using semantic web standards in the project to integrate all of them together as well with sustainable data pipelines is very interesting. In addition, the authors mention that OSi employees see the application as very useful for their work. I really liked the work of mapping the different standards (table 3), which I think is the most relevant contribution of the paper. My main concern here is the impact that can have the tool beyond the OSi agency, i.e. it is neither demonstrated nor mentioned if this framework can be integrated into other agencies or with other datasets. There is not a clear motivation for the use of dedicated software (1Spatial 1 Integrate and Luzzu) which may impact negatively on the adoption of the framework in other contexts.
I also did not understand the decision of using R2RML-F for materializing instead of a more declarative and sustainable approach using RML+FnO[1], with the description of the transformation functions in a declarative way and the possibility of using other tools such as FunMap[2], Ontop[3] or Morph-KGC[4] which have been already demonstrated that are more efficient[4,5]. I really missed during the paper a related work section to compare the proposed approach with others from different domains where similar problems are tackled, and I still do not understand why there is not any reference to GeoTriples work[6,7] and Ontop_spatial[8].
Finally, I visited the resources provided by the authors using Gogs, but non of the resources follow good open science practices (no license, no description, no DOI, etc), so reusability is not ensured.
(2) Clarity and readability of the describing paper
The paper is very difficult to follow, there are many long sentences, with repeated concepts within them and where the ideas are not clear. I think the authors wanted to demonstrate all the effort they did during the development of the project but the paper does not clearly show it. The general structure is very difficult to follow, with many confused ideas and concepts. For example, it is not clear which is the main contribution of the paper bc the authors repeated several times along the introduction with different aims, questions, or vague ideas (governance of data, data pipelines, automatic approaches). The lack of clarity in transmitting the ideas is clear when there are 8 different contributions.
The readability of the paper is not very high, as there are sentences that do not have sense or concepts that are not introduced properly. For example p1-c50-second column “There is a common feature of all these systems that these applications need unification of high quality geospatial data, computer methods and domain knowledge to provide high quality results”. What high quality results mean? Results about What? Queries? Analysis? And I could extract similar examples from many other parts of the paper (see p17-c34-second column where the sentence has almost 8 lines without commas or dots). In addition, there are many typos and inconsistencies within the paper (mentioning only a few of them): listing 3 is referenced as materialized data when actually is a transformation function, Listing 4 contains RDF errors (no datatypes, no quotes for plur:value property, etc), Table 2 is actually a Listing, why is relevant to mention the features of the computer where the experiments were run?. I would encourage the authors to dedicate time to the details of the paper as the work presented is very relevant and can have a high impact on the community but the clarity of the ideas and the way how the paper is written needs to be changed.
Overall, I think the status of the paper is not ready for acceptance in this journal and it would need a complete rewritten process, beyond what is expected as a major revision.
[1] Meester, B. D., Maroy, W., Dimou, A., Verborgh, R., & Mannens, E. (2017, May). Declarative data transformations for Linked Data generation: the case of DBpedia. In European Semantic Web Conference (pp. 33-48). Springer, Cham.
[2] Jozashoori, S., Chaves-Fraga, D., Iglesias, E., Vidal, M. E., & Corcho, O. (2020, November). Funmap: Efficient execution of functional mappings for knowledge graph creation. In International Semantic Web Conference (pp. 276-293). Springer, Cham.
[3] Calvanese, D., Cogrel, B., Komla-Ebri, S., Kontchakov, R., Lanti, D., Rezk, M., ... & Xiao, G. (2017). Ontop: Answering SPARQL queries over relational databases. Semantic Web, 8(3), 471-487.
[4] Arenas-Guerrero, J., Chaves-Fraga, D., Toledo, J., Pérez, M. S., & Corcho, O. (2022). Morph-kgc: Scalable knowledge graph materialization with mapping partitions. Semantic Web.
[5] Arenas-Guerrero, J., Scrocca, M., Iglesias-Molina, A., Toledo, J., Gilo, L. P., Dona, D., ... & Chaves-Fraga, D. (2021). Knowledge graph construction with R2RML and RML: an ETL system-based overview.
[6] Mandilaras, G., & Koubarakis, M. (2021, October). Scalable Transformation of Big Geospatial Data into Linked Data. In International Semantic Web Conference (pp. 480-495). Springer, Cham.
[7] Kyzirakos, K., Savva, D., Vlachopoulos, I., Vasileiou, A., Karalis, N., Koubarakis, M., & Manegold, S. (2018). GeoTriples: Transforming geospatial data into RDF graphs using R2RML and RML mappings. Journal of Web Semantics, 52, 16-32.
[8] Bereta, K., Xiao, G., & Koubarakis, M. (2019). Ontop-spatial: Ontop of geospatial databases. Journal of Web Semantics, 58, 100514.
|