Review Comment:
The article presents a three-step (named iterations in the article) evolution of the Kadaster Knowledge Graph (KG). The article presents each iteration in terms of a model developed, a technological architecture illustrated by a figure and an evaluation. In the evaluation some shortcomings are presented.
Regarding the quality, importance, and impact of the described application (convincing evidence must be provided). The quality of the artefact, due to its size, is hard to be evaluated nevertheless, playing a bit with the Kadaster Knowledge Graph it seems to have been a high-quality application in technological terms. The importance and the impact of the Kadaster Knowledge Graph, as any other Kadaster, is clearly relevant. However, the article is about the evolution of this Knowledge Graph and the lessons learnt, and this is important to be considered. The quality, importance and impact of the lessons learnt, as well as the iterations defined is not very high:
• The Kadaster KG is available online, however, the article lacks links to the ontologies and other related artefacts (like SHACL shapes) mentioned in the paper that are relevant to understand and watch the evolution through the different iterations
• Each iteration presents models, but it is unclear how these models and the Knowledge Graph where developed.
• The evaluations sections present some shortcomings. However, it is not clear how these shortcomings are addressed in the next iterations. Also, the article mentions an evaluation for each iteration but does not explain how this evaluation was carried out or what was evaluated.
• In general, the reviewer is not able to follow and understand the evolution of the different steps since their explanation is shallow and generic.
Regarding the clarity of the article, which the reviewer considers one main drawback of the article. The article is not written in a clear language, particularly the first sections. Several sentences are repeated without providing more information about the contributions of the article. For instance, the evaluation and the iterations are mentioned in the abstract, in the introduction, and in the context. However, no information about the iterations or the evaluation is provided; just that there are three iterations and a design science methodology evaluation. There is no insight in the different sections or more information. This provides to the reviewer the impression of been reading the same text several times without reaching any point further. A reader needs to arrive to section 3 to understand what the authors mean by the iterations and the evaluation.
Following the previous comment, the contribution and the challenges are not clear, and they seem to change depending on the paragraph. They revolve around the iterations and the evaluation but sometimes, it seems the contribution is the definition of those iterations, other times it seems the implementation, other the evaluation as the iteration where already defined. A reader needs to reach section 3 to understand what these iterations stand for since there is no prior explanation about them.
In addition, the article has many inaccuracies and sentences that need to be improved, some of them:
- In the abstract the world KKG appears without a previous definition of it, it should appear what this acronym stands for before using it.
- "Over the past decade, three distinct iterations of linked data
creation, publication, and integration have emerged" iterations do not emerge, they are identified or performed
- "The most recent iteration involves the KKG" all iterations involve linked data. Which one is the most recent, the first iteration, the second, the third? is the most recent a fourth iteration? this is not clear.
- "A design science methodology is used to perform the evaluations and the findings reveal the importance of strategic alignment," a punctuation sign is needed after the world "evaluations"
- "The impact of the paper presented in this work is twofold" sentence is a bit redundant
- "The transformation of a Web Feature Service (WFS) model to a linked data model involves the alignment of the underlying data schema with semantic web standards. Initially, the WFS model, which is typically structured in a format optimised for geospatial data exchange, is mapped. This mapping process involves translating the elements of the WFS schema—such as feature types, attributes, and relationships—into corresponding classes, properties, and relationships in ontologies based on standards like RDFS, OWL, SHACL and SKOS" --> The schema of WFS, that is a model, should be developed as an ontology or mapped to an existing one so data expressed with WFS could be expressed in RDF according to such ontology. SHACL is for validation, how is WFS mapped into SHACL. This paragraph needs to be explained in more detailed as now has misleading and (seems) incorrect information.
- " The ETL begins with the extraction of relational data from the WFS service followed by loading it into a spatially enabled database. This step ensures that geographic features available via the WFS services are standardised into a structure which supports efficient querying and manipulation in the following steps of the ETL" How this can be ensured? the database and the fact that data is stored ensures that geographic features are standardised into a structure compatible with ETL? which standard is that?
* "The relational data is then mapped to the model defined and the resultant triples are loaded into an instance of GraphDB during which the SHACL validation step is taken to ensure that the resultant linked data adheres to the linked data model" Which data model, i.e., ontology ? how those SHACL are developed ?
* "This architecture is illustrated in Figure 1." Figure 1 shows an RDF4J database, the role of this component is not explained.
- "While this construction iteration resulted in the availability of a high volume of linked data" What does this mean, how much is high volume? in the figure 1 only one relational database appears.
- "The Information Model details specific dataset information using the Shapes Constraint Language (SHACL) to ensure internal consistency and maintain recognizability for domain experts. Conversely, the Knowledge Model captures generic, shareable knowledge, facilitating integration with external linked data models through RDF(S), OWL, and SKOS vocabularies for improved reusability and interoperability. A model for each key register and required external source was manually defined." Something similar was done in the first iteration, what is the difference now? Also, SHACL defined restrictions, is a bit odd to call it Information Model. How was the SHACL shapes developed? are they available?
- In figure 2 where is the original Relational Database and the WFS services? The reviewer expected an evolution of the previous iteration, but Figure 2 seems a new scenario.
In general, the article needs to be improved to explain accurately the evolution of the Kadaster Knowledge Graph. As the article is currently written and ideas presented it seems to have a lot of inaccuracies and the impact of the contribution is not clear. It would be also a good addition to explain how the lessons learnt can be adopted third parties.
|