Review Comment:
This paper introduces a meta model to semantically enrich and integrate precision agriculture and livestock farming. The model reused several widely adopted standards, including DCAT, QB, and PROV-O. The main contribution is the creative linking between DCAT and QB through a SHACL shape so that data measures are associated with datasets. Plus, the paper is well-structured in general and is relatively easy to follow. Below are my major comments:
(1). The meta model is proposed for the domain of precision agriculture and livestock farming. However, even though there are some discussions about the specifications of agricultural metadata requirement, I do not see how these ‘specifications’ are different from other types of domain-specific data and how they are implemented in the model. The discussion in section 4 seems very general. Consequently, I do not see much novelty of the proposed meta model for precision agriculture and livestock farming.
(2). Following my first comment, I am suspicious on how the authors categorize agriculture and livestock farming data. First, I think the categorization generally works for all types of data, e.g., environmental data, urban data, and etc. So how different this proposed meta model would be for non-agriculture data? Second, won’t sensor data overlap with earth observations? Won’t maps and earth observations all include location data? So the proposed categorization is not mutually exclusive? Next, how does this categorization help the design of the meta model? Will it also be a class in the model (I do not see it in Figure 2 though)? How would such a categorization help end users to answer their competency questions? Finally, I think maps (if you mean vector-based polygons, polylines, or points) are structured data as they are stored as relational database in most GIS systems. Also, I am curious why maps and images cannot have a structure (see the statement of the second bullet point in Section 5.1)? For example, a geographic entity can have a spatial relation with another, which should be captured by a schema (data structure).
(3). In section 4.3, the authors summarized three functional requirements based on interviews and survey with stakeholders. I am wondering what kinds of questions have been asked in the interview or survey? How many stakeholders have been interviewed? Without these elaborations, the summarization seems arbitrary. Additionally, like my comment (1), I do not see how different the non-functional requirements (Table 1) would be for non-farm/non-agriculture domain.
(4). In Section 6: Application of the Model, I suggest replacing the listings to figures (for these data and structures in RDF), which would be more readable and it can save a lot of space as well. More importantly, I do not see much significance from these demonstrating examples. For example, I believe using SOSA together with DCAT might have similar capability, if not even better as temporal and spatial info are already modeled in SOSA. So a comprehensive comparison between different ways of designing the model would be needed here to show the significance of the work. Alternatively, the authors should show the capability of this model to address rather complex competency questions, e.g., semantically integrating data from various sources. The current queries shown are trivial IMHO (i.e., can be done using other models).
(5). It would also be worth for the authors to explain on whether it is a better idea to semantically annotate individual data records using RDF, rather than only on the meta level? One advantage I think is that one does not need to know both SPARQL and SQL at the same time in order to query useful data. It might be beneficial to have either relational database or linked data in a project, but not both? All these questions are fundamental to this work and worth discussing.
(6). The model is served on a long-term maintained URL. However, README in the provided Github page is missing. The replicability of the model/data might be difficult.
(7). The paper must be proofread carefully. There are many typos. E.g., Page 2 paragraph 1, “… as well as the identification the best harvesting period” --> “identification of the best”. Page 3 right side paragraph 1, “… as the definition of the proposed mode in this is paper” --> delete 'is'.
In summary, the topic discussed in this paper is trending and the proposed SHACL shape to address the linking between two ontologies is creative. But for a journal paper, this paper should be substantially improved in terms of its methodological originality and result significance.
|