Review Comment:
The article proposes an approach to build an I4.0 benchmark dataset by using KG, called I4.0 KG, to integrate data from sensors attached to machines in a production line for the manufacture of soccer balls.
As a general comment, the article lacks of a concise and clear description of the key terms characteristics of the dataset. This does not help to facilitate its usage for different purposes. No information about dataset maintenance, reported usage and known shortcomings or limitations is provided. Furthermore, the article does not demonstrate what is the advantage or the contribution of using a KG in addition to the ontological models that exist in the literature for the integration of heterogeneous data and the access to this data. I think the authors should emphasize this to make it clear to the readers. Also, the authors talk about Industry 4.0 in general but the KG they present is specific to soccer ball manufacturing. I strongly recommend the authors to dedicate a section to discuss the possibilities of adapting their approach and the proposed KG to be used in other manufacturing activities.
Here are some more specific comments:
The title of the article is not clear. I don't understand why KG appears at the end.
In the abstract:
"Our research helps the stakeholders to take timely decisions by exploting the information embedded in the KG." - I do not see this verified in the article.
In the introduction:
Mass production was achieved in previous industrial revolutions, not in I4.0.
I4.0 is more about the use of various technologies and also the use of artificial intelligence to make better use of resources to optimize production.
The paragraph between lines 12 and 24 in page 2 is not clear. Maybe, it would be better to give the necessary definitions and then describe the differences and how they complement each other or how they are linked. Ontology and KG are not defined, please add the definitions here with the corresponding references.
The authors enumerate the contributions of the article, however I do not think they are all contributions. For example, the last one is the validation of RGOM, a model that it is not sufficiently described in the article. I do not think the validation of this model is a contribution of this article. It is more to prove that the KG built from RGOM is in fact a contribution.
In the related work:
"... Internet of Things, Internet of Services, Cyber-Physical Systems, Digital twins, ..." are not technologies. Please rephrase.
The state of the art is difficult to understand. I think the authors should work on this section and make clear the link between the existing works and what their limitations are so that it is clear how the proposed model addresses those limitations and to what extent.
In section 3:
"... acquisition and generation of the dataset." This phrase is not clear for me. Maybe, data acquisition and dataset construction ?
There is no transition between the first paragraphs of this section with subsections 3.1 ... 3.9. At the beginning two types of attributes, static and variable, are mentioned and then it goes on to describe each machine without transition. Adding a transition here would make the text more readable and aid understanding.
Regarding the random creation of the variable attribute values, the authors give a reference [17] that explains how these values are generated, however I think it would be worth to give more details about this in the article to see how accurate these values are. Also, it would be interesting to know if in the creation of these values the relationship that exists between certain properties (for example, the temperatures of the different components of a machine) is taken into account. Is it possible to represent these relationships (perhaps physicals) in the proposed KG? I think this is a key point and would allow to further enrich the data of the model.
In section 4:
"IT silos" is a very specific term, maybe give the definition or use another term like data storage.
"... usability of this data for, e.g., subsequent analyses and reasoning." - I do not understand this phrase.
Linked Open Data was never mentioned in the article before. Maybe briefly explain what it is about.
"This goal can be ..." - Which goal?
"The following describes the steps ..." -> Maybe the layers and the interaction of the different components instead of "the steps".
"... at a certain timestamps ..." -> at different timestamps.
"... unnconnected data" - What does it mean? I think it is a very general statement, it would be better to be more specific and for the authors to make it clear what they mean by "unconnected data".
"... interaction of production staff with unconnected data is very difficult." - I do not understand this phrase. It refers to access to information, interpretation, ... ?
The authors state that the RGOM model is inspired by the standards adopted by RAMI4.0. RGOM is also based on the model proposed in "Giustozzi, F.; Saunier, J.; Zanni-Merk, C. Context modeling for industry 4.0: An ontology-based proposal. Procedia Comput. Sci. 2018, 126, 675–684" and reuses other ontologies such as the Time ontology, SSN, among others. The reference [7] cites these models. The construction of the KG is not well described, it is difficult to see the link between the KG and these ontological models and how the KG is constructed from these modelsand the data. This is linked to my general comment that it is not easy to see what is the contribution of the KG with respect to using these ontological models.
I think this whole section should be rewritten and restructured to make it clear how the KG is constructed and why it is useful and necessary.
In section 5:
It is difficult to see the adaptability of RGOM through this example. The use case is very simple and does not demonstrate the usefulness of the KG. I think that the queries are too simple, maybe think about adding more complex queries that allow to see the real usefulness of the KG and the advantages it offers in terms of integration of heterogeneous data and with different temporal resolutions.
In the query of listing 3 it would be helpful to give more details about the status of the engine of a machine. I do not see the concept Status in the model, maybe it should be described how this status is obtained or what it represents (for example, if it can be seen as an abnormal behavior). This would show that the KG offers this kind of semantic information that could be exploited by an operator and even obtain more information associated with this abnormal behavior to determine its causes.
Furthermore, maybe add third-party uses to provide evidence of the usefulness of the KG dataset. For example, a possible application (and not just say methods and tools) that can make use of KG to demonstrate its usefulness. For example, how KG could help to create suitables datasets to build machine learning models for predictive maintenance.
Can this KG be adapted to another case that is not the manufacture of soccer balls? I think that the authors could make a discussion about this and give some hints on how to do it beyond that they do not fully validate this adaptability.
|