Review Comment:
1 Overview comments
This is an interesting proof of concept paper that tackles an existing problem for industry. It is a practical and pragmatic use of formal industrial ontology.
The framework presented captures the temporal aspects of sensor data and uses stream reasoning coupled with classical reasoning as a mechanism to achieve this. The unification of different semantic technologies to solve a real-world problem is impressive and we expect this to generate interest in the industrial ontology community.
We have a number of suggestions to improve the paper. We hope the authors find them useful.
2 Suggestions on the context and framing
The following suggestions in this section are based on one of the reviewer’s 30+ years in condition monitoring and maintenance. These suggestions are intended to assist the authors to better place this work in the context of where industry is today and to help industry readers appreciate where this work will add value.
Process and condition monitoring of discrete and continuously operated machines has been used for decades now. Most modern machines have a large number of sensors installed by the Original Equipment Manufacturer (OEM) as well as the additional sensors added by the manufacturing train operator. These sensors are wired into a DCS/PLC and the data available to operators and engineers through SCADA, OSI-Pi and other interfaces. Sophisticated time-based models for early fault detection are available and quality control is sophisticated. The work proposed here is not moving into a vacuum where nothing exists. Industry 4.0 ideas, as exciting as they are for academics and industry consortia, are proving very slow in adoption, for a number of reasons as documented in [1]. With this in mind, we miss any comparison with current methods of condition
and performance monitoring to identify fault states and provide suggestions for action such as you are proposing.
The key goals you set for the semantic model developed in this paper are (page 3):
• The ability to integrate data from different sensors, including information about sensor values that indicate abnormal behaviour.
• Data is annotated with time of occurrence and validity.
• The streaming data must be able to be processed in a timely manner.
• Relationships between situations must be understood in order to understand the effect of proposed actions.
We suggest that some of the desired capability, listed above is already present in modern manufacturing plants. It is already present in the plants of a number of international operators one of the reviewer’s works with ( to varying degrees). However it has taken decades to do this and every machine was set up one by one. Rules and limits proliferate and are difficult to keep track of. Is it possible that the ontology you are proposing would enable new machines to be set up more quickly, through replication and standardisation. Would this also lead to improved maintenance of the systems as currently the set up on each machine is unique requiring lots of time and experience to a) set up, b) make changes, and c) to troubleshoot. Would your ontology allow similar machines but from different manufacturers and different sensor naming systems to have one set of common decision logic. If this were the case then we suggest there would be considerable interest from existing plant operators in what you are proposing.
That said there are a number of things you describe that are an advance on current practice. For example making information updates about the relationship(s) between processes, entities and locations in the factory dynamic, that are very interesting.
3 Abstract
The abstract is easy to follow and motivates reading the paper.
Having said this we ask you to consider the suggestions about the context made earlier. While it is fashionable to talk about Industry 4.0 as something completely new, manufacturing machines and process plants have been instrumented with sensors for performance and condition monitoring for decades. Many of these are hard wired into DCS systems and and the data available through PLCs and SCADA as mentioned earlier. It would make you paper relevant to a much wider udience if you could frame it as being useful for the manufacturing industry right now with their existing sensor and communications set-ups regardless of whether they have adopted Industry 4.0 practices and protocols.
4 Introduction
Can we suggest the authors incorporate some of the suggestions made about context and framing in this section? In addition we have the following comments.
• In the introduction, the authors outline the core goals for the framework (data integration, time representation, etc). However, the authors should refer to these (perhaps by numbering them) throughout the paper. This will show the reader where the real gaps in the literature are (according to these goals) and how these goals are addressed in the proposed framework.
• For example, “efficiency” is one of the goals of the framework. Upon reading this, my
impression was that reasoning performance / efficiency would be a core feature that would be addressed in the ontology (and your use of stream reasoning appears to suggest this). If this is the intention, we would like to see further evaluation of the reasoning performance of the ontology. For instance, does reasoning complete in the order of milliseconds, seconds, minutes or days (as is the case for some classical reasoning problems) for a large volume of seed data. Upon reading the rest of the paper, it appears the authors were in fact referring to process efficiency of equipment diagnosis and recovery. Referring to your goals throughout the paper should help to eliminate confusion here.
5 Related work
• A quote from [1] might be useful “Condition-monitoring data alone is often not sufficient for PHM; metadata about the asset, its operating environment, and the external covariates that influence its deterioration would also be required.” Your proposal to include these operating context features in your ontology is a key contribution of your paper. Can we suggest the authors make clearer in this section that to do prediction well in industry the model needs to be sensitive to changes in asset’s environmental variables (as these impact the response in the prediction time window). Few assets operate in an unchanging operating, maintenance, environmental context. Therefore models based only on data history do not generalise well to unseen contexts.
• There are also a number of more recent survey papers than the 2010 one you mention [23] that might be worth including instead.
• Please can we suggest a more comprehensive literature review on the state of the art in ontology processing of streaming data. While some work is ongoing by Siemens and Bosch, for instance, I am sure they and others have moved forward and it would be good to reflect where they are compared to what you propose [2, 3, 4].
• Your approach to identifying ‘situations’ sounds similar to the work being done by ontologists in the autonomous driving world. As an example, Bosch have been developing requirements “Globally, if person [is detected] then in response brake [eventually initiated] within 5 time steps” which are then translated into temporal logic [5]. This is similar to the reasoning you are proposing ”if oil temp exceeds 40 deg for more than 20 seconds then ..”. Their work also mentioned the need to take the external world into consideration and demonstrates how they do so. I appreciate that the paper I am talking about is in the Industry track of ISWC so only two pages. Nevertheless it does indicate that there is work going on in this area that is relevant and I’d suggest other readers would like you to place your work in this context.
• While a clear gap in the literature from an industrial setting appears to have been identified, this section is also missing a review of semantic / ontological works for in IoT / sensor technologies outside of the industrial domain. For example, there is much work in IoT ontologies for ubiquitous and pervasive computing (used in domains such as aged care).
• Examples of works in this area include:
• ONDAR: an Ontology for Home Automation (Achraf Lyazidi and Salma Mouline)
• SOUPA: Standard Ontology for Ubiquitous and Pervasive Applications (Harry Chen, Filip
Perich, Tim Finin, Anupam Joshi)
• A review can be found at: A study of existing Ontologies in the IoT-domain (Garvita Bajaj,
Rachit Agarwal, Pushpendra Singh, Nikolaos Georgantas, Valerie Issarny).
• It would also be good to see more of a background on stream reasoning and where it has
been used in the past. This will give the readers a clearer view of the technologies used in the framework.
We have a suggestion to remove the section “Other approaches use data mining and machine
learning methods to extract diagnosis knowledge or mine rules from databases in a smart system.
These approaches include the works of Martinez et. al. that uses an artificial neural network based expert system for detecting the status of several components in agroindustrial machines using a single vibration signal [28]; the works of Liu et. al. that use support vector machines and rule-based decision trees for fault diagnosis of water quality monitoring devices [29] and the works of Antomarioni et. al. which use association rule mining in maintenance [30] to minimize the probability of breakages in an oil refinery [31]” discussed above. The rest of this section stands on its own without it. The reasons are below.
Somewhat baffled why you chose these papers from the 20,000+ research papers on prognostics and predictive maintenance published each year. Why these three examples? Have they been implemented in industry? Were they ever validated in industry? How does their selection (over thousands of others) help your case?
There is a serious issue with labelling (annotating) data for predictive models for use in industry. Most of the research is done in the lab on benchtop rigs or using one of very few public datasets.This issue of labelled data for predictive models is now being more widely talked about and recognised as a key constraint for industry. It is one of the reasons rule-based methods, such as you are proposing are likely to continue to be used [6, 7]. I suggest you include a section on this instead.
Vibration [28], as an example here, is problematic for examples such as you are proposing later on for a number of reasons. Sample rates for the raw signal are 2000 Hz, whereas the data you are using in your example comes from temperature, current, speed and power which all have sampling rates on the order of seconds or minutes apart. Engineers can use a vibration RMS value (aggregated to give similar second intervals) but that’s not much used for predictive diagnostics where we are interested in change in spectral band energies and peak amplitudes at certain frequencies hence vibration RMS is mainly used in protection. The vibration features that are useful have to be extracted using FFT and other spectral techniques and while this is one of the promises of edge computing, deployment of these devices is still in their early stages.
If you are interested in the work of Grabot [30] and other similar groups on semantic extraction of information necessary for predictive model labels such as failure modes and end of life events you might consider a search for the works of Mike Brundage et. al. at NIST on Technical Language Processing of Maintenance Work Orders, also the work of Rajthapak at General Motors who has been combining semantics and ontologies for warranty data for some years now.
6 A novel knowledge-based framework for condition monitoring
The system and technologies used are well described and we commend the authors for their re-use of existing ontologies.Some suggestions for improvements follow.
• There is a problem in the definition of Resource. Resource is defined as a Manufacturing Facility or Staff or Product. This is not a good definition because while it may work for this use case, it is not able to be integrated with other ontology.
Integration of the data with different data sources is one of the core motivations for an ontology. We encourage to authors to think about how different ontologies use the term ”resource”. To retain the reasoning capability enabled in this definition, we suggest a subclass of resource (i.e. Condition Monitoring Resource) that includes this axiom.
• The same goes for the DL axiom for process. This is problematic because it says that a
process is a subclass of Logistic Operation or Human Operation or Manufacturing Process.
However, in Figure 2, it says that these process types are sub-classes of Process. We agree with this but the DL definition should be rewritten.
• The authors should be careful on the bottom of page 11. The authors say that ”the concept of a situation is formally defined by the following DL axiom” and present a subsumption axiom. A subsumption axiom is not a ”definition”, logically speaking.
• On page 13, the authors say ”the modules involved in situation detection are Translation and Temporal Relations. The ”Translation” module has not been introduced until now. Do the authors mean ”Location”?
• How does the reasoner perform when the asset is deliberately switched off e.g. for maintenance, or if one of the sensors goes offline? Do you continue to get the “Not detected” status on each situation?
• Is getting the “Not detected” status every x minutes for every situation going to take up a lot of memory in the operating system? Especially as you scale over 100s of situations and machines.
• We are interested in the mechanics of how the ontology evolves and would appreciate some more details. Our understanding is that new rules will need to be added manually if a situation is encountered whereby a cause can’t be found. Is this correct?
7 Proof of concept
• It would be good for readers to understand more about the Hierarchy of Situations in Figure 9 is used. It appears that this is not part of the ontology. Rather, it is a visual model that is inspected by an expert to make a decision. Is this correct? This should be made very clear to the reader. If so, are there plans in future work to create an ontological representation of situations based on lattice theory?
• One of the main challenges with machine-to-machine based work is that industrial rule-based diagnostic systems are (context, asset, and process)-dependant in the sense that they rely on specific characteristics of individual pieces of equipment in that part of the circuit. This dependence poses significant challenges in rule authoring, reuse, and maintenance by engineers [8]. You seem to have this problem in your example as well. As you show in Tables 4 and 5, the constraints must be developed first by engineers who know what sensors are on the machine and what the sensor values should be. If you have hundreds of non-identical machines or even identical machines with different ages/ behaviours etc, then I don’t see how what you are proposing is any more efficient than current approaches. For example, if we have to develop a model like shown in Figure 9 for each situation and for every machine, how is this any better than the functional models we currently have built into our SCADA systems? What is the value for putting all the data you have in Table 4 into an ontology when it is already captured in the SCADA? Why would anyone replace what is already working? Please can you address this.
• We are interested in how long it took for the authors to set up the rules for their case study.
• Would the rules, once developed for this machine, be transferable to other similar machines? If so, might this be a benefit of this approach?
8 Evaluation
The authors have presented a framework that solves a real-world problem. However, such a
thoroughly considered framework deserves a more thorough evaluation.
• The authors have presented a verification activity of running the reasoner and evaluating using OOPS guidelines. It would we good to see more validation activities performed as part of your evaluation.
• For example, it would be good to discuss how the ontology held up to the current use case, and what is missing? How does this ontology compare with other similar ontologies? What is different to existing ontologies at a concept level and why?
• Something that could be very interesting to readers is a performance evaluation as mentioned earlier. To see how quickly the reasoning is performed using the stream reasoner will be very interesting. You are dealing with high volume sensor data. Can current reasoners hold up to these requirements? We appreciate that this is a large piece of work. If this is a subject of future work, the authors should say so.
• We suspect that some further ontological decisions will be made on an examination of the ontology’s performance. For example, is the partOf relationship in the ontology transitive? If not, was this a performance-related decision? If so, how does this affect performance?
9 GitLab Repository
The authors are to be congratulated for making their work available on a GitLab site (https://gitlab.insa-rouen.fr/fgiustozzi/STEaMINg-SR_SitDet)
• Please give your repository an Open Source license (as the authors claim that the ontology is open source) so that readers can use it freely.
• The GitLab requires more comprehensive run instructions (perhaps in the IDE used by
the authors). We have tried in both VSCode and Eclipse (both with Maven installed) and
have not been able to run without configuration errors. It appears we are missing a csparql dependency in the repository as we are getting the following error in Eclipse: “The POM for eu.larkc.csparql:csparql-core:jar:0.9.6 is missing.” We could be missing something on our end but the publication will benefit from some clear instructions to help readers to run your code.
Here are the references we have used in our review. We hope you find them useful.
[1] D. Kwon, M. R. Hodkiewicz, J. Fan, T. Shibutani, and M. G. Pecht, “Iot-based prognostics and systems health management for industrial applications,” IEEE Access, vol. 4, pp. 3659–3670, 2016
.
[2] E. Kharlamov, T. Mailis, G. Mehdi, C. Neuenstadt, ¨ O. ¨ Oz¸cep, M. Roshchin, N. Solomakhina, A. Soylu, C. Svingos, S. Brandt et al., “Semantic access to streaming and static data at Siemens,” Journal of Web Semantics, vol. 44, pp. 54–74, 2017.
[3] E. Kharlamov, G. Mehdi, O. Savkovi´c, G. Xiao, E. G. Kalaycı, and M. Roshchin, “Semantically enhanced rule-based diagnostics for industrial internet of things: The sdrl language and case study for Siemens trains and turbines,” Journal of web semantics, vol. 56, pp. 11–29, 2019.
[4] G. Mehdi, E. Kharlamov, O. Savkovi´c, G. Xiao, E. G. Kalayci, S. Brandt, I. Horrocks, M. Roshchin, and T. Runkler, “Semdia: semantic rule-based equipment diagnostics tool,” in Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2017, pp. 2507–2510.
[5] A. P. Kaleeswaran, A. Nordmann, and A. ul Mehdi, “Towards integrating ontologies into verification for autonomous driving.” in ISWC Satellites, 2019, pp. 319–320.
[6] A. Theissler, J. P´erez-Vel´azquez, M. Kettelgerdes, and G. Elger, “Predictive maintenance enabled by machine learning: Use cases and challenges in the automotive industry,” Reliability engineering & system safety, vol. 215, p. 107864, 2021.
[7] D. Correa, A. Polpo, M. Small, S. Srikanth, K. Hollins, and M. Hodkiewicz, “Data-driven approach for labelling process plant event data,” International Journal of Prognostics and Health Management, vol. 13, no. 1, 2022.
[8] G. Mehdi, E. Kharlamov, O. Savkovi´c, G. Xiao, E. G. Kalaycı, S. Brandt, I. Horrocks, M. Roshchin, and T. Runkler, “Semantic rule-based equipment diagnostics,” in International Semantic Web Conference. Springer, 2017, pp. 314–333.
[9] A. Mehdi, E. Kharlamov, D. Stepanova, F. Loesch, and I. Grangel-Gonz´alez, “Towards
semantic integration of bosch manufacturing data.” in ISWC Satellites, 2019, pp. 303–304.
[10] M. Pech, J. Vrchota, and J. Bedn´aˇr, “Predictive maintenance and intelligent sensors in smart factory,” Sensors, vol. 21, no. 4, p. 1470, 2021.
Reviewers: Melinda Hodkiewicz and Caitlin Woods, University of Western Australia.
|