Review Comment:
The paper discusses how to improve data quality of maintenance working orders (MWO), through the comparison of the output of a (previously developed) NLP pipeline with the output of an inference process carried out using OWL and SWRL-rules reasoning in the context of an application ontology about maintenance activities.
This topic is appropriate for the journal “Semantic Web Journal”, as it fits the journal’s stated goal of publishing “descriptions of concrete ontologies and applications in all areas”, as well as the topics “Ontologies for knowledge representation and reasoning about topics relevant for industrial engineering” and “Experiences with research and application initiatives” of the Special Issue on Semantic Web for Industrial Engineering: Research and Applications.
* Main arguments in favor of accepting the paper
This is a nice paper with interesting observations and results. The authors show good knowledge of the domain and a clear understanding of the practical problems they aim to solve. The original results are relevant and useful to the community. Overall, the paper is clearly written, fairly well organized, and presented. There are some typos and some repetitions, which are highlighted in the attached pdf document.
The paper is fairly well-positioned among previous literature on the same topic, and clarifies which parts are innovative: the contributions of the paper are, essentially,
• an ontology of maintenance activities based on ISO 14224, ISO 15926-14, and analysis and classification of verbs used in MWO records;
• a methodology, based on the comparison between NLP and OWL/SWRL inference outputs (the latter depending on the maintenance activities ontology), that identifies data quality issues with MWO records.
The paper begins by describing the current situation in the literature and explains the use case data. Then the paper details the derivation of the maintenance activity ontology, how it can be used to check data quality of MWO records, and finally the results that were obtained by applying this methodology to the use case
The paper focuses on data quality assurance, which is an important topic in industrial engineering, and it does so by managing a use case in the maintenance domain. The result is a methodology that, although studied in the paper with a limited scope (the “Centrifugal Pump-Motor System”) and applied to a small number of records, seems to be generalizable to a wider number of engineering systems and report records.
*Main arguments that require a revision
There are some issues regarding the functional breakdown of the centrifugal pump: the centrifugal pump (http://www.semanticweb.org/asset-list-ontology#Pump) is considered a material artifact, which has, at all times, the various subsystems (Control and Monitor system, lubrication system, etc.) as parts. The subsystem themselves are still material entities, which have other material artifacts as parts, at some time. But,
• The hierarchy is not reported in the OWL ontology “functional-breakdown-pump-ontology”, as the smaller material artifacts are not part of the proper subsystems, but only of the centrifugal pump. This results in a weaker ontology than stated in the paper, where the authors write: “The subunits, modelled as Engineered System classes under the BFO object aggregate hierarchy, have continuant parts at sometime, e.g., the Control and Monitoring System having further continuant parts such as a Pressure Switch and other types of instrumentation.”
• In BFO material entities must have at all times some amount of matter as part. Using the authors’ ontology we know that, say, the Driver and Electrical subsystem has a Motor, Variable drive, and Power supply as parts at some time. The fact that the motor etc. are parts only at some time is due, presumably, to the attempt of the authors to consider the possibility that motors etc. are routinely replaced and can be missing from their “functional location” for some time. Thus, the ‘at some time’ phrase is important, and cannot be disposed of. Yet, material entities in BFO must have a material constituent at all times: if the motor etc. of the Driver subsystem is missing, what is the material constituent of the subsystem?
The authors are aware of this problem (they “acknowledge that the idea of replacement is a deeply philosophical problem and raises many questions about identity”) and that the problem of an asset’s functional breakdown is a “complex topic to address in BFO” but do not suggest a viable suggestion leaving the reader wondering if the proposed ontology would remain consistent in real scenarios.
• The authors make heavy use of the concept of function, functional part, functional location, etc.. Formally, the authors take the concept of function from BFO, where it is a subcategory of dispositions. There are some issues with that, for example, “In BFO, there is no such concept [of functional object and functional part] and this makes it difficult to distinguish where the replaced asset sits in a functional breakdown (as opposed to a physical breakdown)”. That is, it seems that one cannot carry out a functional decomposition (“functional breakdown”) of an engineering system even though this is important for the approach to work.
• Another difficulty linked to an underdeveloped function theory is the non-distinction between subtypes of functions. For example, the natural language definition of functional role (http://www.semanticweb.org/functional-breakdown-pump-ontology#Functional...) says: “Fhe[sic] role that a material artifact bears if it is a critical part of the larger system. For example, a pump cannot operate without a motor. Therefore, the motor will play a critical component role in the pump system”. This definition apparently is speaking of essential or primary function roles more than simply function roles. Indeed, in the functional-breakdown-pump-ontology the lubrication system has no functional role, which is counterintuitive. Maybe the authors aim to distinguish primary and secondary functions? The paper should provide some information on how this could be done.
In any case, it would be difficult for the authors to do better since the development of an ontological understanding of functionality (etc.) is still an ongoing research line. Moreover, today BFO shows still has relevant limitations on this aspect compared to, say, YAMATO and DOLCE.
The check vs. inspect problem (pg. 10) can be seen as a classification tension. On the one hand, there is the classification of an activity according to the postconditions it generates – the goal of a check is that the system’s state is known (call it the goal view), and this conflates check and inspection into a single activity. On the other hand, there is the classification of activities according to what motivates them - a check is motivated by, say, a fault; an inspection by, say, a temporal flag. In the latter case check and inspect are distinct activities. The existence of the two views could be presented in the introduction to clarify and motivate the classification choice made in the paper.
This is also related to the overhaul vs repair problem where in this case the authors choose the goal view (or to look at the “bare” action, which here amounts to the same result).
To make the different choices clear and comparable (and possibly coherent), the authors could anticipate the flowchart of Fig. 2 at the beginning of the paper (perhaps simplified as at that point the different activities are not yet identified) and use it to list the criteria that emerge from the ideal use of the ontology.
“These 230 [actually 221, see pdf] terms were then clustered according to whether they describe similar activities.” Following the previous point, here one expects to know the adopted criteria for similarity. Are they the same across all the clusters? Since the decision was made by a subject matter expert, which standards was the subject matter expert familiar with? Did s/he use any? Did s/he participate in the term elucidation writing? Is this person a co-author? If yes, please identify her/him. If not, did the authors double check the quality (and coherence) of this clustering?
On defining “Inspect” (Table 7). We see two problems here.
Problem 1: the elucidation essentially says that the inspection has the goal to observe the state s of the item. The semi-formal def. talks about capability. Do you assume that states and capabilities are the same thing? If so, explain. If not, fix the definition.
Problem 2: A preventative strategy ps is a procedure that describes a series of activities among which some can be inspections. The strategy itself may prescribe a type of inspection that should be executed (and usually also when), but it does not prescribe the specific activity p (which is a token). In other words, the last condition in the elucidation should say something like: p is an activity of type i and the strategy ps prescribes to execute an activity of that very type.
[this type/token problem applies to the def. of Service as well.]
On defining “Diagnose”:
“Able to regain function by…” double check the expression in the original source.
The elucidation for diagnose does not seem to match the definition. In an electric machine the company might perform a diagnose activity to find out why it is not working (by def. this is not an inspection since it is caused by a failure, the machine is not working). After checking a few components, the technician my realise that there is no electricity in incoming line. So, the diagnose result is “no degraded state” and “no failed state” for the machine. Of course, there will be another diagnose activity for the electric system but that is, indeed, another activity since it applies to another system. The activity run on the machine is a diagnosis according to the given semi-formal def. of diagnose (and the SME Def. as well) but not according to the elucidation. This should be fixed.
According to sect. 4.2, the definitions in Table 7 aim to be mutually exclusive but the elucidations do not enforce this from the logical viewpoint. In particular, if a device has both a functional and a control role, can the same activity be both an adjust and a calibrate activity? Does one need to use the terminology only relatively to a specific granularity where one can distinguish a component devoted to control only and another to the function execution only? What if the system is an embedded system (meaning, it cannot be separated)?
(Some of these issues are discussed in sect. 7 as if they were problems related to application aspects. Yet, in some cases the cause can be traced back to these definitions and elucidations which, as shown above, are the source of a few “grey zones”.)
Additionally,
• The elucidations of maintenance activities mention often the term “functioning process”. Is it a technical term in BFO? It should be clarified. Similarly for the other relevant terms recurring in the elucidations.
• The authors state “As such, they [the classes used for automatic classification] should be broadly applicable to categories of equipment or, at the very least, the classes of pump other than centrifugal pump”. This is convincing for, e.g., the class “Inspectable Unit”, but less convincing for the class “Not Pump Unit System”. In fact, the presence of rules depending on whether an object belongs or not to a given class (in this case the object is not a pump unit) seems an obstacle to scaling up the number of rules.
Attached Software
-The rdf serialisation of the paper ontology is readily accessible on GitHub.
-The files and data present on GitHub are well-structured and documented.
-The steps necessary in order to replicate the paper results are clearly stated and easily carried out.
-We reproduced the population script. The script works fine, but on GitHub there is an additional file “populated-data.owl” that is not produced by the script. Additional comments below.
-We reproduced the reasoning script. The script works fine. Additional comments in “Suggestion for revision”
Further issues
-See file “swj3067_comments_review.pdf” for typos.
-Abstract is too long
-The term artifact in the paper is misleading, it should be technical (or engineering) artifacts.
-In the maintenance activity ontology there is the following taxonomy excerpt:
'Maintenance Type'
'Corrective Maintenance Type'
'Preventative Maintenance Type'
where 'Maintenance Type' is the field of a MWO that takes as values either “preventive” or “corrective” depending on the Work Order Type. Presumably, when the authors inserted 'Corrective Maintenance Type' and 'Preventative Maintenance Type' subclasses they meant to say “all the Maintenance Type' fields that take as value only “corrective” ” and “all the Maintenance Type' fields that take as value only “preventive” ”. But, as it is, it seems that 'Corrective Maintenance Type' and 'Preventative Maintenance Type' are full-fledged fields themselves, on par with 'Maintenance Type', ‘Material Cost’, ‘Labour Cost’, etc..
We suggest simply removing the two subclasses.
-Add reference to the fact that an asset’s functional breakdown is a “complex topic to address in BFO”.
-typos/comments/errors for population_script/script.py:
• Line 12: “funtional_breakdown_onto” –> “functional_breakdown_onto”
• Line 17: delete
• Line 75,110; “# sub_unit_indiv = select_sub_unit(row['NLP Identified Subunit'])” –> delete
• Line 175: “# todo: figure out how to make a date type” –> We see that in the ontology the dates are labeled as xsd:dateTime, so maybe it’s ok.
• The script was originally thought as having a loop in which each of the 36 MWOs was to be read from the master datasheet and copied into an owl ontology file. At some time the authors realized that there were problems with “owlready chaching” and had to take the loop to another python script (“population_script/runner.py”). The original loop is still in population_script/script.py, though, and a filter was added in the old loop in order to skip all loop iterations except for one. This is not ideal, and the authors could remove the old loop.
• The authors state “To achieve this, we store the NLP Identified Activity from Table 4 as an annotation property on the Maintenance Work Order Description”, then proceed to show a figure (Figure 1) where “NLP identified Activity”, “NLP identified Item”, “NLP identified Subunit” are annotation properties. The authors highlight this by saying “We represent this information in annotation properties because they represent assumed knowledge resulting from an entity recognition algorithm applied to the specific field. Thus, we cannot assert this knowledge as individuals in the ontology without a detailed ontological analysis”. But in the owl serialisation present on GitHub (e.g., in populated-data-1.owl) “NLP identified Item” and “NLP identified Subunit” are object properties and “NLP identified Activity” is a data property. This is, at least in our understanding, a contradiction between the paper and the owl serialisation, and should be corrected.
-typos/comments for reasoning_script/runner.py:
• Line 9: “PREFIX bfo:” –> prefix not needed by query
• Line 20: “ # FILTER NOT EXISTS { ?activity a/rdfs:subClassOf* macr:UncertainActivity }” –> remove comment of old query version
• Line 26: “ # a/rdfs:subClassOf* activity:replace .” –> remove comment of old query version
• Line 39: “PREFIX bfo:” –> prefix not needed by query
• In the first query the prefixes “macr” and “work” are used, while in the paper the same query uses only the prefix “rule”. This works fine, but it is a small inconsistency in notation.
• The script runs fine, but it is very slow (about 1 minute per record, on my machine), and the reasoning is even done twice (“There is also some inconsistency in the output occasionally (exact cause is unknown) where the activity classifications are not inferred and so the script will run the pellet reasoner twice if it detects such an occurrence”, as per the authors’ explanation – which is correct, and we also have no idea about the cause). This is not unexpected, but perhaps should be mentioned, for, suppose that a company has, say, 500 machines, each one having a history of 30 records, then, assuming one minute per record, the verification of the 15000 records should take more than 10 full days. Additional records may be verified as they are added, so they shouldn’t be too much of a problem, still, if a company would attempt to use the reasoning algorithm outlined in the paper massively on its records, it may encounter severe computational requirements. Maybe this fact should be mentioned and/or some possible solutions suggested.
|