Review Comment:
(1) originality,
***
The SOHO ontology has been first presented in a previous publication by the same authors (see, [13] A. Umbrico, A. Orlandini and A. Cesta, An Ontology for Human-Robot Collaboration, Procedia CIRP 93 (2020), 1097–1102). As far as I can see, the proposed extension of the SOHO ontology mostly (if not entirely) concerns the introduction of axioms defining the different identified collaborative modalities, that is, the "independent", the "simultaneous", the "supportive" and the "synchronous" modality (see, pp. 7-8). Hereafter I report my questions/comments about the presentation and the specification of the ontology, while taking into account the results presented in [13]:
- In [13], the authors already mention that the four introduced human-robot interaction modalities have been defined. Considering that, I struggle in clearly characterising the original content presented here for what concerns the ontology modules.
- I noticed that part of the axioms originally introduced in [13] have been modified in the version of the ontology as in the present contribution. Examples are the axioms defining the concepts of "Function", "ComplexTask" and "SimpleTask". I would suggest the authors to comment on these changes, possibly explaining why they have been performed: what was wrong or not satisfactory enough with the previous version of them. Moreover, there are few details concerning the newly introduced axioms which puzzle me.
- In the new definition of "SimpleTask" ("ComplexTask"), for instance, the role "isConformTo" is used but, I wasn't able to find in the text a sentence explaining the reason why it has been introduced, especially if one considers that it was not present in the definition of "SimpleTask" ("ComplexTask") in [13].
- The filler (using a DL-based terminology) of role "isConformTo" is in (as far as I can see in the paper) the union of "OperativeConstraint" and "InteractionModalitiy" but, the authors do not further argue about the details behind this kind of "norms", as they call them. In particular, I wonder whether an "IndependentTask" (and the same holds for the others "collaboration modalities" on pp. 7-8) has to be understood as a specialisation of the concept "InteractionModality".
- Which are the differences, if any, between "ProductionNorm", as introduced here, and "ExecutionNorm", as introduced in [13]? I noticed that in the definition of the concept "Function" the occurrence of "ExecutionNorm" has been substituted by an occurrence of "ProductionNorm". Should we consider them as two syntactic variants for the very same meaning or there is a more fundamental distinction between them?
Of course, I opened and analysed the ontology files and I found the correct answer by myself. However, I think that the paper should be self-contained with respect to this argument and should clarify justify and explain the suggested ontology internal structuring without asking the readers to look into the ontology modules specification by themselves.
- The discussion of the foundational counterpart of SOHO is basically the same as in [13] or, at least, I did not notice any major difference or improvement. This is not well justified to me. It is a fact that the authors decided to not reproduce in the current paper a substantial part of the original SOHO ontology as it is presented in [13] and, in my opinion, this negatively affects the overall understanding of the latest version of the ontology. Nonetheless, they also decided to entirely report, with no substantial modification, the arguments about foundational ontologies. Honestly speaking, I don't see the rationale behind the decisions: my suggestion is to remove the sections about the foundational ontologies and use the space gained to provide a comprehensive introduction to the SOHO ontology.
- The bibliographic references for CORA and SSN can be found at the beginning of Section 2.2 but the two ontologies are firstly mentioned at the end of the previous Section 2.1.
- The very same content of Section 2.2 was already present in [13], Section 2.2 (by chance, the two sections share the same numbering in the two different papers but, not surprisingly, they have the same title "2.2. What is Missing for HRC?").
- I understood, from the definition of the algorithm on p. 10, that the role "hasConstituent" is not transitive in the introduced ontology. Is this correct? If yes, I would suggest to (also) make this claim explicit in the sections describing the ontology axioms (which come earlier in the paper).
- My last comment about the ontology axioms refers to the usage of the term "synchronous", which associate to 'simultaneous' and 'contemporary'. Then I discover that , actually, this is not the case since a synchronous task in the CAPITAL-GOODS scenario requires human and robotic functions to follow "a strict temporal ordering" (synchronised on the very same target object). In the very same axiom, at p. 7, the synchronous task is defined as something being conformant to one and only one "SequentialExec", which reminds me the sequential order of, first, placing a bolt and, second, screwing it.
A I mention before, the fact that the role "isConformTo" and its possible fillers, i.e., the concepts "Independent", "Simultaneous", "SequentialExec", "ParallelExec" , "Supportive" and "Synchronous", are not further specified makes the clear understanding of the axioms quite difficult, in my opinion.
It is probably because of this lack of understanding that I find the definition of an 'independent task' quite tautological from a pure conceptual point of view (obviously, the same comment applies to the other types of tasks on pp. 7-8). If I had to clarify further my doubts I would say, for instance, that my intuition (that could easily be fully wrong) tells me that what the concept of "SimultaneousTask" means strictly depends on the meaning of the concept "Simultaneous", but this last is not introduced nor discussed in the paper unfortunately.
The introduction of an algorithm for the ontology-based automatic extraction of the relevant knowledge and planning constraints from the manufacturing scenarios at hands was not present in [13] and represents a completely original result introduced in the present paper. According to my understanding, and following the approach suggested by the authors, the algorithm plays a fundamental role towards the synthesis of the final planning model. I have several concerns about this and the sections 4 and 5 that I resume in what follows:
- I found the description of the algorithm quite difficult to follow and I think that showing an instantiation of its execution would be of great help. I take the decomposition graphs for each of the introduced scenarios as part of the information generated by the algorithm. What I completely miss is how these knowledge bases/graphs are then used in combination with the planning constraints, to create the admissible planning models.
On p. 15, the authors say "this type of collaborative behavior is translated into a well-defined structure of temporal constraints into the task planning model" but the way this is done is left to imagination and I think that this does not help the reader in properly evaluating the added value provided by the ontology-based approach introduced by the authors. The decomposition graphs are interesting but they represent the starting point only of the entire planning model synthesis, where the extracted knowledge is then fully exploited. My suggestion to the authors is to dedicate much more space to go into the details of this synthesis process, this way stressing the relevance of, and the added value provided by, the knowledge that is automatically extracted and formally represented according to SOHO.
- As a minor comment, and still connected to my previous point, I think that it would be helpful for the reader to have an explicit explanation on the semantics of the green bubbles in fig. 4 and of the green and the purple bubbles in fig. 6. Naïvely enough, and following the natural language explanation the text, one could think that the green bubbles "AND_#" assume different meanings in correspondence to the level of tree they appear (which is, by the way, counterintuitive!).
- I also have a question about the convergent arrows in the figure: Do the convergence of these arrows to a single human/robot function mean that the tasks they originate from (obtained by decomposition of the complex task above) are achievable by means of the very same function? (E.g., "ProductionL1_h10" and "ProductionL1_h11" converge to the leave "Worker_tightening_nuts_reardoor"). Clearly, I cannot answer this question by myself 'cause I've no idea of what "ProductionL1_h10" and "ProductionL1_h11" are.
- Despite the fact the the first introduced scenario looks quite easy, I don't quite get the message of the last sentence on p. 12, line 24: what does it mean to 'execute a predicate'? Why the temporal order between the execution of the different highlighted function has not been made explicit here? Is there any implicit meaning in the used labels that refers to their (timeline-based) arrangement?
- Each single graph in Fig. 13 is said to represent the hierarchical relationships between the state variables as they have been generated for the corresponding scenario. However, I don't see how the rectangles "Worker" and "Cobot" can be understood as "state variables", according to the definition on p. 8, line 41-46, and I also suspect that the semantics of the arrow connecting "Goal" to "ProductionL0" is different from the one of the arrow connecting "ProductionL1" to "Worker". Could the authors please clarify this point.
- Fig. 6 includes two rectangles under the so-called "ProductionL1" state variable which do not have any incoming connection. Could the authors explain why this is consistent with the idea they introduce that the independent tasks which compose that state variable come from the decomposition of a complex task. Which is the complex task whose decomposition gives rise to the above independent tasks?
- My last comment here is about my almost complete inability to relate the arguments about the collaborative modalities and the way they are formally represented, as in the first part of the paper, and the arguments about the knowledge extraction algorithm and the planning model synthesis. Basically, I don't understand how to trace back the information provided for each application scenario (the decomposition graph and the associated text) to the axioms about the collaborative modalities. In the explanation the authors make claims about that but, formally, how is this working?
(2) significance of the results
***
I am not an expert in robotics but from the knowledge representation perspective, which is where my main competences are, I consider the arguments discussed in the paper of relevance for a wide audience of scholars. In addition to that, the more the automation technology is able to put on the market robotic systems that show a high level of autonomy, that can sense the environment and autonomously elaborate the collected observations, and that are meant to safely work with humans at their side, the more I think we need formal knowledge representation frameworks which provides declarative representations of the (human/robot) behaviours, competences, skills, functions, abilities in place. For Human-Robot Collaboration scenarios and related systems, being able to automatically derive dynamic and reconfigurable planning models for the involved agents, and theoretically sound proofs of their future actions, which is fundamental for safety reasons, is crucial nowadays in my humble opinion.
Considering the attention that has been dedicated both from Robotics engineering and Computer Science to the human-robot collaborative scenarios discussed by the authors and the number of knowledge-based proposals that have been published, I miss a proper state-of-the-art section in the paper comparing the SOHO ontology with other similar artefacts. Related to that, here comes a list of potentially interest pointers from a survey I recently stumbled on, whose focus is indeed on "robotic knowledge base systems" (S. Manzoor et al., Ontology-Based Knowledge Representation in Robotic Systems: A Survey Oriented toward Applications. Appl. Sci. 2021, 11, 4324):
- KnowROB: Know rob 2.0: a 2nd-generation knowledge processing framework for cognition-enabled robotic agents
- OROSU: Knowledge representation applied to robotic orthopedic surgery
- CARESSES: The CARESSES EU-Japan project: making assistive robots culturally competent
- PMK PMK: A knowledge processing framework for autonomous robotics perception and manipulation
- SARbot: High-level smart decision making of a robot based on an ontology in a search and rescue scenario
- IEQ: A Humanoid social robot-based approach for indoor environment quality monitoring and well-being improvement
- Smart Rules: An integrated semantic framework for designing context-aware Internet of Robotic Things systems
- ARBI: Ontology-based knowledge model for human-robot interactive services
- Worker-cobot: An ontology-based approach to enable knowledge representation and reasoning in worker-cobot agile manufacturing
- APRS: Implementation of an ontology-based approach to enable agility in kit building applications
Having in mind an additional state-of-the-art section, it would be also interesting in my opinion to look for ontology-based knowledge representation proposals in fields of application which are not directly related with industry, but where the human-robot collaborative modalities of working is central. A first promising application area that comes to my mind is the one of smart health and, even more specifically, the design and deployment of the so-called robotic surgical assistants (the OROSU proposal, here above, is just one among others).
Moreover, SOHO is declared to be built on top pf CORA and SSN but no further details are provided to clarify which are the axioms in SOHO that strictly extend both these ontologies. As far as I have been able to verify, in [13] a paragraph is dedicated to introduce the scope and limits of both CORA and SSN but, again, no forther details are provided to make explicit how SOHO formally builds on top of their integration and extends it with new concepts and roles.
Another point which is not completely clear to me is the one about the ALFUS framework, which is first mentioned in the paper on p. 3.
I learn from the authors that ALFUS is "a framework to characterize the autonomy levels of a robot" and that an extension of it to include human operators was already introduced in [13] in order to represent their ability of working in autonomy: "We take into account the framework ALFUS [16] to represent the levels of autonomy of robots and extend this model to human operators". The problem I see comes from the fact that the extension of ALFUS is not really documented in the present paper or, at least, I did not find it. Even more misleading, in the paper the authors say that "it would be interesting to extend the ALFUS model to human workers" but, as mentioned here above, this extension is declared to be already worked out in [13]. Could the authors clarify on all that?
(3) quality of writing.
***
My assessment on this respect is that the paper is in general well-written. In fact, most of the comments I listed above are of a conceptual nature and do not refer to the way the authors' arguments are exposed. Nonetheless, I report here below a few comments/questions concerning specific writing of paragraphs, readability of the figures, and typos that I have been able to catch.
- At the beginning of Section 4, first two paragraphs, a sentence looks duplicated up to some slight rephrasing. I suggest to remove one of them. On the other side, at the end of the introductory text, my interpretation is that the first occurrence of "complies" should be "compiles" instead (i.e, "[...] a general procedure that production knowledge into").
- In the definition of the "state variable", Section 4.1, I don't understand the notion of "feature" and, in particular, I don't get what it means that a feature "can assume or perform" something. What is meant to be a feature in this context?
- Following the same block of definitions, it is not clear to me what is a value v_x. Assuming that, by definition, SV_i is always a tuple, in my understanding also v_x, which is a value the variable SV_i can assume, represents in fact a tuple of values. Is this correct? If yes, wouldn't be better to specify that v_x is representing a tuple of values?
- Last sentence on p. 10 contains a typo (see, "to assess the proposed the reasoning").
- P.11, line 15: should this be "working-area"? Still, on p. 11, I don't think that is necessary to say that the "Join" function type is a specialisation of "Join".
- Figures 4-6-8-10 and 12 are definitely not readable in the printed version of the paper. Moreover, I ignore whether this is a (minor) problem I experience but, I've not been able to print out pages 12, 17 and 19: what happens to me is that the text under the figures on these pages simply disappear in the printed copy of the paper (I see the figure at the beginning of each page but not the text that is meant to stay below).
- P. 15, line 22: "requiring;" -> "requiring:".
- P. 15, line 23: "associated to function" -> "associated to the function" (I guess).
- P. 18, footnote: "with an rule-based" -> "with a rule-based".
- P. 18, footnote: If the authors want to keep this note, I would add a few additional words explicitly referring to the API methods of Jena, where "OWL-DL-MEM" and "OWL-MEM-MICRO-RULE-INF" are introduced. (Notice also that the URL is in a font size which is different from the one of the footnote itself). I wonder whether the authors could give some clarification about the differences between "OWL-MEM-MICRO-RULE-INF", whose associated OWL profile is OWL Full, and "OWL_DL_MEM_RULE_INF", whose associated OWL profile is OWL DL, according to the Jena documentation. I ask this especially because in the paper the authors claim that their ontological model falls within the OWL DL profile (not the "full" one).
Please also assess the data file provided by the authors under “Long-term stable URL for resources”. In particular, assess
(A) whether the data file is well organized and in particular contains a README file which makes it easy for you to assess the data,
***
The additional materials have been stored in Dropbox. There, the reader could find the ontology modules that are introduced in this contribution and in [13] also. No README file is provided but the labelling of each file refer to one of the scenarios that are discussed in the paper (Metal, Railways, etc.) and this makes quite evident to recognise which module is for what.
It is not clear to me why in the repository there are different versions of the 'metal' module of the SOHO ontology, since there is no mention about that in the evaluation section of the paper. Honestly, I did not perform a comparison between them to identify which are the differences between the two files.
(B) whether the provided resources appear to be complete for replication of experiments, and if not, why,
***
The repository contains the source files of the ontology modules developed by the authors to deal with the different application scenarios introduced in the paper.
This is somehow the core of what the authors wanted to present in the paper, together with the algorithm to automatically extract instantiations of this modules (see, ABoxes) and planning constraints, whose proper implementation is left to the interested reader (the article reports for its pseudo-code, as usual).
I guess that a more detailed specification of the application scenarios would also be required in order to properly replicate the experiments presented in Sec. 5.2 ("Generation of Planning Specification from Knowledge") but, this is more a feeling I have than a certain claim by me. Anyway, I think that it would be useful to see the concrete models generated by the algorithm described in Sec. 4.2 to properly evaluate whether all the information needed is actually included in the ontology modules.
(C) whether the chosen repository, if it is not GitHub, Figshare or Zenodo, is appropriate for long-term repository discoverability, and
***
Honestly, I don't think that Dropbox is the best repository to chose for long-term discoverability. In particular, anything is discoverable in Dropbox except what a user has been granted to access in advance. If this is not the case, I don't see how a potential interested user could discover and get access to material that has been provided there. Beside that, a number of web portals dedicated to exposing and sharing ontologies and ontology modules exist and I strongly recommend the authors to move (at a certain point) their artefacts in one of them. This will definitely help the discoverability and the re-use of their results.
(4) whether the provided data artifacts are complete.
***
The ontology modules look complete to me. The only "data" in this case are in the form of individuals contained in the modules themselves. No other kind of data has been provided.
|