Review Comment:
This piece of research presents DIVIDE, an environmentally conscious semantic platform, which employs its interconnected IoT devices with a view to spontaneously generate and supervise versatile stream processing queries. The authors, interestingly, placed great emphasis on a particular case of using home care surveillance, demonstrating the flow of queries and the security of sensitive content throughout the sequential IoT component layout, while their system evaluation illustrates robustness and stability of performance compared to other baseline approaches through a series of experimental examinations.
In addition, it is worth mentioning that the authors provide their data files, which consist of the extended ontology they used, the evaluation material and the EYE reasoner implementation files along with the relevant README files. All prementioned files are well-organized on GitHub repository.
Overall, it is obvious that the authors made some effort to accomplish this work, exploiting a combination of existing tools to provide a methodology greater than the sum of its parts, on an interesting issue, that could aid vulnerable groups of people living mostly remote. The introductory chapter gets the reader into the spirit of this article, the motivation behind the labor and the description of contribution are clear, the placement of the work is of interest, since it is related to Semantic Web Journal topics, while the experimental methodology is persuasive. I liked the problem, the perspective and the presentation of this paper, however there are some points that prevent me from suggesting a solid accept, and l will try to mention them below in order of importance:
* The Related Work chapter provides a series of stream processing and semantic reasoning approaches that warm up the reader for what comes next. However, the privacy-preserving segment, which is bulleted first in the list of research objectives of the Introductory section, is not reinforced by the appropriate literature in the field of privacy-preservation. I would suggest enriching this chapter with similar system security solutions that can support this work.
* In a similar spirit to the comment above, some encryption on the information sent via the network to the main reasoner on the central server, would strengthen the privacy-preservation part of this research study and would keep sensitive content more secure from outside threats that could exploit these data. For example, by taking advantage of these data, a malicious person could figure out if the patient is in or out of the apartment.
* Although the presentation and general writeup are good, there are places where the readability of the article could be further improved. Such cases are extensive sentences (covering many rows) that employ identical terms over and over again, while there are some acronyms which are never explained, forcing the reader to investigate the relevant literature to understand them. For instance, CEP (standing for Complex Event Processing), or VKG (meaning, perhaps, Virtual Knowledge Graph), or even IRI (Internationalized Resource Identifier) are not defined. A reader without an Ontology-oriented background would not understand for what IRI stands for. I would recommend rewording or shortening long sentences and describing the unexplained acronyms used.
* In my point of view, it is very important to avoid personal pronouns (e.g., he or she) when composing a formal document. One such case that appears on this paper is: "In the running example of the use case scenario described in Section 3, there is one RSP query that actively monitors the patient's location in the home, and one query that detects when the patient is showering if he or she is located in the bathroom." (page 14, lines 15-17). I would advise rephrasing such sentences. For the given example a better expression might be: "In the running example of the use case scenario described in Section 3, there is one RSP query that actively monitors the location of the patient in the home, and one query that detects a showering condition when the patient is located in the bathroom.".
* Regarding the employed dataset, despite the fact mentioning the data collection process (using HomeLab's and wearable sensors that generated around 670K observation data), the structure of a sample of the dataset acquired, would be convenient for understanding the type of the information obtained by the observed IoT devices.
* In the presented use case scenario, the "Intermediate queries" and the "Context enrichment mode" are not supported. It would be better to present an alternative application example that covers all types of queries that the DIVIDE's query parser can uphold.
* The References chapter is divided into two parts. It is separated by two boxplot distribution schemes of Appendix C. I would propose moving the entire citation list to the end of the paper (before the appendix sectors).
* Minor typos and comments:
- [page 4, lines 38-39] only recent attempt has laid the first fundamentals on realizing the full vision of cascading reasoning with Streaming MASSIF [41]. -> only *a* recent attempt has laid the first fundamentals on realizing the full vision of cascading reasoning with Streaming MASSIF [41].
- [page 5, line 31] Z-Plus has helped us with designing the rules. -> Z-Plus helped us design the rules.
- [page 12, lines 49-50] and (ii) *and* the core of DIVIDE which is the query derivation. -> and (ii) the core of DIVIDE which is the query derivation.
- [page 17, lines 22-23] that it is instantiated by the semantic reasoner *reasoner* if the rule is applied in the proof during the query derivation. -> that it is instantiated by the semantic reasoner if the rule is applied in the proof during the query derivation.
- [page 18, lines 7-9] The direct consequences of a sensor observation matching the WHERE clause in lines 59–*59* would be the fact that an ongoing activity of the given type is detected for the given patient. -> The direct consequences of a sensor observation matching the WHERE clause in lines 59–69 would be the fact that an ongoing activity of the given type is detected for the given patient.
- [page 24, lines 5-6] This process can execute independently for each RSP engine and can therefore be parallelized by DIVIDE. -> This process can *be executed* independently for each RSP engine and can therefore be parallelized by DIVIDE.
- [page 24, lines 15-16] and will also be used as the running example in this section to illustrate the query derivation process *in this section*. -> and will also be used as the running example in this section to illustrate the query derivation process.
- [page 24, line 35] For every step, the inputs and *and* outputs are detailed on the figure. -> For every step, the inputs and outputs are detailed on the figure.
- [page 27, lines 49-50] depending on whether the substituted value is *a* IRI or a literal -> depending on whether the substituted value is an IRI or a literal
- [page 28, lines 5-6] This substitution is performed based on the generic RSP-QL query body that *is referred to in* the output of the query extraction in Listing 12. ->
This substitution is performed based on the generic RSP-QL query body that refers to the output of the query extraction in Listing 12.
- [page 31, line 10] The tasks of this final step are the following: construction the actual RSP-QL query string, -> The tasks of this final step are the following: construction *of* the actual RSP-QL query string,
- [page 33, line 2] This is the preferred option when deploying new systems -> This is the preferred option when deploying new systems*.* [punctuation - full stop at the end of the sentence]
- [page 34, lines 21-22] In the other case, the queries are translated to N3 rules which are then applied on the set of triples and, if reasoning is enabled, ontology rules. -> In the other case, the queries are translated to N3 rules which are then applied *to* the set of triples and, if reasoning is enabled, *to* ontology rules.
- [page 34, lines 36-37] General information about the collected data, the ontology and context, and activity rules used for these evaluations is presented in Section 8.1. -> General information about the collected data, the ontology and context, and activity rules used for these evaluations *are* presented in Section 8.1.
- [page 35, line 21] the state of windows *and* doors and blinds, and others. -> the state of windows, doors and blinds, and others.
- [page 36, lines 29-30] This section evaluates the real-time performance of evaluating these DIVIDE queries on the C-SPARQL RSP engine [15]. -> This section compares the real-time performance of evaluating these DIVIDE queries on the C-SPARQL RSP engine [15].
- [page 37, line 2] RFDox -> RDFox
- [page 38, lines 27-28] Figure 8 shows similar results of the comparison of the real-time evaluation *with* DIVIDE with the real-time reasoning approaches, but for the *toileting* query. -> Figure 8 shows similar results of the comparison of the real-time evaluation of DIVIDE with the real-time reasoning approaches, for the brushing teeth query.
- [page 38, lines 28-29] The properties of the graph are similar to those of the graph
presenting the results for the *brushing teeth* query. -> The properties of the graph are similar to those of the graph presenting the results for the toileting query.
- [page 44, lines 11-12, Figure 9 caption] The results show the total execution time distribution over the engine’s runtime and multiple runs, for *both the toileting and brushing teeth* DIVIDE queries. -> The results show the total execution time distribution over the engine’s runtime and multiple runs, for both the toileting and showering queries, as well as for the brushing teeth DIVIDE query.
- [page 47, lines 15-16] where the activity can be detected by a single independent sensor in the room that *crosse* a defined value threshold. -> where the activity can be detected by a single independent sensor in the room that crosses a defined value threshold.
|