Semantically-enriched Pervasive Sensor-driven Systems

Tracking #: 441-1608

Authors: 
Juan Ye
Stamatia Dasiopoulou
Graeme Stevenson
Georgios Meditskos
Vasiliki Efstathiou
Ioannis Kompatsiaris
Simon Dobson

Responsible editor: 
Oscar Corcho

Submission type: 
Survey Article
Abstract: 
Pervasive and sensor-driven systems are by their nature open and extensible, both in terms of their inputs and the tasks they are required to perform. The data streams coming from sensors are inherently noisy, imprecise, and inaccurate, with differing sampling rates and complex correlations with each other: characteristics that challenge traditional approaches to storing, representing, exchanging, manipulating and programming with rich sensor data. Semantic Web technologies allow designers to capture these properties within a uniform framework. The powerful reasoning techniques with such a representation facility have proven to be attractive in addressing issues such as data and knowledge modelling, querying, reasoning, service discovery, privacy and provenance. In this paper we review the application of the Semantic Web to pervasive and sensor-driven systems. We analyse the strengths and weaknesses of current and projected approaches, and derive a roadmap for using the Semantic Web as a platform on which open, standard-based pervasive, adaptive, and sensor-driven systems can be constructed.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Reject

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Jean Paul Calbimonte submitted on 05/Apr/2013
Suggestion:
Reject
Review Comment:

This survey paper presents an overview of the application of semantic technologies to sensor-driven systems. It analyses mainly 6 challenges and identifies Semantic Web technologies that address those challenges. Then it observes remaining open issues to be tackled.

This paper is ambitious in the sense that sensor systems may benefit from many SW technologies (from query processors to reasoners and models), and thus the authors cover an immense range of works. This is somewhat dangerous as each of the 6 challenges are vast and for each one an individual survey would be needed to fully describe it (e.g. a whole survey on stream reasoning would be worth publishing). Therefore the reader is left with the impression that this paper is too superficial in the analysis and comparison of the different described approaches. Nevertheless, some sections are more detailed than others (e.g. Event modeling), and this means that the authors have not used the same methodology to evaluate or analyze each of the 6 different challenges.

Perhaps it would be necessary to focus only on some of these issues, in order to provide a complete and detailed survey. Narrowing the scope does not mean to make this work become 'incomplete'. On the contrary, it would help the authors to provide a deeper analysis of this area.

Another issue is that one of the challenges (in fact the 7th one in Section 2 is not really analyzed (only a brief unsubstantial mention in the final comments of 4.11), which is surprising, given that this is a key issue for sensor systems with Big Data problems including very high velocity, and (especially for semantic-aware systems) high variety.

Nonetheless, when the authors take time to perform a summarization and analysis, the results do represent a contribution, such as the summary on Table 2. In fact I would recommend the use of tables such as this throughout the paper to show how the different approaches match the 'added value' represented here, and provide more insightful comments on how the authors reached these conclusions. (for instance 'scalability' is identified as one of the 'further research inquiries' on most cases, but this is not evident in the rest of the subsections of Section 4).

Apart from these important issues, in general the paper is well written and represents an improvement over the previous version, especially in terms of the surveyed works and most notably the coherence of the paper terminology, and structure.

Detailed comments below.

Abstract
======
After reading the abstract it is expected to see:
1 application of SW to sensor driven systems
2 strengths and weaknesses of approaches
3 propose a roadmap

After reading the paper point 2 is not really clear. In many cases the paper provides only a description but not a complete analysis and comparison of weaknesses and strengths.

Introduction
=========

What is the methodology used to perform this survey? What type of works are eligible? While there are many sensor-driven systems and approaches using semantic web technologies, it is not absolutely clear at this point what are the dimensions you will use to compare them.

From the introduction, it seems that only Section 4 is devoted to really provide a survey of the different approaches.

Applications, Information and Research Challenges
========================================

This section is too broad in scope. The research challenges or key issues to be analyzed in the paper should be clear even from the introduction, in order to guide the reader and define what this paper is going to analyze in each approach or system.

The application examples (2.1) provided are too detailed and I don't think it is too relevant material for the rest of the discussion. For this type of survey paper, it is more important to focus on the research challenges, and these applications descriptions are secondary and should be shortened.

The information in pervasive computing is missing a very important fact. This is that the raw sensor data substantially differs from other traditional types of data in that it is intrinsically dynamic, and represented (usually) as data streams. This fact brings many of the challenges wrt sensor data, because it implies the need for continuous processing, management of data bursts, real-time evaluation etc. (which is one of the challenges identified by the authors).

The research challenges identified are:
- Conceptual modeling
- Querying
- Reasoning
- Uncertainty
- Service discovery
- Privacy and Provenance
- Scalability and Performance

It remains unclear even after this point if this survey will analyze existing proposals taking into account all these dimensions. Therefore the methodology of this survey is not clear.

Semantic Web Technologies
=======================

For the audience of the SWJ this section is most likely not necessary, or at least not as a full section. The details on DL, OWL, OWL2 do not add much to the whole discussion, and all these details are not necessary to follow the rest of the paper. A 'background' section in this survey can certainly introduce SW technologies (briefly), but also what sensor-driven systems provide, without SW technologies. This automatically would induce the reader to think why do we need SW technologies in these systems?

Integrating Semantic Web with Pervasive Systems
=======================================

This long section (half paper) is the main contribution (the survey itself). However the different sections are very unbalanced in the way they present the different approaches. For instance in 4.3 there is a very complete description of event models and a comparison of them according to well defined criteria. In the rest of the sections the approaches, models and systems are described with much less detail and there is no systematic comparison according to well-defined criteria. For most of the subsections, we are left with mostly a brief description of an approach and little or no comparison among them (which would be expected in a high quality survey). For instance 4.4 doesn't even have an analysis subsection at all.

Also, we would expect that the challenges presented in the previous section would also be addressed here. It is surprising that scalability is nowhere to be found (except for a brief mention in 4.11 but it cites an evaluation using Jena and related tools, which is clearly not enough to seriously speak about performance in this context). The issues on scalability lay far beyond just simple evaluations of OWL and RDF processing libraries, how about rapidly changing observation values coming from sensors? continuous complex event processing? callout with parallel processing of continuous streams of sensor data? the scalability issues are unacceptably disregarded in this survey.
The reader is left with the impression that the different subsections of Section 4 have been written with very different methodologies, in some cases providing mostly brief descriptions and in other cases full comparison and deep analysis. This should be uniformed.

While the approaches in 4.1.1 are worth mentioning, they are syntactic representations (well mentioned by the authors) and therefore I don't see why there are so many details about them. For instance the model in FIg1 is too simplistic compared to other relevant ontologies such as SSN (which is described later). A better balance is needed on the level of details provided in this section.
The analysis in 4.1.3 lacks a discussion about reusability, which is a key point for sensor ontology modeling. Ontologies such as SSN can be combined (and must be combined) with other domain ontologies, temporal ontologies, etc in order to provide a full model covering all aspects of a pervasive application. This is only possible if the models are designed to be extensible and that is crucial in these ontologies.

The section in 4.2 describes different approaches for representing context (time,location, etc), and correctly point out that in most cases the need for integrating existing models. However I am missing simple ontology models for geo-location (wgs84 owl, genomes ontology, neo-geo ontology) which are commonly used for sensor systems.

Section 4.3 is comprehensive and the different approaches are compared with well-defined dimensions.

Section 4.4 is just a description of approaches and lacks analysis and discussion.
This section could probably be merged with 4.5, as CEP is also closely related to querying, but at a higher level of abstraction.

4.5 correctly points out that most of the SPARQL based query extensions focus more on the temporal aspects. I don't understand the final comment on 4.5.1, when it says there are no widely spread standard models. The models in section 4.1, e.g SSN can be used here and are gaining adoption.
There is not even a broad comparison of the presented approaches, how do they compare to each other, what is missing on them, are they useful for sensor-driven systems or to what extent?

4.6 brings forward reasoning techniques including rule-based and hybrid approaches. However I do not see why this section is not merged with 4.7, which is also about reasoning but focusing on the rapidly changing nature of sensor streaming data. It would be advisable to have a more structured view of reasoning for sensor systems including all these possibilities, and explain the scope of each one and how these are combined in practice. Otherwise 4.7 is just an enumeration of some approaches with little relationship with the rest of the paper.

in 4.8 and 4.9 we are missing again informative at best but lack a deeper analysis and comparison. In the case of uncertainty this can understandable to some extent because it seems to be not too developed according to the authors' account.

In 4.10 the term provenance seems to be misleading. Provenance usually refers to where the data comes from, who originated it , how it was derived from previous data. In this sense, existing models such as PROV, PROV-O family of specifications of W3C are good examples, and are widely used by a growing community. The authors do not touch this in this section, and they probably should, but they go back to it in 5.3. Probably some of the analysis of 5.3 would fit better here in order to provide a better account of what is being done int rems of provenance.

5 Challenging Issues
=================

5.1 I agree on most of the issues about interval modeling and need for standards in temporal modeling in rdf and sparql, this is definitely part of the current challenges.

The issues on 5.2 are tackled by complex event processors and stream processing engines and their semantic derivatives, in most cases. It is true that for stream reasoning it is still a topic that is still in its infancy.

Review #2
By Josiane Parreira submitted on 04/Jul/2013
Suggestion:
Minor Revision
Review Comment:

The manuscript presents a survey on semantic technologies applied to the pervasive computing domain. It describes the current state-of-the-art in the different aspects, e.g. knowledge representation, reasoning, data discovery and provenance and highlights the remaining open challenges. It finished with an outlook of possible research directions.

It is very thorough survey and I really enjoyed reading it. I think it provides a very good summary and pointers to references in the area of semantic web technologies for sensors. I would definitely recommended of publication, but first I would suggest a few minor improvements to the text. See details below.

*** 1 ***

I appreciated the extensive reference lists. They authors did a very good work on it!. Still, there are a few sentences in the text that needs to be backed up by references. Below are these sentences:

"To be effective, pervasive computing must be supported by an open and standard-based representation so as to facilitate integrating information of heterogeneous types and modalities, as well as communicating and exchanging information between de- vices and components."

"In traffic control, a system can be immediately informed of incidences of congestion or the occurrence of accidents and notify all approaching drivers."

" studying active volcanoes with collected seismic and infrasonic (low-frequency acoustic) signals."

"At least three possible strategies for overcoming this limitation are available to data modellers: using RDF’s reification vocabulary, externally generating identifiers for statements, or use a non-standard extension to the RDF model."

*** 2 ***

Authors are very detailed in most of the definitions, but I felt that some terms were "taken from granted". For instance, while OWL and DL are very well described, RDF is never actually introduced. For the sake of non semantic web users (e.g. pervasive people) I would recommend to clarify common SW terms.

*** 3 ***

The smart home application in Section 2.1.1 is not really smart home but more like smart home care. Smart home would encompass smart meters, smart fridge, etc.. Please add those applications or change the title of the section.

In Section 2.1.2 that title is also not adequate. The text describe more than applications in the transport domain.

*** 4 ***

The research challenges in Section 2.3. are somehow already tailored to be full fill by the semantic web. It would be nice to have are more "generic" list of challenges in pervasive systems and then highlight those where SW can help.

*** 5 ***

The paper's motivation and sections headers talk about Semantic Web in general, but the text is heavily "bias" towards ontology's, OWL, and reasoning. The description of CEP and querying is fairly small compared to the other sections. The CEP section does not even provide an analysis, like all other sections. I think the authors need to work a bit more on the smaller sections, to have an even more comprehensive survey. There is a tutorial form the Reasoning Web summer school but might help as starting point.

Danh Le Phuoc, Josiane Xavier Parreira, Manfred Hauswirth: Linked Stream Data Processing. Reasoning Web 2012: 245-289

*** 6 ***

At the end of page 21 there is a mention to "our sensor data model". Which model is this?

*** 7 ***

Section 4.8 has a different writing style, compared to the rest of the manuscript. It is the only part where the authors explicitly point the readers to further references.

*** 8 ***

Figure 8 needs more explanation

*** 9 ***

The summary section (4.11) doesn't really addressed what has been presented so far. Table 2 deserves more attention in the text.

*** 10 ***

In Section 5, its not clear why those particular 4 items were selected for discussion.
The programming section sounds very negative. Sure one has to learn the SW technologies, but it is still better to learn to program every different sensor available and try to make them talk to each other right?

*** 11 ***

In the introduction, authors wrote: "Open issues include capturing the temporal semantics of data; reasoning in the presence of extensive uncertainty;… "

There are still a lot of research challenges in those areas, but open issues sounds like nobody looked at them yet. I would rephrase this part.

*** 12 ***

Final comment: I am not a native English speakers, but some of the authors are, but I am usually in favour of small sentences. While reading the manuscript, I found a lot of very long sentences which I find hard to process. But thats more writing style and this is just a personal remark.