Review Comment:
This paper describes a system framework to address the use case of automated user notifications about relevant public transit operation disruptions. The proposed system relies on Semantic Web technologies to provide an integrated domain knowledge model, entity identification and inference capabilities.
In general, this work provides a very good example of a use case where semantic web technologies are used in combination with NLP technologies. The addressed use case is very interesting and highly relevant for the transportation domain. Also, the experimental design seems to be well founded and supported by real user-based evaluations. However, I see some limitations that would prevent me to recommend it for publication in its current state:
- From what I can see, the described system framework was designed and implemented mostly during previous work, namely [11], [12] and [27]. It remains then unclear which contributions are introduced in this paper as opposed to the previous work of the authors. For example, "the knowledge model for transportation information" is mentioned as a contribution of this paper, but the same model was already introduced in [11]. If there are new technical contributions, it should be made explicit in the paper. What was designed/created before? What is new in this paper?.
In the current state of the paper, the main contribution lies on the experimental setup and the performed evaluation, which should be highlighted as such throughout the paper.
- In line with the previous comment, I miss the definition of clear research question(s) and hypotheses for this work. The evaluation hints on what the authors aim to study, but I would rather see them explicitly described in the paper, which should also guide the conclusions as the research questions should be answered/validated based on the results of the evaluation.
- Having the system architecture implementation being done at least 6 years ago (as seen from GitHub activity), lead to certain architectural design choices that could be considered outdated or even deprecated nowadays. As examples stand (i) the transit ontology which has been superseded by the linkedGTFS ontology (http://vocab.gtfs.org/) over 6 years ago; (ii) the use of SPIN for defining inference rules, which currently has SHACL as a notorious standard successor (https://spinrdf.org/spin-shacl.html); (iii) relying on D2RQ for Linked Data generation over relational databases, which was replaced by the standard R2RML and (iv) the Ontotext KIM system, which is a core module of this system and does not seem to exist anymore. I would suggest the authors to at least discuss why their implementation despite outdated may still remain relevant and to provide perspectives on how the current technologies and standards would impact their system design and what technologies may be used to replace those that do not exist anymore.
- This work spans over multiple technical domains that are not sufficiently covered in the related work. The related work focuses only on event detection from social media. I would suggest to at least elaborate on related work about the use of semantics in the transport domain, for example with recent semantic models based on TransModel (https://oeg-upm.github.io/snap-docs/). Transport related works are briefly mentioned in [11] but, this paper needs a more recent revision. Also related work on semantic inference is missing and how your approach compares to it. In this way, it would help to discuss a more complete and up to date technological landscape.
- The time performance experiment presented in section 8.2, is used to prove that the system is able to perform its whole process in a reasonable time and breaks down the times takes by each individual sub-process. However, I think your work would be more complete if the scalability limitations of this system are assessed and discussed. If it is supposed to be deployed in a city with possibly thousands of users and complex transport systems, how many twitter accounts can it monitor, for how many users, for how many transportation routes?
Minor remarks:
- Typo in the abstract: "to determine if they are are" -> "to determine if they are".
- Link in footnote 6 is broken.
- Typo in section 3: "providing appropriate privacy" -> "provided appropriate privacy".
- The TravelBot ontology is not linked in the paper?
- Link in footnote 31 is broken.
|