Review Comment:
The paper illustrates an effort to publish public transport data as linked frangments and an evaluation of a sort of "quickest path" route planning query with a set of data coming from disparate sources around the world. Besides being an interesting subject, the paper in my opinion has a limited validity and the claims by the authors are not always supported. It is a valuable extension of the previous work by the same research group, but it is not up the promises of a "cost-efficient" solution to any public transport use case that the introduction and conclusions try to suggest.
The main limitations I see in the current submission are as follows:
- there is no comparison with a non-semantic web baseline, failing to convince the reader that the solution is really worth the effort with respect to a more traditional architecture
- the evaluation seems only on the scheduled transport data and with a random selection of queries, so it gives no idea of the representativeness of the tested cases
- the "normalization" operated on the experimental results doesn't show the actual outcome of the evaluation, leaving the impression of "artificially adjusted" numbers
- regarding the live and historical transport data, the authors only provide their claim that the proposed architecture "works": they simply present *one* possible solution
- the addressed use case is only related to a traditional route planning scenario (and not any generic elaboration/query on public transport data), and it is also limited to the earliest arrival time search; a dedicated API would perfectly do and would also be much easier on the client side, so the authors fail to convince the reader of the need for such a linked fragments approach for the specific case
- the discussion section includes some unsupported claims on the advantages of the presented solution
I suggest a major revision of the paper with the following goals:
- clarifying the actual scope and limiting the claims to those that can be indeed demonstrated
- providing a comparison with a non-semantic web baseline
- (maybe also) providing some evaluation on the live/historical data and not only on the scheduled data
My review on the usual dimension for research contributions is as follows:
- originality: the content is indeed original, even if an incremental improvement with respect to the previour work of the same authors
- significance of the results: the evaluation is interesting but of limited validity, because of the complete lack of a comparison baseline based on a traditional approach; some of the results appear to be straightforward and not specifically surprising; the applicability of the presented solution in a real setting is not completely convincing (as the same authors partially admit)
- quality of writing: the paper is in general readable, with a few unclear sentences and a bunch of typos; the structure is reasonable, even if I'd suggest the authors to reorganize it a bit, by anticipating the information about the datasets, their metrics and the metric values before the evaluation section (rather than partly in the evaluation and partly in the result sections)
My detailed comments and suggestions are offered hereafter:
- introduction: the authors start with general considerations about public transport data and services, but then they focus only on route planning: while this is of course ok, I think that they should make this focus and scope clearer since the beginning (and probably also in the paper title). Indeed the authors criticize the use of Web APIs (page 2) because they "limit data accessibility" and they advocate for solutions that overcome such limitations "an API that calculates only the fastest route in a PT network may not be useful when trying to find routes that are wheelchair-friendly or for different purposes than route planning", but then they address a very specific route planning case only, failing to show the superiority of their solution in this respect
- related work, 2.1: the authors make an interesting overview of standards and vocabularies in the transport domain, but then they mainly use Linked GTFS, thus all the following claims on "interoperability" are based on the use of a single model (even if it is a reasonable choice, yet not covering all potential needs for public transport data representation)
- related work, 2.2: the authors "criticize" the existing dumps and APIs in use by the transport operators, because of restricted access and server-side costs; indeed, the operators may *want* to limit access to their data because of business choices; moreover, moving the burden to the client side doesn't seem an advantage from the client point of view
- related work, 2.4: the authors make a strong claim on "route planning is the most prominent use case over PT data" which may be true from the traveller's point of view, but not necessarily from the operator's one; moreover, they choose a specific algorithm (CSA) and a specific kind of query (earliest arrival time): this does not clarify why the proposed approach should be better that a dedicated API for this specific scenario
- related work, 2.5: "unfortunately, historical PT data is not easy to come by": this may be true if you limit the scope to open data, but the transport operators do have and maintain historical information! The fact that they are not openly available doesn't mean that they don't exist and the operators have understendable business reasons to keep that data private
- linked connections: "we showed that indeed LC achieves a better cost-efficiency by consuming considerably less computational resources on the server side": and what about the client side? The authors always seem to ignore the burden on the client side and, as a consequence, on the server-client system in its entirety
- linked connections, 3.1: the term "vehicle" is used but not introduced; however, since it is never reused in the rest of the paper, I'd suggest to abstract the explanation avoiding to include vehicles
- linked connections, figure 4: I'm not sure I got the explanation: are the connections simply in temporal order or does the structure take into account the network topology? or does each document refer to a single specific connection? The textual explanation also does not introduce dt_k and references the figure only at the end of the subsection
- linked connections, 3.4: "access to this data could support analytical studies..." but this is not route planning! What kind of queries do you expect on historical data? "historical data queries are not as performance-critical as live data queries for route planning purposes...": this is true *only if* you want to have a route planning query! What about a query like: give me all trips that were delayed of more than 10 minutes at least 50% of the times during the last 2 months?
- evaluation: as said in the summary at the beginning, the entire evaluation is LC vs LC; there is no comparison with a custom Web API approach: since in the end the auhtors address only *one* query type using *one* algorithm, it would be fair to compare the proposed architecture with a traditional solution, to show that it is not only an academic exercise
- evaluation, intro: in H1 the authors introduce the term "performance" but don't specify what it refers to; in the rest of the paper it is clear that they only refer to query response time, but they never address completeness or correctness of results (which are other interesting parts of "performance"). Moreover, H2 seems to be expressed in a reverse form: "it is possible to find PT networs whose topological characteristics improve route planning query performance when published over LC interfaces" I would have expected the contrary, i.e. that LC interfaces may improve response time in case of specific PT topology characteristics... In this subsection, the authors introduce the data, but the details only come much later: I would suggest a dedicated "data" section, followed by the "evaluation protocol", then "results" and "discussion"
- evaluation, 4.2: the authors introduce walking transfers but it is not clear if those are also modeled as connections. As said, performance is only query response time: do the authors always know the correct/complete solution for each EAT query?
- evaluation, table 2: are those numbers computed only on planned connections? how are they computed (e.g. multiple executions of each query)? I don't understand what the sparkline represents, because it does not look like a frequency distribution: what is on the x axis? time or query repetition? how long is the average duration of the query? is this computed over the entire fragmentation size space?
- evaluation 4.3 and results 5.1: this is what I'd move earlier as part of a "data" section
- results, figure 5: it is completely obscure to me what is represented in the pictures and it is also unclear why it is important; the visualization choice with the black background is very questionable (especially in printed or visualized in black and white), but I also don't find this picture specifically relevant
- results, 5.2: "we use a per-connection result to remove the influence of route query length" but isn't it also influenced by the query "complexity"? Since the queries were randomly genrated, we cannot understand their "difficulty" (e.g. long path but less alternative options vs. short path with a lot of alternatives). The "normalization" seems a way to hide the actual response time
- results, figure 7: why did the author choose the 90th percentile?!? the (normalized) response time depends on the query complexity!
- results, 5.3: the authors could simply provide the correlation between the response time and the various metrics (with the respective statistical significance) instead of trying to infer it from the plots in figure 8; moreover, I'd suggest the authors to provide several numerical outcomes (e.g. correlation, covariance, R^2, etc.) and to display more meaningful plots, for example with linear (or other) interpolation, confidence intervals and similar, to visually display the "strenght" of the relation between variables
- discussion, 6.1: "our approach performs such integration in an efficient way on the API server-side...": how can you say it's efficient? what about client-side? "...freeing data reuser applications from expensive data reconciliation tasks": the authors always test the *same* specific route planning use case, which could be implemented by a single API and they didn't compare to it! "live data updates are never recorded and get lost after being briefly published through traditional APIs": the authors can't say that! They can only say it is not left as OPEN historical data. There are a million ways to publish (or store) historical data and the authors offer only *one* way to do it! "More importantly, it also defines a query interface for this data": depending on the way to publish historical data, there can be other query interfaces!
- discussion, 6.2: if I got it correctly, all your evaluation was done on scheduled data only: what about live and historical? This subsection is the only place where the authors honestly admit that "for several networks this approach seems impractical for real scenarios". Here, they also introduce the discussion on caching: the absence of caching in your evaluation shuld have been mentioned before. Anyway, since there is no comparison to an alternative architecture, it is hard to tell whether the proposed solution (even with a cache) is practical or viable for real usage scenarios
- discussion, 6.3: a sentence may be missing on page 20, second column, line 39. Extra processing due to walking distance computation: the authors never explained how they considered or computed walking distance (which could be modeled as a connection itself)
- conclusions: "...as a cost-efficient data publishing": unsupported claim. "...facilitates data reuse for client applications": unsupported claim. "...establishes a framework for data interoperability": this is completely independent from the use of linked fragments and out of scope w.r.t. the presented evaluation; interoperability in this case emerges from the reuse of a reference set of vocabularies. Moreover, the authors did not explore further semantic possibilities, by attaching to connections additional "sematics": the authors used only connection time, but they could have used length (shortest path) or traffic condition (less congested path) as in [1], or "beauty" or "happiness" of the route as in [2], and they could have tried to demonstrate the potential of the approach to be applied to several aspects of the data, even keeping the investigation on route planning only. "...careful design of API data structure can affect and improve the performance...": isn't this obvious? "...the approach performs well enough to be used in practical scenarios...": unsupported claim, indeed in section 6.2 the authors admitted the exact opposite
[1] I. Celino, E. Della Valle, D. Dell'Aglio, F. Steinke, R. Grothmann, V. Tresp: "Semantic Traffic-Aware Routing for the City of Milano using the LarKC Platform", IEEE Internet Computing, DOI: 10.1109/MIC.2011.107, 2011
[2] D. Quercia, R. Schifanella, L. M. Aiello: "The shortest path to happiness: recommending beautiful, quiet, and happy routes in the city", Proceedings of the 25th ACM conference on Hypertext and social media, 2014
|