Review Comment:
The paper presents SYQL, a platform for linked data wrappers, built on top of Yahoo YQL and Open Data Tables. The paper's contribution is a) a set of requirements for linked-data wrapper platforms, b) a proof-of-concept implementation, and c) an evaluation of the platform's effectiveness.
The work is motivated well: linked data wrappers (LDW), as translators between API data sources and RDF, are an important component of the Web of Data, and the paper observes a number of issues common with LDWs.
The paper has several weaknesses that need to be addressed: i) requirements and their evaluation, ii) inclusion of non-archival material, and iii) various aspects of editorial quality.
First, the paper lists 7 requirements for LDW platforms, but it doesn't discuss how those requirements were arrived at – what is the source of the requirements, and what methodology was followed in eliciting them. The Requirements section should also quickly introduce the whole list of the requirements and then, perhaps, discuss them. The paper should include an evaluation of the requirements, based on lessons learned from building your proof-of-concept platform SYQL.
A crucial requirement is that LDWs should be provided as services, not by-products of application development. However, the authors ignore the economics of the situation: a service needs to make business use or be run as a charity (such as wikipedia). In many applications, the effort of externalizing the LDWs is not justified. The authors repeatedly claim that they don't expect altruism, but they do not point to any mechanism that wouldn't require it.
Secondly, the authors should consider *when* the paper will be read: initially, it will serve to let its readers know that you are working on the SYQL system; but later, it will mainly be a reference for the lasting contributions to human knowledge: the set of requirements (if it is methodically produced), your evaluation of the requirements, and the lessons learned from your proof-of-concept work. Conferences are a good venue for current work that may be obsolete when the next technology comes out, but journals are better for lasting contributions and generalizable results.
From this point of view, you should remove all the "demo time" content, and ruthlessly remove text about implementation details aimed at your potential (short term) users. You may include footnotes with pointers to online materials (demos, tutorials) and perhaps an online appendix about the technical details.
Thirdly, a good paper does not only require solid content, but also a good level of polish: readability, clarity, and accuracy. Below are selected suggestions for improving these qualities.
* The paper needs to use a better grammar. Please seek help with that.
* It needs to improve the references (e.g. the word "strony" in [68] and elsewhere is entirely inappropriate, [18] needs authors, [45] needs more bibliographic information, and these comments apply to many of the references)
* Figures need attention: often they try to show too much and that harms their usefulness. E.g. Figures 2, 4 and 16 show a sea of code that doesn't truly illustrate the point. Figures 5, 6 and 12 try to cram too many boxes into a small space.
* Section 2 is not entirely useful: a typical reader of this article should know this material. It might be useful simply to list the terms used further in the paper, with references for more information about them.
* Section 3 can be part of the introduction as it motivates and defines the problem addressed by the work.
* The paper feels repetitive at times; e.g. the quote from [24] just before MR3 on page 5 is already used in the introduction.
* MR5 is not defined very well: the description for "allow for resource lookup" is mostly about API keys, and a bit about provenance.
* In section 5.3, it is not clear why we would assign weather reports to pictures; indeed the picture should be about a place, and a place could have a weather report, and this information should be discovered from the LOD cloud, not in scope of a single LDW.
* The paper should explore the "unskilled target audience" angle mentioned in section 7.2. What are the benefits of semantic technologies to unskilled audience, and how can such audience really help contribute?
* The paper should explain how the platform can ensure backwards compatibility in LDW evolution.
* Section 12 should summarise the results in [43].
* Section 14 could use a reminder of what the requirements are, and it should explain why it skips requirements 1 and 2.
* The paper seems to conflate PaaS with running LDWs as a service. Requirement MR1 is about LDWs, while the text before, on PaaS, is about platforms being provided as a service. An app built on PaaS does not itself have to be seen as a service.
* "taping" -> "tapping"
* "petitions" -> "requests"
Altogether, the paper may contain a solid contribution, it just needs to bring it out and make sure it is consistent and well grounded.
|