Review Comment:
This paper presents a dataset about Web APIs, which has been generated from the directory website ProgrammableWeb.com by screen-scraping, and furthermore interlinked with a few existing linked datasets. Like its previous revision, the paper …
* clearly motivates the need for such a dataset,
* explains the data source reasonably well,
* explains the ontology, which has been designed for this purpose, very well,
* explains the URI naming scheme and some statistics about the dataset,
* covers the interlinking, and
* presents as many as five (5) use cases, whose practical relevance is pointed out clearly.
The latest revision features the following main enhancements: it
* provides evidence for the usefulness of the data, by mentioning in-links from DBpedia (thus proving at least the beginning of third-party use) and a survey of a small group of users w.r.t. the subjective usefulness of the dataset.
* discusses the quality of the dataset (largely by following the 5-star open data scheme – although _even_ more could be done here, e.g. discussing more specific quality metrics such as those presented in http://www.semantic-web-journal.net/content/quality-assessment-linked-da...) and the stability of the dataset (briefly, by explaining how it is, and will be, maintained)
* discusses related work.
The latest revision thus meets, not perfectly but sufficiently, the three criteria for dataset papers. Also most of my more specific concerns were addressed. Moreover, the dataset is feature-complete and appears to be the result of solid work. I recommend acceptance with the following minor revisions:
* Still, the grammar is not perfect (e.g. w.r.t. use of articles); please let a native speaker review.
* section 7.1 "use cases": I wonder whether the queries that use prov:generatedAtTime make sense. If ProgrammableWeb does not record the history of versions of an API/mashup, then this probably effectively has the semantics of "last updated on ". Also, your ontology does not cover version histories. I would appreciate a discussion of these aspects.
* section 7.2 "survey": A broader user base would be helpful, plus some more information on their background, i.e. being more specific than "all of the participants […] have searched or used an API, while 19 […] also provide an API". E.g. in what _ways_ are they using APIs, and in what situations of their work do they consider your dataset helpful. (The distinction between the perspectives of consumer vs. provider is already a good step into this direction!) And finally, have these users used ProgrammableWeb, and if so, do they find ProgrammableWeb or your dataset more useful?
|