Review Comment:
This paper presents a dataset about Web APIs, which has been generated from the directory website ProgrammableWeb.com by screen-scraping, and furthermore interlinked with a few existing linked datasets. Like its previous revisions, the paper …
* clearly motivates the need for such a dataset,
* explains the data source reasonably well,
* explains the ontology, which has been designed for this purpose, very well,
* explains the URI naming scheme and some statistics about the dataset,
* covers the interlinking, and
* presents as many as five (5) use cases, whose practical relevance is pointed out clearly.
The latest revision features the following main enhancements w.r.t. the criteria for dataset papers:
* Regarding the quality and stability of the dataset, it now provides additional information about the process of generating the dataset (Section 4.2) and is more explicit w.r.t. the quality criteria assessed (Section 6.1).
* Regarding the usefulness of the dataset, the observations from the user survey are now explained in slightly more detail (Section 7.2).
* Clarity and completeness of the descriptions: this has generally improved.
I recommend acceptance; however, the authors should take care to update all figures on interlinking. Not only is the number of out-links likely to grow during the subsequent maintenance of the dataset, but on top of the in-links received from DBpedia in October 2015, I would also expect more in-links to be created.
The two _minor_ concerns from my previous review were actually not fully addressed in this revision. Let me re-state them, trusting that you will address them in the final version.
* section 7.1 "use cases": I wonder whether the queries that use prov:generatedAtTime make sense. If ProgrammableWeb does not record the history of versions of an API/mashup, then this probably effectively has the semantics of "last updated on ". Also, your ontology does not cover version histories. I would appreciate a brief discussion of these aspects (2–3 sentences).
* section 7.2 "survey": My question about whether your survey participants had used ProgrammableWeb is now answered, but the following issue has not yet been addressed: some more information on the background of the users would be helpful, i.e. being more specific than "all of the participants […] have searched or used an API, while 19 […] also provide an API". E.g. in what _ways_ are they using APIs, and in what situations of their work do they consider your dataset helpful. (The distinction between the perspectives of consumer vs. provider is already a good step into this direction!)
There are also a few places in which the grammar still needs fixing; e.g. in Section 7.2. "<*> majority of the participants" (missing article; see http://ell.stackexchange.com/questions/38244/what-is-the-difference-in-m... as a guide on whether to choose "a" or "the"). Another example of a sentence with poor _style_ is in Section 8.2: "A currently ongoing effort (up to this point rephrasing will help) is on integration of the (up to here, rephrase once more) […]".
In the references, there is a UTF-8 encoding problem in [10], and the metadata for [11] is not up to date; use the following:
@article{Zaveri2012:LODQ,
author = {Zaveri, Amrapali and Rula, Anisa and Maurino, Andrea and Pietrobon, Ricardo and Lehmann, Jens and Auer, S{\"o}ren},
journal = {Semantic Web Journal},
Number = 1,
title = {Quality Assessment for Linked Data},
url = {http://www.semantic-web-journal.net/content/quality-assessment-linked-da...},
volume = 7,
pages = {63--93},
year = {2016},
}
|