Review Comment:
This survey presents an overview of tools for LD consumption, which, (given the widespread preference for non-LD data) still is a crucial and important topic to cover. The authors go beyond the traditional survey format by identifying an LD consumption process and a pipeline/platform (LDCP) together with 36 requirements and 93 evaluation criteria. Altogether 110 tools are considered from which 9 are picked for a more thorough assessment based on the identified criteria.
The paper is very comprehensive and the topic certainly of relevance.
While I generally appreciate the more original approach of this survey, its impact on the manuscript is both positive and negative. Most notably, the authors base their survey on certain assumptions, e.g. by defining the need for a "platform" or by defining requirements for such a platform which often are debatable and limit the transferability of results. In other words, such decisions constrain the survey (and the surveyed tools) in a way which might not be agreeable for everyone. While "subjective" assessments are usually hard to avoid in any survey, the approach picked by the authors even constraints the selection of surveyed tools themselves.
In this line: the approach of the authors is to filter existing tools by applying all of their criteria at once. This basically means that the final set of tools are meant to provide as many features as possible (while tools which do a single task very well are left out). This, again, seems not to match the practices we find in software development. In particular for the wide range of features which are mentioned (loading, discovery, linking, mapping, visualisation) it seems not convincing to ask for one tool which is good at everything.
Also, the distinction between criteria and requirements seems not very clear (and both seem somewhat redundant). I would suggest to stick to either of these.
The process described in Section 2 paints an idealistic and somewhat naive picture about the state of Linked Data, where general issues of LD (accessability/availability, quality, currentness) seem to be be ignored in favor of LD-centric functionalities. However, it's a well-established fact that LD usually is superior on the format/vocabulary-side but lacks in general usability/quality. Same applies for metadata in public data catalogs, which often are outdated or too sparse to be of use. Hence, independent of the right tool support for querying DCAT et al, the underlying dataset quality seems to be the main obstacle. Same applies to the problem of schema matching here (and the referred example in the paper): sure, schema.org types and WGS84 types are described in LOV, however there are again the usual practical issues wrt schema evolution, e.g. types/definitions and mappings in LOV not being consistent with either (a) the actual schemas at a given point of time or (b) the used schema terms found in a particular dataset. These practical issues are hindering the kind of idealistic consumption process proposed here, it's not the lack of tool support.
The criteria and requirements in Section 4 come across as somewhat arbitrary and seem once again very LD-centric. In particular, if the goal was to improve LD consumption by non-LD experts, these requirements and criteria should ideally be developed by non-LD experts: what do non-LD experts miss when interacting with LD? Again, my feeling is that there might be an element of lack of tools, but challenges are usually less on the tool support front but more wrt data quality.
A more specific (non-exhaustive) set of questions/suggestions:
- Introduction, when stating "it is possible to install locally or instance is available on the Web". Wouldn't that apply to anything? What are tools to which this does not apply?
- Criterion 2.1: do you expect a tool to come up with profiles for datasets? Isn't that more of a data provider side issue?
- Criteria 3.2 and 3.2 seem not very clear to me (in particular their difference).
- Example 5.1 / Criterion 5.1: I am not very sure if data consumers really need this. Generally 5 and 6 seem to refer to traditional IR approaches for query expansion.
- Another observation is that the problem of dataset discovery seems to retrieve lots of attention, while the issue of actual data retrieval (given a particular dataset) is less well covered.
- The notion of "Context-based" ... is not very clear throughout. As it is described in the paper, it sounds more like query expansion/enrichment.
- Criteria 7.1 - 7.3: Isn't the ranking always dependent on both content and intent? How can one be true but not the other?
- Criterion 10.1 seems rather broad.
- Criterion 11.2: is crawling a requirement too? In what way/to what depth?
- Criterion 12.1: What is a "built-in query"?
- Partially, the criteria seem to replicate LD principles and standards ("SPARQL named graphs", "IRI dereferencing", "loading from Turtle, RDF/XML etc"). Wouldn't be easier to refer to these in a less bloated way ("support LD principles and associated W3C standards").
- Criterion 13.5: do you mean "loading triples from RDFa-annotated XHTML document" rather than "loading RDFa file"?
- the more one goes through the criteria it comes across as a near-exhaustive list of LD-tasks, not all equally related to the "consumption" of LD (e.g. Link Discovery/Criterion 18.1, Vocabulary Mapping/19.1, Ontology Alignment/20.1) seem to be more data provider tasks.
- how realistic is the "License Management" criterion (26) given that this kind of information often is missing or expressed in non-standard ways?
- Tables 3 and 4 eventually demonstrate that the hunt for "the" platform is very hard to achieve
- When showing scores in Table 1-4, it would be good to add total amounts too.
- Generally, the actual evaluation (Section 5) seems rather brief and it's not quite clear what the lessons learnt are. It would have been more informative to evaluate strong tools for each individual category/requirement, rather than picking some selected few which have the largest range of functionalities.
English requires thorough proof-reading. Sentences often read a bit bumpy and the language is poor in many places.
In summary, I would recommend the following:
- Restructure and streamline the paper: don't use requirements and criteria but only one of the two. Don't mix criteria with actual tools/survey (as it is atm).
- Survey tools for individual tasks ("LD federated search"; "LD dataset recommendation", "LD profiling", "LD query interfaces" etc) rather than software with as many as possible features.
- Improve clarity and revise language.
Note that these changes go beyond the scope of a major revision, i.e. they require to set up the survey in a different way and considerably change both the evaluation process as well as the selection of surveyed tools.
Minor:
- paper is not "gendered" (always referring to the male form, i.e. "he")
- intro: "Each requirement is either back by an existing W3C Recommendation" ("back by"?)
- once and for all: it's "DBpedia" (not "DBPedia") ;-)
|