Review Comment:
This paper describes the implementation and evaluation of a GUI tool simplifying the interaction with linked and RDF data for users with low previous knowledge of linkedData technologies, terminology and technical intricacies.
This is a revised submission of the paper, and I am reviewing it as such. I appreciate the effort of the authors in particular to reflect on the high quality comments from Reviewer 1. I believe that with minor changes, the paper can be accepted for publication in the category of Tools and Services.
(1) Quality, importance, and impact of the described tool or system (convincing evidence must be provided).
The paper (and tool) shows evidence of being of interest (although maybe not yet useful - but that may be a reflection on the LD technology, not the tool iteslf) to a range of stakeholders and may lead to interesting discussions and discovery of LD technologies by a range of applications specialists.
(2) Clarity, illustration, and readability of the describing paper, which shall convey to the reader both the capabilities and the limitations of the tool.
I think that the paper is well written and reasonably clear. There is, however, a small number of issues I have still spotted, and if addressed I believe they could assist with the clarity of the paper/tool:
Major comment:
- I am unsure how the two types of filters shown in the demo differ. One is n object property, another a data type property. But for instance, for concept "Place", municipality code is a string data type property - but it is a nominal value that must come fro ma list. This should be offered/discovered by such a tool (and really behave as object property. If the tool is to add value for people that explore unknown schemas, indicating the value range is important ( at least through some sort of comments or metadata, if real instances cannot be shown). I would expect that a mechanism reminiscent of say OpenRefine (openrefine.org) or other data wrangling tools could be of assistance here
- The nomenclature in the tool changed, but not in the paper: p2: PAM is still mentioned, but does not appear in the interface ( and the video is not updated, but that is a minor comment). I think that all mentions to PAM could be deleted ( I do not think that this is real Provenance access), it is merely a stored queries browser, or query history browser. One question that could be addressed with respect to this is whether the query browser only stores the queries ( and thus the results are always recalculated, and may differ - if source data changed - from original results) or whether there is an option to store historical results as well. Both are important functionalities, but the control should be up to the user. Why? - Because one may publish an analysis based on such an interface, and the readers may want to scrutinise the results. If they achieve different results, the provenance of the original results should be assured, and the information retained. This is a common concern addressed by eResearch tools developers.
- The way to compose compose more complex queries with logical operators seems to be through clicking on the filter icon. This was not entirely intuitive. I am unsure how to write more complex queries, other than with the AND operator: OR, NOT) etc? I suspect that some of the users will be familiar with approaches to analyse data using SQL, but may need handholding with respect to translating familiar query composition to LD. How can I specify queries such as cities with a population of > 10000 that are twin cities of a birth city of soccer player XY, or that have a river running through with length <100000m (convert to kms - as an added advantage of LD)? I realise that some of these may be beyond the capabilities, but a discussion of non trivial queries would be appreciated - An example of an equivalent of a SQL join in particular, to make new users of LD appreciate the capabilities.
Additional minor editorial comments:
- Abstract: "and and even edited" - remove excessive "and"; delete "hence"
- Introduction, para 1: it is better style to introduce the reference so as to not interrupt the flow of the reader. So, I would suggest to move [3] to the end of this sentence.
- Introduction, lod-cloud footnote: this is a statistic/research from April 2014. Is there any more up to date information you can cite? This is obsolete in the perspective of the advances in the field (at least, one would hope so).
- Introduction: I have an issue with the sentence "Unfortunately, the emergence of a wide number of tools supporting people to publish their data as Linked (Open) Data, has not been complemented by approaches supporting them to consume existing Linked Data in formats
other than RDF [3]". Isn't RDF THE format for LD? I think that the issue addressed here is the problem of presentation and interfacing and terminology, rather than what is the encodiing [SIC] behind the scenes. Similarly, as with e.g., Excel, the xslx format encodes the data as XML, but there is a familiar metaphor of a spreadsheet in which the data are presented. This sentence may need editing to reflect this ( as long as I understand what the authors mean properly).
|