Survey of Tools for Linked Data Consumption

Tracking #: 1774-2986

Authors: 
Jakub Klimek
Petr Skoda
Martin Necasky

Responsible editor: 
Oscar Corcho

Submission type: 
Survey Article
Abstract: 
There is lots of data published as Linked (Open) Data (LOD/LD). At the same time, there is also a multitude of tools for publication of LD. However, potential LD consumers still have difficulty discovering, accessing and exploiting LD. This is because compared to consumption of traditional data formats such as XML and CSV files, there is a distinct lack of tools for consumption of LD. The promoters of LD use the well-known 5-star Open Data deployment scheme to suggest that consumption of LD is a better experience once the consumer knows RDF and related technologies. This suggestion, however, falls short when the consumers search for an appropriate tooling support for LD consumption. In this paper we define a LD consumption process. Based on this process and current literature, we define a set of 36 requirements a hypothetical Linked Data Consumption Platform (LDCP) should ideally fulfill. We cover those requirements with a set of 93 evaluation criteria. We survey 110 tools identified as potential candidates for LDCP, eliminating them in 4 rounds until 9 candidates for LDCP remain. We evaluate the 9 candidates using our 93 criteria. Based on this evaluation we show which parts of the LD consumption process are covered by the 9 candidates. We also show that there are important LD consumption steps which are not sufficiently covered by existing tools. The authors of LDCP implementations may use our paper to decide about directions of future development of their tools. The paper can also be used as an introductory text to LD consumption.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Stefan Dietze submitted on 07/Dec/2017
Suggestion:
Major Revision
Review Comment:

This survey presents an overview of tools for LD consumption, which, (given the widespread preference for non-LD data) still is a crucial and important topic to cover. The authors go beyond the traditional survey format by identifying an LD consumption process and a pipeline/platform (LDCP) together with 36 requirements and 93 evaluation criteria. Altogether 110 tools are considered from which 9 are picked for a more thorough assessment based on the identified criteria.

The paper is very comprehensive and the topic certainly of relevance.

While I generally appreciate the more original approach of this survey, its impact on the manuscript is both positive and negative. Most notably, the authors base their survey on certain assumptions, e.g. by defining the need for a "platform" or by defining requirements for such a platform which often are debatable and limit the transferability of results. In other words, such decisions constrain the survey (and the surveyed tools) in a way which might not be agreeable for everyone. While "subjective" assessments are usually hard to avoid in any survey, the approach picked by the authors even constraints the selection of surveyed tools themselves.

In this line: the approach of the authors is to filter existing tools by applying all of their criteria at once. This basically means that the final set of tools are meant to provide as many features as possible (while tools which do a single task very well are left out). This, again, seems not to match the practices we find in software development. In particular for the wide range of features which are mentioned (loading, discovery, linking, mapping, visualisation) it seems not convincing to ask for one tool which is good at everything.

Also, the distinction between criteria and requirements seems not very clear (and both seem somewhat redundant). I would suggest to stick to either of these.

The process described in Section 2 paints an idealistic and somewhat naive picture about the state of Linked Data, where general issues of LD (accessability/availability, quality, currentness) seem to be be ignored in favor of LD-centric functionalities. However, it's a well-established fact that LD usually is superior on the format/vocabulary-side but lacks in general usability/quality. Same applies for metadata in public data catalogs, which often are outdated or too sparse to be of use. Hence, independent of the right tool support for querying DCAT et al, the underlying dataset quality seems to be the main obstacle. Same applies to the problem of schema matching here (and the referred example in the paper): sure, schema.org types and WGS84 types are described in LOV, however there are again the usual practical issues wrt schema evolution, e.g. types/definitions and mappings in LOV not being consistent with either (a) the actual schemas at a given point of time or (b) the used schema terms found in a particular dataset. These practical issues are hindering the kind of idealistic consumption process proposed here, it's not the lack of tool support.

The criteria and requirements in Section 4 come across as somewhat arbitrary and seem once again very LD-centric. In particular, if the goal was to improve LD consumption by non-LD experts, these requirements and criteria should ideally be developed by non-LD experts: what do non-LD experts miss when interacting with LD? Again, my feeling is that there might be an element of lack of tools, but challenges are usually less on the tool support front but more wrt data quality.

A more specific (non-exhaustive) set of questions/suggestions:

- Introduction, when stating "it is possible to install locally or instance is available on the Web". Wouldn't that apply to anything? What are tools to which this does not apply?

- Criterion 2.1: do you expect a tool to come up with profiles for datasets? Isn't that more of a data provider side issue?

- Criteria 3.2 and 3.2 seem not very clear to me (in particular their difference).

- Example 5.1 / Criterion 5.1: I am not very sure if data consumers really need this. Generally 5 and 6 seem to refer to traditional IR approaches for query expansion.

- Another observation is that the problem of dataset discovery seems to retrieve lots of attention, while the issue of actual data retrieval (given a particular dataset) is less well covered.

- The notion of "Context-based" ... is not very clear throughout. As it is described in the paper, it sounds more like query expansion/enrichment.

- Criteria 7.1 - 7.3: Isn't the ranking always dependent on both content and intent? How can one be true but not the other?

- Criterion 10.1 seems rather broad.

- Criterion 11.2: is crawling a requirement too? In what way/to what depth?

- Criterion 12.1: What is a "built-in query"?

- Partially, the criteria seem to replicate LD principles and standards ("SPARQL named graphs", "IRI dereferencing", "loading from Turtle, RDF/XML etc"). Wouldn't be easier to refer to these in a less bloated way ("support LD principles and associated W3C standards").

- Criterion 13.5: do you mean "loading triples from RDFa-annotated XHTML document" rather than "loading RDFa file"?

- the more one goes through the criteria it comes across as a near-exhaustive list of LD-tasks, not all equally related to the "consumption" of LD (e.g. Link Discovery/Criterion 18.1, Vocabulary Mapping/19.1, Ontology Alignment/20.1) seem to be more data provider tasks.

- how realistic is the "License Management" criterion (26) given that this kind of information often is missing or expressed in non-standard ways?

- Tables 3 and 4 eventually demonstrate that the hunt for "the" platform is very hard to achieve

- When showing scores in Table 1-4, it would be good to add total amounts too.

- Generally, the actual evaluation (Section 5) seems rather brief and it's not quite clear what the lessons learnt are. It would have been more informative to evaluate strong tools for each individual category/requirement, rather than picking some selected few which have the largest range of functionalities.

English requires thorough proof-reading. Sentences often read a bit bumpy and the language is poor in many places.

In summary, I would recommend the following:
- Restructure and streamline the paper: don't use requirements and criteria but only one of the two. Don't mix criteria with actual tools/survey (as it is atm).
- Survey tools for individual tasks ("LD federated search"; "LD dataset recommendation", "LD profiling", "LD query interfaces" etc) rather than software with as many as possible features.
- Improve clarity and revise language.

Note that these changes go beyond the scope of a major revision, i.e. they require to set up the survey in a different way and considerably change both the evaluation process as well as the selection of surveyed tools.

Minor:
- paper is not "gendered" (always referring to the male form, i.e. "he")
- intro: "Each requirement is either back by an existing W3C Recommendation" ("back by"?)
- once and for all: it's "DBpedia" (not "DBPedia") ;-)

Review #2
By Ruben Taelman submitted on 22/Jan/2018
Suggestion:
Minor Revision
Review Comment:

# Survey of Tools for Linked Data Consumption

The authors introduce the requirements of a Linked Data Consumption Platform (LDCP),
which is a hypothetical platform for conveniently consuming Linked Data for a wide range of use cases.
These requirements correspond to criteria using which the authors evaluate the LDCP-applicability of 9 existing tools.
These 9 tools were selected from 110 tools using multiple elimination rounds.

The requirements of an LDCP as proposed by the authors are interesting and valuable for Linked Data tool researchers and developers.
An LDCP that conforms to _all_ these requirements may sound nice in theory,
it would be very unlikely to accomplish this in practise.
As the authors have mentioned, several tools could indeed be combined to build one big tool that does it all,
possibly using Web APIs to resolve issues of tools being implemented in different programming languages.
The complexity of this combination of APIs however seems unlikely to accomplish.
Nevertheless, there is definitely a lot of value in being able to measure a tool's compliance to subsets of these requirements,
such as the five requirement groups introduced by the authors.

I have only one concern with this work, and that is how the evaluation criteria are defined.
Most of these criteria are very vague, which makes conformance to these criteria sometimes quite subjective.
Ideally, these criteria should be defined very concretely, and be testable in an objective way.
Perhaps listing all detailed criterion requirements in this paper is not beneficial to the paper's readability,
so an appendix for researchers wanting to repeat this evaluation could be considered.
For example, Criterion 1.3 (Metadata in SPARQL endpoint) is defined as:
"The evaluated tool can exploit VoID, DCAT or DCAT-AP dataset metadata stored in SPARQL endpoints."
It is not clear when a tool sufficiently "exploits" this metadata.
Is showing a checkmark indicating that such metadata exists sufficient?
Or should it be able to extract and/or derive certain knowledge from this metadata and present it somehow?

Next to this one concern, this paper is very well written and structured.
Below, you can find several minor comments:

## 1. Introduction

The authors start by talking about the 5 stars of open data, but they never say _what_ these stars are.
A listing of them would be beneficial to the reader.

## 2. Motivating example

If the journalist from the example wants to use LD because of the 5 stars,
what is the motivation for still publishing his/her results as non-RDF (CSV) data?
What are the benefits of converting from RDF to non-RDF and using non-RDF tools to process the data
over just processing the data directly with RDF tools?
There are obviously benefits to both sides, but the example does not seem motivating enough for the non-RDF direction.

The authors seem to assume that any data processing will happen in the non-RDF domain,
which is why they require LDCP to export to non-RDF formats.
In some cases, however, there is value in staying within the RDF-domain for processing data (like [32] does).
Later, the authors do define for RDF output, but from this motivating example,
it does not seem clear to me that non-RDF output is actually not required in all cases.

## 4. Requirements and evaluation criteria

I liked the consistent requirement -> example -> criteria structure,
this makes this long list of requirements easier to follow.

## 4.2. Data manipulation

Requirement 16 (Version management)
In cases where a publisher supports some kind of mechanism to expose multiple versions of the same dataset next to each other,
such as Memento [1], an LDCP should be able to support this as well.

Page 17: "Therefore, he expects that the datasets are semantically related"
The user should be gender-neutral.

## 4.5. Developer and community support

Page 23: "The API should again be standardized, i.e. REST or SOAP based"
It's not because an API is standardized, that it is REST or SOAP-based, or vice-versa.
Both aspects are important, but orthogonal.

Page 24: Criterion 35.1 and Criterion 35.2
The difference between a plugin and a project is not clear to me from the text.

## 5. Tools evaluation and description

Some of these tools were already mentioned in the examples of section 4.
I'm wondering if it would be easier for the reader if these tools were introduced before section 4,
so that they can more easily be mentioned in section 4, and section 5 would then explain the conformance to the requirements.

Page 28: typo: "SPARQL CONSTUCT"

## 6. Related work

The authors claim that they cover more recent tools than in [11].
However, one could also say that [11] focuses on more established tools, because they have been around much longer.
So I would not necessarily see this recency of tools as an advantage (which the text seems to hint at), but merely a difference.

[1] Van de Sompel, Herbert, et al. "Memento: Time travel for the web." arXiv preprint arXiv:0911.1112 (2009).

Review #3
By Daniel Garijo submitted on 24/Jan/2018
Suggestion:
Major Revision
Review Comment:

This paper introduces a set of requirements for consuming Linked Data, as well as a comparison of existing tools that partially address them. The authors claim that the text could also be treated as an introduction to Linked Data consumption.

The paper is well written and easy to follow. I also find it highly relevant for the Semantic Web Journal. The work presented is ambitious and a little unclear in scope, although I think the authors have done a good job unifying a set of requirements for different activities related to Linked Data consumption. In summary, I think the paper may be relevant to a broader semantic community; but in its current state it presents several weaknesses, which should be addressed before accepting it for the journal. I briefly describe them below:

The novelty and contribution of the work is not clear. The title indicates that the paper is a survey of tools for Linked Data consumption, and the paper has been submitted as a "survey paper" to SWJ. Yet, in the introduction the authors state that "This paper is not a detailed survey for existing research literature related to Linked Data consumption". The reader may then think that the set of requirements gathered, as an evaluation framework, could itself be a contribution. However, the authors also state that "The individual proposed requirements are not novel". By the time the reader completes the introduction, it is unclear what this work is proposing. My impression is that the paper focuses more on defining an evaluation framework for linked data consumption tools, rather than a survey.

The term "Linked Data consumption" is loosely defined, without a clear scope. An LDCP apparently has to support all the semantic web stack, from data catalog registries to query federation, data visualization, summarization, versioning, etc. The target user described in the original example has an initial simple objective in mind, i.e., collect data about cities in Europe and display the information on a map. However, this is suddenly expanded unrealistically assuming that the user not only knows semantic web technologies to a very granular level (e.g., doing SPARQL queries), but also suddenly cares about totally unrelated activities such as vocabulary mappings. The motivation scenario should be broken into different scenarios with different user roles, distinguishing domain users who just want to retrieve and visualize data from developers who want to integrate databases together, from curators who want to fix and enhance datasets and from knowledge engineers who care more about ontology mappings. In addition, the paper would be more sound if the authors provided evidence of the usefulness of the motivating example, either by citing user surveys or real-world projects that are currently benefiting from this kind of interaction.

Similarly, table 2 would be more readable if it was organized by different research areas (visualization, mapping, linking, loading, etc.) and user roles.

I found strange that there were no requirements in terms of efficiency and expected time to receive answers from an LDCP. If a journalist wants to retrieve results, I guess that she/he will want to receive an answer in a reasonable amount of time. However, this aspect seems to be absent from the paper.

The methodology for identifying requirements seems a little biased. While I agree that many of them are very important in consuming Linked Data at different levels of granularity, the authors basically state that they included requirements they considered crucial for LDCP, using their motivation example as a basis. There is no evidence in the paper on whether these requirements are demanded by users or by a community, apart from those extracted from related work (described in section 3.2.1). Maybe adding a source for the requirement in Table 2 would help addressing this issue. Also, the authors have decided to add requirements based on their interpretation of the papers they read. Has there been a consensus discussion on the papers? Have there been any conflicts between the authors that needed to be resolved. If so, how were they resolved?

Section 3.2.1 states that no workshops were investigated to gather requirements. However, 2 workshops (LDOW and COLD) are listed as part of the investigated venues.

The criteria used to filter candidates seem to be a little restrictive. Have the authors tried to contact an author of the target system when the code seemed to be failing or not available without a registration/login?

In terms of scalability, it is important to take into account the scope of the actions to be carried out by the user. If you try to visualize 610 million triples in a platform, the tool will probably fail. Some systems may not be working because of assumptions when being run.

When describing some of the requirements the authors sometimes provide extra unrelated details. For example, I don't know why I need to see the CKAN Action API examples,or the SPARQL queries in example 9.2, or why when talking about Catalog support there is so much emphasis at the end on dataset profiling.

Before requirement 2, the authors introduce some services useful for building dataset profiles. Maybe it would be worth mentioning that some systems now expose this kind of information using Schema.org JSON-LD snippets as well.

There are some requirements that should be dealt with care, as they may harm the user experience. For example, if you recommend all the datasets with "hasVersion" metadata and there are 100 versions of a dataset, the user may be overwhelmed with related items.

I also think there might be an issue with requirement 35. It's fantastic to have repositories of plugings, but if only one developer contributes to them, then it doesn't show adoption by a community. Probably showing whether a system has participated in open source initiatives, or is part of an open source initiative such as Apache, would really show that the adoption and maintenance of the system are ensured. Another alternative would be to have a number of contributors avobe a certain threshold, which would indicate adoption.

Some tools and W3C specifications are not properly cited (just footnotes).

After reading the discussion and conclusions, I don't understand how much overlap there is with all the other surveys and efforts described in the paper. Is this paper just an aggregation of those? Do the authors propose new requirements? If requirements have been left out of the paper, why?

The authors end the paper stating that the survey "shows that there are no technical barriers to making LD as usable to non-experts as other open data formats such as CSV". However, the authors also express throughout the paper that for using most of the analyzed technologies, one has to be a semantic web expert, and have a clear knowledge on SPARQL. This is a clear technical barrier for consumers of data.