An Approach for Interoperability Assessment of RDF Data

Tracking #: 2819-4033

Authors: 
Shuxin Zhang
Nirupama Benis
Nicolette de Keizer
Ronald Cornet1

Responsible editor: 
Aidan Hogan

Submission type: 
Full Paper
Abstract: 
The Semantic Web community provides a common Resource Description Framework (RDF) that allows data to be shared and reused. High-quality RDF data is needed but current approaches for assessing data quality focus on local usefulness with little attention to re-usefulness across organizations. In this paper, we introduce a novel approach that encompasses the aspect of interoperability to quality assessment of RDF data. We identified eleven interoperability dimensions by aligning a list of standardized dimensions regarding data quality to the I principles out of the FAIR (Findable, Accessible, Interoperable, and Reusable) principles. We related these dimensions to existing knowledge from ontologies about data quality, problem lists from previous work, and interoperability-related articles, resulting in twenty-four metrics. Dimension and metrics were represented in RDF and formed the theoretical basis of the assessment approach. As a proof-of-concept, a tool developed on the proposed approach was used to assess an RDF dataset. It successfully identified interoperability problems and provided suggestions for resolving them. The approach supports the generation of RDF data of high quality and potentially reduces the effort on addressing interoperability issues. Standardized dimensions and metrics can facilitate the establishment of a community that further develops and applies interoperability assessment to improve the quality of RDF data.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Reject

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 29/Aug/2021
Suggestion:
Reject
Review Comment:

This paper present an approach to quantify interoperability of datasets using metrics that relate to existing frameworks, notably FAIR and related quality measurement (and representation initiative).

The approach is interesting. In particular the authors do a good deal of relation to existing work, making their proposal a nice fit into a series of other proposals.

However I am convinced that the work is not yet mature enough for a journal publication, especially SWJ.

First, as a general comment, I must say that I do not buy entirely the parlance in the introduction about the "fitness for use" vs "fitness for exchange". In the end a lot of the interoperability metrics proposed in table 3 are about general data issues, sometimes very low level (like misused types, or existence of contradictions), so it's hard to identify whether interoperability is something really specific. From the paper it seems that every quality dimension can contribute to interoperability, going beyond what is in scope for I1, I2, I3. This is perhaps true, but then it makes the ground of the paper (especially table 1) a bit shaky. I guess here it is the 'dissection' of 3.3 that would need more substance in the paper.

In a similar line of work of the authors' work is not easily accessible/visible. Where are "The dimensions were represented in RDF" (in 3.3)? Is it in section 4.1? Maybe, but then I do not see the point in making such a separation in the paper. The generic title of section 4 ('overview of the approach') doesn't help. In fact the paper is apparently not really consistently written, in the sense that it announce (relevant) pieces of work in some parts but they are hard to identify in others. For example section 3 announced that some metrics were re-used from Zaveri, and DQM, but this is not visible later, e.g., in table 3.

Then it is a bit frustrating to find that this variety of dimensions is not proven by implementation work. Notably, dimensions that are notoriously hard to assess (semantic accuracy and trustworthiness) are not implemented in the current state. Readers should see more evidence that these areas can indeed be implemented to prove the author's inclusion of them in their framework.

More worrying is the lack of large scale experiment on real data. It is good that the authors apply their approach on three datasets. But only one is real. I am not sure why a journal paper should have only one real application, especially when it seems quite easy to apply the developed framework to other real dataset.

Finally, the resources built around the papers are full of RDF issues that indicate lack of maturity and review. Things that may not be formal errors, but that will not help the readers see the proposal in a good light. Basically things that would make the proposal more interoperable!
Notably it would be really good if the authors could follow consistently good practices for naming classes (starting with upper case) and instances (with lower case). In the IFC vocabulary, ifc:failureCase is a lower-cased class and all its instances are upper-cased - well, one is not (ifc:mailformedLiterals, by the way there's quite a typo here!) which looks even more odd.
Giving a good thought about these conventions would perhaps help avoid issues with the use of classes instead of instances. Especially, in IQM, all metrics are formally defined as classes, which is not in line with DQV. I.e. iqm:objectiveMetric a dqv:Metric
is a correct statement, making iqm:objectiveMetric an instance.
But then
dqm:MisplacedClassesOrPropertiesMetric rdfs:subClass iqm:objectiveMetric
renders both dqm:MisplacedClassesOrPropertiesMetric and iqm:objectiveMetric to be RDFS classes, which is not in line with DQV, and actually does not make much sense in general.
Plus, this file does not use the right RDFS property (it is rdfs:subPropertyOf).
In the same line, dqv:isMeasuredOf is not a DQV property (https://github.com/sxzhang1201/Interoperable-Supportive-Tool/blob/main/s...)
and in the IFC file (https://github.com/sxzhang1201/Interoperable-Supportive-Tool/blob/main/v...)
The title of the vocabulary is "Interoperable Failure Case" not "Interoperabality Failure Case" as it should be.
I also do not get why "interpretions" in table 2 are expressed with dct:description in IKP, skos:definition in IQD and rdfs:comment in IQM. Interoperability would probably require using one property here. Or deviations should be explained.
It could be that there are other technical problems. Frankly I have stopped checking at this stage, because I feel that this is work that the authors should do. Or reviewers of a smaller chunks of the authors' work that would be submitted to workshops or conferences, allowing the whole framework to mature at a more regular rhythm!

Minor comments:
- in 3.1 the paper hints that the notions of quality dimensions and metrics originate in Zaveri et al, but these have been also formalized in the daQ work (Debattista et al), which I think predates it.
- "materials" is a strange title for 3.2
- 3.1 could be presented before the related work.
- in 3.2 I am not sure it's right to call LDQ the "linked data version of DQV". Granted, it introduces some linked data quality aspects, but a lot of its contribution regards assessment processes which are not specifically LD-specific
- I do not understand how RDFS can be used for "representing annotations" in 3.4. RDFS is very generic...
- I do not understand the real value of using ShEx in 3.6. It looks like the authors use it to control the data that makes a report. But because the authors control the way the report are generated, I don't see how the report data could fail to meet their expectations.
- in 37 one of the 'synthetic' is written 'synetic'

Review #2
Anonymous submitted on 29/Aug/2021
Suggestion:
Major Revision
Review Comment:

In this work, the authors propose an approach for quality assessment that complies with the I (interoperability) of the FAIR principle, meaning that quality is not only referred to as fitness for use but also fitness for exchange. In the latter case, new quality metrics need to be considered which consider also the problems of interoperability. In this work, the authors gather and systematically represent 11 quality dimensions with 24 quality metrics that capture interoperability issues.

The paper is clearly on topic for this journal and tackles a very important and non-trivial issue for the Semantic Web/Linked Data community. The systematic review is, to the best of my knowledge, novel in its breadth and depth. While the paper is easy to read and follow, the contribution itself seems weak, loosely specified, and unconvincing for a full research paper to be published in the SWJ. I'll continue with the formal criteria for reviewing a research paper as per the CfP.

My main concerns regard the originality and significance of the results.

*If we consider this as a research work, then the research question(s) is(are) missing. You need to specify it explicitly.
*interoperability Metrics
According to the sentences "We aligned Zaveri’s quality dimensions to the “I” principles to have quality dimensions encompass interoperability.
Second, defining novel metrics by extending the scope from measuring only data to also measuring the underlying data model and used vocabularies;
Third, grouping metrics that address similar quality requirements and re-defining these as a new single metric.", I think that the authors should provide paragraphs grouped according to the quality dimensions. Each paragraph should define a quality metric and a short discussion on how was it defined previously in order to see the change between the new definition and the old one.
*results obtained
Section 4.1 I think that just having the table is a bit limited. The output of this section should be a vocabulary that integrates all the existing elements and the newly added elements. In other words, I would expect to see an image of all the schema elements and their relationships
- Any kind of evaluation or user feedback is missing. E.g. was the approach used in practice?
- A running example would very useful (and important), which may show how this metrics can be applied in a situation where interoperability is fundamental.
- An experimental section should include an assessment of your approach that has been conducted on more than one KBs
- In the evaluation, you should be able to report the results of both i) a quantitative and ii) a qualitative validation.

Overall, the paper reads like something between a vision paper and a demo paper rather than a full research paper to be published in a journal.

More major issues:
-Section 3.4 Provision of Interoperability Metrics. You mention that new metrics are derived as a consequence of aggregation or adoption not only to data but also to the data model usage. Would be nice to see these changes either by putting in comparison smth like previously and after since it is not clear what was done
- Section 3.5 Defining Interoperability Failure Types. It is not clear to me how the two steps method is done in practice. How do you search the works describing interoperability failure types?
- section 4.1 The former point refers to the data model while the latter refers to the qualified linking. Here we are talking about "data model, vocabulary, and reference". In the latter case, which of the three scopes are you referring to?
- section 4.2 Which are the 8 metrics chosen from DQM and which are the new ones? In a paper as this one, we should be able to provide an example for each metric.

I would suggest considering the work in
- "Quality assessment for Linked Data: A Survey" for the definitions and more the systematic conceptualization
- " A Quality Assessment Approach for Evolving Knowledge Bases " for the experimental part

Review #3
Anonymous submitted on 01/Sep/2021
Suggestion:
Major Revision
Review Comment:

In this paper, authors introduce a novel approach which encompasses the aspect of interoperability to quality assessment of RDF data by identifying eleven interoperability dimensions by aligning a list of standardized dimensions regarding data quality to the “I” principles out of other FAIR data principles.

Authors also presented a tool by claiming its benefit for successfully identifying interoperability problems and could provide suggestions for resolving them.

It is a well written article and easy to read but following few points can be considered:

· There is no discussion around the validation and evaluation of the approach. It would be great to see the precision and recall of the defined approach especially in terms of complexity analysis and accuracy of the approach as the size of the data grows i.e larger datasets.

· As a proof of concept authors performed the quality assessment on a registry dataset about Addison’s disease and two synthetic datasets respectively about personal information and diagnosis from the European Joint Programme on Rare Disease (EJP RD). Addison's dataset was converted from a synthetic tabular dataset to an RDF dataset claiming the conversion scenario as typical/ useful for quality assessment.

Typically, in most of the cases converting relational databases/ tabular databases to RDF requires a good amount of domain knowledge especially when it comes to medical/ biomedical data sources. Considering the inherent feedback mechanism from domain experts during the conversion process the quality assessment and the results may differ on a case to case basis.

How realistic would it be to comment on the quality dimension based on metrics defined in this paper at the generalised level i.e, on other RDF datasets. Would the same approach, dimensions and metrics work for any RDF resource?

Nevertheless, it would be great to see results in real world settings (on existing RDF data sources) irrespective of domain, complexity, size and coverage in order to consider the quality metrics defined in this paper for interoperability dimensions.

· Most of the data sources are dynamic in nature these days. Does any of the quality metrics defined for interoperability dimension consider the assessment for such scenarios?

· Twenty-nine failure types were defined in the Interoperability Failure Case (IFC) vocabulary which seems comprehensive but it is not explicitly clear if IFC is one of the contributions by authors or the re-use of existing vocab named “IFC”.

· Would be good to see the false positive results while considering the failure matches based on the defined technique. Any analysis based on computed results would be great to see.

· Some of the possible future directions are highlighted in the last paragraph of section 5 and would be good to present them as separate sections/ subsections for better visibility.

· Clear statements regarding the limitation for the approach, data used and the developed tool should also be highlighted.