Path based and triplication approaches to mapping data into RDF: usability analysis and recommendations

Tracking #: 3348-4562

This paper is currently under review
Paul Warren
Paul Mulholland
Enrico Daga
Luigi Asprino

Responsible editor: 
Armin Haller

Submission type: 
Full Paper
Mapping complex structured data to RDF requires a clear understanding of the data, but also a clear understanding of the paradigm used by the mapping tool. We illustrate this with an empirical study comparing two different mapping para-digms from the perspective of usability. One paradigm uses path descriptions, e.g. JSONPath or XPath, to access data ele-ments; the other uses a default triplification which can be queried, e.g. with SPARQL. As an example of the former, the study used YARRRML, to map from CSV, JSON and XML to RDF. As an example of the latter, the study used an extension of SPARQL, SPARQL Anything, to query the same data and CONSTRUCT a set of triples. Whilst there are difficulties common to the two paradigms, there are also difficulties in fully understanding the implications of each paradigm. For each paradigm, we present recommendations which help ensure that the mapping code is consistent with the data and the desired RDF. We also propose future developments to reduce the difficulty users experience with YARRRML and SPARQL Anything. Finally, we make some general recommendations about the future development of mapping tools and techniques.
Full PDF Version: 
Under Review