Linked Data Completeness: A Systematic Literature Review

Tracking #: 2478-3692

This paper is currently under review
Subhi Issa
Onaopepo Adekunle
Fayçal Hamdi
Samira Si-said Cherfi
Michel Dumontier1
Amrapali Zaveri1

Responsible editor: 
Agnieszka Lawrynowicz

Submission type: 
Survey Article
The quality of Linked Data is an important aspect to indicate its fitness for use in an application. Several quality dimensions are identified such as accuracy, completeness, timeliness, provenance, and accessibility, which are used to assess the quality. While many prior studies offer a landscape view of data quality dimensions, here we focus on presenting a systematic literature review for assessing the completeness of Linked Data. We gather existing approaches from the literature and analyze them qualitatively and quantitatively. In particular, we unify and formalize commonly used terminologies across 56 articles related to the completeness dimension of data quality and provide a comprehensive list of methodologies and metrics used to evaluate the different types of completeness. We identify seven types of completeness, including three types that were not previously identified in earlier surveys. We also analyze nine different tools capable of assessing Linked Data completeness. The aim of this Systematic Literature Review is to provide researchers and data curators a comprehensive and deeper understanding of existing works on completeness and its properties, thereby encouraging further experimentation and development of new approaches focused on completeness as a data quality dimension of Linked Data.
Full PDF Version: 
Under Review