Evidence of Large-Scale Conceptual Disarray in Multi-Level Taxonomies in Wikidata

Tracking #: 3337-4551

This paper is currently under review
Atílio A. Dadalto
João Paulo A. Almeida
Claudenir M. Fonseca
Giancarlo Guizzardi1

Responsible editor: 
Stefano Borgo

Submission type: 
Full Paper
The distinction between types and individuals is key to most conceptual modeling techniques and knowledge representation languages. Despite that, there are a number of situations in which modelers navigate this distinction inadequately, leading to problematic models. We show evidence of a large number of representation mistakes associated with the failure to employ this distinction in the Wikidata knowledge graph, which can be identified with the incorrect use of instantiation, which is a relation between an individual and a type, and specialization (or subtyping), which is a relation between two types. The prevalence of the problems in Wikidata's taxonomies suggests that methodological and computational tools are required to mitigate the issues identified, which occur in many settings when individuals, types, and their metatypes are included in the domain of interest. We conduct a conceptual analysis of entities involved in recurrent erroneous cases identified in this empirical data, and present a tool that supports users in avoiding some of these mistakes.
Full PDF Version: 
Under Review