Abstract:
The presence of bias in Wikimedia projects has a potential impact in the lack of fairness of Artificial Intelligence technologies trained on this source of knowledge. However, research on this topic is fragmented, failing to address the complexity of this phenomenon that impacts against minorities in different ways. In this paper we present WikiBias: a framework for exploring bias in the Wikimedia ecosystem through biographical event extraction. WikiBias is designed to jointly study underrepresentation and representational bias, providing a multi-dimensional overview of the sources of harms against people vulnerable to discrimination. We test WikiBias on the case study of writers in Wikidata and Wikipedia, given the crucial role of literature in the definition of identity and otherness in our society. We adopt an intersectional perspective, considering the joint impact of writers' gender and origin in the perpetration of bias against them. Our results show that biographical event extraction can be effective in reducing the underrepresentation of writers with a non-Western origin but at the same time it might induce representational bias, especially against women. This knowledge augmentation has a dramatic impact in increasing the connections of writers with other people in Wikidata, potentially facilitating the discovery of underrepresented writers in the augmented knowledge base.