Reducing the Underrepresentation of Transnational Writers through Biographical Event Extraction

Tracking #: 3385-4599

This paper is currently under review
Marco Stranisci
Viviana Patti
Rossana Damiano1

Responsible editor: 
Guest Editors Wikidata 2022

Submission type: 
Full Paper
Wikidata represents an important source of literary knowledge, which is collaboratively created and curated by a large community of users. In this archive, it is possible to find hundreds of thousands pages about writers and their works. However, Wikidata is affected by the underrepresentation of Transnational authors, as recently demonstrated. Such an issue is present at different levels, since not only Transnational writers are less in number, but there are also fewer biographical information about them in their pages. In this paper we present an approach for reducing such form of underrepresentation by automatically extracting biographical information from Wikipedia through transformers and lexico-semantic patterns, and encoding it into Wikidata semantic model. Results show that our approach allows increasing the number of biographical triples on Wikidata for all writers, rebalancing at the same time the knowledge base in favour of Transnational writers.
Full PDF Version: 
Under Review