Multilingual resources: discourse relations in English TED talks and their translation into Lithuanian, Portuguese and Turkish

Tracking #: 2669-3883

This paper is currently under review
Deniz Zeyrek
Giedre Valunaite Oleskeviciene
Amalia Mendes

Responsible editor: 
Guest Editors Advancements in Linguistics Linked Data 2021

Submission type: 
Full Paper
Abstract. The creation of multilingual resources is of key importance for crosslinguistic research and making such resources accessible for the Linked Data paradigm is a pressing need. The current paper is exploratory, aiming to reveal the potential of an annotated corpus, TED-Multilingual Discourse Bank, for the Linked Data paradigm. The paper focuses on a sample of two TED talk transcripts selected from this corpus in English, the source language, and the translations into three less-studied languages, Lithuanian, Portuguese, Turkish. It examines how low-level coherence is established in English versus target languages, mainly focusing on the comparison of discourse connective classes, explicit versus implicit relations, and the matches between English and the target languages in conveying discourse relations.
Full PDF Version: 
Under Review