Transformer-Based Architectures versus Large Language Models in Semantic Event Extraction: Evaluating Strengths and Limitations

Tracking #: 3673-4887

This paper is currently under review
Tin Kuculo
Sara Abdollahi
Simon Gottschalk

Responsible editor: 
Guest Editors KG Gen 2023

Submission type: 
Full Paper
Understanding complex societal events reported on the Web, such as military conflicts and political elections, is crucial in digital humanities, computational social science, and news analyses. While event extraction is a well-studied problem in Natural Language Processing, there remains a gap in semantic event extraction methods that leverage event ontologies for capturing multifaceted events in knowledge graphs since existing methods for event extraction often fall short in the semantic depth or lack the flexibility required for a comprehensive event extraction. In this article, we aim to compare two paradigms to address this task of semantic event extraction: The fine-tuning of traditional transformer-based models versus the use of Large Language Models (LLMs). We exemplify these paradigms with two newly developed approaches: T-SEE for transformer-based and L-SEE for LLM-based semantic event extraction. We present and evaluate these two approaches and discuss their complementary strengths and shortcomings to understand the needs and solutions required for semantic event extraction. For comparison, both approaches employ the same dual-stage architecture; the first stages focus on multilabel event classification, and the second on relation extraction. While our first approach utilises a span prediction transformer model, our second approach prompts an LLM for event classification and relation extraction, providing the potential event classes and properties. For evaluation, we first assess the performances of T-SEE and L-SEE on two novel datasets sourced from Wikipedia, Wikidata, and DBpedia, containing over 80,000 sentences and semantic event representations. Then, we perform an extensive analysis of the different types of errors made by these two approaches to discuss a set of phenomena relevant to semantic event extraction. Our work makes substantial contributions to (i) the integration of Semantic Web technologies and NLP, particularly in the underexplored domain of semantic event extraction, and (ii) the understanding of how LLMs can further enhance semantic event extraction and what challenges need to be considered in comparison to traditional approaches.
Full PDF Version: 
Under Review