Editorial Board

Editors-in-Chief
Krzysztof Janowicz

Managing Editors
Cogan Shimizu
Eva Blomqvist

Editorial Board
Mehwish Alam
Claudia d’Amato
Stefano Borgo
Boyan Brodaric
Philipp Cimiano
Oscar Corcho
Bernardo Cuenca-Grau
Elena Demidova
Jerome Euzenat
Mark Gahegan
Aldo Gangemi
Anna Lisa Gentile
Rafael Goncalves
Dagmar Gromann
Armin Haller
Aidan Hogan
Katja Hose
Eero Hyvönen
Sabrina Kirrane
Agnieszka Lawrynowicz
Freddy Lecue
Maria Maleshkova
Raghava Mutharaju
Axel Polleres
Guilin Qi
Marta Sabou
Harald Sack
Christoph Schlieder
Stefan Schlobach
Oshani Seneviratne
Cogan Shimizu
Ruben Verborgh
GQ Zhang

Former Editors-in-Chief
Pascal Hitzler

Editorial Assistants
Sanaz Saki Norouzi

Syndicate

SemPubFlow: a novel Scientific Publishing Workflow using Knowledge Graphs, Wikidata and LLMs – the CEUR-WS use case

Submitted by Wolfgang Fahl on 02/29/2024 - 09:44

Tracking #: 3657-4871

This paper is currently under review

Authors:

Wolfgang Fahl

Tim Holzheim

Christoph Lange

Stefan Decker1

Responsible editor:

Guest Editors KG Gen from Text 2023

Submission type:

Full Paper

Abstract:

The CEUR Workshop Proceedings (CEUR-WS) platform has been pivotal in disseminating scientific workshop and conference proceedings since 1995. This paper introduces a paradigm shift towards a semantified, consistent, and FAIR (Findable, Accessible, Interoperable, and Reusable) knowledge graph, emphasizing the critical role of Single Source of Truth (SSoT) and Single Point of Truth (SPoT) in scholarly publishing and reducing the data quality responsibility burden on CEUR-WS editors. Our SemPubFlow approach modernizes the legacy pipeline of manual HTML and PDF content curation by expecting the metadata to be supplied first. It enables the public open source collection of necessary data for event series, events, proceedings, papers, editors, authors, and affiliated institutions directly by the stakeholders of a scientific event as early as possible. The traditional Extract, Transform, Load (ETL) processes that convert existing artifacts into a comprehensive knowledge graph are only needed during the transition to this workflow. The novel approach leverages Large Language Models (LLMs) and the Wikidata knowledge graph, generating the SPoT representing CEUR-WS as the SSoT. This way our methodology not only streamlines the recreation of legacy artifacts but also addresses the \tquote{long tail} problem inherent in CEUR-WS's diverse and evolving data. This paper outlines the transition strategy, avoiding a \tquote{big bang} approach, to ensure the continuity and integrity of scholarly communication. The resulting solution is efficient in attaining the necessary level of coverage, accuracy and scalability. Data protection issues can easily be overcome in this context since even the personal data is intended to be public. The advancements presented promise to enhance publication processes across various contexts, offering a blueprint for future scholarly publishing infrastructures.

Full PDF Version:

swj3657.pdf

Tags:

Under Review

Log in or register to post comments
274 reads

Main menu

Editorial Board

Syndicate

SemPubFlow: a novel Scientific Publishing Workflow using Knowledge Graphs, Wikidata and LLMs – the CEUR-WS use case

Tracking #: 3657-4871

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles

Search form

Main menu

Login

Editorial Board

Syndicate

SemPubFlow: a novel Scientific Publishing Workflow using Knowledge Graphs, Wikidata and LLMs – the CEUR-WS use case

Tracking #: 3657-4871

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles