Editorial Board

Editor-in-Chief
Krzysztof Janowicz

Managing Editors
Cogan Shimizu
Eva Blomqvist

Editorial Board
Mehwish Alam
Claudia d’Amato
Stefano Borgo
Boyan Brodaric
Philipp Cimiano
Michael Cochez
Oscar Corcho
Bernardo Cuenca-Grau
Elena Demidova
Jerome Euzenat
Mark Gahegan
Aldo Gangemi
Dagmar Gromann
Armin Haller
Pascal Hitzler
Aidan Hogan
Katja Hose
Eero Hyvönen
Sabrina Kirrane
Agnieszka Lawrynowicz
Freddy Lecue
Maria Maleshkova
Raghava Mutharaju
Axel Polleres
Guilin Qi
Marta Sabou
Harald Sack
Angelo Salatino
Christoph Schlieder
Stefan Schlobach
Cogan Shimizu
Blerina Spahiu
GQ Zhang
Rui Zhu

Former/Founding Editors-in-Chief
Pascal Hitzler

Editorial Assistants
Michael McCain

Syndicate

From Scientific Variables to Knowledge Graphs: The I-ADOPT Benchmark

Submitted by Barbara Magagna on 01/01/2026 - 02:05

Tracking #: 4005-5219

This paper is currently under review

Authors:

Barbara Magagna

Arvin Rastegar

Esteban González Guardia

Cristian Berrio

Stuart Chalk

Jose Manuel Gomez-Perez1

Christof Lorenz

Saurav Kumar

Daniel Garijo

Responsible editor:

Guest Editors ML and KR 2025

Submission type:

Full Paper

Abstract:

With the adoption of the Findable, Accessible, Interoperable and Reusable (FAIR) principles for data by researchers, an increasing amount of datasets have been made available online, supporting research investigations. In order to ease dataset interoperability, the I-ADOPT framework has been proposed by the scientific community as a means to capture the subtleties and nuance of scientific variables in a structured manner. However, creating machine readable variable representations requires significant expertise and manual effort, given the wealth of variable types in use by different communities. In this paper we explore the use of Large Language Models (LLMs) to aid addressing this manual step. We propose the I-ADOPT benchmark, an expert annotated corpus and task designed to measure the performance of LLMs in the different stages of automatically creating a machine readable scientific variable. Our corpus includes more than 100 scientific variables as structured knowledge graphs, and our results show that even models of large size (32B) struggle in creating these representations accurately (< 50% F1 score).

Full PDF Version:

swj4005.pdf

Tags:

Under Review

Long-term Stable Link to Resources:

https://doi.org/10.5281/zenodo.15313672

Log in or register to post comments
66 reads

Main menu

Editorial Board

Syndicate

From Scientific Variables to Knowledge Graphs: The I-ADOPT Benchmark

Tracking #: 4005-5219

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles

Search form

Main menu

Login

Editorial Board

Syndicate

From Scientific Variables to Knowledge Graphs: The I-ADOPT Benchmark

Tracking #: 4005-5219

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles