Editorial Board

Editor-in-Chief
Krzysztof Janowicz

Managing Editors
Cogan Shimizu
Eva Blomqvist

Editorial Board
Mehwish Alam
Claudia d’Amato
Stefano Borgo
Boyan Brodaric
Philipp Cimiano
Michael Cochez
Oscar Corcho
Bernardo Cuenca-Grau
Elena Demidova
Jerome Euzenat
Mark Gahegan
Aldo Gangemi
Dagmar Gromann
Armin Haller
Pascal Hitzler
Aidan Hogan
Katja Hose
Eero Hyvönen
Sabrina Kirrane
Agnieszka Lawrynowicz
Freddy Lecue
Maria Maleshkova
Raghava Mutharaju
Axel Polleres
Guilin Qi
Marta Sabou
Harald Sack
Angelo Salatino
Christoph Schlieder
Stefan Schlobach
Cogan Shimizu
Blerina Spahiu
GQ Zhang
Rui Zhu

Former/Founding Editors-in-Chief
Pascal Hitzler

Editorial Assistants
Michael McCain

Syndicate

Schema-Miner Pro: Agentic AI for Ontology Grounding over LLM-Discovered Scientific Schemas in a Human-in-the-Loop Workflow

Submitted by Sameer Sadruddin on 10/02/2025 - 07:24

Tracking #: 3953-5167

A new version of this paper is available

Authors:

Sameer Sadruddin

Jennifer D'Souza

Eleni Poupaki

Alex Watkins

Bora Karasulu

Sören Auer1

Adrie Mackus

Erwin Kessels

Responsible editor:

Guest Editors 2025 LLM GenAI KGs

Submission type:

Full Paper

Abstract:

Scientific processes are often described in free text, making it difficult to represent and reason over them computationally. We present schema-miner pro, a human-in-the-loop framework that automatically extracts and grounds structured schemas from scientific literature. Our approach combines large language models for schema extraction with an agent-based system that aligns extracted elements to external ontologies through interpretable, multi-step reasoning. The agent leverages lexical heuristics, semantic similarity, and expert feedback to ensure accurate grounding. We demonstrate the framework on two semiconductor manufacturing workflows—Atomic Layer Deposition (ALD) and Atomic Layer Etching (ALE)—mapping process parameters and outputs to the QUDT (Quantities, Units, Dimensions, and Types) ontology. By producing ontology-aligned, semantically precise schemas, schema-miner pro$ lays the groundwork for machine-actionable scientific knowledge and automated reasoning across disciplines.

Full PDF Version:

swj3953.pdf

Revised Version:

Schema-Miner Pro: Agentic AI for Ontology Grounding over LLM-Discovered Scientific Schemas in a Human-in-the-Loop Workflow

Previous Version:

Schema-Miner Pro: Agentic AI for Ontology Grounding over LLM-Discovered Scientific Schemas in a Human-in-the-Loop Workflow

Tags:

Reviewed

Long-term Stable Link to Resources:

https://github.com/sciknoworg/schema-miner

Decision/Status:

Minor Revision

Solicited Reviews:

Click to Expand/Collapse

Review #1

By Andrea Mannocci submitted on 27/Oct/2025

Suggestion:
Accept

Review Comment:

Having reviewed the paper in its previous iteration, I reckon that the manuscript has remarkably improved, as all comments provided by the reviewers have been carefully addressed by the authors.
The flow is much better now and many details have been polished out.
The SW repository on GitHub is now well-organised, and therefore it will be easier for the community to reuse the framework and its components.
Therefore, I do not have any reservations in accepting the paper for publication in SWJ.
Well done.

Review #2

By Antonello Meloni submitted on 02/Nov/2025

Suggestion:
Minor Revision

Review Comment:

The authors have addressed most of the previous review points thoroughly. The manuscript is now much clearer, with improved repository organization, workflow explanation, consistent terminology, and better figure/table presentation. The documentation and tutorials make the framework accessible and usable.

I remain concerned about the claims regarding domain-agnostic generalization of SCHEMA-MINERpro. The manuscript provides a detailed conceptual discussion and illustrative examples from biomedical, chemical, and engineering domains, but these examples involve relatively structured texts and do not provide empirical evidence of successful application to heterogeneous or less-structured scientific documents. I recommend that these sections be explicitly framed as potential extensions or prospective applications, rather than as demonstrated generalization. A brief qualitative example or small-scale test from a less-structured domain would strengthen the argument if feasible.

Originality:
The multi-agent, human-in-the-loop ontology grounding framework remains a novel and practically relevant extension of prior schema extraction pipelines.

Significance of Results:
The empirical results in the ALD/ALE domains are solid and clearly demonstrate the workflow’s effectiveness. Claims of domain-agnostic generalization should be presented as potential rather than proven applicability.

Quality of Writing:
The manuscript is well-written, clearly structured, and easy to follow. Figures and tables have been improved and terminology standardized.

Data and Resources:
The GitHub repository and documentation are comprehensive, well-organized, and accessible.

Review #3

By Angelo Salatino submitted on 25/Nov/2025

Suggestion:
Accept

Review Comment:

The authors have positively taken my feedback and incorporated suggestions in the new verison of the manuscript.

Log in or register to post comments
781 reads

Main menu

Editorial Board

Syndicate

Schema-Miner Pro: Agentic AI for Ontology Grounding over LLM-Discovered Scientific Schemas in a Human-in-the-Loop Workflow

Tracking #: 3953-5167

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles

Search form

Main menu

Login

Editorial Board

Syndicate

Schema-Miner Pro: Agentic AI for Ontology Grounding over LLM-Discovered Scientific Schemas in a Human-in-the-Loop Workflow

Tracking #: 3953-5167

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles