Editorial Board

Editor-in-Chief
Cogan Shimizu
Eva Blomqvist

Editorial Board
Mehwish Alam
Claudia d’Amato
Stefano Borgo
Boyan Brodaric
Philipp Cimiano
Michael Cochez
Oscar Corcho
Bernardo Cuenca-Grau
Elena Demidova
Jerome Euzenat
Sebastián Ferrada
Mark Gahegan
Aldo Gangemi
Dagmar Gromann
Armin Haller
Pascal Hitzler
Aidan Hogan
Katja Hose
Eero Hyvönen
Krzysztof Janowicz
Sabrina Kirrane
Agnieszka Lawrynowicz
Maria Maleshkova
Raghava Mutharaju
Axel Polleres
Guilin Qi
Marta Sabou
Harald Sack
Angelo Salatino
Christoph Schlieder
Stefan Schlobach
Cogan Shimizu
Blerina Spahiu
Sanju Tiwari
GQ Zhang
Rui Zhu

Former/Founding Editors-in-Chief
Krzysztof Janowicz
Pascal Hitzler

Editorial Assistants
Michael McCain

Syndicate

Empowering the SDM-RDFizer Tool for Scaling Up to Complex Knowledge Graph Creation Pipelines

Submitted by Enrique Iglesias on 11/15/2023 - 14:13

Tracking #: 3580-4794

Authors:

Enrique Iglesias

Maria-Esther Vidal

Diego Collarana Vargas

David Chaves-Fraga

Responsible editor:

Guest Editors Tools Systems 2022

Submission type:

Tool/System Report

Abstract:

The significant increase in data volume in recent years has prompted the adoption of knowledge graphs as valuable data structures for integrating diverse data and metadata. However, this surge in data availability has brought to light challenges related to standardization, interoperability, and data quality. Knowledge graph creation faces complexities from large data volumes, data heterogeneity, and high duplicate rates. This work addresses these challenges and proposes data management techniques to scale up the creation of knowledge graphs specified using the RDF Mapping Language (RML). These techniques are integrated into SDM-RDFizer, transforming it into a two-fold solution designed to address the complexities of generating knowledge graphs. Firstly, we introduce a reordering approach for RML triples maps, prioritizing the evaluation of the most selective maps first to reduce memory usage. Secondly, we employ an RDF compression strategy, along with optimized data structures and novel operators, to prevent the generation of duplicate RDF triples and optimize the execution of RML operators. We assess the performance of SDM-RDFizer through established benchmarks. The evaluation showcases the effectiveness of SDM-RDFizer compared to state-of-the-art RML engines, emphasizing the benefits of our techniques. Furthermore, the paper presents real-world projects where SDM-RDFizer has been utilized, providing insights into the advantages of declaratively defining knowledge graphs and efficiently executing these specifications using this engine.

Full PDF Version:

swj3580.pdf

Previous Version:

Empowering the SDM-RDFizer Tool for Scaling Up to Complex Knowledge Graph Creation Pipelines

Tags:

Reviewed

Long-term Stable Link to Resources:

https://github.com/SDM-TIB/SDM-RDFizer

Decision/Status:

Solicited Reviews:

Click to Expand/Collapse

Review #1

By Dominik Tomaszuk submitted on 25/Nov/2023

Suggestion:
Accept

Review Comment:

I appreciate the authors' efforts in addressing my comments by making necessary adjustments to the paper. Having reviewed the revised version, I find no additional comments to make, and I recommend accepting the paper.

Review #2

Anonymous submitted on 14/Jan/2024

Suggestion:
Minor Revision

Review Comment:

The revised version has addressed concerns from my previous review, and is generally acceptable.

Check that all bibliographical sources have complete bibliographical information.
Also, some groups of papers of referenced literature sources are from the same authors and are very similar in content or even the same (e.g. references 15 and 67 are the same?), so one should leave only ones that are unique and sufficient.

Also, section 6 does not appear to have much to do directly with the main research topic of the paper (it is about the tool in general?), and while it is useful to get a general idea about the tool's usage, it can also be moved to appendix.

Log in or register to post comments
5299 reads

Main menu

Editorial Board

Syndicate

Empowering the SDM-RDFizer Tool for Scaling Up to Complex Knowledge Graph Creation Pipelines

Tracking #: 3580-4794

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles

Search form

Main menu

Login

Editorial Board

Syndicate

Empowering the SDM-RDFizer Tool for Scaling Up to Complex Knowledge Graph Creation Pipelines

Tracking #: 3580-4794

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles