Editorial Board

Editor-in-Chief
Cogan Shimizu
Eva Blomqvist

Editorial Board
Mehwish Alam
Claudia d’Amato
Stefano Borgo
Boyan Brodaric
Philipp Cimiano
Michael Cochez
Oscar Corcho
Bernardo Cuenca-Grau
Elena Demidova
Jerome Euzenat
Mark Gahegan
Aldo Gangemi
Dagmar Gromann
Armin Haller
Pascal Hitzler
Aidan Hogan
Katja Hose
Eero Hyvönen
Sabrina Kirrane
Agnieszka Lawrynowicz
Freddy Lecue
Maria Maleshkova
Raghava Mutharaju
Axel Polleres
Guilin Qi
Marta Sabou
Harald Sack
Angelo Salatino
Christoph Schlieder
Stefan Schlobach
Cogan Shimizu
Blerina Spahiu
GQ Zhang
Rui Zhu

Former/Founding Editors-in-Chief
Krzysztof Janowicz
Pascal Hitzler

Editorial Assistants
Michael McCain

Syndicate

Discovering alignment relations with Graph Convolutional Networks: a biomedical case study

Submitted by Pierre Monnin on 07/24/2021 - 02:53

Tracking #: 2849-4063

Authors:

Pierre Monnin

Chedy Raïssi

Amedeo Napoli

Adrien Coulet

Responsible editor:

Guest Editors DeepL4KGs 2021

Submission type:

Full Paper

Abstract:

Knowledge graphs are freely aggregated, published, and edited in the Web of data, and thus may overlap. Hence, a key task resides in aligning (or matching) their content. This task encompasses the identification, within an aggregated knowledge graph, of nodes that are equivalent, more specific, or weakly related. In this article, we propose to match nodes within a knowledge graph by (i) learning node embeddings with Graph Convolutional Networks such that similar nodes have low distances in the embedding space, and (ii) clustering nodes based on their embeddings, in order to suggest alignment relations between nodes of a same cluster. We conducted experiments with this approach on the real world application of aligning knowledge in the field of pharmacogenomics, which motivated our study. We particularly investigated the interplay between domain knowledge and GCN models with the two following focuses. First, we applied inference rules associated with domain knowledge, independently or combined, before learning node embeddings, and we measured the improvements in matching results. Second, while our GCN model is agnostic to the exact alignment relations (e.g., equivalence, weak similarity), we observed that distances in the embedding space are coherent with the ``strength'' of these different relations (e.g., smaller distances for equivalences), letting us considering clustering and distances in the embedding space as a means to suggest alignment relations in our case study.

Full PDF Version:

swj2849.pdf

Previous Version:

Rediscovering alignment relations with Graph Convolutional Networks

Tags:

Reviewed

Decision/Status:

Solicited Reviews:

Click to Expand/Collapse

Review #1

By Matthias Samwald submitted on 11/Aug/2021

Suggestion:
Accept

Review Comment:

The authors sufficiently addressed my suggestions for improvement.

Review #2

Anonymous submitted on 20/Sep/2021

Suggestion:
Accept

Review Comment:

I appreciate the work done for improving the paper. I still think that the novelty could be better emphasized but the work is interesting and well done.

Review #3

By Ernesto Jimenez-Ruiz submitted on 24/Sep/2021

Suggestion:
Minor Revision

Review Comment:

I would like to thank the authors for their response letter and clarifications. I believe they have covered the comments of all three reviewers. The scope of the paper is clear now, which was one of the main negative points I highlighted in my original review. Now the paper can be reviewed from a different angle. The paper presents an interesting application of GCNs to learn KG embeddings in a specific setting.

Learned embeddings seems to be meaningful and consistent to the type of similarity.
The application domain is clear now and the need to align individuals in a single KG; but due to the fact there is the need of discovering new alignments as this KG integrates knowledge from different sources.

Additional comments:

Regarding Figure 2. Does it mean the “combination” of CYP2C9 + drug warfarin cause phenotype vascular_disorders? Depending on the logics of cause, this may not be correct. Is the property causes transitive? Or are specific rules taking care of this? If Cause(CYP2C9, x) and causes(warfarin, x) then causes (x, vascular_disorders)?

I understand now the motivation of using a custom inference engine to have more control. Although this could be also done by providing a subset of the rules to trigger to state of the art inference engines. Creating a custom inference engine is valid, but one may also need to guarantee scalability, soundness and completeness. Has this been taken into account?

As a supervised approach, the presented approach relies on previously defined gold clusters. For the presented approach it is valid to assume that this exists, however in practice it may be convenient to automate the generation of these gold cluster, relying form example on (incomplete but) very precvise alignments. On a different setting, but we followed this idea in [a]. May be worth adding some discussion about how to automate the generation of gold clusters.

The paper in [12] provide interesting facts about more is not always better for some embedding approaches, this is indeed interesting to understand how KGE approaches behave. Other works have also tried to inject the new inferences into the loss function instead of as materialization [b]. In OWL2Vec [c], a system inspired by RDF2Vec, we showed the potential benefit of taking into account the ontology and performing reasoning/materialization into the embeddings.

Minor comments:
- This initial of selection - This initial selection
- to handle a large number of clusters, potentially large → use synonym of large?...

Suggested literature:
[a] Jiaoyan Chen, Ernesto Jimenez-Ruiz, Ian Horrocks, Denvar Antonyrajah, Ali Hadian, Jaehun Lee. Augmenting Ontology Alignment by Semantic Embedding and Distant Supervision. European Semantic Web Conference, ESWC 2021
[b] Claudia d'Amato, Nicola Flavio Quatraro, Nicola Fanizzi:
Injecting Background Knowledge into Embedding Models for Predictive Tasks on Knowledge Graphs. ESWC 2021: 441-457
[c] Jiaoyan Chen, Pan Hu, Ernesto Jimenez-Ruiz, Ole Magnus Holter, Denvar Antonyrajah, and Ian Horrocks. OWL2Vec*: Embedding of OWL ontologies. Machine Learning, Springer, 2021

Log in or register to post comments
6353 reads

Main menu

Editorial Board

Syndicate

Discovering alignment relations with Graph Convolutional Networks: a biomedical case study

Tracking #: 2849-4063

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles

Search form

Main menu

Login

Editorial Board

Syndicate

Discovering alignment relations with Graph Convolutional Networks: a biomedical case study

Tracking #: 2849-4063

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles