Review Comment:
I would like to thank the authors for their response letter and clarifications. I believe they have covered the comments of all three reviewers. The scope of the paper is clear now, which was one of the main negative points I highlighted in my original review. Now the paper can be reviewed from a different angle. The paper presents an interesting application of GCNs to learn KG embeddings in a specific setting.
Learned embeddings seems to be meaningful and consistent to the type of similarity.
The application domain is clear now and the need to align individuals in a single KG; but due to the fact there is the need of discovering new alignments as this KG integrates knowledge from different sources.
Additional comments:
Regarding Figure 2. Does it mean the “combination” of CYP2C9 + drug warfarin cause phenotype vascular_disorders? Depending on the logics of cause, this may not be correct. Is the property causes transitive? Or are specific rules taking care of this? If Cause(CYP2C9, x) and causes(warfarin, x) then causes (x, vascular_disorders)?
I understand now the motivation of using a custom inference engine to have more control. Although this could be also done by providing a subset of the rules to trigger to state of the art inference engines. Creating a custom inference engine is valid, but one may also need to guarantee scalability, soundness and completeness. Has this been taken into account?
As a supervised approach, the presented approach relies on previously defined gold clusters. For the presented approach it is valid to assume that this exists, however in practice it may be convenient to automate the generation of these gold cluster, relying form example on (incomplete but) very precvise alignments. On a different setting, but we followed this idea in [a]. May be worth adding some discussion about how to automate the generation of gold clusters.
The paper in [12] provide interesting facts about more is not always better for some embedding approaches, this is indeed interesting to understand how KGE approaches behave. Other works have also tried to inject the new inferences into the loss function instead of as materialization [b]. In OWL2Vec [c], a system inspired by RDF2Vec, we showed the potential benefit of taking into account the ontology and performing reasoning/materialization into the embeddings.
Minor comments:
- This initial of selection - This initial selection
- to handle a large number of clusters, potentially large → use synonym of large?...
Suggested literature:
[a] Jiaoyan Chen, Ernesto Jimenez-Ruiz, Ian Horrocks, Denvar Antonyrajah, Ali Hadian, Jaehun Lee. Augmenting Ontology Alignment by Semantic Embedding and Distant Supervision. European Semantic Web Conference, ESWC 2021
[b] Claudia d'Amato, Nicola Flavio Quatraro, Nicola Fanizzi:
Injecting Background Knowledge into Embedding Models for Predictive Tasks on Knowledge Graphs. ESWC 2021: 441-457
[c] Jiaoyan Chen, Pan Hu, Ernesto Jimenez-Ruiz, Ole Magnus Holter, Denvar Antonyrajah, and Ian Horrocks. OWL2Vec*: Embedding of OWL ontologies. Machine Learning, Springer, 2021
|