Benchmarking Embedding Techniques for Knowledge Graph Comparison

Tracking #: 2724-3938

This paper is currently under review
Pieter Bonte
Sander Vanden Hautte
Filip De Turck
Sofie Van Hoecke
Femke Ongenae

Responsible editor: 
Guest Editors DeepL4KGs 2021

Submission type: 
Full Paper
Knowledge graphs (KGs) are gaining popularity and are being widely used in a plethora of applications. They owe their popularity to the fact that KGs are an ideal form to integrate and retrieve data originating from various sources. Using KGs as input for Machine Learning (ML) tasks allows to perform predictions on these popular graph structures. However, KGs can't directly be used as ML input, they first require to be transformed to a vector space through embedding techniques. As ML techniques are data-driven, they can generalize over unseen input data that deviates to some extend from the data they were trained upon. To fully exploit the generalization capabilities of ML algorithms, small changes in the KGs should result in small changes in the embedding space. Many graph embedding techniques exist, however, they have not been tailor towards KGs. We investigated how these whole graph embedding techniques can be used for KG embedding. This paper evaluates if these existing embedding techniques that embed the whole graph can represent the similarity between KGs in their embedding space. We compare the similarities between KGs in terms of changes in size, entity labels, and KG schema. We found that most techniques were able to represent the similarities in terms of size and entity labels in their embedding space, however, none of the techniques were able to capture the similarities in KG schema.
Full PDF Version: 
Under Review