A Biaswalk Based RDF Entity Embeddings

Tracking #: 2401-3615

This paper is currently under review
Thi Thu Van Duong
Md Anwarul Islam
Young-Koo Lee1

Responsible editor: 
Harald Sack

Submission type: 
Full Paper
Resource Description Framework(RDF) graph has become an important data source for many knowledge discoveries and data mining tasks. However, to enable complex analytic, most of the knowledge discovery algorithms require data in vector representation. Therefore, several works have been recently proposed which aim to represent entities in the RDF graph as low dimensional vectors by graph walking. However, sequences generated by graph walking capture only structure related context, it cannot capture latent context such as semantically related information which is an important property of RDF data. In this paper, we proposed a new method to map each entity in RDF to vector using word2vec as a language modeling to learn embedding. In order to use word2vec, we produce a bias random walk, to generate sequences as node context. In this paper, we provide a new concept of similar entities which trade-off between the label of outgoing edge and outgoing nodes. By using entity similarity, we provide a structural similarity that calculates the similarity of two entities in each case of the current sequence. Moreover, we proposed a latent sequences which cannot be generated by traveling the graph, but provide more semantic sentences. Experimental results and the case study on real graphs demonstrates that our method achieves better quality and efficiency.
Full PDF Version: 
Under Review