Editorial Board

Editor-in-Chief
Krzysztof Janowicz

Managing Editors
Cogan Shimizu
Eva Blomqvist

Editorial Board
Mehwish Alam
Claudia d’Amato
Stefano Borgo
Boyan Brodaric
Philipp Cimiano
Michael Cochez
Oscar Corcho
Bernardo Cuenca-Grau
Elena Demidova
Jerome Euzenat
Mark Gahegan
Aldo Gangemi
Dagmar Gromann
Armin Haller
Pascal Hitzler
Aidan Hogan
Katja Hose
Eero Hyvönen
Sabrina Kirrane
Agnieszka Lawrynowicz
Freddy Lecue
Maria Maleshkova
Raghava Mutharaju
Axel Polleres
Guilin Qi
Marta Sabou
Harald Sack
Angelo Salatino
Christoph Schlieder
Stefan Schlobach
Cogan Shimizu
Blerina Spahiu
GQ Zhang
Rui Zhu

Former/Founding Editors-in-Chief
Pascal Hitzler

Editorial Assistants
Michael McCain

Syndicate

A Study of Concept Similarity in Wikidata

Submitted by Filip Ilievski on 08/10/2023 - 21:09

Tracking #: 3520-4734

Authors:

Filip Ilievski

Kartik Shenoy

Hans Chalupsky

Nicholas Klein

Pedro Szekely

Responsible editor:

Harald Sack

Submission type:

Full Paper

Abstract:

Robust estimation of concept similarity is crucial for applications of AI in the commercial, biomedical, and publishing domains, among others. While the related task of word similarity has been extensively studied, resulting in a wide range of methods, estimating concept similarity between nodes in Wikidata has not been considered so far. In light of the adoption of Wikidata for increasingly complex tasks that rely on similarity, and its unique size, breadth, and crowdsourcing nature, we propose that conceptual similarity should be revisited for the case of Wikidata. In this paper, we study a wide range of representative similarity methods for Wikidata, organized into three categories, and leverage background information for knowledge injection via retrofitting. We measure the impact of retrofitting with different weighted subsets from Wikidata and ProBase. Experiments on three benchmarks show that the best performance is achieved by pairing language models with rich information, whereas the impact of injecting knowledge is most positive on methods that originally do not consider comprehensive information. The performance of retrofitting is conditioned on the selection of high-quality similarity knowledge. A key limitation of this study, similar to prior work lies in the limited size and scope of the similarity benchmarks. While Wikidata provides an unprecedented possibility for a representative evaluation of concept similarity, effectively doing so remains a key challenge.

Full PDF Version:

swj3520.pdf

Previous Version:

A Study of Concept Similarity in Wikidata

Tags:

Reviewed

Long-term Stable Link to Resources:

https://github.com/usc-isi-i2/wd-similarity

Decision/Status:

Solicited Reviews:

Click to Expand/Collapse

Review #1

Anonymous submitted on 25/Aug/2023

Suggestion:
Accept

Review Comment:

In this version, the authors have addressed the concerns and remarks I mentioned previously. I think the paper is now improved.

- The reason for selecting the approaches TransE, ComplEx, and DeepWalk is now justified in the revised paper.

- My previous request to provide the statistics of the datasets used to train the KG embedding models and the Node embedding methods has been addressed by providing Table 2. However, it would be good to also provide the slits for train and test.

Review #2

Anonymous submitted on 04/Sep/2023

Suggestion:
Accept

Review Comment:

The paper reads very well now, all reviewer comments have been properly and diligently addressed. Great work!

Review #3

Anonymous submitted on 05/Oct/2023

Suggestion:
Accept

Review Comment:

I would like to thank the authors for taking into account the issues brought up and including the pertinent information, as was mentioned in the prior review.
They have thoughtfully addressed the issues raised about how generalised the proposed approach is and how the outcomes produced in this study, namely the achieved related ideas in Wikidata, might be utilised in future research. Therefore, I would like to accept the paper.

Log in or register to post comments
3329 reads

Main menu

Editorial Board

Syndicate

A Study of Concept Similarity in Wikidata

Tracking #: 3520-4734

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles

Search form

Main menu

Login

Editorial Board

Syndicate

A Study of Concept Similarity in Wikidata

Tracking #: 3520-4734

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles