QALD-10 — The 10th Challenge on Question Answering over Linked Data

Tracking #: 3357-4571

Ricardo Usbeck
Xi Yan
Aleksandr Perevalov
Longquan Jiang
Julius Schulz
Angelie Kraft
Cedric Moeller
Junbo Huang
Jan Reineke
Axel-Cyrille Ngonga Ngomo
Muhammad Saleem
Andreas Both

Responsible editor: 
Guest Editors Wikidata 2022

Submission type: 
Dataset Description
Knowledge Graph Question Answering (KGQA) has gained attention from both industry and academia over the past decade. Researchers proposed a substantial amount of benchmarking datasets with different properties, pushing the development in this field forward. Many of these benchmarks depend on Freebase, DBpedia, or Wikidata. However, KGQA benchmarks that depend on Freebase and DBpedia are gradually less studied and used, because Freebase is defunct and DBpedia lacks the structural validity of Wikidata. Therefore, research is gravitating toward Wikidata-based benchmarks. That is, new KGQA benchmarks are created on the basis of Wikidata and existing ones are migrated. We present a new, multilingual, complex KGQA benchmarking dataset as the 10th part of the Question Answering over Linked Data (QALD) benchmark series. This corpus formerly depended on DBpedia. Since QALD serves as a base for many machine-generated benchmarks, we increased the size and adjusted the benchmark to Wikidata and its ranking mechanism of properties. These measures foster novel KGQA developments by more demanding benchmarks. Creating a benchmark from scratch or migrating it from DBpedia to Wikidata is non-trivial due to the complexity of the Wikidata knowledge graph, mapping issues between different languages, and the ranking mechanism of properties using qualifiers. We present our creation strategy and the challenges we faced that will assist other researchers in their future work. Our case study, in the form of a conference challenge, is accompanied by an in-depth analysis of the created benchmark.
Full PDF Version: 

Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 08/Mar/2023
Review Comment:

In this paper, the authors present QALD-10, a dataset for Question Answering over Knowledge Graphs.
Being at its 10th edition, the QALD is a well-known dataset in the field of Question Answering for structured data and represents a benchmark for several works in the literature.

The dataset presents some novelty compared to the past. The main one is surely represented by the change of the underlying Knowledge Graph, i.e., from DBpedia to Wikidata.

Unfortunately, the number of questions included within the QALD 10 training set was slightly reduced due to some challenges and incompatibilities between the two Knowledge Graphs. Also, the language coverage of the dataset was reduced. The authors explained and addressed these problems in a specific section of the paper. Anyway, I encourage them to enlarge the dataset in future releases.

I suggest the author explicitly state the final number of questions in the test set in section 2 for better clarity for the readers. Other details about the dataset are covered throughout the paper.

QALD 10, like its predecessors, is a fully manually created dataset, which, as stated before, can affect the number of question and their coverage. Still, it surely helps create more diverse questions. I appreciate the attention given by the authors to cover more complex questions in terms of queries and modifiers.

The paper's topic is well fit for the journal, and this work represents an important contribution to the field of KGQA.

Review #2
Anonymous submitted on 13/Mar/2023
Minor Revision
Review Comment:

This is a dataset description paper which proposes a question answering dataset for bench-marking.
Different editions of the dataset known as Question answering over linked data (QALD) have already been published. QALD-10 is proposed in this paper which is based on Wikidata.

Overall, it is a valid and interesting dataset description paper and this reviewer has some following suggestions:

General comment: Is it possible to provide an analyses of what type of questions are challenging to answer? This might help researchers to focus on unsolved problems. For the moment, most of the discussion is focused on the challenges faced while doing adaptation to Wikidata. Are there some questions in general that are difficult for most of the question answering systems? This reviewer feels that such discussion is only partially present in the paper.

General comment: The readme at seems slightly short. Some more description can be provided. For example a few lines about could be provided.

p3, l33: "low complexity of the gold standard SPARQL queries": here the discussion was about several challenges, but the formulation above makes it sound as if it is less challenging. Is it possible to find another reformulation of "low complexity"?

Moreover, the challenges are quickly stated using 1 short liners. The challenges should be explained more at this point in the paper.

p6, l18: please see if all acronyms are defined when they appear first time in the paper. For example QQT here is not defined.

p6, l38: generate --> generates?

p7, l35: "However, the results clearly suggest that the proposed benchmark is way more complex than QALD-9-plus in terms of various important modifiers such as COUNT, FILTER, ASK, GROUP BY, OFFSET, and YEAR." It is actually not so clear just by looking at the table, because it does better in some metrics, but bad in others. Perhaps reformulate to say that a detailed analysis of the complexity is done in the following text.

p7, l43 and l48: it is called joint vertex here, but in l10 it is called join vertex.

p10, l22: in many examples only entity IDs are provided for example wd:Q28222602. Perhaps the paper will be more readable if the entity names are provided as well.

p11, l31: "We formulate our challenges and solutions during the SPARQL generation process to aid further research in KGQA dataset creation as well as Wikidata schema research."
This is said in the end. Perhaps it should have been said in the beginning of Section 5.