Abstract:
Multilingual Question Answering (QA) has the potential to make the knowledge available as linked data accessible across languages. Current state-of-the-art QA systems for linked data are nevertheless typically monolingual, supporting in most cases only the English language. As most state-of-the-art systems are based on machine learning techniques, porting such systems to new languages requires a training set for every language that should be supported. Furthermore, most recent QA systems based on machine learning methods lack controllability and extendability, thus making the governance and incremental improvement of these systems challenging, not to mention the initial effort of collecting and providing training data. Towards the development of QA systems that can be ported across languages in a principled manner without the need of training data and towards systems that can be incrementally adapted and improved after deployment, we follow a model-based approach to QA that supports the extension of the lexical and multilingual coverage of a system in a declarative manner. The approach builds on a declarative model of the lexicon ontology interface, OntoLex lemon, which enables the specification of the meaning of lexical entries with respect to the vocabulary of a particular dataset. From such a lexicon, in our approach, a QA grammar is automatically generated that can be used to parse questions into SPARQL queries. We show that this approach outperforms current QA approaches on the QALD benchmarks. Furthermore, we demonstrate the extensibility of the approach to different languages by adapting it to German, Italian, and Spanish. We evaluate the approach with respect to the QALD benchmarks on five editions (i.e., QALD-9, QALD-7, QALD-6, QALD-5, and QALD-3) and show that our approach outperforms the state-of-the-art on all these datasets in an incremental evaluation mode in which additional lexical entries for test data are added. For example, on QALD-9, our approach obtains F1 scores of 0.85 (English), 0.82 (German), 0.65 (Italian), and 0.83 (Spanish) in an incremental evaluation mode in which lexical entries covering test data questions are added. So far there is no system described in the literature that works for at least four languages while reaching state-of-the-art performance on all of them. Finally, we demonstrate the low efforts necessary to port the system to a new dataset and vocabulary.