Collaborative multilingual knowledge management based on controlled natural language

Tracking #: 524-1726

Authors: 
Kaarel Kaljurand
Tobias Kuhn
Laura Canedo

Responsible editor: 
Guest editors Semantic Web Interfaces

Submission type: 
Full Paper
Abstract: 
User interfaces are a critical aspect of semantic knowledge representation systems, as users have to understand and use a formal representation language to model a particular domain of interest, which is known to be a difficult task. Things are even more challenging in a multilingual setting, where users speaking different languages have to create a multilingual ontology. To address these problems, we introduce a semantic wiki system that is based on controlled natural language to provide an intuitive yet formal interface. We use a well-defined subset of Attempto Controlled English (ACE) implemented in Grammatical Framework. Our wiki system offers precise bidirectional automatic translations between ACE and language fragments of a number of other natural languages, making the wiki content accessible multilingually. Because ACE has a partial but deterministic mapping to the Web Ontology Language, our wiki engine can offer automatic reasoning and question answering over the wiki content. Users speaking different languages can therefore build, query, and view the same knowledge base in an intuitive and user-friendly interface based on the respective natural language. We present the results of a user evaluation where participants using different languages were asked to write and assess statements about European geography in our wiki environment. Our results show that users reach a high level of consensus, which is not negatively affected by the presence of automatic translation.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Accept

Solicited Reviews:
Click to Expand/Collapse
Review #1
By S.G. Lukosch submitted on 04/Sep/2013
Suggestion:
Accept
Review Comment:

The article introduces a semantic wiki system that supports collaborative multilingual knowledge management based on controlled natural language. The authors used a subset of Attempto Controlled English (ACE) implemented in Grammatical Framework (GF) to support bidirectional automatic translations between ACE and language fragments of a number of other natural languages in their semantic wiki. With this approach users speaking different languages can collaboratively build and manage a knowledge base.

The semantic wiki system was evaluated in a study with 30 participants speaking 3 different languages. For each language, there were 10 participants. In the study users had two tasks: users had to create articles in their native language or in a language they were fluent in as well as to read automatically translated articles to evaluate the truth or falsehood of the translation. The evaluation shows that users reach a high level of consensus. The evaluation also shows that the automatic translation does not have a negative effect.

In total, the article is well written and related to the state of the art. The authors clearly identify their contribution to the state of the art, i.e. making a semantic wiki environment multilingual, and evaluate whether their contribution addresses the identified problems around creating a multilingual ontology in a semantic wiki system. Conclusions are thus validated and future work is well based on the given findings. Concluding, I recommend to accept the article.

Review #2
By Eero Hyvonen submitted on 05/Sep/2013
Suggestion:
Accept
Review Comment:

The paper extends the authors' earlier work on using
controlled natural language (CNL) in semantic OWL-based wikis.
The novelty in this paper is to investigate this in the multi-lingual
case. CNL statements, transformed into OWL, can here not only be given
in different languages (here in particular in Englishm German, and Spanish)
but also translated arcross language boundaries facilitating using wiki
CNL in different languages. The paper expands the authors' recent ESWC 2013
paper.

The topic is clearly suitable for the topic of the special issue.

The research problem and methods used for attacking it are clearly stated.
Related work is discussed in a separate section, which seems adequate, although
I am not an expert in this particular field.

The papers cover a great deal of work related to the underlying tools and new
experiments, with illustrative examples and pointers to further sources.
After presenting the framework, the quality of the translations arcross natural
languages is evaluated and results analysed in careful way.
The language and presentation is exceptionally well polished.
In short, this looks like solid work worth publishing.

My main concern about the paper is related to the general idea of using CNL as a basis in wikis in general. What would be the *realistic* use case problem for a system like this, and how well would it then actually solve the problem of collaboarative multilingual ontology creation? The paper concerns a toy example of countries, rivers etc. It is good to use such examples in a research setting, but it would be nice if the authors could shortly discuss this bigger question and e.g. motive the reader by examples of more serious CNL-based wikis and OWL ontologies - are there useful systems already and what are the challenges? It is a challenge, if a group of people start inputting CNL OWL expressions in a wiki, and this should coverge into something logically consistent and useful. Some challenges encountered in the evaluation section are discussed, e.g., different opinions people may have about geography, which leads to inconsistency. It is also said in the paper that 80% of the users could not express themselves as they liked in the experiment. In footnote 9 the authors point the reader to "demo wikis", but I could not find any realistic applications or datasets there. The video there was for some reason not operational.

Minor comments

p. 2 Provide the reference to GF when it is first mentioned.

Use mdash "---" without spaces at its ends. There are many occurrences of this.

"as already mentioned" -- Remove, it is not good style to use expressions like this.

"[10] discusses a multilingual ..." Using a reference as a word does not look nice. E.g. "Davis et al. [10] discuss ..." would be better. There are many occurrences of this.

In Fig. 6 the "proper name" column contains adjectives "Spanish" and "Swedish". Explain or correct this.

[25] Journal name "Semantic Web" is not complete.

[36] Pages missing.

Review #3
By Prateek Jain submitted on 08/Jun/2014
Suggestion:
Minor Revision
Review Comment:

The work 'Collaborative multilingual knowledge management based on controlled natural language' presents a description, architecture and implementation details of a Controlled Natural Language based knowledge engineering in a semantic media wiki based environment. This system allows a Semantic Media Wiki to become multi lingual editing environment. The underlying technology relies on using ACE based controlled vocabulary. The authors have presented a comprehensive evaluation and a portal to download and play with the system.

I like the work as it (a) demonstrates capabilities which can be achieved just by using controlled language (b) Shows an actual system which can be used in multiple and real world settings.

Some minor remarks:

The last years have shown great progress on the
technical side towards the realization of what is called
the Semantic Web -> The last few years ?
Already in 2007 -> In 2007

proper names -> proper nouns


Comments

We would like to thank the reviewers for their helpful comments. Below are our responses to some of the raised questions and comments.

Prateek Jain: proper names -> proper nouns

Response: As far as we know, both "proper name" and "proper noun" are correct and common.

Eero Hyvonen: My main concern about the paper is related to the general idea of using CNL as a basis in wikis in general. What would be the *realistic* use case problem for a system like this, [...]

Response: We agree that at this point we cannot demonstrate that our approach scales up to more complex ontologies and fully realistic scenarios. We now state this explicitly in the future work section. However, it is never possible to answer all relevant questions at the same time, and we think our study tackled the most important ones, allowing future studies to build on this work.

Eero Hyvonen: The video there was for some reason not operational.

Response: Please check again, it seems to work for everybody else.

Eero Hyvonen: Use mdash "---" without spaces at its ends. There are many occurrences of this.

Response: We can change it if you insist, but there doesn't seem to be complete agreement on this issue. Dashes with spaces are found in many publications and are suggested by at least one guideline. Personally, we think that dashes without spaces look ugly and are hard to read.

Eero Hyvonen: In Fig. 6 the "proper name" column contains adjectives "Spanish" and "Swedish". Explain or correct this.

Response: "Spanish" is both a proper name and an adjective. We use it only as a proper name, as in "I want to learn Spanish."

Eero Hyvonen: [25] Journal name "Semantic Web" is not complete.

Response: The official title according to the publisher is just "Semantic Web". See http://iospress.metapress.com/content/1mx3651013408786/ and http://iospress.metapress.com/export.mpx?code=1MX3651013408786&mode=txt