Review Comment:
This short paper is a tool/system description paper that introduces Evoke, a new tool that provides a web-based user interface for linguistic linked data resources modelled with Ontolex lemon. This work focuses on topical thesaurus in particular, and the tool provides mechanisms for viewing, navigating, extending, and analysing their content. Users can formulate queries to the tool and can extend the data with annotations and potentially with links to other datasets. The main aim of this work is to allow users to engage more fully with published lexicographic content without them infringing on licenses or requiring additional hosting.
The article is clear and well written. It is well structured and easy to read and follow. This work is very timely and relevant and could have an impact on the adoption of linguistic linked data (LLD) techniques by the specific communities of terminologists and lexicographers. Further, it might serve as inspiration to develop similar tools (or future releases of this one) that attract other end user communities as well. In fact, despite linked data is a mature field (as well as its LLD subfield), there is a shortage of proper user interfaces that bridge the gap between final users and Semantic Web experts and developers, and this work constitutes a decisive step in that direction. The tool is accesible online, along with a demo. However the GitHub repository (https://github.com/ssstolk/evoke) seems to be empty at the time of writing this review.
Despite the interest of the approach and the developed tool, there is a number of issues to be addressed in order to further increase the quality of this submission as a journal publication.
The main issue of this work is the lack of a user-centered evaluation. The paper hypothesis is that the use of LLD techniques through the Evoke platform can largely benefit both lexicographic data publishers and users in their daily work and to attain their research goals. However, this has not been confirmed through empirical evidence. Ideally, a study should be conducted with users in which clear metrics are defined and reported. For instance, measure time reduction of some common tasks, or gauge user experience through a survey, or measure any other aspect that might be relevant to validate the hypothesis.
Another hypothesis of this paper is that Evoke can bridge licensing barriers when users interact with lexicographic works available online. Although some qualitative justification is provided in the paper, it would greatly benefit from concrete examples, (i.e., resource X cannot be freely accessed because of its license Y, but Evoke can channel the user query Z without violating copyright because of ...).
I also miss a detailed comparison with other frameworks also devoted to offer proper user interfaces for linguistic linked data, as VocBench (http://vocbench.uniroma2.it/) by Armando Stellato and his team at University of Rome, or LexO, developed by Andrea Bellandi and colleagues at ILC in Pisa. See:
* A. Stellato et al. "VocBench 3: A collaborative Semantic Web editor for ontologies, thesauri and lexicons", Semantic Web, vol. 11, no. 5, pp. 855-881, 2020
* A. Stellato et al. "VocBench: A Web Application for Collaborative Development of Multilingual Thesauri", In Proc. of ESWC 2015, Lecture Notes in Computer Science, 9088, 38-53, Springer International Publishing, 2015
* A. Bellandi and E. Giovannetti. "Involving Lexicographers in the LLOD Cloud with LexO, an Easy-to-use Editor of Lemon Lexical Resources". In Proc. of the 7th Workshop on Linked Data in Linguistics (LDL-2020), pp 70-74, May 2020
* A. Bellandi et al. "Developing LexO: A Collaborative Editor of Multilingual Lexica and Termino-ontological Resources in the Humanities," in Proceedings of Language, Ontology, Terminology and Knowledge Structures Workshop (LOTKS 2017)
According to the SWJ guidelines, impact should be justified for system papers. In that regard, more metrics indicating impact and adoption of the tool should be provided besides its use on the TOE use case. Impact can be proved, for instance, with the number of downloads of the tool, or the number of unique visitors to the web service. If there were any other project, initiative, institution, researcher, using Evoke they should be also reported. Also, current plans to enhance the uptake of Evoke could help to demonstrate potential impact.
I think that the the paper would largely benefit from a figure with an architecture outline.
I section 3.2 it is stated that Evoke assumes Ontolex lemon as data model with its adaptation to topical thesauri (through 'lemon tree'). It is unclear, though, whether it supports general lemon lexicons (not only topical thesauri) as well as dictionaries represented with the Ontolex module for lexicography (lexicog). If not, if would be good to know about future plans to support lexicog, if any.
To make a stronger case for the benefits of linked data for lexicographic work, I would recommend to consult the work by Julia Bosque-Gil, for instance:
* J. Bosque-Gil et al. "Linked data in lexicography". Kernerman Dictionary News, 19–24. July 2016
Other minor remarks:
- I would define some notions such as "topical thesaurus" and "linguistic linked data" the very first time they appear in the introduction, for readers not so familiarised with them.
- Some spacing problems need to be fixed. For instance in section 2 "... COST Action Nexus Linguarum (2019-23).Tooling... " -> " COST Action NexusLinguarum (2019-23). Tooling" or "...tools LingHub[10], which offers " -> "...tools LingHub [10], which offers... ", "...The Historical Thesaurus of English[8],..." -> "The Historical Thesaurus of English [8],"
- The use of opening/closing single quotation marks (') need reviewing.
- The example in Listing 1 needs a more detailed explanation.
- Figure 4 possibly needs a more detailed caption. What is the sense information specific to riddle47?
- In section 4 it is stated that "Examples of research done within this project are linking up words (or word senses, rather) from Old Frisian and Old Dutch to the thesaurus taxonomy." But the date sources (Olf Frisian, Old Dutch) are not cited.
- English is good but would benefit from a final checking. For instance in section 3: "...an international standard specifically for expressing datasets..." -> "...an international standard specifically developed/designed/created for expressing datasets..." ; "allowing them to share it in the
manner of their choosing" -> "... choice"
|