Review Comment:
This is an application report on Clover Quiz, a turn-based multiplayer trivia game for Android devices with more than 200K multiple choice questions (in English and Spanish) about different domains generated out of DBpedia.
Positive aspects of the app/report:
The application seems to be popular among users and also has good number of downloads. It is good to see a real mobile app based on DBpedia. The report largely covers the technical aspects from a development/game perspective and author has done a good job of presenting them.
Improvements required:
The area where the report is weak is when it comes to reporting existing research or reporting how existing research was utilised to guide some of the design decisions in the development of the game.
The language of the text also has, in many places, more liberty compared to strict research writing - for example "DBpedia constitutes the main hub of the Semantic Web"
I am not too convinced with the latency arguments represented by the authors-"DBpedia endpoint, hence latency is too high for an interactive trivia game, as reported in [13]." since there are so many benchmarks that show that triplstores can handle singificantly larger knowledge graphs and offer sub-second query response time.
Data extraction process :
- there are few unaswered questions here including 1. mapping the textual keyword to the concept from DBpedia - what if you have multiple options, of no option available matching the keyword,
but synonym or similar concept exists? 2. domain specific file - the format of the file seems random, although DBpedia has irregular property patterns, there are at least some common ones
offering the opportunity to use a few template across all the domains? 3. wikipedia has category chains and you have used it in your work-however how do you decide in terms of how far to go in terms of sub categories?
The comparison with the pervious work is quite broad with the criticism as: "question generation schemes that are not able to produce varied, large, and entertaining questions." For example, There are many works on the topic of linked data based query answering including a series of workshops (QALD). Although the submission is an application report, it needs to be positioned within these works. Few examples include:
https://qald.sebastianwalter.org/
https://www.sciencedirect.com/science/article/pii/S157082681300022X
https://dl.acm.org/citation.cfm?id=2557529
The existing research also need to be reflected in various decisions made in designing the game - for example, the data gathering phase, question generation, etc to show if it was influenced by one approach over another. At the moment it does not refer to previous works and uses quite a broad brush in its criticism of the existing work - although it seems parts of this work follows design/research principles of other games/applications.
I would have expected at least some new lessons learned from a significant development exercise. Many of the lessons learned are not new - for example, messiness of DBpedia. Authors can add further value by focusing on the ease of using DBpedia, performance issues while using in production environment and other developmental issues specific to using semantics/linked data/DBpedia.
I also feel that there is/was an excellent opportunity to analyse qualitative feedback in terms of your 5k users/downloads.
|
Comments
Public review
(To the editor: Major revision paper)
An adhoc system that generates MCQs from DBpedia dataset is detailed in the manuscript, with prime focus on DECREASING THE LATENCY. A versatile template-based mechanism is employed for generating VARIED and ENTERTAINING questions.
The presented approach is interesting. The authors have given adequate design and implementation details of the system, qualifying the article as an application report. However, there are significant shortcomings that need to be rectified to warrant publication.
1. Include relevant references.
2. Theoretical aspects of the approach should be formally discussed.
3. Remove a few claims after including proper references.
Introduction: "creating questions from DBpedia can be significantly improved by splitting this process into a data extraction and a versatile question generation stage" is posted as the main HYPOTHESIS! "This approach can...declarative specifying the classes and question templates of the domains of interest." -- This statement is very obvious and makes us feel that an automated method should have been the prime goal this paper.
I felt that many references are missing at several places, say, approaches adopted to find the "popularity of concepts" and the "question difficulty estimator" are not given.
Can you give an account of the set of templates used and why have you chosen only those templates to generate the varied and entertaining question -- you may refer to [1] for seeing an example set of templates and their significance.
Associating images to questions was a good move!
Page-6 para above Listing 6. The example used (museum and country connected using the property country) is confusing.
Do you have a statistics about the no. of properties that are assigned as "functional" (or inverse or other predicate properties) in dbpedia -- to justify your argument.
I am convinced that the android application based on the manuscript is well developed, however, I feel that the theoretical aspect problem mentioned in the paper is limited -- hindering its publication in the SWJ. The android app. is adhoc in nature and limited to a few predetermined concepts -- an automated method would have been more appreciated. An improvement that I can think of is: given a random subject, potential mappings can be found with dbpedia concepts using tools such as AIDA. Similarly, rather than using random templates, related properties can be also identified and ranked, for generating question templates (as in [7]). However, since the focus of this paper is on decreasing the latency the mentioned change is less relevant.
If I understood your approach correctly, the set of concepts obtained after broadening (up 4 levels) are used to frame question templates. Do you consider the generated stems as having same difficulty-levels?
A considerable amount of related works are missing. I am listing a few here.
[1] D. Liu and C. Lin. Sherlock: a semi-automatic quiz genera- tion system using linked data. In M. Horridge, M. Rospocher, and J. van Ossenbruggen, editors, Proceedings of the ISWC 2014 Posters & Demonstrations Track a track within the 13th International Semantic Web Conference, ISWC 2014, Riva del Garda, Italy, October 21, 2014. , volume 1272 of CEUR Work- shop Proceedings , pages 9–12. CEUR-WS.org, 2014. http: //ceur-ws.org/Vol-1272/paper_7.pdf.
[2] Dominic Seyler, Mohamed Yahya, and Klaus Berberich. Generating quiz questions from knowledge graphs. In Proceedings of the 24th International Conference on World Wide Web, WWW ’15 Companion, pages 113–114, New York, NY, USA, 2015. ACM. Dominic Seyler, Mohamed Yahya, and Klaus Berberich.
[3] Knowledge questions from knowledge graphs. CoRR, abs/1610.09935, 2016. Dominic Seyler, Mohamed Yahya, Klaus Berberich,
and Omar Alonso.
[4] Automated question generation for quality control in human computation tasks. In Proceedings of the 8th ACM Conference on Web Science, WebSci 2016, Hannover, Germany, May 22-25, 2016, pages 360–362, 2016.
[5] Vinu E.V. and Kumar P. Sreenivasa. A novel approach to generate mcqs from domain ontology: Considering dl semantics and open-world as-
sumption. Web Semantics: Science, Services and Agents on the World Wide Web, 34:40 – 54, 2015. http://dx.doi.org/10.1016/j.websem.2015.05.005.
[6] Ellampallil Venugopal Vinu and Puligundla Sreenivasa Kumar. Improving large-scale assessment tests by ontology based approach. In Proceedings of the Twenty-Eighth International Florida Artificial Intelligence Research Society Conference, FLAIRS 2015, Hollywood, Florida. May 18-20, 2015., page 457, 2015.
[7] E.V Vinu and P. Sreenivasa Kumar. Automated generation of assessment tests from domain ontologies. Semantic Web Journal, Vol 6, 1023-1047, 2016.
[8] E.V, Vinu; Alsubait, Tahani; Kumar, P Sreenivasa, Modeling of Item-Difficulty for Ontology-based MCQs, arXiv:1607.00869 Technical Report, 2016.
[9] L. Bühmann, R. Usbeck, and A. N. Ngomo. ASSESS - automatic self-assessment using linked data. In M. Arenas, Ó. Corcho, E. Simperl, M. Strohmaier, M. d’Aquin, K. Srini- vas, P. T. Groth, M. Dumontier, J. Heflin, K. Thirunarayan, and S. Staab, editors, The Semantic Web - ISWC 2015 - 14th Inter- national Semantic Web Conference, Bethlehem, PA, USA, Oc- tober 11-15, 2015, Proceedings, Part II , volume 9367 of Lecture Notes in Computer Science , pages 76–89. Springer, 2015. 10.1007/978-3-319-25010-6_5
Section-7: Overall, the templated-based..very well. //there are many other factors other than popularity and similarity of distractors which decides the difficulty level of a question. Refer [7, 8].
Section-8: When designing multiple..., distractors...However, ..use random distractors. //Not all existing methods use random distractors.
The authors did not discuss anything about the natural language conversion part of the generated stems.