InteractOA: Showcasing the representation of knowledge from scientific literature in Wikidata

Tracking #: 3513-4727

Authors: 
Muhammad Elhossary
Konrad Foerstner

Responsible editor: 
Guest Editors Wikidata 2022

Submission type: 
Tool/System Report
Abstract: 
Knowledge generated during the scientific process is still mostly stored in the form of scholarly articles. This lack of machine-readability hampers efforts to find, query, and reuse such findings efficiently and contributes to today’s information overload. While attempts have been made to semantify journal articles, widespread adoption of such approaches is still a long way off. One way to demonstrate the usefulness of such approaches to the scientific community is by showcasing the use of freely available, open-access knowledge graphs such as Wikidata as sustainable storage and representation solutions. Here we present an example from the life sciences in which knowledge items from scholarly literature are represented in Wikidata, linked to their exact position in open-access articles. In this way, they become part of a rich knowledge graph while maintaining clear ties to their origins. As example entities, we chose small regulatory RNAs (sRNAs) that play an important role in bacterial and archaeal gene regulation. These post-transcriptional regulators can influence the activities of multiple genes in various manners, forming complex interaction networks. We stored the information on sRNA molecule interaction taken from open-access articles in Wikidata and built an intuitive web interface called InteractOA, which makes it easy to visualize, edit, and query information. The tool also links information on small RNAs to their reference articles from PubMed Central on the statement level. InteractOA encourages researchers to contribute, save, and curate their own similar findings. InteractOA is hosted at https://tools.wmflabs.org/interactoa and its code is available under a permissive open source licence. In principle, the approach presented here can be applied to any other field of research.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Andra Waagmeester submitted on 11/Oct/2023
Suggestion:
Minor Revision
Review Comment:

I appreciate the authors' thoughtful and comprehensive response to the feedback from the previous review round. I would like to suggest accepting the contribution with some minor revisions.

While the paper has made commendable progress, I would like to express a lingering concern regarding the treatment of knowledge graphs and their applications, particularly concerning Wikidata. There is room for improvement in addressing knowledge graphs beyond Wikidata, ensuring a more inclusive description of significant contributions from outside Wikidata. The current description of knowledge graphs in the introduction appears somewhat lacking in detail and could benefit from additional information.

While the sections related to knowledge graphs could benefit from enhancement, it's crucial to emphasize that these aspects might not be necessary for the paper's overarching message. The paper effectively serves as a demonstrator, highlighting the valuable role of Wikidata in the linked data cloud for the life sciences.

Sections 1.1 and 1.2 still require substantial refinement, but there is an opportunity to streamline them into a concise description emphasising Wikidata as the linked-data resource of Wikipedia/the Wikimedia Foundation. While discussing knowledge graphs and Wikidata's role in the broader landscape, these sections could be more concise to align with the paper's primary topic namely, the authors' noteworthy contributions, i.e. interactOA.

Additionally, I encourage the authors to moderate the promotional tone at specific points in the text. While advocating for Wikidata's value, it's essential to strike a balance and avoid presenting it as the only working solution. For instance, the use of superlatives, as seen on line 23 of page 4, may be reconsidered. Describing Wikidata as "valuable" is accurate, but terms like "superior" may be replaced with more neutral alternatives.
In short, I recommend accepting the paper with the remark that the authors could make revisions on sections 1.1 and 1.2, presenting a concise yet informative description of Wikidata, and adjusting the tone to ensure a more neutral and balanced representation.

Minor nitpicks:
Page 8, line 32: HTLM -> HTML

Review #2
Anonymous submitted on 30/Oct/2023
Suggestion:
Minor Revision
Review Comment:

This is a review based on a revised version of the paper, that presents a novel user-friendly interface for users to interact with RNA knowledge based on Wikidata. I would like to first thank the authors for carefully taking my comments (and the other reviewers’) into acount, as I consider that the paper has greatly improved from the previous version. The interaction with the tool feels smoother too, and all resources are openly available and accessible. Before publishing, however, I find still some issues that need addressing.

I maintain my argument about the structure of the paper. Section 2 has improved greatly, the title fits better and the descriptions are clearer, and the discussion in Section 3 introduces interesting points; but I think the paper could use a separation of Section 1 into Introduction and Related Work (or Background). It feels weird to start reading a paper without an introduction, but that may be a personal preference.

Regarding the authors’ reply about the statements without evidence (“Having demonstrated the usefulness of this approach to our own research field”, “…and very likely shortens the time needed to consult previous research on individual small RNAs”), I also keep my previous argument after the rewriting: this is the kind of claim that can be said after a user study, that I understand will be carried out in the future as the work progresses. I would suggest rewriting in this direction (no need to be literal): “We believe this tool can pose as a valuable contribution to this research field” and “…which we aim to shorten the time needed to consult previous research on individual small RNAs”

Review #3
Anonymous submitted on 10/Nov/2023
Suggestion:
Minor Revision
Review Comment:

Thank you for taking the comments into consideration. Most of my previously mentioned issues are addressed. There are still some issues regarding the structure of the paper. Previously, I mentioned that the structure should be improved, the first section title was 'Introduction' but no actual introduction was presented. Now the title 'Introduction' is removed, which does not seem like the right solution. It would be better if a proper introduction is added to the article. In its current form, it only becomes clear what this article is actually after the first section. A more logical structure is to have an introduction and then discuss the background in more detail.

Also, it would have been nice if some sort of evaluation was added. For example, the sentence: "demonstrated the usefulness of this approach" is solely based on the author's opinion. Even a small-scale qualitative study to access the opinions of domain researchers about the approach would be a welcome addition.

The paper can benefit from more proofreading, the majority of the comments below relate to spelling and styling mistakes.

* Section 1 title missing capitals: "Background and related work" -> "Background and Related Work".
* Page 2, Line 33: "in HTML, and XML" remove the comma.
* Page 4, line 20: "(Wikimedia Foundation) Wikidata", comma missing (also it is a rather long sentence).
* Page 4, line 38: format the links as footnotes.
* Page 5, line 49: oddly positioned sentence.
* Page 6, line 40: "RNA Database" (SRD)" wrongly positioned quote.
* In Section 2.4, Flask is not always capitalized, also it is mentioned twice that InteractOA is implemented in Flask (and the second time, a footnote is added). Probably mentioning it once is sufficient.
* Section 2.4: HTLM -> HTML?
* Is there a reference for "a web-based tool developed by Halder et al. will be used"?