A decade of Semantic Web research through the lenses of a mixed methods approach

Tracking #: 2058-3271

Authors: 
Sabrina Kirrane
Marta Sabou
Javier D. Fernandez
Francesco Osborne
Cécile Robin
Paul Buitelaar
Enrico Motta
Axel Polleres

Responsible editor: 
Christoph Schlieder

Submission type: 
Full Paper
Abstract: 
The identification of research topics and trends is an important scientometric activity, as it can help guide the direction of future research. In the Semantic Web area, initially topic and trend detection was primarily performed through qualitative, top-down style approaches, that rely on expert knowledge. More recently, data-driven, bottom-up approaches have been proposed that offer a quantitative analysis of the evolution of a research domain. In this paper, we aim to provide a broader and more complete picture of Semantic Web topics and trends by adopting a mixed methods methodology, which allows for the combined use of both qualitative and quantitative approaches. Concretely, we build on a qualitative analysis of the main seminal papers, which adopt a top-down approach, and on quantitative results derived with three bottom-up data-driven approaches (Rexplore, Saffron, PoolParty), on a corpus of Semantic Web papers published between 2006 and 2015. In this process, we both use the latter for “fact-checking” on the former and also to derive key findings in relation to the strengths and weaknesses of top-down and bottom-up approaches to research topic identification. Although we provide a detailed study on the past decade of Semantic Web research, the findings and the methodology are relevant not only for our community but beyond the area of the Semantic Web to other research fields as well.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Yingjie Hu submitted on 24/Feb/2019
Suggestion:
Accept
Review Comment:

The authors have successfully addressed my previous comments. The authors also created a resource page to share detailed information about the corpora used in this study to support reproducibility. Overall, I think the authors have done an excellent job in improving this paper, and I recommend it for publication!

Review #2
Anonymous submitted on 08/May/2019
Suggestion:
Accept
Review Comment:

I am not completely convinced, but I acknowledge that the auuthors made a eral effort to address my Points.

Review #3
Anonymous submitted on 15/May/2019
Suggestion:
Major Revision
Review Comment:

In this manuscript, a topic modeling of the semantic web research field is performed. To do that, authors employed a set of three software tools and compare their results with the topic extracted from 3 seminal papers by the authors of this manuscript.

The main idea is interesting, and the paper is well organizing.

In what follows, some comments and suggestions are listed:

- Mainly authors performed a science mapping analysis (a kind of scientometric analysis), but any comment or reference to this research field is present.
- The corpus is not well defined. How many documents were retrieved.
- Why the last year is 2015? We are now in 2019.
- There are a lot of science mapping software tools specific to deal with topic analysis on scientific corpus. For example, SciMAT, VOSViewer, CiteSpace. Why authors do not use this software?
- Why authors use three different software to perform topic modeling?
- Why Rexplore use a different corpus. It does not make sense. A different corpus should uncover different topics.
- As authors claimed, the 20 most important topics detected by each software tool do not match. So, as I argued above, if authors employed different corpus, it is logical that the result will be different.
- The utility of this manuscript is not clear. I mean, authors just identify the topic using the expert opinion, and then try to validate their results using automatic tools. Maybe the focus of the paper is wrong. Authors must not try to validate their results, authors must try to extract the main concept covered in the research field. Maybe, the validation of the expert opinion with the results given by automatic learning, could be developed in a different paper.
- Authors should try to consider to develop a science mapping analysis based on co-word analysis, and at least, try to compare the results given with the results obtained in this manuscript.