Review Comment:
The paper explores techniques for the retrieval and ranking of Semantic Associations from Knowledge Graphs (i.e. loop-free paths connecting two entitities in the KG). This is very topical and important research area. The challenge is the very large number of such paths that can be found between two given entities, hence requiring some effective ranking and also personalization as well of the list of SAs returned.
The paper presents a pay-as-you go approach for achieving these aims, using a learning to rank algorithm and an active sampling method. The authors describe in detail experiments conducted over two datasets, comparing across a variety of baselines and algorithms. The results are promising in respect of both the ranking of SAs, and support for the personalization hypothesis.
The paper is generally clearly motivated and presented, giving the necessary details for the various techniques.
Unfortunately, however, the paper bears a strong similarity to "Actively Learning to Rank Semantic Associations for Personalized Contextual Exploration of Knowledge Graphs" by the same authors, published in the ESWC 2017 conference proceedings (Springer):
- the Abstracts are identical
- the introduction, motivation and stated contributions are similar
- Figures 1 and 3 are the same in both papers
- Figure 2 is very similar between the papers
- the general approach and specific techniques described are the same in the two papers
- the datasets and experiments appear to be the same, although described in a little more detail in the paper submitted to SWJ
- the experimental results and discussion are essentially the same, apart from the addition of the Wilcoxon text.
Thus it is hard to see any additional subtantive contribution by the paper submitted to SWJ compared with the authors' paper already published in the ESWC 2017 proceedings.
More detailed comments about the paper submitted to SWJ:
In the discussion on page 2, a bit more background is needed about the DaCENA application. Also, some forward-looking discussion is needed to explain concepts such as "k-most interesting", "ordered by serendipity", and whether "interest" is regarded as being the same as "serendipity".
On page 6, a short discussion is needed on how the parameter alpha is set. On page 9, some more discussion is needed on how p and lambda were determined.
Section 3 requires some more reflection on why certain choices were made. It is hard for the reader to follow the overall strategy within all the details presented. A short summary overview is needed, perhaps at the start of the section.
On page 11, the discussion at the bottom of Column 2 on the LAFU dataset it unclear and should be rephrased.
Page 1 col 2
receiving ccontent -> receives content
Page 4 col 1
Unfortunately, being Serendipity -> Unfortunately, Serendipity being
weather this -> whether this
Page 6 col 2
in a SAs -> in a SA
P7, c1
entity that are central -> entities that are central
features that considers -> features that consider
P8, c1
Two different dataset -> Two different datasets
collected these dataset -> collected these datasets
in Figure 5, we show -> in Figure 5: we show
P9, c1
for SAs, in addition often -> for SAs. In addition,
for each algorithms -> for each algorithm
algorithms that have are signed with the blue columns ->
algorithms that are marked with blue (dark grey) in the first column
compare it with the use of clustering algorithm ->
compare it with the use of clustering algorithms
thus it could be informative -> thus could be informative
selected one SAs -> selected one SA
uncertainty a Global Uncertainty -> uncertainty, a Global Uncertainty
P10, c1
a order -> an ordering
In the SAMU datasets Dirichlet -> In the SAMU datasets, Dirichlet
this two methods -> these two methods
this dataset it is bigger -> this dataset is bigger
P10, c2
One of our assumption was that the personalization was needed ->
One of our assumptions was that personalization was needed
which output -> whose output
P11, c1
has given rating to -> has given a rating to
better then methods -> better than methods
with each iterations -> with each iteration
in the sectopn above, the general -> in the sectopn above. The general
not able to access to the -> not able to access the
P11, c2
algorithms requires longer -> algorithms require longer
can not -> cannot
to training -> to train
the plots 6 -> the plots in Figure 6
P12, c1
with the algorithms configurations -> with the algorithm configurations
Table 8, we signed with -> Table 8 - we mark with
P13, c1
serendipity heuristics is a -> serendipity heuristic is a
user are interested -> users are interested
P13, c2
that help a user -> that helps a user
to by known -> to be known
P14, c1
This approach use -> This approach uses
tailored on -> tailored to
split in a training -> split into a training
uses uncertainty measure -> uses an uncertainty measure
The sentence "(In Section 3 .....Section)." is unclear and should be rephrased.
The sentence starting "One approach that has been proposed ..." is unclear and should be rephrased.
P14, c2
requested to the user -> requested from the user [ twice ]
since user are interested -> since users are interested
An other -> Another
|