Similarity-based Knowledge Graph Queries for Recommendation Retrieval

Tracking #: 2031-3244

Lisa Wenige
Johannes Ruhland

Responsible editor: 
Guest Editors Knowledge Graphs 2018

Submission type: 
Full Paper
This paper investigates how similarity-based retrieval strategies can be combined with graph queries to enable users or system providers to explore repositories in the Linked Open Data (LOD) cloud more thoroughly. For this purpose, we developed a content-based recommender system (RS). It relies on concept annotations of Simple Knowledge Organization System (SKOS) vocabularies and a SPARQL-based query language that facilitates advanced and personalized requests for openly available and interlinked datasets. We have comprehensively evaluated the novel search strategies in several test cases and example application domains (i.e., travel search and multimedia retrieval). The results of the web-based online experiments showed that our approaches increase the recall and diversity of recommendations or at least provide a competitive alternative strategy of resource access when conventional methods do not provide helpful suggestions. The findings may be of use for Linked Data-enabled recommender systems (LDRS) as well as for semantic search engines that can consume LOD resources.
Full PDF Version: 

Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 12/Nov/2018
Minor Revision
Review Comment:

I would like to thank the authors for taking the reviewers' comments into account and considerably improving the readability of the paper. While I still believe that the level of innovation is somewhat limited, the paper provides interesting results by offering a very profound evaluation of the influence of combining similarity-based and different filtering options with knowledge-graph queries.

The overall functioning of the system has become much clearer with the restructuring, however, the difference in numbering steps between the description on page 4 and the visualization on page 5 in Fig.1 are still somewhat confusing. There are only 10 steps in the description, but 12 in the graph. Furthermore, there seems to be no single connection to the optimizer and also step 5 in the description is rather unclear. Maybe you could improve on this part a bit more.

To make the contributions even more clear, the abstract and introduction should really talk more about query language and search strategies than system. For if this is really a system, it is entirely unclear to me on which basis the system decides when to switch between different types of queries and filtering steps.

There are also still some expressions that are not intuitive and also not explained in the paper. For instance, what is a "rollup" query pattern? The same goes for the sentence where it first occurs with "three rollup patterns" - which three patterns? Another example is the "maximum frequency among relevant resources" => frequency of what? It would also be nice to provide a succinct "these are the most important findings" of the paper at the end. The only reason this is not provided I presume is because the authors for some reason decided to omit the very much standard section "Conclusion". I strongly recommend adding both, a succinct summary in two sentences and a Conclusion.

In the Results sections the authors start by indicating a percentage of participants without ever stating the number of participants. This is absolutely necessary in order to understand this section. Only providing it in a table later on is not sufficient for such an important piece of information.

In terms of formatting, the paper needs some attention. "Table" should never be abbreviated as "Tab" or "Tabs" whereas Figure usually is abbreviated as "Fig." in the running text. When an author name is provided, such as Adomavicius et al. this needs to be followed by the reference such as "Adomavicius et al. [2]" instead of adding the reference to the end of the next sentence. This occurs several times in the paper. The reference of footnote 1 to the general resource of DBpedia has absolutely no relation to the content of the sentence where it is placed in the text. And for abbreviations, please use "Information Retrieval (IR)" somewhere before using the abbreviation. Definition 12 and 13 are identical to Definitions 6 and 7 - only the input is different. I suggest omitting those two definitions and instead stating that the IC is then calculated the same as before. As it is now, it is incorrect, since Definition 12 states "Conditional SKOS similarity" - but the similarity is not conditional and in fact the same as before.

Minor comments in order of appearance.
p. 2 ff it is politically critical to refer to the user always as he - I suggest she or s/he or he/she
p. 2 SKORecommender => SKOSRecommender
p. 3 [12, 13, 22, 24, 29, 50, 64] => are all these references necessary for such a minor point as made by this sentence?
p. 9 Table 6 reference should be Table 5 since this is the one showing the results of Q4

Review #2
Anonymous submitted on 12/Dec/2018
Review Comment:

The authors satisfactorily answer all the comments in their new version.

Review #3
By Faizan Javed submitted on 22/Jan/2019
Minor Revision
Review Comment:

The paper gives a good overview of state-of-the art techniques in LOD-enabled recommender systems and describes the design and implementation of a query-based SKOSrec engine, a recommendation system framework that leverages graph and similarity-based retrieval techniques. The paper is well organized and contains sufficient details on the implementation of the system as well as experimentation details.

The authors do mention that the SKOSrec system may not be superior for every domain and use case (e.g., page 22, TC3 discussion) – it might be worth emphasizing for which use cases the system may not perform well as it can further help define the future work roadmap; the future work section does mention the potential use of RDF vocabularies but it’s not clear whether its due to some observed deficiencies during experimentation. One possible area to experiment with is in ranking of results using some form of personalized LTR (Learning-to-Rank) techniques.

Other suggestions:

Fig. 10 is too small - it can be made large for more clarity.


section 4.1: second para:
"metdata" --> "metadata"
The experiments were carried out in the defined usage
scenarios with metdata descriptions from DBpedia.

page 19, column 1:
replace colon by comma in "Suppose in his/her profile; a consumer has "