A Survey On Knowledge-Aware News Recommender Systems

Tracking #: 2889-4103

Andreea Iana
Mehwish Alam
Heiko Paulheim

Responsible editor: 
Dagmar Gromann

Submission type: 
Survey Article
News consumption has shifted over time from traditional media to online platforms, which use recommendation algorithms to help users navigate through the large incoming streams of daily news by suggesting relevant articles based on their preferences and reading behavior. In comparison to domains such as movies or e-commerce, where recommender systems have proved highly successful, the characteristics of the news domain pose additional challenges for the recommendation models. While some of these can be overcome by conventional recommendation techniques, injecting external knowledge into news recommender systems has been proposed in order to enhance recommendations by capturing information and patterns not contained in the text and metadata of articles, and hence, tackle shortcomings of traditional models. This survey provides a comprehensive review of knowledge-aware news recommender systems. We propose a taxonomy that divides the models into three categories: neural methods, non-neural entity-centric methods, and non-neural path-based methods. Moreover, the underlying recommendation algorithms, as well as their evaluations are analyzed. Lastly, open issues in the domain of knowledge-aware news recommendations are identified and potential research directions are proposed.
Full PDF Version: 

Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Peter Bloem submitted on 27/Sep/2021
Review Comment:

I have read the updated paper and the author's comments. I think the paper is much improved and I am happy to recommend that it be accepted.


The following point may be taken for what it's worth (the paper can still be accepted if it is entirely ignored), but I think it would be worthwhile to foreground the question of how suitable an approach the inclusion of knowledge actually is, and how much it actually helps the main issues of news recommendation.

The authors noted in their rebuttal that Section 2 is not about the inclusion of knowledge and that my point is addressed instead in section 5.2.5, but the depth of the latter in the section hierarchy suggests that this matter is buried a little. The whole paper is about knowledge awareness in news recommendation. That suggests that a discussion of the general challenges of news recommendation should be followed up directly by if and how knowledge awareness can address these issues.

For instance, Section 5.2.5 suggests that text representations of news articles contain ambiguities that are difficult to unpick automatically and that knowledge representations can help. But this is only a solution if we assume that entity linking has already been done. The entity linker will presumable have as much trouble with the ambiguities as the knowledge unaware recommender system.

Another example is the running case of linking Elon Musk to Robinhood. For a recommender system to make use of this knowledge, it first needs to be created somewhere, either automatically, or by humans. If it's extracted automatically, we're kicking the can down the road. Why can't recommender systems do this themselves, and cut out the middle man? If it's done by hand, the question becomes who will create all this structured knowledge in the face of the high churn of the domain?

While it is not the job of the paper to answer these questions, or to defend the idea of the inclusion of structured knowledge, I think it may be a good idea to reflect on whether these questions are answered at all in the literature, and perhaps to suggest them as avenues for future research. We are so often overly concerned with the performance achieved on a benchmark while entirely ignoring all the additional challenges of turning a high-performing system into an effective solution in the real world.

I think it's still a perfectly fine survey paper without this reflection, so the authors may ignore this point, but these are the main questions I'm left with after reading the paper.

Review #2
Anonymous submitted on 03/Oct/2021
Minor Revision
Review Comment:

This paper provides a comprehensive overview of knowledge-aware news recommender systems. The authors propose a taxonomy that divides the models into three categories: neural methods, non-neural entity-centric methods, and non-neural path-based methods. Representative recommendation algorithms from each category are thoroughly analyzed, followed by a discussion of evaluation methods and open challenges. The review is well motivated, timely and self-contained. It will make the topic accessible to people seeking guidance in getting familiar with knowledge-aware recommender systems in the news domain. Especially, the discussion of reproducibility and comparability is impressive to me.

As far as I can judge, all the comments from the reviews in the first round have been successfully addressed.

I have spotted a few errors that need to be corrected for the final version of the article to be published:

1) Missing references on knowledge-aware news recommendation:
a. Sheu, H. S., & Li, S. (2020). Context-aware graph embedding for session-based news recommendation. In Fourteenth ACM conference on recommender systems, pp. 657-662.
b. Sheu, H. S., Chu, Z., Qi, D., & Li, S. (2021). Knowledge-Guided Article Embedding Refinement for Session-Based News Recommendation. IEEE Transactions on Neural Networks and Learning Systems.

2) Open Issues and Future Directions should be categorized for better readability.

Review #3
Anonymous submitted on 05/Nov/2021
Major Revision
Review Comment:

The survey gives an overview of the field of knowledge-aware new recommender systems which differentiates itself from other surveys on news recommender systems due to the focus on knowledge-based solutions. It proposes a simple (maybe too simple) taxonomy by means of which the various works from the literature are presented. Next to the algorithms, also evaluation methods and future research directions are investigated. The paper is in general well-written and relatively easy to follow. Below, comments are given which could help the authors improve their paper.

It would be nice to present the various algorithms based on their commonalities and keep the presentation at a more abstract level as sometimes the level of details given is too specific but not enough for a full comprehension. Also, a glossary of terms and/or acronyms could be useful given the extensive technical jargon used (similar to Table 4).

For the Hermes-related papers the Data Source is always Reuters (the same 100 news items) and not the Hermes News Portal or Unknown as claimed in Tables 6 and 7 (which are also inconsistent with one another with respect to this aspect). This is valid for CF-IDF, SF-IDF, SF-IDF+, Bing-SF-IDF, Bing-SF-IDF+, CF-IDF+, Bing-CF-IDF+, Bing-CSF-IDF+, and Bing-SS.

The authors touch on information networks as representations of knowledge graphs (outside the news domain) but there are works not referred to which deal with feature selection across various network paths and weighting these by considering node’s centrality:

Bart van Rossum, Flavius Frasincar: Augmenting LOD-Based Recommender Systems Using Graph Centrality Measures. ICWE 2019: 19-31

Thomas Wever, Flavius Frasincar: A Linked Open Data Schema-Driven Approach for Top-N Recommendations. SAC 2017: 656-663

Tommaso Di Noia, Vito Claudio Ostuni, Paolo Tomeo, Eugenio Di Sciascio: SPrank: Semantic Path-Based Ranking for Top-N Recommendations Using Linked Open Data. ACM Trans. Intell. Syst. Technol. 8(1): 9:1-9:34 (2016)

Vito Claudio Ostuni, Tommaso Di Noia, Eugenio Di Sciascio, Roberto Mirizzi: Top-N Recommendations from Implicit Feedback Leveraging Linked Open Data. RecSys 2013: 85-92

In the surveyed approaches it is not clear if there are knowledge-based solutions for sequential recommender systems for news. In the open issues, Hermes is able to update the knowledge base based on information from news:

Flavius Frasincar, Jethro Borsje, and Frederik Hogenboom: Personalizing News Services Using Semantic Web Technologies. E-Business Applications for Product Development and Competitive Growth: Emerging Technologies, In Lee (Ed.), Chapter 13, pages 261-289, IGI Global (2011)

and tOWL is able to store dynamic information in a knowledge base also coming from news:

Viorel Milea, Flavius Frasincar, Uzay Kaymak: tOWL: A Temporal Web Ontology Language. IEEE Trans. Syst. Man Cybern. Part B 42(1): 268-281 (2012)

Other comments:
-throughout the manuscript: do not use indentation before “where” when you explain on a new line what the terms of the previously mentioned equation mean
-page 1: in title “on” instead of “On”
-page 10: “models use” instead of “model uses”
-page 13: “two representations” instead of “two profiles”
-page 14: text goes outside borders
-page 20: “artificial intelligence.” instead of “artificial intelligence .” (delete extra space)
-page 21: “from the least common subsumer to” instead of “lowest to”
-page 25: “context (e)” instead of “context(e)” (insert space)
-page 25: “Eqs. (50)” instead of “Eqs.(50)” (insert space)
-page 26: e(i) missing in Eq. (57)
-page 29: explain the symbol with a dot in Eq. (68)
-page 28: “as well as” instead of “as well and”
-page 35: for the row starting with Bing-SF-IDF [60] following text needs to move one column to the right “Reuters, English, …” and end with “N/A”
-page 42: “comprise many” instead of “comprise of many”
-page 44: “user interests” instead of “user interest”
-page 47: “WordNet” instead of “Wordnet”