Neural Entity Linking: A Survey of Models Based on Deep Learning

Tracking #: 2699-3913

Authors: 
Özge Sevgili
Artem Shelmanov
Mikhail Arkhipov
Alexander Panchenko
Chris Biemann

Responsible editor: 
Guest Editors DeepL4KGs 2021

Submission type: 
Survey Article
Abstract: 
In this survey, we provide a comprehensive description of recent neural entity linking (EL) systems developed since 2015 as a result of the "deep learning revolution" in NLP. Our goal is to systemize design features of neural entity linking systems and compare their performances to the best classic methods on the common benchmarks. We distill generic architectural components of a neural EL system, like candidate generation and entity ranking summarizing the prominent methods for each of them, such as approaches to mention encoding based on the self-attention architecture. The vast variety of modifications of this general neural entity linking architecture are grouped by several common themes: joint entity recognition and linking, models for global linking, domain-independent techniques including zero-shot and distant supervision methods, and cross-lingual approaches. Since many neural models take advantage of pre-trained entity embeddings to improve their generalization capabilities, we provide an overview of popular entity embedding techniques. Finally, we briefly discuss applications of entity linking, focusing on the recently emerged use-case of enhancing deep pre-trained masked language models such as BERT.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 15/Feb/2021
Suggestion:
Minor Revision
Review Comment:

The paper "Neural Entity Linking: A Survey of Models Based on Deep Learning", is a survey where the authors propose a comprehensive description of recent neural entity linking systems developed since 2015. Moreover, they provide an overview of popular entity embedding techniques since many neural models take advantage of pre-trained entity embeddings to improve their generalization capabilities.
The paper is well written, structured and technically sound.
Its contributions are a survey of state-of-the-art neural entity linking models, features tables for neural entity linking, a survey of entity and context/mention embedding techniques, a discussion of recent domain-independent and cross-lingual entity linking approaches and a survey of entity linking applications to modelling word representations. I appreciated the section 3.1 where the general architecture for entity linking based on neural networks is depicted. A lot of methods for entity linking have been analysed, discussed, commented, and included in the description, experiments and features tables. Definitions have been given too for the general reader. Table 3 has been particularly appreciated for the presented architectural features. Figure 8 and Table 5 show the performance of the best entity linking models and neural entity disambiguation on different state-of-the-art datasets. Applications of the entity linking are also illustrated in Section 5.1 where different recent works are referenced.

There are several typos that the authors should take care of and sometimes English grammar needs fix. Also, more examples about entity linking and entity disambiguation shown in text (and not in images) should help the reader to enjoy the reading.

MINOR
At the end of section 1.3 "We also the first"
Section 2.2 "Formally, we use an ER function takes as input a textual context"
Section 2.3 "More or less the same technologies and models sometimes called differently in the literature"

The following paper should be included in the survey as one of the very recent knowledge graphs produced within scholarly data and where Entity Recognition or Entity Disambiguation may be executed.
Danilo Dessì, Francesco Osborne, Diego Reforgiato Recupero, Davide Buscaldi, Enrico Motta:
Generating knowledge graphs by employing Natural Language Processing and Machine Learning techniques within the scholarly domain. Future Gener. Comput. Syst. 116: 253-264 (2021)

Section 4.1.2, "We start our presentation of results of from the disambiguation only models"
Section 5.2 "As they both just some neural networks"

Review #2
By Mojtaba Nayyeri submitted on 17/Mar/2021
Suggestion:
Major Revision
Review Comment:

General:

This research work surveys the deep learning models of entity linking which have been developed based on neural networks. It looks into the relevant publications of the last 5 years (since 2015) and as claimed by the authors attempts to provide a comprehensive review of the existing approaches. Due to the influence of deep learning on entity linking (EL) approaches, there is a need for such a survey which is covered in this work. Overall, the entity linking approaches are divided into: entity recognition, and entity disambiguation. Furthermore, each of these components is analyzed by looking into the already existing works. The survey continues by providing features on neural-based EL models on these two components. Finally, it classifies the models based on the introduced features. The survey also touches the application of EL approaches and closes the work by providing some future directions. Overall, it is a good start and focuses on an important topic but it is not ready as a final version.

Weak points:
The survey suffers from an imbalance in the descriptions provided for different concepts such as Knowledge graphs and Embedding and also different sections such as evaluation. In terms of the topics, some are well done and some parts are only touched roughly. Most of the explorations about the existing approaches of EL sounds like a general statement and reporting of what they do, rather than a systematic evaluation including their weaknesses, strength, and gaps and suggestions. In this way, the survey stays as a report rather than an insightful research survey. Although one can not completely cover everything in a survey, the most relevant parts should be properly covered. Knowledge Graphs, Knowledge graph Embedding Models, information networks, and many other relevant concepts are overlooked.
In terms of the description provided for different parts, section 4 for evaluation could still be extended to match the strength of other parts of the paper. This can be done from two aspects: 1) coverage of the evaluation from collected existing works could be presented better because normally entity linking publications have a rich evaluation section. These are not covered in this evaluation.

The authors could use interesting evaluation methods to explore the collected information more, and also use lots of visualization tools to illustrate them.
There is a short touch on embedding models used in this context. ‘ERNIE [123] expands the BERT [22] architecture with a knowledgeable encoder (K-Encoder), which fuses contextualized word representations obtained from the underlying self-attention network with entity representations from a pre-trained TransE model [8].’ However, it does not discuss what other embedding models could be used. Even if it is not touched by already existing approaches, the authors could summarize them as a future direction and give some insight into their effect as a component influencing the outcome of EL.

Section 5 is very brief, and barely covers the recent applications of EL in domain-specific challenges such as Medlinker. Also, it seems the authors do not present any future direction in this regard. I would have expected separate sections for collections of works based on the application domains, bringing in tables with more analysis.
Another aspect that is helpful to mention is time complexity and computational cost comparison of different models.

Questions to the authors:
- Why are knowledge graphs completely skipped to be mentioned in this survey? They are the most important technology of recent years which have an enormous effect on a large number of deep learning models. The authors stay with the notion of the knowledge base in the entire survey and the knowledge graph only appears in the title of some references.

- What are the time complexity of different models and which of them are preferred in this regard?

- What is the effect of hyperparameter search and the used setting of each approach on the presented accuracies? (e.g., the number of layers, etc.)

Recommendations to the authors:

- The introduction starts with a subsection where Knowledge Bases (KBs) are immediately mentioned without proper definition.

- The opening of the introduction section needs a careful check and edition. It is recommended to have an introductory paragraph before section 1.1 starts.

- Section 1.2 then needs to start and continue with KB and KG, with a clear distinction using previous definitions.

- Figure 4 is not clear, what are the x and y axis.

- Further extension of Section 5 is required.

- The starting point and the motivation behind the survey are very promising, however, the evaluations are not that impressive and do not completely live up the promise.

Some missing related work:
- Shine+: A general framework for domain-specific entity linking with heterogeneous information networks
- REL: An entity linker standing on the shoulders of giants
- Medlinker: Medical entity linking with neural representations and dictionary matching
- Fast and accurate entity linking via graph embedding
- End-to-End Entity Linking and Disambiguation leveraging Word and Knowledge Graph Embeddings
- PNEL: Pointer Network based End-To-End Entity Linking over Knowledge Graphs
- 5* Knowledge Graph Embeddings with Projective Transformations

Review #3
By Daza Cruz submitted on 27/May/2021
Suggestion:
Minor Revision
Review Comment:

The authors present a review of deep learning methods for entity linking. The problem is well motivated, and the differences with previous surveys are clear. The authors make use of a modular architecture to describe the most common approaches towards entity linking, which results effective at encompassing the vast literature on the topic and helps understanding prior work. Other methods that are not entirely described by this architecture are described as well. Methods are also compared in detail in a table listing different aspects. Lastly, the authors provide a qualitative assessment of the performance of such methods, and a discussion of open challenges and directions for future work.

As such, the survey does a good job at collecting and describing existing work in entity linking, and would be an extremely useful resource for researchers, PhD students or practitioners interested in a comprehensive overview of the field. I believe the submission can be accepted for publication after a number of clarifications and changes that would improve its clarity, and correct some assertions made in it.

1. In Page2-Line43-left the authors mention "deep" distributed representations. What does it mean for a representation to be deep?
2. The notation used for formal definitions is not entirely clear. It is introduced in section 2.2, and refers to notation presented by Ganea et al., but it would be useful for the reader to make this more clear without having to refer to another paper, specially when the notation is used throughout the paper. Eq. 1 defines a function ER defined in terms of a context C and a set M but M is not explicitly defined. Is is a set of words in the context? Can they also be spans of multiple words? The later definition of entity disambiguation, in line 30, could also benefit by emphasizing that E is the set of entities in the KB that we are trying to link to.
3. In P4-L47-left local and global approaches towards EL are introduced, but it seems to me that this should not be part of the formal definition of EL but rather to a section on approaches towards solving the problem (as described in sec. 3.2.2).
4. P6-L24-left: "The candidates are also generated...", with this sentence, are you still describing the method in the previous paragraph, or is this introducing another way of generating candidates?
5. Most of the methods described are supervised, which means that labeled entity linking data is required to train them. However, some of these methods are still described as not requiring annotated data, which might be misleading. In tables 2 and 3, Wu et al. (2019) is marked as not requiring annotated text, even though it does. This work does zero-shot entity linking in the sense that it can link to entities not seen during training, but training is done with a labeled entity linking dataset thus requiring annotated text. This should also be made clear in P13-L26-right: "Recently proposed zero-shot techniques pushed EL labeled data requirements to the minimum", and also in P17-L36-left. Some of these methods are even pretrained over Wikipedia, which can be seen as a large labeled dataset for entity linking.
6. P14-L1-right: Why is there such a gap between bi-encoders and cross-encoders?
7. The way metrics are computed in entity linking is often not discussed in papers, and metrics like F1 score can actually be computed in different ways depending on how e.g. a false positive is defined. I think section 4.1.1 could benefit from more details about this, and would be very relevant for a survey paper.
8. Section 4.2.1 could emphasize that what is evaluated here specifically are the representations of entities in the KB (e.g. embeddings) rather than other parts used in the EL pipeline like entity recognition and disambiguation.

Some minor issues in the text:

- P4-L24-right: "More or less the same technologies and models sometimes called differently in the literature" might be missing a word.
- Some of the footnotes containing URLs could use a descriptive text of what they point to.
- P9-L26-right: "Only entities are remained..."
- P10-L38-left: "Consider we have k candidates..."
- P21-L31-left: "As they both just some neural networks..."

Lastly, a minor suggestion would be to include GENRE [1], which was presented recently at ICLR 2021. I understand that this was published after the author's submission to the journal, but it would be a great addition to the survey as it is a state-of-the-art method that also proposes new ideas that depart from many of the methods described in the paper.

[1] Nicola De Cao, , Gautier Izacard, Sebastian Riedel, and Fabio Petroni. "Autoregressive Entity Retrieval." In International Conference on Learning Representations, 2021.