Review Comment:
General:
This research work surveys the deep learning models of entity linking which have been developed based on neural networks. It looks into the relevant publications of the last 5 years (since 2015) and as claimed by the authors attempts to provide a comprehensive review of the existing approaches. Due to the influence of deep learning on entity linking (EL) approaches, there is a need for such a survey which is covered in this work. Overall, the entity linking approaches are divided into: entity recognition, and entity disambiguation. Furthermore, each of these components is analyzed by looking into the already existing works. The survey continues by providing features on neural-based EL models on these two components. Finally, it classifies the models based on the introduced features. The survey also touches the application of EL approaches and closes the work by providing some future directions. Overall, it is a good start and focuses on an important topic but it is not ready as a final version.
Weak points:
The survey suffers from an imbalance in the descriptions provided for different concepts such as Knowledge graphs and Embedding and also different sections such as evaluation. In terms of the topics, some are well done and some parts are only touched roughly. Most of the explorations about the existing approaches of EL sounds like a general statement and reporting of what they do, rather than a systematic evaluation including their weaknesses, strength, and gaps and suggestions. In this way, the survey stays as a report rather than an insightful research survey. Although one can not completely cover everything in a survey, the most relevant parts should be properly covered. Knowledge Graphs, Knowledge graph Embedding Models, information networks, and many other relevant concepts are overlooked.
In terms of the description provided for different parts, section 4 for evaluation could still be extended to match the strength of other parts of the paper. This can be done from two aspects: 1) coverage of the evaluation from collected existing works could be presented better because normally entity linking publications have a rich evaluation section. These are not covered in this evaluation.
The authors could use interesting evaluation methods to explore the collected information more, and also use lots of visualization tools to illustrate them.
There is a short touch on embedding models used in this context. ‘ERNIE [123] expands the BERT [22] architecture with a knowledgeable encoder (K-Encoder), which fuses contextualized word representations obtained from the underlying self-attention network with entity representations from a pre-trained TransE model [8].’ However, it does not discuss what other embedding models could be used. Even if it is not touched by already existing approaches, the authors could summarize them as a future direction and give some insight into their effect as a component influencing the outcome of EL.
Section 5 is very brief, and barely covers the recent applications of EL in domain-specific challenges such as Medlinker. Also, it seems the authors do not present any future direction in this regard. I would have expected separate sections for collections of works based on the application domains, bringing in tables with more analysis.
Another aspect that is helpful to mention is time complexity and computational cost comparison of different models.
Questions to the authors:
- Why are knowledge graphs completely skipped to be mentioned in this survey? They are the most important technology of recent years which have an enormous effect on a large number of deep learning models. The authors stay with the notion of the knowledge base in the entire survey and the knowledge graph only appears in the title of some references.
- What are the time complexity of different models and which of them are preferred in this regard?
- What is the effect of hyperparameter search and the used setting of each approach on the presented accuracies? (e.g., the number of layers, etc.)
Recommendations to the authors:
- The introduction starts with a subsection where Knowledge Bases (KBs) are immediately mentioned without proper definition.
- The opening of the introduction section needs a careful check and edition. It is recommended to have an introductory paragraph before section 1.1 starts.
- Section 1.2 then needs to start and continue with KB and KG, with a clear distinction using previous definitions.
- Figure 4 is not clear, what are the x and y axis.
- Further extension of Section 5 is required.
- The starting point and the motivation behind the survey are very promising, however, the evaluations are not that impressive and do not completely live up the promise.
Some missing related work:
- Shine+: A general framework for domain-specific entity linking with heterogeneous information networks
- REL: An entity linker standing on the shoulders of giants
- Medlinker: Medical entity linking with neural representations and dictionary matching
- Fast and accurate entity linking via graph embedding
- End-to-End Entity Linking and Disambiguation leveraging Word and Knowledge Graph Embeddings
- PNEL: Pointer Network based End-To-End Entity Linking over Knowledge Graphs
- 5* Knowledge Graph Embeddings with Projective Transformations
|