External Knowledge Integration in Large Language Models: A Survey on Methods, Challenges, and Future Directions

Tracking #: 3835-5049

This paper is currently under review
Authors: 
Itisha Yadav
Sirko Schindler
Diana Peters
Roman Klinger

Responsible editor: 
Guest Editors 2025 LLM GenAI KGs

Submission type: 
Survey Article
Abstract: 
Large language models (LLMs) have proved to be good in various natural language understanding (NLU) tasks. However, they face notable limitations like hallucinations, lack of contextual knowledge, and outdated or incomplete knowledge when applied across knowledge-intensive domains such as scientific research, biomedical sciences, finance, law, and others. These challenges commonly arise due to the scarcity and under-representation of domain-specific data during the training and model alignment phases, the latter being synonymous with reinforcement learning from human feedback (RLHF). Furthermore, LLMs struggle with providing nuanced expertise, as their internal knowledge remains static and generalized, hindering their ability to reason accurately or deliver context-aware results in specialized tasks. This survey investigates the integration of external knowledge into LLMs as a means to address these limitations. By investigating parametric and non-parametric approaches, this work discusses methods to enhance model reasoning capabilities, factual accuracy, and adaptability for domain-specific and knowledge intensive tasks. Additionally, it highlights the potential of external knowledge integration in improving explainability and ensuring more trustworthy outputs. This survey supports software developers and natural language processing (NLP) researchers in designing natural language understanding systems for specialized domains by leveraging pre-trained LLMs. Additionally the work provides a foundation for advancing LLM-based NLU systems with insights into future research areas.
Full PDF Version: 
Tags: 
Under Review