OntoMatcher: Leveraging Context-Aware Siamese Networks, LLMs and BioBERT for Enhanced Biomedical Ontology Alignment

Tracking #: 3477-4691

This paper is currently under review
Zakaria Hamane
Abdelhadi Fennan
Amina Samih

Responsible editor: 
Jérôme Euzenat

Submission type: 
Full Paper
Biomedical ontologies play a crucial role in knowledge representation and standardization within the biomedical domain. With the rapid growth of ontologies, the need for efficient and accurate alignment techniques has become paramount to ensure interoperability between various biomedical systems. Current ontology alignment methods often struggle to cope with the complex and dynamic nature of biomedical terminologies, resulting in suboptimal performance. In this study, we introduce a novel supervised deep learning approach for aligning biomedical ontologies, employing Large Language Models (LLMs) alongside Bidirectional Encoder Representations from Transformers for Biomedical Text Mining (BioBERT), One Dimensional Convolutional Neural Network (1D-CNN), highway networks, bi-directional long short-term memory (Bi-LSTM) and Siamese Network models. This approach captures character-level and contextual information of entities and efficiently incorporates entity descriptions and context embeddings to improve alignment accuracy. The results of our method demonstrate a significant improvement in performance, achieving an F1 score of 0.87 for match/not match classifications and 0.94 for level classifications, outperforming several baselines on benchmark datasets. These results indicate the potential of our approach, employing LLMs for data enrichement and Transformer models for embeddings, in facilitating a more effective alignment of biomedical ontologies. Ultimately, this enhances data integration and interoperability across different biomedical systems.
Full PDF Version: 
Under Review