Review Comment:
This paper addresses a relevant problem in the CH domain, meaning the automatic extraction and creation of linked data descriptions under low-resource conditions. Authors evaluate information extraction (NER and RE) models on non-English retrodigitized documents with OCR noise, aligning extracted information with ArchOnto. In this revised version, fine-tuned BiLSTM-CRF models are compared against zero-shot transformer-based models in different settings.
Strengths
The problem addressed is particularly relevant for GLAM institutions, which usually operate with limited computational resources and challenging documents (non annotated, non-English, noisy).
The paper is well-structured and the pipeline is clearly described. Compared to the previous version, this revision shows improvements: results are reported more extensively, the dataset creation process is transparent, and models of different architectures and sizes are evaluated.
Weaknesses
The main limitation of this paper (acknowledged by the authors) is the limited size and composition of the domain-specific evaluation data. Ner.spec.OCR and Ner.spec.Human datasets only comprise 13 documents and a limited set of entity types and relations, which make results hard to generalize. Additionally, in the RE task there is a lack of baseline, since only the GLiREL model is evaluated due to insufficient training data. As such, the RE-related aspects of RQ1, RQ4 and RQ5 are not fully explored and the RQs might need to be reshaped. Finally, I would like to underscore the fact that the choice of GLiNER’s entity labels (e.g., replacing Group with Organization) may have influenced model’s predictions and results.
Minor concerns
- The list of of contributions in the Introduction skips contribution 2).
- The future work section would benefit from greater specificity.
- Typos in Sec 6.2.1 and 6.2.2: words are missing a final 's' in a few places (e.g. 'we evaluated the model…', 'of extracting relation between…').
|