Ontology-Driven Linked Data for Hebrew Manuscripts

Tracking #: 4067-5281

This paper is currently under review
Authors: 
Alexander Goldberg
Gila Prebor
Avshalom Elmalech

Responsible editor: 
Cogan Shimizu

Submission type: 
Full Paper
Abstract: 
This paper presents the Hebrew Manuscripts Ontology (HMO), developed within the Mapping Hebrew Manuscripts (MHM) project, as an evaluated domain ontology and transformation workflow for representing Hebrew manuscripts as Linked Open Data. The problem addressed is that rich Hebrew-manuscript metadata in the National Library of Israel remain largely locked in MARC records, limiting entity-level querying, cross-record reasoning, and future interoperability with external knowledge graphs. HMO responds with a domain-specific model aligned with CIDOC CRM and LRMoo that combines three elements: structural granularity through Bibliographic Unit, Codicological Unit, and Paleographical Unit (BU-CU-PU); an event-centric representation of production, transfer, and ownership; and an explicit epistemological layer for source attribution and status, with certainty support defined at schema level. The paper evaluates this contribution at three levels: six fully instantiated pilot manuscripts chosen to cover key structural edge cases; schema-level validation over 37 SPARQL checks executed on the released ontology files and controlled vocabularies; and a larger-scale feasibility run of the conversion workflow over 10,000 catalog records. Within the pilot, HMO supports research questions that are difficult to ask in MARC alone, such as retrieving manuscripts with multiple hands, identifying codices with more than ten codicological units, tracing transfer-of-custody chains, and distinguishing textual witnesses from bibliographic works. The current release follows a MARC-only population policy for external identifiers: Wikidata, VIAF, GeoNames, and owl:sameAs slots are defined at schema level but are not yet populated in the pilot RDF, so interoperability is demonstrated here as alignment readiness rather than completed entity linking. The ontology (OWL/TTL), SHACL shapes, pilot RDF, crosswalk, validation materials, and conversion pipeline are included in the repository materials accompanying this paper for direct inspection. The contribution claimed here is therefore an evaluated and inspectable semantic framework, not a claim that full corpus-scale reconciliation or community- wide uptake has already been achieved.
Full PDF Version: 
Tags: 
Under Review