Transdisciplinary approach to archaeological investigations in a Semantic Web perspective

Vincenzo Lombardo
Tugce Karatas
Monica Gulmini
Laura Guidorzi
Debora Angelici

Special Issue Cultural Heritage 2021

Full Paper
In recent years, the transdisciplinarity of archaeological studies has greatly increased because of the mature interactions between archaeologists and scientists from different disciplines (called ``archaeometers''). A number of diverse scientific disciplines collaborate to get an objective account of the archaeological records. A large amount of digital data support the whole process, and there is a great value in keeping the coherence of information and knowledge, as contributed by each intervening discipline. During the years, a number of representation models have been developed to account for the recording of the archaeological process in data bases. Lately, some semantic model, compliant with the CRMarchaeo reference model, has been developed to account for linking the institutional forms with the formal knowledge concerning the archaeological excavations and the related findings. On the contrary, the archaeometric processes have not been addressed yet in the Semantic Web community and only an upper reference model, called CRMsci, accounts for the representation of the scientific investigations in general. This paper presents a modular computational ontology for the interlinked representation of all the facts related to the archaeological and archaeometric analyses and interpretations, also connected to the recording catalogues. The computational ontology is compliant with CIDOC-CRM reference models CRMarchaeo and CRMsci and introduces a number of novel classes and properties to merge the two worlds in a joint representation. The ontology is in use in ``Beyond Archaeology'', a methodological project for the establishing of a transdisciplinary approach to archaeology and archaeometry, interlinked through a semantic model of processes and objects.
Review #1
By Xander Wilcke submitted on 19/Mar/2022
Minor Revision
In this work, the authors present an extension to the CIDOC CRM ontology to model the archaeometric data of an excavation with. This model, called BeArchaeo, integrates with other archaeological ontologies, and is current in use in an excavation project with the same name in Japan. The authors explain the model and modelling choices, place it into an archaeological context using many examples, and evaluate it during a workshop. While already in use, the model is mentioned several times as having shortcomings that will be acted on in the future.

The paper gives a complete overview of the BeArchaeo model, and is generally well written but still sometimes difficult to follow because of the verbosity of the text and the great many details and domain-specific examples. A Long-term Stable Link to Resources is provided, which links to GitHub. Overall, the paper is an improvement on the authors' previous version, but I still have a few comments I'd like to see addressed.

On pp.9, in figure 4, the authors show an hierarchical pyramid which depicts how the archaeological CIDOC modules fit together. Since the core CIDOC model is the basis on which these extensions are modelled, should the pyramid not be upside down, having the core CIDOC model at the bottom?

On various occasions, the authors dismiss the (different) shortcomings of their model and modelling decisions as something to be implemented or investigated in future work. However, many of these shortcomings would, once solved, require the archaeologist to update any information which has already been modelled using the BeArcheo ontology, with little incentive to do so. Also, on pp 19, it is mentioned that the evaluation showed that the model currently misses several features. While deferring extensions to the future is nothing new and fine with nice-to-have but not necessary features, the way it comes across here feels like the model is still halfway in development, whereas the authors present is as ready-to-use. Since the model is aimed at data curation and preservation, I am struggling to understand why the authors want to publish the seemingly unfinished version, especially since the shortcomings affect future processing of past datasets. Consider for example, the encoding of the meta data of a device in an unstructured form as a string literal (pp.14). The authors acknowledge the downside of this and promise future improvements, but once this information is modelled as such there will likely be little incentive for end-users to actually update it, nor would an automated tool be able to convert this unstructured piece of information.

## Minor comments

- pp.2 left, line 40: 'on the other' -> 'additionally' or similar word
- pp.2 right, line 4: 'consists in' -> 'consists of'
- pp.3 right, line 3: acronym DDC used, but only defined on pp. 6
- pp.14 left, line 35: add 'and' to last section of enumeration
- pp.14 left, line 47: add 'and' before 'scale'
- pp.14 right, line 49: add 'and' before 'metabarcoding'

Review #2
By Alessandro Adamou submitted on 19/Apr/2022
Review Comment:

This is the third version of a submitted paper presenting the beArachaeo project, its ontology that bridges archaeology and archaeometrics, and its underpinnings.

The paper has made significant progress in its subsequent iterations: it makes a clearer point in regards to the strength and novelty of its contribution, is organised in a better-flowing way and the preliminary evaluation is delivered with greater detail and effect. I have no further requests to make before recommending it for acceptance in the journal, other than a few minor corrections listed below (page:line)

3:17 "have not found their way"
3:48 "specialized" (as it looks like the authors have opted or US English throughout)
4:4 check that the capitalisation of "ARIADNEplus" is consistent
8:4 "interfaced by *an* Omeka-S-based web platform"
10:17 again for the sake of US English, replace "centred" with "centered" (also in other parts of the paragraph)
10:40 (right) "The goal *was* to employ a semantic database"
11:44 "aligned with *the* core CIDOC-CRM"
12:47 (footnote 25) "While the other terms in this list *come* from..."
17:45 "has *led* to discussions"
17:47 "Again, by relying on a web platform, ..."
19:39 "has been deemed particularly valuable" ("evaluated valuable" didn't sound too good)
20:5 IRIAE is the institute for Archaeology and Ethnology, not Ethology.
20:17: "*The* Omeka-S frontend"
Please double-check Reference [12] as "J.R. J" and "L.H" don't appear to be proeprly formatted names; also "eds" is repeated twice