Review Comment:
In this work, the authors present an extension to the CIDOC CRM ontology to model the archaeometric data of an excavation with. This model, called BeArchaeo, integrates with other archaeological ontologies, and is current in use in an excavation project with the same name in Japan. The authors explain the model and modelling choices, place it into an archaeological context using many examples, and evaluate it during a workshop. While already in use, the model is mentioned several times as having shortcomings that will be acted on in the future.
The paper gives a complete overview of the BeArchaeo model, and is generally well written but still sometimes difficult to follow because of the verbosity of the text and the great many details and domain-specific examples. A Long-term Stable Link to Resources is provided, which links to GitHub. Overall, the paper is an improvement on the authors' previous version, but I still have a few comments I'd like to see addressed.
On pp.9, in figure 4, the authors show an hierarchical pyramid which depicts how the archaeological CIDOC modules fit together. Since the core CIDOC model is the basis on which these extensions are modelled, should the pyramid not be upside down, having the core CIDOC model at the bottom?
On various occasions, the authors dismiss the (different) shortcomings of their model and modelling decisions as something to be implemented or investigated in future work. However, many of these shortcomings would, once solved, require the archaeologist to update any information which has already been modelled using the BeArcheo ontology, with little incentive to do so. Also, on pp 19, it is mentioned that the evaluation showed that the model currently misses several features. While deferring extensions to the future is nothing new and fine with nice-to-have but not necessary features, the way it comes across here feels like the model is still halfway in development, whereas the authors present is as ready-to-use. Since the model is aimed at data curation and preservation, I am struggling to understand why the authors want to publish the seemingly unfinished version, especially since the shortcomings affect future processing of past datasets. Consider for example, the encoding of the meta data of a device in an unstructured form as a string literal (pp.14). The authors acknowledge the downside of this and promise future improvements, but once this information is modelled as such there will likely be little incentive for end-users to actually update it, nor would an automated tool be able to convert this unstructured piece of information.
## Minor comments
- pp.2 left, line 40: 'on the other' -> 'additionally' or similar word
- pp.2 right, line 4: 'consists in' -> 'consists of'
- pp.3 right, line 3: acronym DDC used, but only defined on pp. 6
- pp.14 left, line 35: add 'and' to last section of enumeration
- pp.14 left, line 47: add 'and' before 'scale'
- pp.14 right, line 49: add 'and' before 'metabarcoding'
|