Review Comment:
The paper discusses a Semantic Web ontology for the representation of poetry. In particular, it addresses how the core module of the ontology, called OntoPoetry Core, is aligned to CIDOC-CRM and FRBRoo, given the wide use of the latter two in the digital humanities.
I think that the topic is interesting. I'm not an expert of literature or poetry, therefore my review considers the clarity of the presentation, and the robustness of the research from an applied ontology perspective. From these views, I'm not convinced that the paper can be published in its current state for two main reasons.
First, it is not clear why the authors **align** rather than **re-engineering** their ontology to CRM/FRBRoo. This is surprising considering that, browsing the OWL file of the ontology (which I found at: https://github.com/linhd-postdata/OntoPoetry/tree/master/Core), one finds natural language annotations - for classes declared to be logically equivalent to either CRM's or FRBRoo's classes - explicitly saying that these classes have been "cloned" from CRM/FRBRoo.
Hence, the authors have first developed their own ontology by "cloning" instead of simply reusing CRM/FRBRoo; then, they developed a formal alignment. This choice leads to the duplication of several modeling elements. For instance, the modeling pattern shown in Fig. 17 duplicates what already exists in CRM without adding any novelty. I think that, by simply reusing CRM/FRBRoo, even from a logical perspective, the OntoPoetry Core module would result much cleaner and simpler without all the equivalent declarations.
I see two possibilities for the authors: the first option, I would re-engineer OntoPoetry by specializing CRM/FRBRoo with elements relevant in the scope of the presented research but avoiding duplicating classes. This is a standard approach when working with top-level ontologies and specializing them for specific domains. The second option, to motivate the needs and benefits of the adopted approach, namely, explaining why "cloning" existing ontologies and developing alignments is a better strategy rather than reusing them.
Second, there is no conceptual attempt at clarifying the introduced core notions, one for all the notion of (literary) "work". There is a huge literature on this topic in the humanities and applied ontology (see references below). FRBRoo is largely used in the digital humanities; it remains however highly ambiguous in what a work is. Indeed, on the one hand, a FRBRoo's work seems an idea in the author's mind (see, e.g., FRBRoo v.2.4, 2015, p. 27); on the other hand, it seems an entity relevant for cataloging purposes (see, e.g., the notion of F1 Complex Work). I'm wondering whether the authors are aware of this ambiguity, which seems to apply to their proposal, too; e.g., they talk of works sometimes as ideas, sometimes as abstract concepts, but it is not clear what these terms mean.
Consider two simple sentences: "The cat is on the mat" and "El gato está en la alfombra" (consider them as two poetic verses). Would the authors claim that they are two expressions, in two different languages, for the same "work"? If this is the case, isn't the notion of "work" related to that of "meaning"? I know that this discussion raises foundational questions together with a critical attitude towards FRBRoo. However, if the authors wish to bring a research contribution for the ontological characterization of (poetic) works, something about "what a work is" must be said, especially in the light of a larger view on the state of the art not limited to CRM and FRBRoo. In the applied ontology literature, the authors can find some references on this looking for the notion of "information entity" (see, e.g., Gangemi and Peroni 2016 for an approach based on ontology design patterns; the authors can also find papers related to this topic in the Applied Ontology journal).
Other remarks:
- Please clarify which version of CRM is used. Note that the latest release of the ontology includes relevant differences wrt previous versions (e.g., P78, P87 etc have been deprecated).
- At p. 6 the authors say to reuse content ontology design patterns. However, it is not clear in the paper how these patterns were adopted.
- I think that the paper could be reduced in length without the loss of relevant information. This would facilitate reading. For instance, there are sections that, to the best of my knowledge, do not add new material with respect to the state of the art but simply say how some elements of CRM/FRBRoo have been used (e.g., 5.2.2, 5.3, 5.4, section about data properties for quantities). I would recommend the authors to focus on the presentation of those aspects of the ontology which are novel with respect to the state of the art and relevant in the context of application for poetry. This would help the reader in better appreciating the authors' contribution. Some paragraphs are also redundant; e.g., footnote 15 is also part of the main text.
- The authors need to introduce the modeling elements of CRM/FRBRoo reused in the paper. Otherwise the reader cannot properly follow the discussion. E.g., what is the difference between individual and complex work in FRBRoo?
- The graphical notation used in figures 5-7 etc is not clear. What do the arrows stand for? I strongly recommend the authors to use a well-known notation like UML Class Diagram.
- Section 5.2.3. It is not clear the pattern for the representation of agent roles. If I understand correctly, the authors treat the relation "PC14 carried out by" as a class in order to tell that a person participates in an event with a certain role, that is, the relation is reified. This is a common move to represent n-ary relation (n>2) in Semantic Web languages. However, what does it mean that the class AgentRole is a subclass of "PC14 carried out by" (btw, in the OWL file, AgentRole is *equivalent* to PC14)? Intuitively, PC14 is still a relation but formally treated as a class; its instances should have three arguments, e.g., arg1 the event, arg2 the actor, and arg3 the actor's role. Differently, instances of AgentRole are **not** relations; they stand for roles played by agents when they participate in events. Looking at Fig. 23, If I understand what the authors mean to do, AgentRole should be in the place of skos:Concept.
- Section "Datatype properties related to appellations". Looking at Fig. 29, it seems that "p102 has title" is both a data property (related to xsd:string) and an object property (related to E35 Title). This needs clarification.
- It is a common practice in the development of domain or application specific ontologies, to drive the development through the use of experts' requirements (sometimes represented as competency questions). These allow us to understand whether the resulting ontology fits experts' needs. I think that it would be valuable to write a section in this direction; the authors may present a case study exploiting the ontology, possibly showing how it matches experts' requirements.
I strongly encourange the authors to put forward this research and present a new version of their paper. The authors may actually consider submitting a new version of it to a journal specialized in the digital humanities where readers likely know more about CRM/FRBRoo, and can better appreciate the presented contribution with respect to the state of the art. I would also suggest assuming a critical attitude towards CRM and FRBRoo, and to better explore the state of the art.
Some references
Eggert, P. (2019). The Work and the Reader in Literary Studies. Cambridge University Press.
Gangemi, A., & Peroni, S. (2016). The information realization pattern. In Ontology Engineering with Ontology Design Patterns (pp. 299-312). IOS Press.
Pierazzo, E. (2016). Digital scholarly editing: Theories, models and methods. Routledge.
Thomasson, A. L. (2015). The ontology of literary works. In The Routledge Companion to Philosophy of Literature (pp. 349-358). Routledge.
|