CURIOCITY: A Cultural Heritage Ontology for Urban Tourism

Tracking #: 2773-3987

Authors: 
Alexander Pinto
Yudith Cardinale
Irvin Dongo
Regina Ticona-Herrera

Responsible editor: 
Special Issue Cultural Heritage 2021

Submission type: 
Full Paper
Abstract: 
Urban tourism information available on Internet has been of enormous relevance to motivate the tourism in many countries. There exist many applications focused on promoting and preserving the cultural heritage, through urban tourism. However, there is still a lack of a well-defined and standard model for representing the whole knowledge of this domain, thus ensuring interoperable applications. Current studies propose the use of ontologies to formally model such knowledge. Nonetheless,they only represent partial knowledge of cultural heritage and are restrictive to an indoor perspective (i.e., museum ontologies). In this context, we have proposed the ontology CURIOCITY (Cultural Heritage for Urban Tourism in Indoor/Outdoor environments of the CITY), to represent the cultural heritage knowledge based on UNESCO’s definitions. CURIOCITY ontology has a three-level architecture (Upper, Middle, and Lower ontologies) in accordance with a purpose of modularity and levels of specificity. In this paper, we describe in detail all modules of CURIOCITY ontology and perform a comparative evaluation to show how it outperforms state-of-the-art ontologies. Additionally, to demonstrate the suitability of CURIOCITY ontology, we have developed a first version of a framework, which allows transforming a museum data repository (in CSV format) to RDF triples of CURIOCITY ontology. Thus, it is possible to automatically populate the CURIOCITY repository, from which a set of tourism applications and services can be developed.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Pasquale Lisena submitted on 26/Apr/2021
Suggestion:
Major Revision
Review Comment:

The paper introduces CURIOCITY, a CIDOC-CRM-based ontology for describing entities referring to the cultural heritage world. The proposed model is interesting and well structured, relevant to the community. The paper is clearly written. However, I have to point out some limitations.

There are several cultural heritage ontologies in literature, which have not been compared with CURIOCITY. Some of them are ArCo (http://wit.istc.cnr.it/arco/) and CrossCult (https://doi.org/10.1007/978-3-319-67162-8_35).

The "Music Ontology" has been proposed for describing the music information. It is the only reused model not based on CIDOC-CRM, while CIDOC-CRM-based alternatives are present (see https://dl.acm.org/doi/pdf/10.1145/3243907.3243910, https://qmro.qmul.ac.uk/xmlui/handle/123456789/68018 and https://www.eurecom.fr/fr/publication/5565/download/data-publi-5565.pdf ). The authors should better justify this choice.

The evaluation has been conducted using an approach coming from the literature. For the lexical and structural approach, CURIOCITY has been compared with CIDOC CRM. However, CURIOCITY itself is based on CIDOC CRM. This comparison can gain meaning only when involving competitors (see above). So I invite the authors to re-evaluate the ontology comparing it with another CH model, possibly with more than one.

About experts' opinions, it is not clear where the list of possible answers (x-axes in both Fig.11 and 12) is coming from. From what I read, this is completely arbitrary, while it would have been more solid if taken from some UNESCO list (e.g. 2.1) or by the experts themselves (with free-text answers, instead of checkboxes). And also in this case, why a relevant field as Languages in not in CURIOCITY? Can we have also here a comparison with competitors?

Detailed comments:
- 2.2. Work [32] is not published yet, so I can't fully understand the motivations behind the presented categorisation. The authors may want to summarise them also here
- 2.2. "elements to extend the Site concept". Do you mean the Place concept? "Site" has never been defined
- Note 10: a curly bracket invalidates the link
- Fig.2 needs some legend about arrows (when do they mean subclasses? when link by property?)
- Fig.3. It is not clear why some properties (cit:Quality, crm:Dimension) are attached to cit:SiteCH instead of crm:site
- 4.2.5 While for Places and Events we have the "normal" version and the "CH" version (SiteCH and EventCH), we have not this for food. Given that cit:Food is also used for representing the ingredients (which are possibly not categorisable as heritage), the definition of a cit:FoodCH seems to be crucial.
- Fig.8 can be improved by marking the cit: classes in blue (such as in other figures)
- 5.1 Why those values of alpha and beta?
- 5.1 Are all CRM classes /properties considered as part of the model for the Lexical Similarity? If so, I don't really understand why we are doing this.
- Fig. 12. The legend should show the field in order of strength (moderately necessary < necessary < very necessary)
- 6.2.2 Is the author intervention required at every parsing on the csv? Is any learning/caching strategy implemented?
- 6.2.2. How the specialists' reviews are conducted? There is a tool for it? Can you give some stats about this task?
- 7. I think there is some future work missing here. How do you want to apply and extend this model?

I additionally recommend proofread in order to correct typos, in particular in tables and figures.

Review #2
Anonymous submitted on 29/Jul/2021
Suggestion:
Major Revision
Review Comment:

The authors present their work on a new ontology to model cultural heritage for urban tourism. They provided an overview of the existing related work in the field, and their proposed ontology, supported by different evaluation approaches and a use-case that includes a framework with a web-based and desktop application.

The paper targets an interesting topic that is very relevant to the SWJ Special Issue on Cultural Heritage and Semantic Web.

With respect to the quality of writing, overall the paper is easy to read, but contains language related issues that should be fixed. I will mention some examples of minor issues to address at the end of the review, but a thorough editing is required before publishing it.

The paper can benefit from further improvements that I detail below.

With respect to the flow, I found it a bit strange that in Section 2.2, the authors' related work [32] was introduced at this level. This categorization scheme is the same as the one followed in the current paper right? I checked [32], and it has a lot of similarities with the current paper, e.g., see Table 1 in [32] and the case study. If this is the case, I recommend to move this part as part of the approach.

Concerning the proposed approach, from the current paper, the rational behind the design of the ontology should be made explicit. It wasn't clear to me why you are proposing the extension of the CIDOC CRM ontology. Was it to fulfill the needs of an certain use-case? Certain user requirements? Currently the paper is directly proposing certain concepts to be extended, with certain properties, without any reasoning, support, and clear motivation.

From your related work in Table 1, it seems that Finto has a wider coverage of your target entities compared to CIDOC CRM. Why did you choose to extend CIDOC CRM and not Finto?

I found some inconsistencies between Figure 1 and Figure 2, where Food, Music, etc are considered in the Middle level in Figure 1, however they are defined in the upper ontology in Figure 2.

It is not clear why the ontology was designed in three layers, upper, middle and low. Then Music and Food was treated as middle module, while museum artworks are low level. Please elaborate on the design choices you made at this level.

With respect to the originality of the work, it seems to me that the bulk of the proposed CURIOCITY ontology is based on CRM, and CURIOCITY only extends it in a minor way. See for example Figure 8. Five classes (performance, SiteCH, Food, Music, Performing arts) are the extended parts in the whole ER diagram. This is also reflected in the examples presented in Tables 6 and 7. I recommend having a clear table with the number of classes, relations, etc that are provided mainly by CURIOCITY.

It would be good to highlight if you connected to ontologies beyond CRM? For example you mention DBpedia in Figure 1, how did you manage such linkages?

Given that the way currently the paper is framed with having the ontology as the main contribution of the work, I recommend having an online access to the ontology elements with clear descriptions.

The major issue I found in the paper, and requires attention is at the evaluation level. It is stated on Page 13, line 5: "In our case, the golden standard is defined by the knowledge categorization presented in Section 2.2 and CIDOC CRM Standard, available as ERLANGEN-CRM" But Section 2.2 is your proposed work in [32], which is part of your current methodology. It cannot be used as a gold standard. It's as if you are comparing your work to your work.

As stated above, CURIOCITY seems to only extend the existing CIDOC CRM in a minor way. And this is actually confirmed in the lexical and structural comparison results, where for example you mention on Page 14, Line 10 that "It means that these ontologies have more that 77% similar terms. We expected such close similarity value, since both ontologies are derived from the CIDOC CRM Standard". Of course this is expected, so what did you learn from the results? Same thing for Figure 10. If your findings are not that significant, I recommend dropping this evaluation measure from your study.

In my opinion the case-study provides a good indication of the usability of the ontology. It gives a concrete example of how the ontology is used. Maybe this can be used as a starting point and motivation to guide your ontology design decisions that are currently missing? And it can be a good way to go deeper at the evaluation level for example by showing gaps between the data source from which the data is extracted, and the ontological entities. This may also provide additional qualitative evaluation of the applications and overall approach.

It would be good to provide the link to the developed web application to see how it works.

Minor issues:
- Section 1, last paragraph: you have Section 5 missing from the list.
- Page 2, Line 1: "Nowadays, cultural heritage is supported on communication" --> Nowadays, cultural heritage is supported by communication
- Page 2, Line 20: "urban tourism on Internet are" --> urban tourism on the Internet are
- Page 5, Column 2, Line 13: "publishing and use of vocabularies" --> publishing and using vocabularies
- Be careful when using different tenses, e.g. Page 6, paragraph 1, you refer to the related work in present tense, and then in the following paragraph you shift to the past tense.
-Page 6, last paragraph: "Marchenkov et al. [20] propose an ontology aimed at developing a digital environment oriented to visitors and museum service staff, that offers personal recommendations" --> What is offering personal recommendations? The ontology, environment, or staff?
- In Table 1, "What is Person Ext."?
Figures 3 to 6 labels: You mean "Representation" instead of "reasoning"?
- Section 6 Title: "Towards CURIOCITY Framework: A Case of Study" --> Towards CURIOCITY Framework: A Case Study
- Page 18: "The mapping process consists on matching" --> The mapping process consists of matching
- Last line: "such as recommendation systems, virtual museums, catalogs." --> such as recommendation systems, virtual museums, and catalogs.

Ambiguous sentences:
- Page 9: "Additional extensions are needed to treat with digital media"
- Page 9: "Besides this, it is also necessary an extension of crm:Person class"

I hope you will find those comments useful to improve your paper.

Review #3
By Oscar Corcho submitted on 19/Aug/2021
Suggestion:
Major Revision
Review Comment:

My first comment on this paper is that it is submitted as a full paper, but in my opinion it should have been submitted as an ontology paper, since the main contribution that this work focuses on is the development of a network of ontologies in the area of cultural heritage (I would not really claim that it is for urban tourism, but mostly for cultural heritage). As such, the paper should have had a different structure, a different length, and even a bit different set of criteria to be evaluated.

Therefore, my review will mix aspects that are related typically to ontology papers and aspects that are related to research papers as well.

My very first comment is related to the fact that the authors describe mainly an ontology (a network of ontology modules) which are not available publicly, following good practices on ontology publication. This is in my opinion a must for any work that describes an ontology, so as to avoid problems that are actually identified by the authors in the section related to the state of the art analysis, where they say that some ontologies that are described in papers are not available any more. Please make sure that the ontology is available so that it can be used by others (besides also being evaluatable by reviewers like me).

I have several concerns on some of the claims made already in the abstract (and in several parts of the paper):
- There is a lack of a well-defined and standard model for representing the whole knowledge of this domain. Ok, we may or not agree with this statement, but how is this work presented in this paper solving this problem? Indeed, the authors are using extensively the CIDOC CRM ontology, and a few other ontologies. But how can you claim that your model would be such a standard model for the whole knowledge of this domain? Why haven't others succeeded?
- "it outperforms state-of-the-art ontologies". This is not proven anywhere in the paper.
- "it is possible to automatically populate the ontology". This is not clear from the description provided in the paper, where a lot of manual effort and Python scripts are used for such population.
- "from which a set of tourism applkications and services can be developed". It is not clear what can be really done that would not be doable with other techniques, with other ontologies, without the use of ontologies, etc. What it is shown is only a very small proof of concept (again, another reason to consider that this paper should really focus on the ontology development instead of being a full research paper).

The paper provides a very comprehensive overview of related work, in terms of ontologies used for POI representation and for museum-related knowledge. I cannot consider myself an expert on this area, and there may be aspects that I have missed, or relevant works that I have not considered, but reading the sections 2 and 3, the authors seem to show a very good understanding of this topic. I would have liked, though, to have more structured and systematic descriptions of all of these ontologies covered in sections 3.1 and 3.2, since sometimes they read like non-systematic reviews, but I also acknowledge that the summary table provided in the end of section 3 is useful for those interested in understanding the extent to which the domain is actually represented in these ontologies.

This said, we can now move into the core contribution of the paper, which is the CURIOCITY ontology:
- Why have you decided to follow GoodOD recommendations instead of using other more widely-used ontology engineering methods and methodologies? I am for instance missing in the description references to artefacts like ontology requirements (e.g., competency questions, use cases, etc.), which would be useful for ontology evaluation afterwards. Indeed, the description of these recommendations does not clearly show how the middle and low levels of the ontology network would be created.
- I would recommend adding property names/labels/IRIs to the figures describing excerpts of the ontology across section 4.
- Please unify terms like Site CH or SiteCH, as they appear with different names in different parts of the descriptions.
- I am surprised to see that the authors are including in the ontologies concepts like Park, Protected Area, etc., which are defined in many other ontologies elsewhere.
- I am also surprised to see cit:Description instead of using rdfs:comment.
- It is unclear to me why you need the concept of crm:Dimension to represent attributes, when you could have gone down into more details on the properties applicable to each concept in the ontology, instead of using such a generic/reified approach.
- There is no discussion on why there are 5 middle ontology modules and why they are divided like that.
- In the food middle ontology, the concept cit:PreparationMethod seems a bit generic to me. How do you instantiate it? How complex should it be? I do not think that a single class can encode as much as the complexity of preparing some food.
- The Low ontology is very unclear. Have you actually defined any property or concept or do you only rely on CIDOC CRM?
That said, I appreciate the image provided in figure 8, but it also reinforces my view that the ontology is not very complete, and it mainly relies on CIDOC CRM.

Similarly, I have very important concerns on the approach taken for the evaluation of the ontology, which relies on a previous proposal from the authors which is not necessarily widely adopted by the community, and there is no justification of why such an approach is used. Obviously, there are no competency questions for which the ontology validation could be done, as per the method used, but I cannot understand why lexical analysis is relevant in this case for a proper evaluation of ontologies in this network. This said, may main concern in the evaluation is the selection of CIDOC CRM as a gold standard (a term that is weird in ontology development terms), since the ontology network that is generated is heavily based on this one, and consequently I can understand that there is a high degree of overlap.
I will focus most of my comments on the domain knowledge evaluation, since the lexical and structural sound a bit strange to me and I cannot really understand why they are relevant and what the metrics really mean there. In such domain knowledge level evaluation, the authors do not explain clearly how the experts were selected (the selection seems very biased in my opinion). And the questions that are posed to experts seem too naive to me for an ontology evaluation purpose (I would have preferred something that focuses more on the knowledge that is encoded and how different tasks can be accomplished with it).
In summary, the ontology evaluation does not follow usual principles for ontology evaluation (not a problem if they were well justified in general or for this domain) and hence the authors should not claim that they have done a proper evaluation of this ontology network.

My final set of commments are related to the content of section 6, which I understand well as a case study, but which has severe limitations in the context of a full research paper, as well as for showing the usefulness of the ontology that has been described, so not fulfilling either one objective or the another:
- The authors claim that other semantic repositories may be added, such as ontologies to represent tourists, but this is not further described or discussed, so not convincing.
- RDF is generated with scripts. Why don't you use declarative mappings (e.g., RML)
- The level of complexity of the RDF that is generated is too simple.
- The authors do not show how the use of the ontology generate new knowl3edge from the initial data. Which are the characteristics of the RDF triples that are inferred? Are they materialised.
- I do not agree with the idea of adding a "Desconocido" value in some properties. This does not make sense under Open World Assumption and under the RDF model.
- You claim that you have several SPARQL queries, but they are not listed.

In summary, I think that a lot of additional documentation work has to be done in order to convince readers about the usefulness of this ontology network in contrast with existing ones, and for demonstrating its potential use, before this paper can be accepted.

Minor comments:
- Please change the shorturl URLs with permanent URLs.