Review Comment:
This article describes a novel method for automatic cross-lingual ontology alignment, based on lexical, semantic, and structural aspects of the ontology. It combines a number of similarity metrics, relying on Google translate and BabelNet for the crosslingual aspects. An analysis of the similarity of neighbouring concepts (at depth 1) is added in case that the initial similarity between terms is not conclusive enough. A preliminary evaluation with the Multifarm track shows promising results.
The paper is well written and organised in general. The topic, cross-lingual ontology matching, is a timely and interesting one. In fact, this is a challenging research area in which there are still a lot of room for improvement. The main novelty of this work, not sufficiently emphasized by the authors, is the use of BabelNet as a source of background knowledge for cross-lingual ontology matching, through the use of NASARI vectors.
The main drawback of this approach is the evaluation setup. The authors tuned their system with the same dataset used for testing. Ideally, they should have split the evaluation data in a development and testing part, to carry out the evaluation with alignments not previously seen by the system. Therefore, a comparison with other participant systems in MultiFarm is not completely valid, since MultiFarm participants were evaluated against a blind test data. In particular, the authors chose the "Conference" ontology, and 45 language pairs out of the 55 available language pairs in the evaluation dataset. The authors should explain why this choice of languages.
The results shown in table 6 does not correspond to the ones published by the OAEI'18 organisers. Maybe the authors have filtered the participants' results to the ones of the "conference" ontology only and only for the 45 language pairs examined and re-computed the metrics. However, this details are not present in the paper and the source of such numbers is therefore unclear.
Another issue is the fact that MultiFarm considers two types of alignments: type (i) between different ontologies and type (ii) between the same ontology (in different languages). This work only addresses type (ii), which reduces considerably the interest of the evaluation. As stated by the OAEI organisers "for the tasks of type (ii), good results are not only related to the use of specific techniques for dealing with cross-lingual ontologies, but also on the ability to exploit the identical structure of the ontologies". For this reason, purely monolingual methods that are good at compute structural similarities could give good results in type (ii) but perform poorly in type (i). It would be important to check whether the proposed system behaves well also for type (i) alignments.
The related work analysis is also problematic. The reported SoA of systems participating in MultiFarm is outdated, since there have been two more recent campaigns after 2019. The table of OAEI MultiFarm results is the one of 2018, disregarding the last three OAEI editions. Maybe there is a good reason to compare with 2018 only, which however is missing from the document. The table includes a system not described in the SoA (XMAP).
The source code is not publicly available (it is behind login/password in their institutional repository https://gitlab.ic.unicamp.br/jreis/evocros ) therefore the reproducibility aspects couldn't be tested properly.
In summary, this work is not yet ready for its publication as a journal article, in my view. To address the aforementioned issues, the authors should extend the evaluation setup by considering type (i) alignments and clearly separating development and test data. Related work needs improvement also, as well as other issues described later in this review. It would be a good idea to participate in the next OAEI campaign.
Other issues:
* The classification of the CL matching systems given in the background section is not satisfactory, since the "information retrieval" approaches also include translation-based systems (e.g., KEPLER)
* In the introduction, "As differences between the used alphabets hamper the use of simple string comparison techniques, similarity measures play a key role..." I would add "semantic" here: "...semantic similarity measures play..."
* The authors state, in the background section, that "Our approach differs from the above-mentioned proposals because we combine both semantic and syntactic similarities by computing the composed similarity assigning weights to each similarity measure". This is not completely true since a weighted combination of syntactic and semantic similarities is something extendedly used in OM systems. I see the novelty in other aspects such as the use of BabelNet and the fact that they do not only rely on a translation system for the cross-lingual aspects, as most other systems do.
* In the literature review, the authors should make clear that the reported methods on background-based ontology matching are just an illustrative sample, because the SoA on that matter is more extensive and the authors' analysis is far from exhaustive. The same applies for combined methods (lexical + structured + semantic based) which are all over the place in the OM literature but the authors only mention one work (Nguyen and Conrad).
* In definition 3.1: "Each relation r(c1, c2) in R" should be "Each relation r in R ". The definition of ontology is incomplete since it does not consider individuals.
* The neighbourhood of relations is not defined in Definition 3.2
* Definition 3.3 is not really formulated as a definition (I'd rather expect "We define CL Ontology Alignment as..."). Similarity (s_ij) is not adequate in the definition, I'd say "confidence degree". c1 = "Cabeça" and c2= "Head" is also wrong in my view, there is no identity relation between the label and the concept (the latter is much more than a label). They could say, instead: concept c1 with associated label "Cabeça" etc.
* In definition 3.4 (mapping), the last sentence unnecessarily restricts the notion of mapping to string similarity. Actually the author's method does not fulfil such a definition (they do not only use string similarities).
* Definition 3.5 uses similarity and relatedness in an interchangeable way, while they are not the same concepts. Apparently 3.5 is a definition of similarity in general but ends up defining syntactic similarity (syn).
* When describing NASARI vectors, it is unclear what they mean by "contextual information". This description is a bit obscure and it is difficult to know which information is used to build the concept's vector.
* In Section 4: "These ontologies are converted to an object, preserving the relations and neighbourhood relationship between concepts." It is unclear which object they refer to (an object in the OO implementation?).
* Comparisons are made among entities of the same class, then those mappings above a threshold are kept. What happens if there are more than one candidate mapping for different entities? The paper does not mention any selection strategy (other OM approaches uses the Hungarian algorithm, for instance).
* It's unclear which information that characterise the ontology entities (e1, e2) is passed to BabelNet apart of the natural language tags and labels, and how the system deals with polysemy when getting the translations (w1, w2).
* Table 4 is unnecessary. They can simply list the chosen thresholds and weights and state that all the possible configurations were run.
* In the experimental section, it would have been a nice addition a quantitative analysis (not only based on examples) of the impact of using the neighbourhood, to better justify why this is not always used.
* In section 6 the authors state that "it might be useful considering semantic algorithms such as stop-words elimination and stemming, etc. [...]". Notice that these are NOT semantic algorithms but syntactic ones.
English is good in general. Some suggested improvements:
* Section 1 "Our experiments suggest that the threshold, language in which the ontologies are described and translation tool play an important role" -> "...the language in which the ontologies are described, and the translation tool play an important role"
* There is a strange encoding problem when citing Dowling and Gallier in the background section
* The sentence starting "The threshold 0.95..." in the last paragraph of section 5.1 is not understandable.
|