Review Comment:
There are some minor grammar errors (in general, I would suggest reviewing the use of the prepositions, they are not always correct) or unfortunate phrasings,
but overall, the paper is clearly written. However, I think the reproducibility of the method and validation of the result is not fully discussed,
and I'm having a hard time trying to review the following recommended indicators:
- the design principles (throughout my comments below, you can see that I'm a bit confused about what the scope of the Conceptual Mapping is supposed to be, and thus question some design decisions)
- comparison with other ontologies on the same topic (other ontologies are extensively discussed, but a clear alignment between the Conceptual Mapping and these other ontologies is missing)
- and pointers to existing applications or use-case experiments (this seems to be future work)
More specifically:
- The comparison framework: it is currently unclear what this entails, however, I have the feeling you were rather "collecting stamps" than "physics" (I don't want to sound negative, both have merits): analyzing existing languages and extracting their features, and not trying to come up with a complete set and map that to languages.
Please be very clear about what you are doing, clarify, and provide argumentation to make your work more reproducible (e.g., how do you detect certain features: based on the specification, a set of test cases, verified expert opinion, ...?).
- The mapping language does not seem validated: please provide argumentation as to why certain modeling decisions were made, and provide some proof that the mapping language is validated. As this is an ontology paper, I currently have trouble finding the proof for "Quality and relevance of the described ontology (convincing evidence must be provided)".
- Some of the arguments of the discussion section come out of the blue for me, e.g. the provenance, applicable shapes, extensible character: I fail to understand how these have been extracted from existing mapping language features, and if not extracted, why they are added.
That said, the work is valuable, relevant, and timely. I believe the hardest work has been done, and many of the comments I have are requests for clarifications.
However, without a validation of the actual result (i.e., a comparison/alignment with the other languages, to showcase you indeed cover all bases),
I would not be inclined to accept it, hence I suggest a major revision.
Introduction
- I really like the conclusion of p2 left column line 26
- Please clarify whether you made an ontology to unify definitions across mapping languages, or a mapping language that is a superset of existing mapping languages. This is currently not clear. Given this was submitted as an Ontology Description I assume a "short paper describing ontology modeling and creation efforts", however, I have the feeling this is not the case here. If this instead is supposed to be a full paper, I would assume to review "originality, significance of the results, and quality of writing [...] and more specifically the evaluation sections in a style and level of detail that enables the replication of their results".
- I commend for clarifying the usage rights, persistent identifier, and clear documentation of the ontology. However, I found the following errors:
- I'm quite surprised a mixture of rdfs:comment and skos:definition is used for the ontology, that doesn't feel right.
- http://vocab.linkeddata.es/def/conceptual-mapping/protocols_list.ttl returns a 404
Related work
- For 2.1, I would appreciate some discussion on which criteria you used to select this set of mapping languages to compare with, and not, e.g., SPARQL-Anything, XRM, or SMS2.
- In 2.3, for me, it is not clear why in Mapeathor the spreadsheet is language-independent. Mapeathor imposes a specific structure within the spreadsheet (is that not 'a language'?), very similar as to how, e.g., YARRRML imposes a specific structure within a YAML document. Neither change the underlying serialization (and for completeness, YARRRML also supports translation into R2RML). That said, this kind of discussion is quite interesting: what kind of distinction is there between e.g. Mapeathor and YARRRML vs RML and SPARQL-Generate? What changes if someones builds, e.g., a tool that directly works on YARRRML mappings instead of translated RML? In general, I'm not sure this distinction is relevant for this paper.
- For 2.3, I miss a concluding remark, for now, I don't understand why this section is included (translation is a future work, AFAICT)
Comparison framework
- You state which languages you include, but there's no argumentation why those (similar comment than my first one of the related work section). Can you provide a more rigorous argumentation as to why you include specifically those?
- The example is a quite limited RDF structure -> no rdf:Lists or similar constructs, no graphs. I'm currently not convinced this is a complete example that touches your complete ontology/language.
- It is for me not clear whether this language/ontology is meant to be abstract (so tries to attempt completeness), or rather a superset of existing languages (so tries to cover everything that existing languages cover). There is a distinction between these two, so clarifying that scope is important.
- For example: Data retrieval: the fact that there are 3 retrieval 'modes': is this complete, or is this extracted based on the features of existing mapping languages? [1] describes factors that influence an RDF graph generation algorithm, and makes the distinction between 'real-time' and 'on-demand' trigger, where I can see that your distinction 'Streams' maps to 'real-time', and 'Asynchronous' and 'Synchronous' are two types of 'on-demand' triggers (event-based could be a third type of on-demand trigger). I'm not saying one categorization is better than the other, but some argumentation would be good. If the point above is well tackled, this point should become a non-issue.
- For the data source description, I would expect a discussion on the extensibility of the language vs support of tools implementing the language, e.g. RML does support Streaming data sources (https://github.com/RMLio/RMLStreamer#processing-a-stream), but this is indeed not specified in the original paper or specification, since RML provides an extension point concerning data source descriptions, and is only implemented in the RMLStreamer. How could you compare such features on a language level?
- p8 right line 43: please clarify provenance, it is not clear. Given that you state "No language considers the specification of its provenance", I have the feeling you attempt to create a complete set of features. I would strongly suggest not going down this path and instead trying to create a superset that (only) contains the features that are currently supported by your set of mapping languages. Otherwise, I would need some argumentation (and preferably, theoretical grounding) why some features are taken into account, and why some aren't.
Mapping Language Ontology
- For reproducibility, I would expect that the requirements specification is (publicly) available
- p11 left line 9: so the set of functions a mapping may use is predefined, not extensible. How is Conceptual Mapping then an abstraction of existing features? (e.g. FunUL supports function extensions)
- I think the mapping validation is important, however, I could not find the results described or linked to in the paper. Based on what I read up till now, I would expect some validation document that states 'feature X is supported by construction Y (or a combination of constructions)', and as such, you can prove that you cover all features. Or, if impossible, argue why this is not provided. Otherwise, I cannot review "whether the provided resources appear to be complete for replication of experiments, and if not, why".
- For example, it is unclear to me which feature the CombinedFrame construct solves.
- How do you specify that an expression is either an XPath or a JSONpath, or 'among others'? Why is this not a SKOS ConceptScheme?
- By linking the datatype to the statement, don't you get in trouble when you want to create mixed-type rdf:Lists?
- From the features listed in Section 3, I don't understand why the Ontology or shapes constructs were required.
Discussion
- p15 left line 35: the problem of 'lack of information on valid combinations' has not been clearly explained up till now.
- p15 left line 45: the purpose of this language has, up till now, not been validated: there's no overview of how existing mapping language features are supported by conceptual mapping language.
- p15 right line 3: please exemplify and better argue why provenance definition and shape applicability are relevant in this paper. For now, it is unclear. Same with extensibility: currently it's a bit unclear this is possible (I assume you mean you can add metadata triples to the existing mapping graph? is that an extracted feature?)
- I kind of disagree that mapping governance has not been developed so far, e.g. http://events.linkeddata.org/ldow2016/papers/LDOW2016_paper_04.pdf (last page) showcases how author metadata could be added, using PROV-O statements.
- I find the maintenance guarantee a bit underpromising (who is "we" in this case?), can this be made stronger, e.g., some public statement on Github, an endorsement by an organization instead of a group of people?
Conclusion and Future Work
- "an ontology-based conceptual model that aims to gather the expressiveness of current mapping languages" --> so why also include other features (such as provenance) that are mentioned in the comparison framework to not be considered by any language?
- "Finally, we want to specify the correspondence of concepts between the considered mapping languages and the Conceptual Mapping" --> I'm very much confused by this statement, then what is the Conceptual Mapping at this point? How can you be sure that it currently gathers the expressiveness of current mapping languages if you don't have this correspondence table?
-- spelling/grammar/details
I personally prefer using Oxford comma consistently, for both 'and' and 'or'.
I tagged some phrases that were unclear in the PDF at link https://www.dropbox.com/s/4hr7lobx1fm006v/swj2913_bdm.pdf?dl=0
|