Review Comment:
SUMMARY
The paper presents a thorough survey and evaluation of RDF metadata "representation models" (patterns). Evaluation is carried out based on a wide range of criteria (performance, number of RDF triples, usability, loading times, etc.) and parameters (input data representations, metadata patterns, triple stores, query types). The paper also introduces and evaluates a novel metadata representation model (MRM in short), called Companion Properties, against other well-known MRMs. Results show a heavy dependence on the input parameters, thus it is not generally possible to designate a "winning" MRM. Rather, the judicious choice of MRM can only be made in function of the data representation (e.g., triples or quads), the triple store being used, and the complexity of the typical queries.
SCIENTIFIC CONTENT
The topic of the paper is a good fit for the SWJ Special Issue on benchmarking. The paper appears to offer the following contributions:
1) a survey on metadata usage in real-world data;
2) a thoroughly documented performance evaluation of RDF metadata usage: representation models, current practices, DB storage and SPARQL query support, query complexity, and standards conformance are all considered;
3) a novel MRM, Companion Properties, is introduced as an improvement to the Singleton Properties pattern, addressing common performance issues related to storing and querying huge numbers of triples with unique property names;
4) evaluations over large real-world datasets (collected from Wikidata and DBpedia).
This large number of contributions in this paper is simultaneously a strong and a weak point, as each individual contribution is minor and there is no single main novelty or message highlighted. The stated goal of the paper is to evaluate metadata representations; however, it does so by reproducing the experiments described in two earlier papers not by these authors ([9] and [10]), a fact that the paper clearly states. On the positive side, with respect to these two articles, the coverage of the subject is more comprehensive, the evaluation is somewhat finer-grained and results are documented in more detail. From an engineering perspective, the results should be useful for readers looking for well-performing combinations of metadata representation patterns and back-ends. From a theoretical standpoint, however, on the whole the experimental results mostly concord with [9] and [10] in the sense that there is no "single best" metadata representation nor back-end, because performance will depend on how each back-end was equipped to deal with various RDF constructs and SPARQL query types. This overall conclusion remains even if the detailed results are not always identical with those in the earlier works.
Overall, the question is whether the sum of these minor contributions reaches the critical mass for the paper to be accepted in the SWJ. I suggest acceptance because the paper is very relevant to the Special Issue, the general problem of RDF metadata is covered from several angles, and there is inherent value in being comprehensive. However, in its current form, the paper is too chaotic to be accepted (see below) so I suggest major revisions.
Specific questions that the paper should have answered:
(a) Why was CPPROP, one of the contributions of the paper, not evaluated in the quins experiment?
(b) Why are simple queries evaluated only on Wikidata and complex queries only on DBpedia? Why not evaluate all four combinations?
LENGTH AND STRUCTURE
The somewhat unclear message of the paper is not at all helped by being too long and not very well structured. While its length (24 densely written pages) is partly due to being very detailed with respect to experimental design, setup, data, and results, which is OK, there is also considerable amount of redundancy and verbosity that could have been eliminated. Some examples:
- p. 2, second column, on the principle of the reification of a triple not entailing the triple itself: the first three paragraphs could be replaced by a single short paragraph explaining this very straightforward principle.
- p. 3, first column: the introduction need not go this deep into details, just mention what is to come later in the paper.
- p. 3, second column, on nano-publications: this is again a lengthy presentation of a subject also covered in section 3.1.2.
- p. 6, section 3.2: granularity, which should be a separate dimension, is explained three times within three dimensions (3.2.1, 3.2.2, 3.2.3).
- p. 11, DBpedia: I feel that a full page on presenting metadata in DBpedia is unnecessarily long.
The structure and section titles of the paper could also be greatly improved. A figure on MRM schemas is provided in the introduction while the MRMs themselves are only explained in section 4 (p. 8). There are several forward references to this section and uses of terms before defining them (e.g., singleton properties and n-ary relations on p. 5) which is also typically a sign of bad paper structure. In section 3.2, the four presented dimensions of evaluation are mixed up: 3.2.1 is entitled "purpose and types" but it rather presents levels of granularity; 3.2.2 presents the dimension of MRMs but it also includes granularity; and 3.2.3 is entitled "dataset characteristics" which seems to be a generic "other" category, again including granularity (!). 3.2.4 states that "we define three complexity classes" but then fails to state which ones. In section 4, the titles of subsections are very confusing. I suggest 4.1 to be called "RDF compliant models" and 4.2 "Vendor-specific models". Otherwise it is not clear whether this section is about techniques, RDF stores, or models. Furthermore, the MRM descriptions are a mix between definition and evaluation, adding to the confusion. Section 5 presents evaluation datasets but Wikidata is already presented earlier in section 3.1. Section 6 presents the evaluation setup but it is not clear why evaluation dimensions are anticipated to section 3.2 and evaluation criteria to 3.3. Throughout section 6, evaluation results are anticipated while they should be kept for section 7 (see sections 6.3.1 and 6.3.2).
The following paper structure should have been much clearer, as it separates the introductory and definitorial part (sections 1-4) from the evaluation part (sections 5-8):
1. Introduction
2. Related Work
3. Metadata Representation Models
4. Survey on Metadata Use (formerly 3.1)
5. Evaluation Datasets
6. Evaluation Method (containing 3.2 and 3.3)
7. Evaluation Results
8. Conclusions and Future Work
LANGUAGE AND WRITING QUALITY
While the writing is generally understandable with no major issues, the English is approximative throughout the paper with lots of grammar mistakes (too many to enumerate). A thorough proofreading would be necessary.
|