Review Comment:
The authors have taken considerable efforts in improving the paper, including the addition of a number of references. Many of the issue I have raised in my original review have been addressed and discussed in the response letter, including the various points of critique I had against quite a few of the metrics used in the paper.
A part of the paper I am still not too happy with is the section where the PCA is conducted. The question in the box - i.e., which are the key features - is imho not answered. The PCA rules out three of the metrics which do not contribute much to the variance, but apart from that, the question is not answered. It should rather be understood as: which metrics can be safely removed, as they are already covered by others. Dimensionality reduction techniques could thus be more appropriate for identifying the most promising features.
Along the same lines, I do not agree with the authors' statement that closely related metrics are grouped by PCA. This is not what PCA does. Metrics ending up in the same component are not necessarily related or unrelated, they can just be cominbed linearly in a way that explains a lot of the variance in the data, regardless of whether they are semantically related or not.
I strongly appreciate that the discussion of the metrics has been extended, including possible caveats. In some cases, responses from the response letter should be included in the paper (e.g., the time-dependency of accessibility metrics).
Some points that have *not* been addressed properly in my opinion include:
* IO1: I still do not believe that this metric measures what the authors claim. Looking up a vocabulary in LOV only analyzes whether this vocabulary has been registered there, which, in turn, requires some minimal effort in describing metadata of the vocabulary. However, if I create a dataset with a proprietary vocabulary, and register that vocabulary at LOV, I do *not* reuse an existing vocabulary, but the metrics suggests so.
* CS9: I do not think it is enough to materialize the subject's and object's type. For example, the DBpedia property dbo:starring has rdfs:range dbo:Actor . A triple where the object is of the (less specific) type dbo:Person would thus be marked as a range violation, although this is probably not desired.
A remark from my original review which should still be addrssed:
* CS6: In my opinion, it would make more sense to use the overall size of the vocabulary as the denominator.
I am confident that the authors will be able to address these issues. However, since particularly the analysis of informative metrics needs careful rework, I would still recommend a major revision.
|