Review Comment:
The submitted revised article is a good step towards a final version. Many of the major comments to the first submission has been properly addressed and the paper is now well structured with a clear content and contribution.
The paper still needs a new round of revision which mainly should focus on the following parts:
I would like to see a more insightful discussion of the “semantic quality” of the data. The biggest problem in frbrized data sets is the occurrence of false positives. These are typically found for works that have more than one expression and for authors having more than one work. This is also the part of the result that is likely to be linked to/from (reused). The paper documents that 112 work groups have been inspected, but does not say how they are selected and what they represent. Checking a random selection of groups will give a number for correctness, but the number does not really say anything about the actual quality of the data. 8 false positives out of 112 may sound like a low number, but if these errors are in the 10% of the data that is most likely to be linked to/from it could imply very low quality.
The technical quality of the paper must be improved. There are still many errors and odd phrases.
The “em dash” is simply overused – combined with rather variable use of space before and after. Please replace a reasonable number of these with commas, and rephrase accordingly when needed.
Part 1, first paragraph:
• “…RDF format own description…”: do not understand this.
Part 1, second paragraph:
• “… alternative replacement…”: RDA is a modernized version of AACR2, also a bit strange to state that something is an “alternative replacement”. It is sufficient to say “alternative” or “replacement”.
Part 1, third paragraph:
• “…provide easier navigation…”: RDA descriptions themselves do not necessarily provide easier navigation and retrieval….
• “…data expressed primarily in natural language text…”: I do not agree on this description of bibliographic records. They are highly structured and have some elements that have natural language text, but most fields are more value-like than human-text-like.
• The last part of this paragraph is rather meaningless and odd.
Part 2, first paragraph
• Is the publication of linked data one of the building blocks of the semantic web, and does the publication require normalization and adaption?
• What is a web-oriented format?
Part 2, second paragraph
• The TELplus project was an experiment on a selected set of records. Use words like experiment, case study, prototype etc.
Part 2, fourth paragraph
• “different approach to musical content”: different approach “based on” or “for” musical content?
Part 2, paragraph 9
• FRBRoo is an elaborated version of FRBR implemented as an extension of CIDOC CRM
Part 2, paragraph 12
• A bit simple to describe BIBFRAME just as an RDF-based alternative to MARC21. "replacement for" is maybe more appropriate?
Part 3, first paragraph:
• “see for instance….”: remove the comma.
• “, a common requirement …”: can be deleted.
Part 3.1
• Consider using an alternative to bulleted list. Bulleted lists are not particular readable when each item is a longer text.
Part 3.2, second paragraph
• Comparing persistent storage technologies with semantic storage? Most triple stores are persistent too.
Part 3.3, second paragraph
• The subject relationship between work and author is already described in FRBR, why present it as something you are introducing?
Part 3.3, third paragraph
• Reference to figure 2 should be to figure 3.
Part 4:
• The defined constraints are only documented by a URL to a git repository. In the git repository I was looking for a readable listing of these constraints, but could not find any.
Refence [1] is missing information that is needed to retrieve this publication. Describe all conference proceedings with same level of detail.
|