Review Comment:
Though the revised version of the paper is improved, and many of the detailed comments have been addressed, I still think more work is needed since the broader comments still hold. I'll try give a more detailed idea of my remaining concerns in this review than I did in my previous review.
As before, I am still concerned that the paper does not meet the following two criteria from the CfP:
1) Suitability as introductory text, targeted at researchers, PhD students, or practitioners, to get started on the covered topic.
Again, although the paper certainly contains all of the raw material for a very good survey paper on a very important topic, it is still, for me, simply too difficult to read and too confusing to understand in its current form. Given that I'm already familiar with issues of Linked Data quality, and given how much I genuinely struggled to make sense of the paper, I cannot say it is yet suitable as an introductory text for researchers getting started on the topic. I hope the following comments can help the authors rethink parts of the paper.
Primarily the problems with readability are due to the classification, which forms the long core of the paper. I am glad that the authors chose to partially simplify the classification but I'm disappointed to see that, in my opinion, they didn't simplify it enough. I think the paper still draws unintuitive distinctions between quality dimensions that are not justified, which really hurts the readability and understandability of the paper. Given that the descriptions of dimensions are not concise or formal, trying to mentally distinguish them leads to major confusion. The discussion sometimes (not always) succeeds at saying how the dimensions are *different*, but not why they are interesting to consider as separate dimensions. I'm not really sure why the authors made the classification so complicated! I guess it might have to do with aligning with previous papers or with following something like the taxonomy by Bizer in his thesis. But again, a simpler classification with a more natural and intuitive explanation in a Linked Data setting is all that's needed! The metrics of the papers should then fit into this classification (otherwise the metric is not related to LOD quality). A simpler classification would also make the section much easier to read and to write (and to review!).
To ground this criticism, let me try go into detail on three dimensions that are considered separate at the moment and how I tried to understand them as I was reading. I emphasise that this is just an example; it would take too much effort for me to do this for all the dimensions that seem redundant to me (these will be summarised again later; these were also mentioned in the last review and were addressed by adding new discussion rather than simplifying the classification structure; I did not find the new discussion all that helpful unfortunately).
#### Reputation:
# "Gil et al. ... proposed the tracking of reputation either through a centralized authority or via decentralized voting"
# "... a judgement made by a user to determine the integrity of a data source."
# "Reputation is usually a score ..." -- vs. , a score suggests that reputation is a computed metric whereas the definition says that reputation is a user judgement
# "The (semi-)automated approach uses external links or page ranks to determine the reputation of a dataset." -- vs. likewise
# "Reputation is a social notion of trust"
# "It should be noted that credibility can be used as a synonym for reputation."
#### Then under Believability (which is an actual synonym for credibility, vs. ):
# "Jacobi et al. termed believability as 'trustworthiness'" -- vs. , also about trust.
# "they referred to believability as a subjective measure of a user's belief that the data is 'true'" -- vs. , also dependant on the user's context.
# "Believability is measured by checking whether the contributor is contained in a list of trusted providers" -- vs. , also about sources, could involve a centralised list of trusted sources.
# "In our flight search engine use case, if the flight information is provided by trusted and well-known flight companies such as Lufthansa, British Airways, etc. then the user believes the information provided by their websites. She does not need to assess their credibility since these are well-known international flight companies." -- vs. , has she not already judged the credibility/integrity of these flight companies as being well-known and reputable, based on a "social notion of trust"? If this example does not refer to "Reputation", then I have no idea what "Reputation" is any more.
#### Then under "Objectivity".
# "The extent to which information is unbiased, unprejudiced and impartial." -- I cannot understand how this is not covered by the previous two? This is just a reason *why* a source might not be credible/a dataset believable. To be consistent, you would then have to list other dimensions as fine-grained as objectivity (i.e., reasons *why* a dataset is not reputable/believable or why a user might judge it not to have "integrity"), such as "Expertise" (how much the providers know about the topic, are only experts allowed to edit), "Verification" (is the dataset verified/curated/corrected by some quality-control process), etc.
Some other examples of statements that show confusion:
# "Low response time hinders the usability ..."
# "this is the same information is stored in different ways, this leads to high extensional conciseness ..."
# Section 4.2.4 doesn't mention Conciseness at all.
# "Reputation affects believability but the vice-versa does not hold true." ??
# "By fixing amount-of-data, completeness becomes a function of relevancy." If we substitute in the definitions of the terms, the following statement is made: This is an example of why I feel the "Intra-relations" sections often don't really help with clarifying the dimensions.
# Another example: "Timeliness measures how up-to-date data is, relative to a specific task" ... "Although timeliness is part of the dataset dynamicity group, it can be also considered as part of intrinsic quality dimensions because it is indepenent of the users context" A contradiction: how can it be intrinsic and relative to a specific task?
# The example of Interpretability talks about human readable labels being missing for URIs, which is precisely what Understandability was just talking about: "Understandability is measured by detecting whether human-readable labels for classes, properties and entities ..."
# "Most web applications prefer timeliness as opposed to accurate, complete or consistent data" ??
# "a list of courses published on a university website must be timely, although there could be accuracy or consistency errors ..." ??
Here's a summary of my own thoughts on the dimensions on a high-level:
* Availability is fine
* Licensing is fine
* Interlinking is fine
* Security is thoroughly ambiguous in a *LOD* context. A LOD dataset behind a security firewall is not a LOD dataset. I would remove or otherwise just focus on signed content, not access control.
* Accuracy seems too broad in that its definition covers most of the dimensions that follow it. If the authors focused on something like "Syntactic Validity" here or some equivalent, and stay away from the semantic interpretation, I think it would make more sense.
* Consistency is fine.
* Conciseness is intuitively fine, but the definition/discussion is not great since it does not explain what the redundancy is.
* Reputation, Believability and Objectivity should be consolidated and simplified into one dimension, with the text drastically shortened from the sum of the parts. The metrics can be categorised under one dimension.
* Verifiability is fine.
* I think Currency, Volatility and Timeliness should be consolidated into one dimension. Again, volatility has nothing to do with Linked Data quality for me. If I have a dataset with the capitals of all the countries in the world, a low volatility says nothing about quality. What is important for quality is that data are up-to-date. It that case, I would call the dimension "Timeliness", which indicates the amount of time between changes in what is described by the data and changes in the data itself. Currency is not distinct from this. All of the metrics associated with the three dimensions can fit under one. I think the new Timeliness dimension could then go into Section 4.2 under intrinsic dimensions. (On that, I don't like the name of that section since quite a few of the dimensions outside of that section are intrinsic.)
* I still find Completeness, Amount-of-Data and Relevancy confusing. In the simplest case, I think only Relevancy is needed: the dataset has the content the user needs. Conciseness is already covered. Completeness could be folded into Relevancy.
* Representational-conciseness is fine
* Representational-consistency is useful, but could be folded into the previous dimension or perhaps renamed? Something like "Interoperability"? Consistency is a loaded term.
* Interpretability and Understandability could be compressed into one. Otherwise, it should be made much more consistently clear that one is to do with a human user understanding the data, and the other is to do with machines being able to process the data.
* Versatility is fine.
I think that ideally, there should be about 13-15 dimensions. Whether or not the authors choose to simplify the classification that much is up to them, but in it's current form, Section 4 needs to be written much more clearly and sharply to be a good introductory text! And I think greatly reducing the number of dimensions and simplifying/shortening the discussion is the easiest way to achieve this.
Finally, I still do not understand how the subjective/objective distinction is applied in the tables. The detailed explanation comes far too late in Section 5 and still, I don't know why some metrics are considered one or the other. To take an example, in Table 2, the "detection of the existence and usage of external URIs and owl:sameAs links", I have no idea why this metric is subjective? Again, I noticed that some of the paper are misattributed; for example, the first and second entries in Table 2: reference [26] does not check SPARQL endpoints or data dumps. Again, I urge the authors to double-check that all of the referenced works are correctly attributed! In the tables that list metrics, it would also be good to know whether the metric is good or bad; for example, "no usage of slash-URIs": is this indicating high Linked Data quality or low Linked Data quality? The easiest way would be to label the metrics in such a way that they always indicate good Linked Data quality.
(3) Readability and clarity of the presentation.
Section 5 is improved, though I would ask the authors to break up long paragraphs into logical chunks.
Although my minor comments with respect to the writing have been addressed (as I had noted, it was an incomplete list), again, parts of the paper are well written but parts of the paper are still poorly written. I will try to outline more minor comments at the end to address these problems but again, this can only be considered an incomplete list: *please* proof-read the paper more carefully before submission. It is time-consuming for me as a reviewer to draw attention to this issues that I am sure could easily be fixed by the authors themselves (esp. given that parts of the paper are well-written and typo-free!).
In summary, I appreciate that the authors have worked hard to collect together a comprehensive list of literature in the area and the paper has the raw material for an excellent survey paper on an important subject. However, I again strongly encourage the authors to improve the writing throughout and to greatly simplify the classification until it is sufficiently intuitive and readable to serve as a good introduction text for a researcher new to the area (as per the criteria in the CfP). I hope this second batch of detailed comments will help in that direction.
MINOR COMMENTS: (Incomplete!!)
<> = delete
{} = add
Throughout:
* I said before that LOD refers to the "Linking Open Data" project. But when you say "published on the Web as Linking Open Data", this doesn't make sense since Linking Open Data is a project. You could simply say "Linked Data" but if preferred "Linked Open Data" could also be used since it's also used in the original Linked Data Design Issues document by Berners-Lee (sorry for that; I was incorrect to bring this up before).
* Sometimes RDF terms like owl:sameAs are given a \tt format and sometimes they're not. Please make consistent.
* The end of examples is not clearly marked. The next paragraph reads like it is still part of the example.
Abstract:
* "toward s data"
Section 1:
* "focus {on the} quality"
* "Thus, adopting existing approaches"
* "and {the} unbound{ed} dynamic"
* "focus {on}" again
Section 2:
* "as well as {identifying} open
* "What kind{s} of tools"
* "The majority of the papers {were} published {in an} even distribution between ..."
Section 3:
* "'fitness for use' [31]."
* "The semantic metadata, for example ..." Not sure what the "semantic metadata" are here.
* "is used {a} quality indicator"
* "with the user's quality ..."
Section 4:
* "It obtains ..." What does?
* "There are five dimensions {that are} part of"
* " {I}nterlinking is"
* "between entities<,> {are} user or software agents able to"
* "as well as {the} accessibility"
* "is represent{ed} as A231"
* " {fewer} inconsistencies"
* "one of the dimensions<, which> that"
* owl:DatatypeProperty, owl:ObjectProperty, owl:DeprecatedProperty (no '-'), owl:InverseFunctionalProperty (no '-')
* "to the degree {to} which"
* "from malicious websites<.>{:} for instance, if a website"
* "is measured as "
* "a prior{i}"
* "states {that} information"
* "up-to-data data"
* "user{'}s context"
* "comprises {of} the following aspects"
* "enough data"
* data is `complete'
* The HDT guys would probably appreciate a formal reference to one of their papers if the work is to be discussed (as well as the footnote). For example, "Javier D. Fernández, Miguel A. Martínez-Prieto, Claudio Gutiérrez, Axel Polleres, Mario Arias: Binary RDF representation for publication and exchange (HDT). J. Web Sem. 19: 22-41 (2013)".
* "blank nodes {where} the blank node"
* Figure 2: Understand{a}bility
Section 5:
* "metrics belonging to dimensions such as objectivity" ... objectivity is a confusing example of a dimension when talking about the division of Objective and Subjective categories. Also, again, break up that paragraph a few times.
References:
* Reference [14] needs a proper "thesis" entry like [4] has. Looks like a web-page in current form.
And more besides! Please don't rely on just these comments but thoroughly proof-read the whole paper!
(Finally, on a side note, please find a better format for the response letter! I could not print or read the spreadsheet without setting word-wrap and resizing each column and row size individually. Please keep it simple and just do quotes and inline comments in plain text.)
|
Comments
Syntax fixes to the review
Just a note to the authors/editor that part of my review with cross-references got a little garbled, maybe because I was using angle brackets that got mistaken for HTML tags. Here I just repost that part of the review (same content but with the cross-references fixed)). Sorry about that.
#### Reputation:
#(R1) "Gil et al. ... proposed the tracking of reputation either through a centralized authority or via decentralized voting"
#(R2) "... a judgement made by a user to determine the integrity of a data source."
#(R3) "Reputation is usually a score ..." -- vs. (R2), a score suggests that reputation is a computed metric whereas the definition says that reputation is a user judgement
#(R4) "The (semi-)automated approach uses external links or page ranks to determine the reputation of a dataset." -- vs. (R2) likewise
#(R5) "Reputation is a social notion of trust"
#(R6) "It should be noted that credibility can be used as a synonym for reputation."
#### Then under Believability (which is an actual synonym for credibility, vs. (R6)):
#(B1) "Jacobi et al. termed believability as 'trustworthiness'" -- vs. (R5), also about trust.
#(B2) "they referred to believability as a subjective measure of a user's belief that the data is 'true'" -- vs. (R2), also dependant on the user's context.
#(B3) "Believability is measured by checking whether the contributor is contained in a list of trusted providers" -- vs. (R1), also about sources, could involve a centralised list of trusted sources; vs. (B2) subjective or not?
#(B4) "In our flight search engine use case, if the flight information is provided by trusted and well-known flight companies such as Lufthansa, British Airways, etc. then the user believes the information provided by their websites. She does not need to assess their credibility since these are well-known international flight companies." -- vs. (R2,R5), has she not already judged the credibility/integrity of these flight companies as being well-known and reputable, based on a "social notion of trust"? If this example does not refer to "Reputation", then I have no idea what "Reputation" is any more.
#### Then under "Objectivity".
#(O1) "The extent to which information is unbiased, unprejudiced and impartial." -- I cannot understand how this is not covered by the previous two? This is just a reason *why* a source might not be credible/a dataset believable. To be consistent, you would then have to list other dimensions as fine-grained as objectivity (i.e., reasons *why* a dataset is not reputable/believable or why a user might judge it not to have "integrity"), such as "Expertise" (how much the providers know about the topic, are only experts allowed to edit), "Verification" (is the dataset verified/curated/corrected by some quality-control process), etc.