Review Comment:
The article presents a framework for decentralised social resource sharing as well as an experiment using a recommender system.
Quality of writing:
===================
The article has a very weak presentation, as neither the research question nor the research contributions are cleary described and enumerated. The experiment and evaluation of the paper is disconnected from the proposed architecture.
Originality:
============
The originality is severly lacking, as the presented approach is not ground in established terminology or existing publications. The authors do not compare their approach to existing industry standards such as OAuth 2. The first two sections (Introduction and Context) only use 3 citations, which shows a strong lack in properly grounding the motivation. Most of the cited research is relatively old considering the speed at which research in this area moves (about half of the cited papers are from the early 2000s). In particular, the authors are not aware that FOAF+SSL was renamed into WebID.
In addition, two lists of requirements are used to inform the proposed approach in Sections 2.1 and 4, however these requirements are not properly grounded, e.g. in a literature survey or in an (empirical) requirements analysis.
Significance of results:
=======================
The results are not significant in any way.
In addition, the abstract, introduction and conclusion make claims which the paper does not support.
The proposed architecture is derived from a so called Authentication, Authorisation, Accoutability (AAA) conceptual architecture from 2000. The only extensions presented by the authors are textual descriptions of how to extend the functionality of the individual components, in Section 5. The extended functionality is not described in a formal or semi-formal way, and not implementation is presented.
The recommender system evaluation uses a subset of the Hetrec 2011 data set. The implementation uses Apache Mahout. The only functionality which is implemented by the authors in order to extend Apache Mahout is a so called semantic similarity provider, which is mentioned for the first time in the middle of section 7. No details for this semantic similarity are described neither formal or informal.
It is not clear how exactly the proposed architecture enables personalisation while protecting the privacy of the user at the same time. What is the threat model the architecture addresses? What kind of personalisation algorithms are able to work in such an architecture?
Detailed feedback:
====================
1.) Introduction:
+++++++++++++++++++
* Why are neither privacy nor recommendation mentioned in the introduction?
* No research problem is introduced here.
* The contributions are not enumerated.
* Why is the introduction not ground in more related work and other relevant citations?
2.) Context
++++++++++++++
* Why is privacy mentioned for the first time on page 3 ?
* Where do the "different methods for users to share resources" come from? Citation?
Or provide examples, e.g. is this how Facebook operates?
Or describe at least a use case.
None of these are provided.
* In particular the "long, psuedo-undecipherable URIs" bullet, suggests a lack of understanding for the subject matter. These URIs usually require cryptogtraphic tokens to be accessed, either as a URL parameter or in the header, so it is not correct that they URI itself protects the resource by adding a layer of obfuscation. Even if such a URI is intercepted, it can not be used to actually access the resource without the token, e.g. in OAuth 2.
* Where does the list of challenges from 2.1 come from? There are no citations and no requirement analysis.
* About 2.1.1 weak cross-domain security. Facebook provides this. For instance, every click on a like button is using strong cryptographic tokens to authenticate that this indeed was caused by the correct user. So a different grounding of this "challenge" needs to be provided.
* The goal described in section 2.2 is not formulated in a clear and concise way. In addition, it does not use concept names which point to other parts of the paper
3.) Background knowledge
+++++++++++++++++++++++++
* In section 3.3 a categorisation of user awareness of resources is presented, which has 4 quadrants: known-unknowns, known-knowns, unknown-knowns, unknown-unknowns. The authors then state that "recommender systems are conceptually fit to help users perceive resources as useful known-knowns". That makes no sense. If the item is not known then at least on of the two adjectives needs to be "unknown".
* In section 3.4, please cite a paper on your classification of recommender systems approaches.
* The paragraph with "there are different resons for these perceptions, including ..." needs to be significantly expanded. It is totaly unclear to this reviewer.
* How does business logic fit into the description of recommender systems ? E.g. Amazon also tries to optimise sales.
* What is the impact of access policy restrictions on the recommender system?
This is only hinted at in this section. The explanation is much to short. Alternatively provide a citation.
* The sentence following citation 30 seems to forget that the system facing information overload is usally the user itself. The recommender system / algorithm is not really the victim of information overload.
4.) Discussion
++++++++++++++
* Normally a discussion is presented after the contributions, towards the end of the paper.
* However, this is not a discussion but a list of requirements which supposedly are the basis for the presented architecture.
* Where do these requirements come from?
Why is there only one citation in this section?
* The authors have to derive the requirements from somewhere, and they need to describe this process. Literature survey, use case and requirements analysis, industry project, or something else.
* The discussion of multiple identities (4.1) requires more grounding. There is evidence that users actually prefer fragmented identities.
* In section 4.3 OAuth is mentioned the only time in the paper. So the authors are actually aware of it. The proposed architecture has to be compared to OAuth. Also, the reason for stating this requirement is not clear.
5.) proposed architecture
+++++++++++++++++++++++++
* it is not clear how exactly the proposed architecture enables personalisation while protecting the privacy of the user at the same time. What is the threat model the architecture addresses? What kind of personalisation algorithms are able to work in such an architecture?
* The authors refer to a "typical access control architecture". Please provide a citation or a detailed use case and requirements analysis.
* In 5.1 FOAF+SSL is referenced, which has been renamed into WebID.
* In section 5.6 an "access policy defintion language" from the W3C is mentioned without calling it by its name.
6.) Experiments
+++++++++++++++
* How are these experiments related to privacy?
* How are the experiments related to the rest of the paper?
* The criteria for selecting the subset of the Hetrec 2011 data set have to be listed.
* The text refers to the "evaluation needs", these have to be clearly listed.
* Which tool / approach was used to reconcile the LastFM Tags ?
7.) Evaluation:
++++++++++++++++
* From the description of section 6 and 7, the only extension which the authors implement on top of Apache Mahout is the semantic similarity provider. It is mentioned for the first time on page 17, where configuration C105 is described.
* What exactly does this semantic similarity do? Did you develop it by yourself?
Is it using an approach from existing research?
* How is the experiment described in this section relevant to privacy?
* Are these results statistically significant ?
8.) Conclusions and future work
++++++++++++++++++++++++++++++++
* The authors claim: "... this work demonstrates that it is possible to achieve a balance between privacy and information recommendation, with minimal trade-off between both." As the presented experiment does not consider privacy at all, you can not support this claim.
Also the trade-off is not quantified.
* The conclusions mentions that an implementation of the architecture exists, however this implementation is not described anywhere in the paper.
* The conclusion tries to explain that the contributions of the paper enable balancing privacy and personalisation. This explanation needs to be expanded, and it needs a formalisation. The paper does not currently support this claim.
* The conclusion claims that "this system has been fully tested on a closed ennvironment web server". However no details of this implementation are actually described in this paper.
So the authors can not support this claim.
|