Multi-application Profile Updates Propagation: a Semantic Layer to improve Mapping between Applications

Paper Title: 
Multi-application Profile Updates Propagation: a Semantic Layer to improve Mapping between Applications
Nadia Bennani, Max Chevalier, Elöd Egyed-Zsigmond, Gilles Hubert, Marco Viviani
In the field of multi-application personalization, several techniques have been proposed to support user modeling. None of them have sufficiently investigated the opportunity for a multi-application profile to evolve over time in order to avoid data inconsistency and the subsequent loss of income for web-site users and companies. In this paper, we propose a model addressing this issue and we focus in particular on management of user profile data propagation, as a way to reduce the amount of inconsistent user profile information over several applications. A second goal of this paper is to illustrate, in this context, the benefit obtained by the integration of a Semantic Layer that can help application designers to automatically identify potential attribute mappings between various applications. This paper so illustrates a work-in-progress work where two different approaches are integrated to improve a main goal: managing multi-application user profiles in a semi-automatic manner.
Full PDF Version: 
Submission type: 
Full Paper
Responsible editor: 
Guest Editors

Solicited Review by Francesca Carmagnola:

The paper presents a multi- application user modeling system called G-Profile, which allows user profile information to be shared in a multi-application personalization context.
The topic presented in the paper, that is the interoperability of user profiles among applications is interesting. It is also a challenging task, since achieving interoperability in an open and dynamic environment like the Web requires a very high level of alignment by applications.
Despite of the promising premises, it is my opinion that the paper fails in providing enough details.
In the following my comments.

The paragraph where the limitations of having user data scattered among not-communicating applications is not well structured. Specifically, there is a sentence made of 13 rows and this decreases the readability and comprehensibility of the content.

This is a weak part of the paper. The analysis of the existing literature and approaches for user model interoperability is quite approximate. The advantages and disadvantages of adopting standard ontologies rather than mediation-techniques is just superficially discussed. Moreover, nothing is reported about open standards protocols or specific languages that are commonly exploited to model the interaction among providers and suppliers of user model data. I would suggest the author to read "User model interoperability: a survey" (Carmagnola et al.) in order to have a complete overview of the related work in this field.

Section 3.
It is my opinion that the formalization of the G-Profile model and and the way user data are propagated is not clear. I would suggest the authors to provide in this section an example use case which may help the reader in cope with the comprehensibility of data mapping and propagation process.

Section 4.
I have here some concerns about the definition and presentation of the Semantic Layer. What the authors define as "a high- level description of various dimensions characterizing users in a specific context" is what is commonly known as ontological representation of concepts. Indeed, why do not simply represent the user model as a shared ontology? This should be clarified and well discussed in order to give value to the authors approach. Moreover, any reference to the existing ontology mapping approaches is here discussed or compared to the presented approach.
Section 4.2, which is the core part of this section, is not presented with a proper level of detail. For instance, any example of the inference rules that are used to check if there are possible relations between the concepts associated to couple of attributes is reported. Another example of lack of details: how the range of the equivalentClass of the concepts is defined?

General considerations.
The paper lacks of a discussion about the limitations or, at least, restrictions to the applicability of the presented approach.
Moreover, in this paper the issue of privacy management is not faced. However, in an interoperability context, privacy is an important and challenging issue which deals with the release of user model data to third party systems. I wonder how the authors aims at managing it.
As a final consideration, there is no evaluation. This makes impossible to understand if the approach is valuable or not. I have the feel that the research presented here is promising but it is my opinion that it is not enough mature for a high level journal paper.

Solicited Review by Eelco Herder:

As the title and abstract indicate, the paper proposes a model, G-Profile, for connecting application-specific user profiles. Possible mappings are identified in the semantic layer, using techniques such as SparQL.

The paper starts with a concise, fair summary of the problem description and related work.

Section 3 contains a formal description of the graph-based G-Profile, which consists of attribute and function nodes that are connected by edges. Propagation attributes are used for conditioning propagation of updates, thus preventing issues such as cycles and parallelism.

First, it should be noted that the formalism is hard to read due to the rather extensive use of variables that are only briefly explained. However, even after having understood the meaning of the various variables, I am not convinced that the mere presence of propagation attributes prevents issues such as cycles - a formal proof is not given. Mappings may be complex, involving several source variables (for example, there might be a mapping from sysA.age and sysB.age to sysC.isadult, which remains undefined if the age values in sysA and sysB differ). Other mappings may determine whether a user is assumed to have knowledge on a topic, based on her age, level of education, read tutorials and whatever other source. How does propagation control behave in these kinds of situations is not clear to me.

I understand that the formalism is just a model, but for practical use, implementation issues should be considered at this stage: when does a profile change propagate (at the moment that it takes place or at the moment that a depended value is requested), will there be any sort of caching technique (I assume that is needed), who exactly is the owner of which values (the application, the user, ...) and who is allowed to change them? These are just some example questions that should be addressed in order to justify the existence of 'yet another model'.

The semantic layer, as introduced in section 4, seems rather straightforward albeit quite simple and high-level (for example, apart from an example conversion method, no further details are given on this rather important issue).

To summarize, the paper introduces a graph-based model for user profile mapping. The main problem is that it is just a model with many definitions, but no real proof of its claimed properties. I personally consider the fact that there is no (prototypic) implementation and evaluation as the major shortcoming.

Solicited Review by Shlomo Berkovsky:

This submission presents a formal framework for multi-application mapping of user modeling data and discusses the use of a semantic layer aimed at enhancing this mapping. While the submission has some merit with respect to the formal definition of cross-application user modeling data exchange, it fails short in convincing the reader that the proposed mapping using the semantic layer is accurate and efficient. It provides neither technical implementation details nor experimental evaluation results supporting the proposed mapping method. As such, I regret to recommend to have this submission rejected.

Detailed comments:

* Section 1:

- Paragraph 3 discusses several reasons for data incoherence. Another important reason that was overlooked is the inconsistency in explicit user data. See recent papers by Xavier Amatrian on inconsistency of user ratings in recommender systems. The drawback referred to as the "lack of efficacy" is probably "data access restriction".
- Paragraph 3 deals with mono-application scenarios, while paragraph 4 suddenly talks about multi-application scenarios. There is a leap between these two, which should be covered by some introductory sentence.
- Besides the motivating example presented in the second last paragraph, it is not clear to me what was actually done in this work and what the learned lessons were. For example, how the semantic layer was developed and used, what was evaluated, what results were obtained, which conclusions can be drawn from these results. All this information should be added to the introduction.

* Section 2:

- The last paragraph is somewhat misleading. It is not clear whether the focus of the work is on the user model evolution processes over time or on the propagation of newly obtained user modeling data across partial user models.

* Section 3:

- A set of user attributes should be denoted by {a_k^A}.
- Source and target attributes should be elaborately explained. While I can intuitively guess what these mean in the context of multi-application user model update propagation, the formal definition come about half a page later. I would suggest to include some verbal description of these attributes.
- The active mapping is propagated *only* to t_h^Aj. Why only to this attribute and not to multiple applications and/or attributes?
- The procedure at the end of sub-section 3.4 details the update propagation process. Essentially, G-Profile plays a pivotal role in this process and dispatches the updates. How will G-Profile know all the existing links and dependencies? It seems to me that this is a huge question, comparable to the comprehensive domain ontology problem mentioned for the standardization based user modeling method. The following section provides the initial ideas how to address this question, but still falls short in answering it.
- The recursive cases discussed in section 3.5 are nice and can be of importance in practical deployment of the proposed propagation model.

* Section 4:

- At the beginning, "manual detection of mappings" is discussed and it is hypothesized that the semantic layer can limit it. Can this task be automated or semi-automated (system suggestion to application designers)?
- The definition of the semantic layer is unclear to me. Was it actually defined? If so - how and using which tools? If not - GUMO can be taken as a good starting point.
- Figures 3 and 4 are hardly readable. Their quality need to be improved.
- How are the links between the semantic layer and the application side identified in figure 4.
- How can the steps presented after figure 4 actually be implemented? A simple string matching is not likely to fit here, due to synonyms, hypernyms, and even language heterogeneity. This is another big problem, heavily investigated in schema matching and ontology reconciliation communities.

* Section 5:

- The first scenario discusses a tedious task, as G-Profile would consist of a substantial number of attributes. It is not clear what the benefits introduced by the semantic layer are.
- The third scenario over-simplifies the problem. For example, how is the semantic layer used to propose partnerships between applications, how does it identify potential mappings between user models, how can it easily associate attributes belonging to different applications (the latter does not seem easy to me at all)?

* Section 6:

- Even after reading the conclusions, I am still not confident I can point out the exact contribution of this work. The discussion is at a very high level, whereas low level and technical details of implementation and evaluation are overlooked.