Review Comment:
The author present a study about different possible ways of capturing the way that a user is interacting during ontology visualization, using some derived metrics in order to build a model and predict whether the user is struggling to perform their task at hand. In particular, the selected task is evaluating a list of ontology matches, asking the user as well to add some matches that they might consider be missing in the current presented list. Two interfaces are evaluated, namely, tree-view (the one provided directly by Protegé) and an adhoc graph-based visualization implemented in D3. The results show that cognitive workload metrics seem to be the most interesting ones in this particular context.
The paper reads well and, indeed, it tackles a problem that might have been overseen by the ontology visualization community such as using adaptive UI techniques in order to lower the mental burden of the user of ontologies. This said, I must admit that I have mixed feelings about the contribution of the paper. It reads more as a CHI paper rather than a paper focused for the SWeb community; this is not necessarily bad, but I think that the take away message should be clearer for the use case at hand. In particular, I have a series of comments about the current manuscript:
- The task that has been selected to conduct the experiments is far from being the most lightweight one, so the cognitive workload metrics could be directly the ones expected to be the most important. It is good to test them, but I miss other more lightweight tasks which might be more general-user oriented, such as ontology navigation (e.g., ask the users for some questions to be answered analyzing the model or searching for a particular fact). In this sense, the task at hand requires understanding two possibly different ontological models, and, on top, aligning them to the user's own mental model in order to evaluate them. So, while the experiment is indeed interesting, the kind of user that would be being analyzed should be almost classified as knowledge engineer rather than an average one. I miss a detailed analysis of the possible tasks affected by adaptive UI in ontology visualization and their user-profiling applicability (for example, which ontology visualizations tasks might be affected by early-predictions and which ones should be considered longer and more prone to use the cumulative data). This would make the take away message clearer and closer to an actuable suggestion for the community.
- Reference 61 seems to be quite related and important for the current contribution but has not been yet published. Up to which point do the contents overlap?
- I miss a presentation about the prediction task that is being modeled. In order to give an actuable take away message for the SWeb community, the classification task should be defined (rather than just stating that some different example classifiers were used). This is, how is the model actually trained? The features gathering is clearly explained, but not the prediction task. Besides, in this setup, at which points in time would it be invoked for prediction? Is it classifying each user's interaction? Given that it's a needed first step towards adaptive UIs in ontology visualization, a further explanation of the integration of the techniques would be welcome (in a similar way to the section about the gaze metrics, which I personally found very interesting).
- Experimental setup:
- What are the sizes of the ontologies being used in the experiments? Did they actually fit completely in the screen so the user could see all the information at just one sight? Were both ontologies presented at the same time in the same screen (i.e., using two instances of Protegé side by side) or did the users have all the freedom to interact as they wanted? Moving continuously from one model to another might have imposed an extra overload (the same would apply to D3 visualization).
- When doing the Pizza tutorial, did the users use Protegé? If so, up to which point do this bias the experiment due to previous exposure to the tool? In the paper, it is mentioned that they were instructed to only use the tree-view to solve their problems at hand, but this might imply that they were aware of the rest of the tool as well previously.
- Was the D3 visualization focused somehow only on the contexts of the mappings? I lack information about the interface in order to actually assess its potential effect.
- While the goal was to predict the user's behaviour, just the tree view of the hierarchies might not be enough to precisely assess the validity of the matches (the information required for matching could be within the set of properties that a particular concept has, or in the definitions and GCIs in the ontology). Did the author take this into account when doing the analysis? Wouldn't this affect the predictive capabilities of the model?
- Experimental results:
- Splitting the graphs in two pages makes it a little bit difficult to follow. Please, consider to have them in the same page. Moreover, showing all the results at the same time in the graphs hampers the readability, I would suggest to summarize the graph in the main body, and let the complete one for the annexes. Moreover, the graphs do not include the baseline (if I'm not wrong).
To sum up, I really find the paper interesting, but I think that further efforts must be done in order to get it closer to the ontology visualization community (among others) by polishing the applicability of the study to different possible visualization tasks.
Typos:
Figure 5a: Conferene
page 11 => user's succuss => success (also in the table Correctness succuss)
|