Digests, snapshots, events, or cumulative gaze - what is most informative of success and failure? A study of the foretelling signs of user performance during interaction with ontology visualization

Tracking #: 3271-4485

Bo Fu

Responsible editor: 
Cogan Shimizu

Submission type: 
Full Paper
The current research landscape in ontology visualization has largely focused on tool development yielding an extensive array of visualization tools. Although many existing solutions provide multiple ontology visualization layouts, there is limited research in adapting to an individual user’s performance, despite successful applications of adaptive technologies in related fields including information visualization. In an effort to innovate beyond traditional one-size-fits-all ontology visualizations, this paper contributes one step towards realizing user adaptive ontology visualization in the future by recognizing timely moments where users may potentially need intervention, as real-time adaptation can only occur if it is possible to correctly predict user success and failure during an interaction in the first place. Building on a wealth of research in eye tracking, this paper compares four approaches to predictive gaze analytics through a series of experiments that utilizes scheduled gaze digests, irregular gaze events, last known gaze status, as well as all gaze captured for a user at a given moment in time. Experimental results suggest that irregular gaze events are most informative of early predictions, while increased gaze is most often associated with peak accuracies. Furthermore, cognitive workload appears to be most indicative of overall user performance, while task type may impact predictive outcome irrespective of the gaze analysis approach in use.
Full PDF Version: 

Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 04/Jan/2023
Major Revision
Review Comment:

This paper describes an experimental study where four approaches to predictive gaze analytics are compared. The study is geared towards the development of adaptive tools that depend on the user task. The context of the study is ontology visualization.
Given that the paper has been submitted as a 'full paper', it is reviewed along the following dimensions for research contributions: originality, significance of the results, and quality of writing.

In general, the originality of the contribution is presented in the paper, i.e. using classification experiments and comparing accuracies for the analysis of the predictive capability on correctness and completeness of a task.
The results are significant as a contribution to the research on using predictive gaze analytics; however, it is not a contribution for the particular task at hand, ontology visualization, so the emphasis on this particular use case is not relevant for this research as I will detail in my comments below and
The paper is well written, clear and quite straightforward to follow.

Detailed comments:
(1) Title. I suggest deleting from the title “during interaction with ontology visualization”. As mentioned below, the task that is used in the experimental study is quite limited in scope.

(2) Introduction
a. Ontology visualization use cases mostly refer to visualizing an ontology as a data model, having a graphical representation of the model. The suggestion is to focus more on ontologies as data models and not as datasets.

(3) Related work
a. In order to present the originality and contribution of this work, the authors should detail clearly the differences with the related work on the use of gaze data as an input (among others) to predict user success or failure in ontology visualization [59-61]. The same can be said of studies in [37-39] and [58].

b. Visualization of large knowledge graphs does not seem related to this work. The emphasis in this type of visualization research is the problems in the deployment of large graphs and the issue of viewing parts of the graph.

(4) Controlled eye tracking user studies
a. Given the wide spectrum of ontology visualization tools currently available and described in detail in the survey presented in [13], it was to be expected that the experimental setting was to be done on some of these tools. Protégé is one of the tools, but why not use another available visualization tool and instead use the authors’ own implementation for node-link diagram visualization? This should be justified in the paper.

b. Many of the use cases in ontology visualization are performed over ontologies that represent classes (entities) and relationships. In both domains used in the experimental study, the ontologies are limited to hierarchies. The authors should clearly justify why the use case tasks are performed over a “taxonomies” and not full-fledged ontologies.

c. In the subsection “Gaze metrics” and “Gaze feature sets”, it would be clearer to have besides the bullets on gaze feature sets (in the last part of this section), a table that summarizes what each measure may indicate. This table could have three columns: “Metric”, “Measure”, “What it indicates”. For example, Count of Fixations; Total number under a certain value (could a percentage be given?); Indication more effective visual cues
Another example of an entry in the table: Search to process ratio; 1; equal time spent on searching and processing

(5) Results and Findings and discussions
a. The result analysis regarding “Cumulated gaze analytics” and its association with higher accuracies seems rather straightforward as more data is collected in this case, i.e. “likely to contain richer data to inform ….”. The point in the case of cumulated data would be if in general (for other use cases) it is feasible to collect data in this way.

b. It is not clear whether there is a correlation between completeness and correctness success scores. The authors could explain how the study and its results could be affected If this correlation were to be taken into account.

(6) Grammar and typos
a. Page 2, first column: subsequent predictions based upon -> subsequent predictions based upon them
b. Page 2, first column: in their entirety -> entirely
c. Page 2, second column: capturing all known gaze of a user -> capturing all known gazes of a user
d. Page 4, second column: automatedly generated mappings -> automated/automatically generated mappings
e. Page 4, second column. Caption: Figure a. Conferene Domain -> Figure a. Conference domain
f. Section 5.1: When predicting users’ succuss -> When predicting users’ success
g. Table 1. Results of prediction accuracy: Correctness Succuss -> Correctness success
h. Page 13, second column: oberserved -> observed
(7) Resources: https://github.com/TheD2Lab/Eye.Tracking.Data.Analysis.For.Tobii.2150
Resources have been provided. Missing general readme file that describes how to reproduce the generation of the results based on the experiment data.

Review #2
Anonymous submitted on 14/Apr/2023
Review Comment:

The article compares four approaches to predictive gaze analytics through a series of experiments and situates its contribution towards supporting future adaptive ontology visualization systems by putting first steps towards detecting notable visual events experienced by a user.

On the positive side, the topic is highly relevant especially within the context of adaptive ontology visualization systems. Authors cover, although briefly, challenges and existing work in adaptive ontology visualization and also provide a good outlook of gaze analytics. In this respect, I find the article quite informative both in terms of relevant challenges and potential directions.

Having said these, I think the article has fundamental problems when it comes to its scientific contribution. The adaptation of ontology visualizations and its intricacies are highly relevant for the Semantic Web community; however, that is the future work or follow up of the current study. Then the current study's connection to Semantic Web and how it addresses the intricacies on this particular field are not clear. It rather reads like an example/test case is taken from ontology mapping to test different approaches (could have been any other test case outside Semantic Web). In this current form, the article is rather fit for a venue that publishes studies on gaze analytics. Even then the actual contribution does not seem to be very novel, since these gaze analytics approaches have been studies extensively, and article does not seem to report anything really new.

In short, I think the idea and topic (promised) is quite interesting; however, the actual content of the article and contributions are rather generic and does not go beyond a well informative text.

Review #3
Anonymous submitted on 23/Apr/2023
Major Revision
Review Comment:

The author present a study about different possible ways of capturing the way that a user is interacting during ontology visualization, using some derived metrics in order to build a model and predict whether the user is struggling to perform their task at hand. In particular, the selected task is evaluating a list of ontology matches, asking the user as well to add some matches that they might consider be missing in the current presented list. Two interfaces are evaluated, namely, tree-view (the one provided directly by Protegé) and an adhoc graph-based visualization implemented in D3. The results show that cognitive workload metrics seem to be the most interesting ones in this particular context.

The paper reads well and, indeed, it tackles a problem that might have been overseen by the ontology visualization community such as using adaptive UI techniques in order to lower the mental burden of the user of ontologies. This said, I must admit that I have mixed feelings about the contribution of the paper. It reads more as a CHI paper rather than a paper focused for the SWeb community; this is not necessarily bad, but I think that the take away message should be clearer for the use case at hand. In particular, I have a series of comments about the current manuscript:

- The task that has been selected to conduct the experiments is far from being the most lightweight one, so the cognitive workload metrics could be directly the ones expected to be the most important. It is good to test them, but I miss other more lightweight tasks which might be more general-user oriented, such as ontology navigation (e.g., ask the users for some questions to be answered analyzing the model or searching for a particular fact). In this sense, the task at hand requires understanding two possibly different ontological models, and, on top, aligning them to the user's own mental model in order to evaluate them. So, while the experiment is indeed interesting, the kind of user that would be being analyzed should be almost classified as knowledge engineer rather than an average one. I miss a detailed analysis of the possible tasks affected by adaptive UI in ontology visualization and their user-profiling applicability (for example, which ontology visualizations tasks might be affected by early-predictions and which ones should be considered longer and more prone to use the cumulative data). This would make the take away message clearer and closer to an actuable suggestion for the community.

- Reference 61 seems to be quite related and important for the current contribution but has not been yet published. Up to which point do the contents overlap?

- I miss a presentation about the prediction task that is being modeled. In order to give an actuable take away message for the SWeb community, the classification task should be defined (rather than just stating that some different example classifiers were used). This is, how is the model actually trained? The features gathering is clearly explained, but not the prediction task. Besides, in this setup, at which points in time would it be invoked for prediction? Is it classifying each user's interaction? Given that it's a needed first step towards adaptive UIs in ontology visualization, a further explanation of the integration of the techniques would be welcome (in a similar way to the section about the gaze metrics, which I personally found very interesting).

- Experimental setup:
- What are the sizes of the ontologies being used in the experiments? Did they actually fit completely in the screen so the user could see all the information at just one sight? Were both ontologies presented at the same time in the same screen (i.e., using two instances of Protegé side by side) or did the users have all the freedom to interact as they wanted? Moving continuously from one model to another might have imposed an extra overload (the same would apply to D3 visualization).
- When doing the Pizza tutorial, did the users use Protegé? If so, up to which point do this bias the experiment due to previous exposure to the tool? In the paper, it is mentioned that they were instructed to only use the tree-view to solve their problems at hand, but this might imply that they were aware of the rest of the tool as well previously.
- Was the D3 visualization focused somehow only on the contexts of the mappings? I lack information about the interface in order to actually assess its potential effect.
- While the goal was to predict the user's behaviour, just the tree view of the hierarchies might not be enough to precisely assess the validity of the matches (the information required for matching could be within the set of properties that a particular concept has, or in the definitions and GCIs in the ontology). Did the author take this into account when doing the analysis? Wouldn't this affect the predictive capabilities of the model?

- Experimental results:
- Splitting the graphs in two pages makes it a little bit difficult to follow. Please, consider to have them in the same page. Moreover, showing all the results at the same time in the graphs hampers the readability, I would suggest to summarize the graph in the main body, and let the complete one for the annexes. Moreover, the graphs do not include the baseline (if I'm not wrong).

To sum up, I really find the paper interesting, but I think that further efforts must be done in order to get it closer to the ontology visualization community (among others) by polishing the applicability of the study to different possible visualization tasks.

Figure 5a: Conferene
page 11 => user's succuss => success (also in the table Correctness succuss)