Data journeys: explaining AI workflows through abstraction

Tracking #: 3199-4413

Enrico Daga
Paul Groth

Responsible editor: 
Guest Editors Ontologies in XAI

Submission type: 
Full Paper
Artificial intelligence systems are not built on single simple datasets or trained models. Instead, they are build using complex data science workflows involving multiple datasets, models, preparation scripts and algorithms. Given this complexity, in order to understand these complex AI systems, we need to provide explanations of their functioning at higher levels of abstraction. To tackle this problem, we focus on the extraction and representation of data journeys from these workflows. A data journey is a multi-layered semantic representation of data processing activity linked to data science code and assets. We propose an ontology to capture the essential elements of a data journey and an approach to extract such data journeys. Using a corpus of Python notebooks from Kaggle, we show that we are able to capture high-level semantic data flow that is more compact than using the code structure itself. Furthermore, we show that introducing an intermediate knowledge graph representation outperforms models that rely only on the code itself. Finally, we reflect on the challenges and opportunities presented by computational data journeys for explainable AI.
Full PDF Version: 

Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 19/Sep/2022
Review Comment:

I want to thank the authors for the effort put in revising the manuscript.
Although the revised version of the manuscript improves the previous one, my feeling is that this contribution is mainly out of scope of the special issue.
The connections missed in the original submission have been provided, but they seem to be someway forced with respect to the topic of the manuscript.
I would suggest the authors to refine their contribution and to submit it as a regular paper to this journal.

Review #2
By Agnieszka Lawrynowicz submitted on 05/Nov/2022
Review Comment:

I appreciate the efforts the authors have made in a revised version of the paper by addressing the remarks raised.
My major remarks have been addressed. Therefore I recommend acceptance of the paper.