Continuous Multi-Query Optimization for Subgraph Matching over Dynamic Graphs

Tracking #: 2864-4078

Authors: 
Xi Wang
Qianzhen Zhang
Deke Guo
Xiang Zhao

Responsible editor: 
Guest Editors Web of Data 2020

Submission type: 
Full Paper
Abstract: 
There is a growing need to perform real-time analytics on dynamic graphs in order to deliver the values of big data to users. An important problem from such applications is continuously identifying and monitoring critical patterns when fine-grained updates at a high velocity occur on the graphs. A lot of efforts have been made to develop practical solutions for these problems. Despite the efforts, existing algorithms showed limited running time and scalability in dealing with large and/or many graphs. In this paper, we study the problem of continuous multi-query optimization for subgraph matching over dynamic graph data. (1) We propose annotated query graph, which is obtained by merging the multi-queries into one. (2) Based on the annotated query, we employ a concise auxiliary data structure to represent partial solutions in a compact form. (3) In addition, we propose an efficient maintenance strategy to detect the affected queries for each update and report corresponding matches in one pass. (4) Extensive experiments over real-life and synthetic datasets verify the effectiveness and efficiency of our approach and confirm a two orders of magnitude improvement of the proposed solution.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Accept

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 24/Sep/2021
Suggestion:
Minor Revision
Review Comment:

I find that the added related work section and added references are highly improved.

In terms of positioning, I would suggest mentioning that your approach is also good when it comes to re-optimizing a set of submitted queries at given time intervals or for a given amount of change in the input graph.

Glad you added both 6.10 and 6.11. For 6.10, I was expecting you would compare multiple matching orders for your approach as opposed to other ones. The goal here is to see the gap between what you are picking and the best possible ordering at least over some queries. This gives an idea of how robust your query vertex ordering is. For the last missing piece, I would say update 6.10 such that it looks at a sample of many query vertex orderings within your approach and gives an idea of how well it does empirically.

Some writing fixes:
. When there is a circle in the query graph, the spanning tree will ignore the non-tree edges -> If there is a cycle in the query graph, the spanning tree ends up not matching the cycle and producing more intermediate results.
In the abstract, add the following:
. Our approach can also be used as a re-optimization approach which given a time interval or a significant change in the input graph, re-optimizes the input queries globally.
. Like other multi-query optimization approaches, it is designed to take advantage of the common
computations to speed up the query process. -> Similar to prior multi-query optimization approaches, our technique relies on sharing computation to speedup query processing.
. This is very subjective, I would avoid future work promises in the conclusion in general.

Review #2
Anonymous submitted on 26/Sep/2021
Suggestion:
Accept
Review Comment:

Thanks for your efforts to address my comments. I have no more questions.