Editorial Board

Editor-in-Chief
Cogan Shimizu
Eva Blomqvist

Editorial Board
Mehwish Alam
Claudia d’Amato
Stefano Borgo
Boyan Brodaric
Philipp Cimiano
Michael Cochez
Oscar Corcho
Bernardo Cuenca-Grau
Elena Demidova
Jerome Euzenat
Mark Gahegan
Aldo Gangemi
Dagmar Gromann
Armin Haller
Pascal Hitzler
Aidan Hogan
Katja Hose
Eero Hyvönen
Sabrina Kirrane
Agnieszka Lawrynowicz
Freddy Lecue
Maria Maleshkova
Raghava Mutharaju
Axel Polleres
Guilin Qi
Marta Sabou
Harald Sack
Angelo Salatino
Christoph Schlieder
Stefan Schlobach
Cogan Shimizu
Blerina Spahiu
GQ Zhang
Rui Zhu

Former/Founding Editors-in-Chief
Krzysztof Janowicz
Pascal Hitzler

Editorial Assistants
Michael McCain

Syndicate

PRSC: from PG to RDF and back, using schemas

Submitted by Julian Bruyat on 03/28/2023 - 04:30

Tracking #: 3426-4640

A new version of this paper is available

Authors:

Julian Bruyat

Pierre-Antoine Champin

Lionel Médini

Frédérique Laforest

Responsible editor:

Stefan Schlobach

Submission type:

Full Paper

Abstract:

Property graphs (PG) and RDF graphs are two popular database graph models, but they are not interoperable: data modeled in PG cannot be directly integrated with other data modeled in RDF. This lack of interoperability also impedes the use of the tools of one model when data are modeled in the other. In this paper, we propose PRSC, a configurable conversion to transform a PG into an RDF graph. This conversion relies on PG schemas and user-defined mappings called PRSC contexts. We also formally prove that a subset of PRSC contexts, called well-behaved contexts, can be used to reverse back to the original PG, and provide the related algorithm. Algorithms for conversion and reversion are available as open-source implementations.

Full PDF Version:

swj3426.pdf

Revised Version:

PRSC: from PG to RDF and back, using schemas

Tags:

Reviewed

Long-term Stable Link to Resources:

https://github.com/bruju/PREC

Decision/Status:

Major Revision

Solicited Reviews:

Click to Expand/Collapse

Review #1

By Olaf Hartig submitted on 24/Aug/2023

Suggestion:
Major Revision

Review Comment:

This manuscript introduces a well-defined approach to map Property Graphs to RDF-star graphs, including an algorithm that implements this approach. The mapping is user configurable in the sense that users have to provide a so-called "context" description that specifies for each type of node and edge in their Property Graph what triples are to be created for each node/edge of that type. The corresponding templates for such triples can refer to the values of the properties that the nodes/edges have in the Property Graph (such that these values can be used as literals in the triples created from these templates). By this approach, it is possible to use arbitrary RDF vocabularies for the resulting triples.

The main technical contribution of the manuscript is to show formally that the mapping is reversible if the aforementioned user-provided context satisfies a number of conditions, where reversibility means that any input Property Graph can be reconstructed from the resulting RDF graph in combination with the context that was used for the conversion. As part of showing this reversibility property, the authors also provide an algorithm for the reverse conversion from resulting RDF graphs back to Property Graphs.

While there already are a few other publications about mappings from Property Graphs to RDF, the idea with the template triples as proposed in this manuscript is interesting. Moreover, to the best of my knowledge, there is no other work in the literature that formally (!) shows the reversibility property (sometimes also referred to as information preservation or losslessness) for an approach that provides the same level of flexibility in terms of the way the mapping can be customized by users as is done in this manuscript. I should also mention that I have checked the proofs in detail and can confirm that they are correct. In conclusion, I would be happy to see this work published in the journal ... but not in its current form. Instead, there are some things that need to be improved in the manuscript and some other things that I would like to see being added by the authors. The remainder of this review elaborates on these things. Additionally, a scan of a print-out of the manuscript with several minor comments and points that should be fixed can be found at:
https://www.dropbox.com/scl/fi/81rlyga8yi741haw2c4sm/swj3426.ScanOfComme...

(1) INSUFFICIENT MOTIVATION

The motivation for the presented work needs to be made more clear. In particular:

1.a) The introduction needs to state more explicitly what exactly the research question/problem is that the presented work aims to address.

1.b) Related to the previous point, the introduction also needs to make clear what is challenging about this research question/problem.

1.c) Moreover, I would expect some statement about why "the foreseen scenario is the conversion from PG to RDF" and not the other way around (and, by the way, by whom is this scenario "foreseen"?)

(2) COMPLEXITY ANALYSES MISSING

2.a) I would like to see a complexity analysis of the proposed PG-to-RDF conversion algorithm (Algorithm 1) and perhaps also of the reversion algorithm (if that one is also meant to be used in practice).

2.b) Given that the whole idea of the approach is to enable users to convert PGs into RDF in a "reversible" way, and that this reversibility requires that the given context is a well-behaved one, I think it is crucial that checking whether a given context is well behaved needs to be a computationally tractable task. Consequently, a missing piece of the presented work is a complexity analysis of the corresponding decision problem (for any given context ctx, decide whether ctx is well behaved), ideally in combination with an actual algorithm that can be used to decide.

2.c) Similarly, another decision problem that is relevant and should be discussed is to check for any given context ctx and any given PG pg whether ctx is complete for pg (as per Def.19), because this is another requirement for the contexts used for the conversion. In fact, the latter requirement is not only relevant for the reversibility but even for irreversible conversions (Algo.1).

(3) DISCUSSION OF BLANK NODES

Elements of PGs (nodes and edges) and blank nodes are typically considered as separate types of concepts in the literature. In contrast, the presented approach assumes that nodes and edges of PGs may be blank nodes. Since this is an unusual assumption, it deserves some discussion. Are there any consequences of this assumption? Can the same blank node be used in multiple PGs? Can it even be used as a node in one PG and as an edge in another PG? What would that mean?

(4) FORMALIZATION HARDER TO READ THAN NECESSARY

4.a) The paper uses a lot of different symbols and forms of notation. It is very hard for the reader to keep all of these things in mind. I strongly recommend the authors to try to reduce these things and to repeatedly give the reader clues throughout the paper. One example of how this can be done: Def.3 introduces the notion of a PG as a tuple that consists of six elements, but the symbols for these elements contain the symbol that denotes the PG as a whole. That makes the formulas that use these symbols very overloaded, and a related problem is that later parts of the paper do not even use the introduced form of a tuple. My proposal to fix these issues for this example (and the same approach can be adapted to many other aspects of the formalization as well) is to remove the subscripts from the symbols that denote the elements that are part of the tuple that defines a PG (i.e., "a property graph $g$ is a tuple $(N, E, \mathit{src}, \mathit{dest}, \mathit{labels}, \mathit{properties})"). Then, whenever a PG is mentioned later, it is introduced with the tuple that defines it (which makes it easier for the reader to remember where the individual symbols in the tuple come from). For instance, Def.4 should be written as follows: "The empty PG is a PG $p = (N,E,\mathit{src},\mathit{dest},\mathit{labels},\mathit{properties})$ for which it holds that $N=E=\emptyset$, ..." and Example 3 should be written as follows: "... can be captured formally by the tuple $g_\mathit{TT} = (N,E,\mathit{src},\mathit{dest},\mathit{labels},\mathit{properties})$ with $N= ...$ ..." In a context in which two different PGs are discussed (e.g., Def.6) the symbols of these two PGs can simply be distinguished by adding an apostrophe to the symbols of one of the two PGs.

4.b) Another good way to reduce the number of symbols and, at the same time, to make the text easier to read is to get rid of any symbol that, for some kind of things, denotes the set of all of these things. An example would be the symbol 'PGs' that denotes the set of all property graphs (line 7 on page 6). Instead of using symbols like this, the text should simply state explicitly what kind of thing a particular other symbol is meant to denote. For instance, instead of saying "∀(G, H) ∈ PGs^2, G and H are isomorphic iff [...]", Def.7 should be written as follows: "Two property graphs $G$ and $H$ are isomorphic iff [...]." There are more such "all-of" symbols that should be dropped (e.g., the symbols 'RdfTriples', 'BPGs', 'Ctx', 'Ctx_pg', Ctx^+').

4.c) Similarly, the overly aggressive use of the two quantification symbols (∀ and ∃) as a replacement for writing actual text makes the paper more cumbersome to read than it should be. For instance, Def.7 should better continue as follows: "... iff there exists a renaming function $\phi$ such that $\mathit{rename}(\phi,G)=H$."

4.d) Yet another issue that makes it more difficult than necessary for the readers to follow the formalization is that, for some notions, the authors use multiple different types of symbols throughout the paper. For instance, property graphs are sometimes denoted by G and sometimes by pg, template triples are sometimes denoted by tp and sometimes by t. This form of inconsistency must be avoided.

4.e) Finally, an aesthetic issue: For symbols used in the formalism, all symbols that consist of more than one character should be wrapped in a \mathit{..} command instead of writing them directly within the math environments. When writing them directly, LaTeX treats each character as a separate symbol and uses weird-looking spacing in some cases. As an example, consider the symbol 'Str' in line 45 of page 4 or the symbols 'subject', 'object' and 'RdfTriples' in Def.9.

(5) MINOR

5.a) When reading that the conversion is "without information loss" or "reversible", I would assume that the resulting RDF graphs contain all the information that is present in the converted PGs. However, this is not actually the case for the presented approach. That is, while the information captured by the properties in a PG is somehow captured in the resulting RDF graph, the information captured by means of the labels of nodes and edge does not need to be captured in the RDF graph. Instead, when recovering the PG from the RDF graph, the information about the labels is taken from the context that has been used. Hence, the context is an essential input for the reversion algorithm (and, thus, would need to be kept available in cases in which users want to be able to restore the original PG). I don't see this as a problem but as an interesting aspect of the approach, which you should be made explicit in the paper.

5.b) Regarding reference [10], instead of citing the workshop version of the paper, please cite the more recent and improved journal version:

@article{DBLP:journals/semweb/LassilaSHBBBKLL23,
author = {Ora Lassila and
Michael Schmidt and
Olaf Hartig and
Brad Bebee and
Dave Bechberger and
Willem Broekema and
Ankesh Khandelwal and
Kelvin Lawrence and
Carlos{-}Manuel L{\'{o}}pez{-}Enr{\'{\i}}quez and
Ronak Sharda and
Bryan B. Thompson},
title = {The OneGraph Vision: Challenges of Breaking the Graph Model Lock-In},
journal = {Semantic Web},
volume = {14},
number = {1},
pages = {125--134},
year = {2023},
url = {https://doi.org/10.3233/SW-223273},
doi = {10.3233/SW-223273},
timestamp = {Fri, 20 Jan 2023 20:27:10 +0100},
biburl = {https://dblp.org/rec/journals/semweb/LassilaSHBBBKLL23.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}

5.c) On page 2, line 3, you state that "RDF-star [...] does not provide exactly the same modeling capabilities as PGs" and the reference that you use to back up this statement is the RDF-star CG report [9]. I don't see how this report can be used as a reference for this statement. The report does not talk about PGs at all, let alone compares RDF-star and PGs.

5.d) On page 2, lines 10-11, you write that "the conversion from PG to RDF [is] without information loss, so that users can modify their data and convert them back to the original PG model." This does not sound right. If the RDF data resulting from the conversion is later modified, why would users expect it to be possible to convert the modified RDF data back to the original PG? What would be the point of modifying the RDF view of the data if, after converting back, the PG is still the original one?

5.e) In the context of Def.13, I suggest to say explicitly that this notion of "keys" is not meant to be something like the notion of keys as a form of integrity constraints that can be defined in a database schema. The reason why I mention this is because the section is about PG schemas. In fact, after reading that title of the section plus the first sentence and then seeing the word "key", I immediately assumed that the form of schemas that is defined here enables users to specify some form of key/uniqueness constraints.

5.f) Line 51 on page 13 provides a (fairly vague) informal definition of a notion of "reversibility" of functions and, then, claims that this function implies some properties that are described by two bullet points in lines 1-6 of page 14. I do not see how the second of these properties ("the inverse function [...] must be computable in reasonable time") is implied by the given informal definition of reversibility. This needs to be clarified. Also, what does "reasonable time" actually mean?

5.g) The second bullet point on page 14 continues with an example of public-key encryption functions for which it talks about "applying the encryption function on all possible outputs." I do not understand this statement. What "possible outputs" does it refer to? Is it possible outputs of the encryption function? But then why would the encryption function be applied on its possible outputs? Shouldn't it be a corresponding decryption function that is applied on the outputs of the encryption function?

5.h) Example 16 is wrong because, in contrast to the title of this example ("a trivially non reversible context"), ctx_\emptyset is in fact "reversible", at least in the sense that it satisfies the condition given in the paragraph before the example. The condition is satisfied as follows. For every BPG pg that is not the empty PG, the antecedent of the condition (i.e., ctx_\emptyset \in Ctx_pg) is not true and, thus, the condition is true. For the empty PG, the antecedent is true, but so is the consequent (because the prsc function / Algorithm 1 does indeed produce the empty RDF graph if given the empty PG).

5.i) At the beginning of Section 5.2.1, instead of directly putting the definition of this \kappa function without any further explanations, first the idea behind this \kappa function should be described informally in order to make it easier for the reader to understand what the formal definition aims to achieve.

5.j) While Example 17 illustrates an application of the \kappa function, none of the template triples in this example is nested. It would be helpful if there was also an example of \kappa being applied to a nested template triple. The reason for this is that it is not immediately obvious (and also not explicitly stated--see my previous point) that the template-triple version of \kappa (i.e., the second case in Def.22) is indeed recursive within itself. Providing an example with a nested template triple can make this fact more apparent to the readers (ideally in combination with te informal description of the idea of \kappa, as suggested in my previous point).

5.k) The definition of sign_ctx(type) in Def.24 is ambiguous. In particular, it is undefined what exactly sign_ctx(type) denotes in cases in which tps contains multiple template triples "that will produce triples that no other template in ctx can produce." For instance, what is sign_ctx(tn1) for the type tn1 in Example 18? A consequence of this ambiguity is that the use of sign_ctx(type) in the formula of Lemma 2 is unclear.

5.l) Algorithm 4, line 6: The purpose of the second condition (after the "or") is not clear to me. If the first condition (before the "or") is not true, then the second one cannot be true either. Hence, the second condition seems to be irrelevant.

5.m) Line 9 on page 24: It is not clear what you you mean by "cut". Additionally, this part comes out of the blue; it is not clear at all what this has to do with the theorem that came directly before it or with the topic of the section. Hence, there needs to be some text that makes the connection.

5.n) Lines 11-14 on page 24: A fifth bullet point should be added to this list, saying: "The triples in the template graph ctx(typeof_pg(m))."

5.o) Line 24 on page 24, "Proof of the algorithm": State explicitly what algorithm this is about. Also, where is the actual proof?? (i.e., where does "here" refer to?)

5.p) Line 24 on page 25: There should be some text right before Lemma 4 that introduces the lemma (e.g., what is the purpose of the lemma/why is it relevant).

5.q) Lines 33-35 on page 28: The latter part of the sentence about our related work is not entirely true. The mapping in my AMAR2019 paper [22] is based on the notion of an LPG-to-RDF* configuration, and the idea behind this notion is also present in the GRADES2022 paper [23] where it is captured via the mapping functions pnm, lm, idm, and elm (but, admittedly, not really exploited to its full potential in the experiments of that paper). This notion of an LPG-to-RDF* configuration gives users the flexibility to model the resulting RDF/RDF-star graphs with different RDF vocabularies for different types of PGs. In this sense, it is also a user-configurable mapping, not so different from the approach in your manuscript!

Review #2

Anonymous submitted on 06/Jan/2024

Suggestion:
Major Revision

Review Comment:

The article describes PRSC, a method to transform a property graph into an RDF graph.

General evaluation: The method presented in the article is interesting. However, the redaction must be improved to get the quality of a research article. Moreover, the authors should conduct an empirical evaluation to evaluate and demonstrate the characteristics, advantages and disadvantages of the method.

(1) Originality: fair
the method is new with respect to the current methods for converting property graphs into RDF graphs. However, the use of mapping rules or mapping templates is not new in data management. Moreover, the method is restricted to basic conversions (e.g. the method does not allow to put complex conditions over the elements to be transformed).

(2) Significance of the results: poor
The article describes the method and studies the feature of reversibility, which is related to information loss. Although it is an interesting result, it is not hard to see that it is true because the simplicity of the mapping rules supported by the method. In this sense, I was expecting an analysis of cases (including nice examples) when reversibility does not apply, and showing some features related to such cases.
In terms of theoretical results, I was also expecting an analysis of other features, like complexity and expressiveness, including a comparison with other methods.
In terms of practical results, the article lacks of an empirical evaluation to shows the performance of the method in comparison with other methods.

(3) Quality of writing: very poor
The article is very hard to read for many reasons: there are an excessive number of definitions, many of them used a couple of times in the article; there are confusing statements like "Let PG be a PG"; some definitions, algorithms and examples are not explained (e.g. definition 21, example 15, algorithm 1) or their explanation is unclear.

Specific comments:

-- Page 1 --

"Property Graphs are not a uniform model: some implementations like Neo4j only allow exactly one label for each edge." What is the idea of this statement? It could be interesting to see an analysis of the property graph features supported by current graph database systems, and use such features to describe the requirements for a "good" method for transforming property graphs into RDF graphs.

"In this model, data are represented with triples that represent links between resources." From a data modeling point of view, it is better to use "relationship" instead of "link", and talk about "Web" resources.

-- Page 2 --

"... and more generally any Property Graph with the same schema into the corresponding RDF graph". There is no schema in the example.
A found a weak point here. Given the simplicity of the example, it is possible to create the mapping rules by "looking" the graph. However, it will be very complicated to do that with a larger graph. I recommend the authors to include property graph schemas as an input to understand the structure of a property graph and create the mapping rules.

-- Page 3 --

Is it possible to create URIs instead of blank nodes?

How to deal with a multi-valued property? i.e. when the property-value is an array of values.

How to generate constants or new values?

-- Page 5 --

"Let E be a set, we recall that 2^E denotes the set of all parts of E." 2^E denotes the set of all subsets of E, including the empty set and E itself.

All the definitions related to functions can be reduced to a paragraph.

Definition 2 and Remark 2 are the same.

-- Page 6 --

Definition 6: in the definition of N_H, replace exists x by exists n.

-- Page 7 --

Definition 7: to say that G an H are isomorphic, the inverse mapping is also required, or not?

Definition 8: "L = Str x I". I should be replaced by the set of datatypes.

Check Definition 9: By definition, an RDF-star triple cannot contain itself and cannot be nested infinitely.

-- Page 10 --

Example 11: It requires a better explanation.

Definition 17. What is given defined?

Definition 18: The notion of "valid" template graph must be defined before. Improve the explanation.

-- Page 11 --

"The set of all ctx functions is noted Ctx". This type of notation makes the article difficult to understand.

Example 12, 13 and 14, are not explained nor referenced.

Example 15. This example must be explained.

-- Page 13 --

Algorithm 1: This algorithm must be explained with clarity and detail. It is not clear if this algorithm works with RDF-star.

-- Page 14 --

Example 16. This example is unclear. It does not help to understand the notion of reversible context.

Definition 22. Add an explanation. What are the input and output of function k?

-- Page 5 ---

Lemma 1. The sketch of the proof is a comment more than a short explanation of the proof.

Definition 23 is unclear.

-- Page 16 --

Theorem 1:
Why Theorem 1 is important?
Ctxˆ+_pg is defined below it is used.
What is the contradiction in the proof?

Definition 24: Improve the writing.

-- Page 17 --

The notion of well-behaved is interesting, however it must be presented and studied with more clarity.

Table 8 is very good to understand the method. It could be extended to explain more features.

-- Page 18 --

Add a short explanation of the RDF graph shown in Listing 3.

It is not hard to see that the reversion algorithm is sound and complete, in particular for well-behaved context.
But, what happens with non well-behaved contexts?

-- Page 21 --

Theorem 3.
Why is this theorem important?

-- Page 22 --

Theorem 4.
What is the theorem? An initial description is required.
Why is this theorem important?

-- Page 23 --

Definition 26.
The explanation is enough to understand the notion of projection. The formal definition is unnecessary.

Lemma 3.
This is not relevant for the article.

-- Page 24 --

The statement in Theorem 5 is not a strong result to be presented as a theorem.

Table 11 should be explained.

"Here, we prove the correctness of the buildpg function in Algorithm 5."
Where is the proof? It must be presented with more clarity.

Fix: "The no value losscriterion"

-- Page 25 --

"into the graph g PG". Improve writing.

"Theorem 6. The PG returned by Algorithm 5 is pg."
This title is unclear.
This theorem is very important, so it must be explained and discussed with more clarity.

Fix "RDFToPG"

-- Page 29 --

Fix "RDG graph"

Discuss the advantages of PRSC with respect to the related work.

Review #3

Anonymous submitted on 06/Jan/2024

Suggestion:
Minor Revision

Review Comment:

The paper introduces PRSC, a conversion tool designed to facilitate interoperability between Property Graphs (PGs) and RDF Graphs. The authors propose a configurable conversion approach that leverages PG schemas and user-defined mappings known as PRSC contexts. These contexts enable the transformation of PGs into RDF Graphs and vice versa, allowing for seamless data exchange between the two graph models. The paper outlines the key components of PRSC, including the conversion process and the formal proof that a subset of PRSC contexts can be utilized to reverse back to the original PG. Additionally, the authors discuss the benefits of using PRSC for converting between PGs and RDF Graphs, highlighting its potential to enhance data integration and interoperability in diverse application scenarios.

Motivation
The motivation of the paper is not explicitly articulated, and there is room for improvement in clearly conveying the significance and relevance of the research. The paper lacks a strong and compelling narrative that effectively communicates the specific problem or challenge in the domain of graph database management that the PRSC tool aims to address. Additionally, the paper could benefit from a more explicit discussion of the broader implications and potential impact of the proposed solution.
To enhance the motivation of the paper, it would be beneficial to provide a more detailed and explicit discussion of the following aspects:
1. Identification of a Clear Problem Statement: The paper should clearly articulate the specific challenges or limitations in the current landscape of graph database management, particularly in the context of interoperability between Property Graphs and RDF Graphs.
2. Relevance and Significance: The motivation should emphasize the broader relevance and significance of addressing the identified problem. This could include discussing real-world scenarios or use cases where seamless interoperability between different graph models is crucial, and the potential impact of improved interoperability on data integration, knowledge representation, or application development.
3. Gap in Existing Solutions: The paper should explicitly highlight any existing limitations or gaps in current approaches to graph database interoperability, underscoring the need for a novel and effective solution such as PRSC. This would help to position the research within the context of existing literature and solutions.
4. Potential Benefits and Implications: The motivation should clearly outline the potential benefits and implications of the proposed PRSC tool, such as enabling cross-model data utilization, ensuring information preservation, and fostering broader adoption and collaboration within the graph database community.

Orginality
The paper demonstrates originality in several key aspects:
1. Novel Approach to Interoperability: The paper introduces PRSC as a configurable conversion tool that leverages PG schemas and user-defined mappings to facilitate interoperability between Property Graphs and RDF Graphs. This approach represents a novel contribution to the field of graph database management, addressing the pressing need for seamless data exchange between different graph models.
2. Formal Proof of Reversibility: The paper provides a formal proof that a subset of PRSC contexts, termed well-behaved contexts, can be used to reverse back to the original PG without information loss. This emphasis on reversibility sets the PRSC approach apart from traditional conversion methods and contributes to its originality.
3. Open-Source Implementation: The paper mentions that the PRSC engine is available under the MIT license. This emphasis on open-source availability contributes to the reproducibility of the work, as it allows for broader scrutiny and potential adoption by the research and practitioner community.
4. Focus on Graph Database Interoperability: While there is existing research on graph databases and conversion methods, the specific focus of PRSC on enabling interoperability between Property Graphs and RDF Graphs represents a unique contribution to the field. By addressing the challenges of data exchange between these graph models, the paper offers original insights and solutions to a specific and relevant problem.

Significance of the results
The results presented in the paper hold significant implications for the field of graph database management and have broader relevance in the context of data interoperability and integration. The significance of the results can be understood through several key points:
1. Addressing Interoperability Challenges: The paper's results are significant as they directly address the challenge of interoperability between Property Graphs and RDF Graphs. By introducing PRSC as a configurable conversion tool, the paper offers a practical solution to facilitate seamless data exchange between these two graph models. This is particularly relevant in scenarios where organizations need to integrate data from diverse sources and systems.
2. Enabling Cross-Model Data Utilization: The results are significant in enabling users to leverage the strengths of both Property Graphs and RDF Graphs without being constrained by the limitations of a single model. This flexibility allows organizations to make use of a wider range of tools and technologies, thereby enhancing the utility and value of their graph data assets.
3. Reversibility and Information Preservation: The formal proof of reversibility for a subset of PRSC contexts is a significant result, as it provides assurance that the conversion from Property Graphs to RDF Graphs can be achieved without information loss. This is crucial for ensuring data integrity and maintaining the fidelity of the original graph data, which is essential in various domains such as data migration, data warehousing, and knowledge representation.
4. Future Research and Development: The paper's results also have significance in shaping future research and development efforts in the domain of graph database management. The proposed avenues for future work, such as extending PRSC's expressiveness and addressing scalability issues, provide a roadmap for further innovation and refinement of the PRSC conversion tool, thereby contributing to the advancement of graph database interoperability.

Soundness
The soundness of the paper can be evaluated based on several key aspects:
1. Formal Definitions and Proof: The paper provides formal definitions of PGs and RDF graphs, as well as a formal definition of PRSC conversion. It also includes a formal proof that a subset of PRSC contexts, termed well-behaved contexts, can be used to reverse back to the original PG without information loss. This demonstrates a rigorous approach to establishing the soundness of the proposed conversion method.
2. Relevance to Interoperability: The paper addresses the need for interoperability between Property Graphs and RDF Graphs, which is a significant challenge in the field of graph databases. By introducing PRSC as a tool to facilitate seamless data exchange between these graph models, the paper addresses a relevant and pressing issue in the domain of graph database management.
3. Discussion of Limitations and Future Work: The paper acknowledges potential limitations of the PRSC approach, such as scalability issues for large PGs and the need for further expressiveness in PRSC contexts. By addressing these limitations and proposing avenues for future research, the paper demonstrates a comprehensive and self-aware approach to the topic, contributing to the overall soundness of the work.
4. Related Work: The current iteration touches on key studies, but a more comprehensive review of the literature could enhance the paper's depth and context. By broadening this section, the authors can better position their work within the existing body of research, highlighting both the advancements they are contributing and the gaps they are addressing. Moreover, an expanded Related Work section could offer a more detailed comparison with similar studies, thereby strengthening the paper's argument and its scientific significance.

Clarity
The paper is generally clear and well-written, with several features that contribute to its clarity:
1. Organization and Structure: The paper is well-organized, with clear section headings and a logical flow of ideas. The introduction provides a clear overview of the problem and the proposed solution, while subsequent sections delve into the technical details of the PRSC conversion tool. The paper concludes with a summary of the key findings and future research directions. This structure helps readers to navigate the paper and understand the main points.
2. Definitions and Terminology: The paper provides clear definitions of key terms and concepts, such as Property Graphs, RDF Graphs, and PRSC contexts. The authors also use consistent terminology throughout the paper, which helps to avoid confusion and ensure clarity.
3. Examples and Illustrations: The paper includes several examples and illustrations to help readers understand the PRSC conversion process. For instance, Figure 1 provides a visual representation of the PRSC conversion process, while Table 1 shows an example of a PRSC context for converting a PG into an RDF Graph. These examples and illustrations help to clarify the technical details of the PRSC conversion tool.
4. Language and Style: The paper is written in clear and concise language, with technical terms and jargon explained in a straightforward manner. The authors also use a consistent style throughout the paper, which helps to maintain clarity and readability.

Log in or register to post comments
3500 reads

Main menu

Editorial Board

Syndicate

PRSC: from PG to RDF and back, using schemas

Tracking #: 3426-4640

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles

Search form

Main menu

Login

Editorial Board

Syndicate

PRSC: from PG to RDF and back, using schemas

Tracking #: 3426-4640

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles