ConSolid: a Federated Ecosystem for Heterogeneous Multi-Stakeholder Projects

Tracking #: 3248-4462

Jeroen Werbrouck
Pieter Pauwels
Jakob Beetz1
Erik Mannens

Responsible editor: 
Guest Editors SW for Industrial Engineering 2022

Submission type: 
Full Paper
In many industries, multiple parties collaborate on a larger project. At the same time, each of those stakeholders participates in multiple independent projects simultaneously. A double patchwork can thus be identified, with a many-to-many relationship between actors and collaborative projects. One key example is the construction industry, where every project is unique, involving specialists for every subdomain, ranging from the architectural design over technical installations to geospatial information, governmental regulation and sometimes even historical research. A digital representation of this process and its outcomes requires semantic interoperability between these subdomains, which however often work with heterogeneous and unstructured data. In this paper we propose to address this double patchwork via a decentral ecosystem for multi-stakeholder, multi-industry collaborations dealing with heterogeneous information snippets. At its core, this ecosystem, called ConSolid, builds upon the Solid specifications for Web decentralisation, but extends these both on a (meta)data pattern level and on microservice level. To increase the robustness of data allocation and filtering, we identify the need to go beyond Solid's current LDP-based interfaces to a Solid Pod and hence introduce the concept of metadata-generated `virtual views', to be generated using a SPARQL interface to a Pod. Building on top of these generic interfaces, domain-specific (higher-level) interfaces can be set up. A recursive, scalable way to discover multi-Pod aggregations is proposed, along with data patterns for connecting and aligning heterogeneous (RDF and non-RDF) resources across Pods in a mediatype-agnostic fashion. We demonstrate the use and benefits of the ecosystem using minimal running examples, concluding with the setup of an example use case from the Architecture, Engineering and Construction (AEC) industry.
Full PDF Version: 

Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 22/Oct/2022
Major Revision
Review Comment:

This is a review of an earlier reviewed paper presenting a decentral ecosystem,called ConSolid, for multi-stakeholder, multi-industry
collaborations dealing with heterogeneous information snippets. I'd like to thank the authors for addressing my comments
and improving the paper.

Understandability and Clarity: The paper is structured in a better scope in this version. Research questions are explicitly described and my previous concerns about abstract, as well as, the related works sections were addressed.

Experimental evaluation:
However, my main concern about evaluation section (even a simple one such as collecting feedback from the domain experts) is not addressed in this version properly again. It is still not clear if the framework has been used by the stakeholders or industry partners, even though some company names are given. In case it is used the extent of the usage is not obvious to the reviewer.
The authors added a validation section however this section describes how the framework is used with some given queries. The results of the queries are not given/explained in the context.

Moreover, the GitHub repo still has some problems which ends up with 404 error. The replication of the results is not possible. This could be an indicator why there is not any query results for the framework.

Considering the mentioned points, I believe the study is still immature and requires an evaluation section where the benefits of using this infrastructure are discussed throughly. The significance of the results is highly important for research papers and IMHO with some more time the authors can provide a high value validation and evaluation section.

Minor: Page 11 Line 26 Listin 7 -> Listing 7

Review #2
Anonymous submitted on 28/Oct/2022
Review Comment:

This revision is a thorough rewrite of the paper. It has significantly improved from the previous version. The content is more focused, substantial, and understandable. Language and typography are excellent. The paper makes an original and important contribution by providing a concrete implementation of a decentralised approach for managing building data. It also reveals the challenges/shortcoming of Solid that should be addressed to support such a demanding use scenario. There is obviously a lot of work still to be done, which is also indicated in the paper.

Minor notes to consider for the published version:
- p 4, lines 25-26: "a folder system of more or less independent researchers" - I don't understand this part of the sentence
- p 4, lines 39-40: For consistency, shouldn't the Github address also be in the footnote?
- p 5, lines 6-9: Perhaps a mention of the concepts "information model", "information container" and "federation" of ISO 19650 would be appropriate here as well?
- p 5, lines 33-43: In my view, the relevant meaning of "data sovereignty" here is that the owner remains in the control of data she produces, which implies the need for (some level of) decentralised management of data. The term is used in this way in The term may be used also to refer to compliance with national regulations etc. but often the term used in that case is "digital sovereignty". Divergent regulation is, of course, a real problem but it is beyond the scope of this paper and it would be better to remove the unnecessary discussion of it here.
- p 8, subsection 3.1: This is really an important section that all future linked data developers (especially if they come from API programming background) should read! Nodes in graph data do not generally have a single parent. Great! However, the SPARQL access control part in the end of the subsection (from lines 45-46 onwards) could be separated as its own subsection.
- p 11, lines 21-41: The example (listings 6 and 7 and the related description) are difficult to follow; please try to make it more explicit.
- p 14, Fig 5: Does the two-way arrows mean that there is dcat:catalog link to both directions? If so, you should mention that when accessing the catalog structure using property paths, looping is avoided.
- p 15, section 5: This section has greatly improved and is more focused and understandable. However, it also reveals the complexity of the decentralised setting.
- p 24, line 18: Since the data is already given above, could the results of the query also be included in the listing 16? To get a feeling of closure.

Review #3
Anonymous submitted on 03/Jan/2023
Review Comment:

This manuscript was submitted as 'full paper' and should be reviewed along the usual dimensions for research contributions which include (1) originality, (2) significance of the results, and (3) quality of writing. Please also assess the data file provided by the authors under “Long-term stable URL for resources”. In particular, assess (A) whether the data file is well organized and in particular contains a README file which makes it easy for you to assess the data, (B) whether the provided resources appear to be complete for replication of experiments, and if not, why, (C) whether the chosen repository, if it is not GitHub, Figshare or Zenodo, is appropriate for long-term repository discoverability, and (4) whether the provided data artifacts are complete. Please refer to the reviewer instructions and the FAQ for further information.

** ConSolid: a Federated Ecosystem for Heterogeneous Multi-Stakeholder Projects

This paper is the revised version of the submission LBDserver - a Federated Ecosystem for Heterogeneous Linked Building Data.

My main concerns in the original paper were two:
1. It is hard to discern between the software architecture and the metadata management. I think this could be improved by giving more importance to the LDP section in the state of the art (which is only one paragraph), and guiding the user how it is implemented to solve the specific problem the authors try to solve. Right now the paper looks like a usual micro services architecture, implemented it using the Linked Data Platform recommendation and the Solid ecosystem.
2. this is a more interesting problem, which I have not seen much written in the article. I think that consistency between data produced between nodes and how is managed by the platform would be a better approach.

From my point of view 1. has been solved by focusing more in the actual process of how to access and manage building's data which has been really nice. Now the paper reads fluently and the message is clear. The authors describe how they deal with URIs , datasets, aggregate data, access it, etc.

Regarding 2. the authors do not deal with it. I was hoping a response letter from the authors to clarify the points in my review, but I did not see it.

Overall the paper has improved greatly and I think it is a contribution for the special issue.