Review Comment:
This paper presents Blue Brain Nexus, an ecosystem for data and knowledge management that is open-sourced by the Swiss brain research initiative Blue Brain.
Blue Brain Nexus is composed of three main components:
- Nexus Delta: composed of Cassandra (scalable NoSQL database), Blazegraph (scalable triple store) and ElasticSearch (scalable text search). These datastores are all exposed under a single secured API.
- Nexus Fusion: Web interface for Nexus Delta, allowing users to upload and search for data, and configure access permissions.
- Nexus Forge: Python framework that allows users to build knowledge graphs from various sources and formats, and to validate the data using SHACL.
The authors describe in details the architecture of this system and its components, and provide several use cases in Section 7.
This is a system report submission, so it will be reviewed along the following two dimensions:
(1) Quality, importance, and impact of the described tool or system (convincing evidence must be provided).
I think the project presented in this paper is exemplary in its approach of developing scalable data and knowledge management solutions, using open standards, for practical use in complex real-world scenarios. I believe that the approach adopted by the Blue Brain Nexus ecosystem is an additional proof on the added value of these open standards and existing technologies, such as SPARQL and SHACL.
The value of Blue Brain Nexus is clear to the reader from the way the system is designed, and its importance is demonstrated by its adoption in several use cases described in Section 7.
-> My only take is that it would have been valuable if usage statistics were provided as well in this section (e.g. number of active users, number of commits, number of queries per day, etc.) to clearly show how this system is being adopted in real-world scenarios.
The system is extremely well-documented on its website and Github repository, and includes a number of getting-started examples and codes.
(2) Clarity, illustration, and readability of the describing paper, which shall convey to the reader both the capabilities and the limitations of the tool
Overall, I think the paper is very well-written. It manages to provide developers with a large amount of technical details required to fully understand the architecture and the performance of the system, while at the same time clarifying to the average researcher and user the important functionalities that this system covers and the problems it solves.
-> That being said, I believe that some detailed technical parts of this paper (mostly in Section 4) can be reduced, and moved to the supplemental material or another technical documentation that the authors can refer to in the paper.
The motivation and background section shows that the authors are fully aware of the challenges and the requirements of such systems.
-> Here, I would have hoped if the authors would mention few other inter-disciplinary projects that, similarly to Blue Brain, deal with the same challenges around managing the data and knowledge produced in data-driven science cycles. It would be great if the authors can list significant high-level differences in the approaches taken by different projects to solve these common problems. And whether there is a possibility to reuse existing components and tools from other projects.
----
Finally, given the quality and the value of the introduced system, and given the clarity of the paper and the available external documentation of this system, I recommend the acceptance of this system report in the Semantic Web Journal.
|