Declarative Construction of Knowledge Graphs from NETCONF Data Sources

Tracking #: 3809-5023

Authors: 
Ignacio Dominguez
Luis Bellido
Diego Lopez

Responsible editor: 
Guest Editors KG Construction 2024

Submission type: 
Ontology Description
Abstract: 
The knowledge graph paradigm is drawing attention in the network industry as a technology for integrating heterogenous data silos such as model-driven telemetry based on the YANG language. In this sense, declarative mapping languages have emerged as scalable and flexible solutions for constructing knowledge graphs. A prominent mapping language is the Resource Mapping Language (RML), which enables the integration of heterogenous data sources by reusing ontologies that describe access to them. However, when it comes to the network domain, there is a lack of ontologies that describe access to YANG data exposed by network devices. This paper introduces the YANG Server Ontology for describing YANG servers and the interactions with them using network protocols like NETCONF. Additionally, guidelines for reusing the ontology in RML mappings are provided and validated in a use case by extending a reference RML engine.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Reject (Two Strikes)

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Edna Ruckhaus submitted on 19/Feb/2025
Suggestion:
Accept
Review Comment:

The authors have addressed accordingly all of the concerns stated in the previous review. There is only one additional comment:

Authors have added some clarifying text in the paper regarding concern #1 and concern #2, it would be advisable to extend this text regarding the results of the evaluation by the tools OOPS! and FOOPS!. For example, in the case of OOPS! it could state that the evaluation results are all "Minor" pitfalls and explain why the only "Important" pitfall is not relevant in this ontology. In the case of the evaluation of FOOPS!, also it would be nice to know if you have followed the recommendations on the metadata given in the result.

Review #2
Anonymous submitted on 06/Mar/2025
Suggestion:
Reject
Review Comment:

I am reviewer 3 of the original version of this paper. I see that Rev 1 and 3 (me) have been bluntly ignored in the point-by-point response by the authors. And none of my points have been addressed. Therefore, I'll just copy-paste my previous evaluation here, and adapt it just as necessary. I leave my recommendation to "reject" on principle.

This manuscript is submitted as 'Ontology Description'.
In 12 pages with a total of 41 references, it introduces the YANG server ontology, which describes the core concepts of the YANG data model and adds extensions to the specific NETCONF protocol which encodes YANG as XML and relies on SSH for interactions between clients and servers. The ontology is developed following the LOT methodology, conceptualized using CHOWLK, the documentation is generated using Widoco, and the ontology+documentation is available at https://w3id.org/yang/server
The sources are on github, with a total of 22 requirements listed in a csv document, and converted to SPARQL queries that can be executed through a Jupyter Notebook. An example turtle file is available too.

In addition to introducing the ontology, the paper describes how the YANG server ontology can be combined with RML to generate a knowledge graph from the data available at a YANG server. More specifically again, this paper and the implementation focuses on the servers implementing the NETCONF protocol. The combination of this work with RML has been integrated in the reference RML implementation BURP (pull request #5).

In my opinion, the article does not meet the quality standards required for publication in the semantic web journal :

(1) Although the paper is officially an ontology paper and should therefore be concise (12p would be fine), its actual scope is broader. The title of the paper demonstrates it focuses on its integration in RML-based KG construction.

(2) As an ontology paper, I'd say that it has some shortcomings.

- It is the result of applying well a mature methodology, but many details are missing, including the timeline of the sequence of sprints, number and type of participants (domain experts, ontology engineers), number of pitfalls and how they were solved, same for FOOPS!, ...
- some statistics about the ontology would be welcome. How many classes, properties, what expressivity, ...
- some additional considerations such as modularity: it would have been useful to better separate what's generic (YANG) from what's specific to NETCONF. I guess basic authentication for example is not relevant for all YANG protocols. It would be probably appropriate as well, for a journal paper, to support at least one more YANG protocol such as RESTCONF or gNMI or CORECONF (CORECONF is not mentioned in the paper).
- the way the YANG Server Ontology and RML can be combined could be specified using simple alignments, or more formally using SHACL rules.

(3) If I consider the part of the paper that focuses on the construction of knowledge graphs from NETCONF data sources (what's the focus as per the title, and also the most relevant to this special issue):

- we're missing a proper validation of the approach. It's a good point that the proposal has been merged in the BURP code base, however this doesn't properly justify the validity of the approach. I would expect some validation through experiments in the paper, with a clear description of the setting (based if I understand well on CESNET/netopeer2). Statistics about KG generation would be relevant, including the duration, how this duration is shared between the YANG server/network/BURP, including size of the exchanged XML documents, number of triples generated, relevance of having filters on the server, etc.
- I miss some discussion about alternative ways to support the conversion of XML data on CORECONF servers. From my understanding of RFC6241, NETCONF must support SSH as a transport protocol (specified further in RFC6242), but other transport protocols could be defined incl. SOAP/HTTP/TLS. So an alternative could be to have data sources in RML send a SOAP request message, and interpret the SOAP response message. An alternative could also be to extend RML with support for SSH connections to some server, then have the logical source element describe what needs to be sent to the server, and how the response must be interpreted ...
- I miss some discussion about what would be different for another YANG protocol. What can be reused from the ontology and implementation, and what needs to be added

(4) Finally, I believe the paper could use more references or could better choose references. For example, there is a reference for the modular RML as the result of 3yrs of existence of the KGC community group (ISWC 2023 Resource Track). Maybe the following papers are highly related work:
- Ismail, H., Hamza, H. S., & Mohamed, S. M. (2018, December). Semantic enhancement for network configuration management. In 2018 IEEE Global Conference on Internet of Things (GCIoT) (pp. 1-5). IEEE.
- Sahlmann, K. (2021). Network management with semantic descriptions for interoperability on the Internet of Things (Doctoral dissertation, Universität Potsdam).
- Sahlmann, K., Scheffler, T., & Schnor, B. (2018, June). Ontology-driven device descriptions for IoT network management. In 2018 Global Internet of Things Summit (GIoTS) (pp. 1-6). IEEE.
The section about related work is really focusing on RML, with only 4 references. If the paper is about the ontology, then related ontologies should be considered.

Review #3
Anonymous submitted on 24/Mar/2025
Suggestion:
Minor Revision
Review Comment:

As reviewer 2 of the previous version of the paper, I reiterate my positive appreciation of the proposed work and thank the authors for considering some of my suggestions and those of the other reviewers. Among other examples: the small clarifications added to the explanations and the additional details facilitate the second reading; the reuse of third-party vocabularies (e.g. UCO and FOAF) aligns with the principle of interoperability of Linked Data; etc.

However, I would like to draw the authors' attention to the fact that fundamental expectations for a paper of the "Ontology Description" type, outlined in the four reviews of the previous version, are still not fully met. This particularly regarding the evaluation of the proposal, where the alignment with the requirements is overlooked (which would involve translating these requirements into competency questions and corresponding authoring tests), and where the implementation of a prototype is presented as the sole justification for the proposal without qualitative or quantitative analysis.

It seems to me that the authors have all the keys and material to address this by naturally following the questions and suggestions from the previous reviews, along with the few additional comments I provide below, which should lead to a straightforward process of reformulation and enrichment of certain parts of the paper. On this basis, the presented work will have the deserved impact.

Major questions and remarks:

- §2 Related work, p.2 l.37-41: could you please clarify what is the key takeaway from reference 11? Is it the challenge discussed for references 10 and 12?
- §3.2 Implementation p.4 l.1-2: could you please clarify the relationship in your approach between Chowlk and Protégé? Was CHOWLK "only" used for the convenience of a graphical notation to share the ontology design between stakeholders? Why not using the CHOWLK converter to translate the conceptualization to an ontology implementation?
- §3.2 Implementation p.4 l.2-3: it is a nice initiative to leverage OOPS and FOOPS and share the results in the projects' GitHub repo. However, could you summarize the results in the paper to make it standalone, and provide your perspective on the results in relation to your design?
- §5 Use Case, p.7+: Overall, the content of this section appears more like a test rather than an evaluation (a comment already made in the reviews of the previous version of the paper). To address this, I suggest that the authors change the title of the section (e.g. 'Evaluation through the YANG Catalog Use Case') and reorganize/reformulate the content to clarify the goals. A typical narrative flow could be: 1) In this section, we propose to exemplify the utilization of our framework by building a knowledge graph on top of the YANG catalog; 2) The YANG catalog is ...; 3) However, it lacks ... and using our framework could bring ... such as enabling semantic-based searches; 4) For this, we present a two-step approach ...; 5) Results; 6) Discussion.
- §6 Conclusion, p.10 l.3: I suggest that the authors clarify the meaning of 'network management operations,' as this formulation is very specific to network administration activities and does not directly correspond to the scope of the developed ontology.
- §6 Conclusion, p.10 l.12-17: I suggest that the authors clarify the purpose of this feedback by relating it to concrete work they have carried out. Indeed, is it meant to convey that they chose the LOT methodology to simplify interactions with domain experts, as these experts typically do not have skills in semantic modeling? Is it intended to indicate that this work is the first of its kind (to the best of their knowledge) and therefore had to rely on standards for the R/D stage when no related work was available?

Minor remarks:

- §3.2.1 YANG Server Core, p.4 l.19: referring to the 'on the other hand' formulation, could you please clarify in comparison to what?
- §3.2.3 YANG Operations, p.6 l.5: "Please, note that" => "Note that"
- §5 Use Case, p.9 l.45-46: there are unnecessary spaces between the text and the footnote numbers.
- Bibliography: I suggest that the authors add more details to ref. 15. I also suggest that the authors replace refs. 31 and 32 by URLs to the projects' repositories in footnotes.
- I suggest that the authors make the footnote URLs clickable.