Review Comment:
This paper presents an important contribution to the field of teaching Knowledge Graphs (KGs) by proposing a structured approach to organising educational resources for KG courses. The work is timely and relevant, especially given the increasing adoption of semantic technologies in education. The study is well-motivated, and the proposed system has the potential to be useful for both educators and learners. That said, I believe the results are still preliminary, and a more comprehensive evaluation would strengthen the findings. The quality of writing also needs improvement, as there are several grammatical errors and structural issues that affect clarity. Overall, the paper requires major revisions to improve clarity, structure, methodology, and writing quality before it can reach its full potential.
Abstract
The abstract provides a solid and concise overview of the paper, but it could be clearer and more informative. First, the intended audience for the Knowledge Graph (KG) should be mentioned early on, along with the level of education it targets. This would help readers understand its relevance and application. Additionally, the abstract does not reference the KG construction pipeline, which is a key contribution of the paper. A brief mention of the methodology used to build the KG would add important technical context. Lastly, the abstract should specify the nature of the evaluation. Since the assessment is qualitative and based on expert interviews, highlighting this would give readers a better sense of the study’s validity and approach.
Introduction
One key area for improvement in the introduction is the early definition of key educational terms. The paper introduces important concepts that may not be familiar to the computer science community, but their explanations appear too late in the text. For instance, "knowledge" and "skills", which are central to the discussion, are not defined until page 8. I recommend moving these definitions to the introduction, ensuring a solid conceptual foundation from the outset. Additionally, commonly used terms such as "teaching materials," "open educational resources," and "learning objects" should be clearly defined early in the paper to enhance readability.
Another point that needs more balance is the discussion of intended users. Lines 24-33 focus heavily on how teachers will use the KG, but there’s little mention of students or learners. This is a noticeable gap, especially since the problem statement highlights the difficulty learners face in finding high-quality learning resources. If the KG is meant to support both teachers and students, the introduction should make that clear by explaining how each group will engage with it.
On the technical and methodological side, the introduction could benefit from more details about data collection and the KG construction pipeline. While these are acknowledged as contributions at the end, they are fundamental to the framework and should be introduced earlier to provide necessary context. Even a high-level summary of how the KG is built would help readers unfamiliar with the process. The same applies to the evaluation approach—while the specifics belong in the methodology section, briefly mentioning the qualitative nature of the evaluation, particularly the use of expert interviews, would strengthen the introduction and reinforce the credibility of the study.
Finally, the list of contributions could use some expansion. The paper presents a lot of technical and methodological details, but the introduction doesn’t fully capture their scope. Providing a bit more context around these contributions would improve readability and help the audience better understand the significance of the work. Expanding this section slightly—without making it too dense—would make the introduction more effective in setting up the rest of the paper.
Related Work
The Related Work section does a good job of covering key papers, including some recent contributions, but there are areas where it could be clearer and more structured. In lines 9-14, the classification of prior work needs to be more comprehensively presented. Each category should be explicitly described, with key references provided for each, allowing readers to better understand the distinctions and refer to relevant literature. Without this, the classification feels underdeveloped and lacks the necessary context.
Following this classification, the discussion appears to focus primarily on literature related to assisted learning. However, it is unclear whether this is the main category the authors are emphasizing or if other categories were intended to be covered but were omitted. If assisted learning is the central focus, this should be explicitly stated. Otherwise, the section needs to be revised to ensure that all categories are properly represented and discussed.
Additionally, the section tends to list prior work without critically engaging with it or clearly positioning it within the context of the current research. To improve, the discussion should go beyond summarization and instead analyze how existing research relates to the current study. This includes making explicit the research gap and positioning the present work within the broader context. Strengthening these aspects will enhance the section’s coherence and ensure it effectively justifies the need for the proposed research.
Methodology
Section 3.1. Second paragraph: Given that a semi-structured approach to construct the KG was used, the phrase "generating an educational teaching KG" might not be the most precise wording. "Generating" often implies a more automated, fully algorithmic process, whereas a semi-structured approach usually involves a combination of automated methods and manual curation.
Section 3.1. Second paragraph: “Although there are some recent courses ….by the semantic web community”. Is there any way to support this claim? In what specific ways do they differ? Are there particular aspects—such as structure, depth, or target audience—that set them apart? Providing an example of such a course and explaining how it diverges from standard university curricula would help substantiate this point. Can you give an example?
Section 3.1. Second paragraph: “Based on input we received…” this lacks specificity regarding the nature of the input received. Were these gathered through formal approaches like structured surveys, or was it informal feedback? The rest of the paragraph also seems anecdotal rather than well-supported. It could benefit from citing examples or references that substantiate the claim about the relative availability of educational content in these domains compared to KG resources.
Section 3.1. Second paragraph: Please include links for both the Alan Turing Institute and the COST Action on Distributed Knowledge Graphs (DKG). Additionally, spell out the full name of the latter.
Section 3.2. "capturing intricate relationships" and "enabling semantic interoperability", this could be more precisely defined. For instance, what specific types of intricate relationships are difficult to capture? Are there existing methods that partially address this but fall short? Since the text refers to "current" limitations, it implies that existing approaches do attempt to address these challenges but fall short. However, the passage does not provide any concrete examples or explanations of where and why these methods are insufficient (something that could be included in the Related Work section).
The research questions would benefit from greater precision and clearer definitions of key terms. In RQ1, for example, the phrase "minimum pedagogical requirements" is quite broad and could mean different things. Are these requirements focused on curriculum design or metadata standards? A brief explanation of what is meant by these minimum requirements would help bring clarity and focus. Similarly, in RQ2, the idea of "contextual alignment" is not clearly defined. Does this refer to the semantic relationships between resources or the way content is structured? A clearer explanation would help strengthen the research framework and ensure the question is well contextualised.
Section 3.3.
The phrase "supplementary resources were obtained from authors’ private lists with resources" requires elaboration. What types of resources are included? What is their provenance?
Lines 25–27:
It is essential to be absolutely clear about the sources of the collected data. The Alan Turing Institute, COST Action DKG, and publicly available courses are mentioned, but further clarification is needed. What specific types of courses and materials were collected from these sources, and what criteria were used for their selection?
Regarding the data collection form:
• Was this distributed online or in a paper format?
• How were Semantic Web experts identified and contacted for participation?
• The mention of "requesting personal information" needs clarification. What specific types of personal data were collected, and for what purpose?
• The reference to URLs is vague, does this mean URLs of course webpages, specific course materials, lecture recordings, or other resources? This should be made explicit.
• The "optional template" needs clearer explanation. Was this the format of the data collection form, or a separate resource provided to guide responses? If it relates to the user interface in Fig1, this should be signposted accordingly. Additionally, does it correspond to the contents of Table 4? If yes, please signpost.
The phrase "by a group of three participants" should specify the nature of their collaboration. How was this collaboration organised?
The sentence "The collection of educational materials began with voluntary contributions from the course instructors" should appear earlier in the section as it logically introduces the data collection process. The final sentence in the section should also specify who was responsible for incorporating additional resources and where these materials were sourced from.
Regarding Table 1, the explanation lacks detail on how this information was presented to the experts and how it guided them in the data collection process. More context would help readers understand its role in guiding expert contributions.
How were the data collected from the Alan Turing Institute group? The process is clear for COST Action DKG but missing here.
Section 3.4.
What “diverse formats” were collected? Please give examples.
A short overview of the referenced paper [1] here would be helpful, along with a clear signpost to Table 5 for better readability.
Line 11: What type of raw data is being referred to? Please specify.
The sentence "We implemented a data pipeline to enable..." would benefit from a signpost directing readers to Section 5 for further details.
Section 3.5.
"As Semantic Web experts are typically not trained..." Aren’t they the instructors mentioned in the previous section? If so, wouldn’t they already be familiar with educational parameters? Or are you referring to researchers and practitioners in general? If it’s the latter, consider using a different term to avoid confusion with the previously mentioned participants.
Line 24: What specific educational parameters are being referred to? Please clarify.
Section 4.1.
Line 2 – Please add a signpost to Table 5.
Section 4.2.
The definitions of skill and knowledge should be introduced earlier.
Lines 31–32: The last module should be written as a new paragraph for consistency.
Lines 32–33: Who carried out the annotation? Please specify.
Section 4.2: Please provide concrete examples for each module by referencing specific ontological components that were defined.
Line 42: "The classification was developed through an iterative process, leveraging both pedagogical expertise and feedback from project stakeholders." Could you elaborate on how this iterative process was conducted and what specific input was incorporated?
Page 11
Line 4: What constitutes the bare minimum? Please elaborate.
Section 4.3.
Lines 14–15: Explain the reasoning behind the shift from slides to theoretical educational material. Is this change influenced by the structure of the vocabulary used, please explain.
SPARQL queries 4.3.1(d) and 4.3.2(e) are essentially addressing the same query but for different entities (topic vs. course). Please ensure consistency in the terminology used to refer to "laboratory".
The section states, "We group the SPARQL queries thematically as presented in Table 5" which lists four groups: sub(topic), course, dataset, and material. However, the section only presents SPARQL queries for sub(topic), course, and material. Where are the SPARQL queries for the dataset?
Section 5.1.
It would be helpful to include a brief overview of the LOT methodology, along with its four key activities, to give readers a clearer understanding of its workflow and relevance to the study.
Section 5.1.1.
Line 2: please specify the format/ formats of the semi-structured input.
This section is quite brief, especially considering that it’s the only place where the technical details of the process are discussed. While it outlines the general methodology and tools used, it lacks the depth needed to make the approach fully clear. Expanding on each step with more detailed explanations and concrete examples would not only improve clarity but also enhance the reproducibility of the framework, which is one of the key contributions of the paper. If the goal is to provide a practical guide for constructing educational knowledge graphs, then adding more details would make the framework more actionable for others looking to implement it. A major limitation of this section is that while it lists tools like OWL2YARRRML, Yatter, and Morph-KGC, it doesn’t illustrate how they work in practice. For instance, when discussing the transformation of semi-structured data into RDF, an actual example dataset could be introduced, perhaps a simple table of educational concepts or learning resources. The authors could then walk through how this dataset is mapped using YARRRML templates, translated into RML, and eventually converted into RDF triples. Including a small snippet of a YARRRML mapping template and its corresponding RML translation would be particularly helpful to give readers a practical sense of how the transformation process is structured. Additionally, this section would greatly benefit from a figure that visually represents the transformation pipeline. Given that multiple tools are involved, a diagram showing how data flows from its semi-structured form through the mapping and transformation process into RDF would help clarify the steps.
The statement "we also ensure high performance and scalability during the transformation process thanks to Morph-KGC" is lacks empirical support. Does Morph-KGC achieve better scalability compared to other RML engines? If there are benchmarks or performance comparisons available, including those results would strengthen the claim and provide readers with concrete evidence of its advantages.
The proposed use of LOT4KG and OCP2KG for schema-level changes is well-founded, but the specific mechanism by which these tools "automatically propagate ontology changes over RML mappings" should be briefly elaborated. Do they merely update mappings, or do they also validate consistency with existing RDF data?
The explanation of life cycle changes is vague, particularly when discussing how modifications at the data level will be handled. The phrase "there is still missing a novel solution" is ambiguous, does this mean that no existing methods are sufficient, or that no framework currently exists to address this issue? The mention of IncRML for incremental updates is promising but lacks details. How does IncRML determine which triples need to be regenerated? Does it work based on a versioning mechanism, a change-tracking log, or another strategy?
Section 5.2.1.
The of the interface design process is quite broad and lacks the depth needed to fully explain how the interdisciplinary collaboration informed the final design. It highlights the involvement of experts in pedagogy, the Semantic Web, teaching, and software development, however, it does not specify how their expertise influenced decision-making. What specific contributions did each domain bring to the design process? Another key aspect missing from this description is how the experts were recruited and selected.
Section 5.2.2.
This section lacks some details. Was a pre-trained SBERT model used or has it been fine-tuned on a domain-specific dataset. Also, given that the system is expected to scale, mentioning the storage mechanism would be useful, so, how is the latent space structured? Is it stored as a simple lookup table, a graph-based structure …?
Regarding the fact that the similarity computation is extended to sub-topics and educational resources, are all these entities embedded separately, or is there a shared representation across courses, topics and materials? If similarity is computed at multiple levels, how is the final ranking determined?
Section 6.1.
The passage claims that enabling an "ontology-driven data creation pipeline will minimise errors in the data." However, it does not define which errors are being minimised and how OWL's reasoning ensures correctness.
Section 6.2.
The section does not specify how the two stakeholders were identified and recruited.
The section states that only two stakeholders were interviewed. This is a very small sample for deriving meaningful conclusions. Please explain why two stakeholders were deemed sufficient, and If constraints such as time and availability limited the sample size, acknowledge this as a limitation.
The section did not the method of analysis (thematic analysis? coding scheme?).
The section does not clearly distinguish whether the stakeholders are evaluating the actual KG or providing conceptual input on what an ideal Teaching KG should include. Please make this explicit.
The phrase "to inform the design and implementation" is vague. Does this mean the KG is already in development, or is stakeholder input being gathered before any KG is built?
Before introducing the questions, it would be better to describe the key themes that were targeted when designing the interviews (similar to what we see in Table 10).
Section 6.2.1.
This section requires significant restructuring for better clarity and readability. In addition to the typos and grammatical errors that need to be corrected, the overall flow is difficult to follow. The current format makes the responses feel like a rigid Q&A list rather than a natural discussion of stakeholder insights. One major issue is inconsistency in attribution. The section first states that findings come from "the interview 1", but later refers to "participants suggested including labs...", it is unclear whether responses came from a single interviewee or both.
Furthermore, the use of numbers in parentheses (e.g Q2, Q3, …) to refer to questions is confusing and disrupts readability. If these numbers indicate participant responses to specific questions, this should be made explicit. However, I strongly recommend removing this notation and instead integrating brief descriptions of the questions into the discussion for a more natural flow.
Additionally, not all answers to the 13 questions were included and there is no explanation for why certain responses were omitted. This omission raises concerns about selective reporting and should be clarified.
To improve structure and readability, I suggest organising the findings into thematic categories rather than presenting them as a loosely connected list. This would provide a clearer narrative structure and make it easier to follow key insights. Also, stakeholder insights could be reframed as a set of requirements, which could then be mapped to the KG in Section 6.2.2 for better alignment between feedback and implementation.
Finally, were there any overlapping opinions between the two interviewees? If so, this should be highlighted to emphasise common themes rather than treating responses as isolated inputs.
Section 6.2.2.
These thematic categories should be explicitly linked to the interview questions and used to guide the structure of the previous section. Currently, the text does not specify how these themes were derived, making it unclear whether they emerged organically from the data or were predefined.
Section 7
The section begins with the statement "…we describe four use cases" yet only three use cases are actually presented.
This section requires significant restructuring as it does not fit well within the overall flow of the paper. The primary purpose of this section is to demonstrate the functionalities of the KG interface, yet these functionalities were already described in Section 5.2. Presenting them again at this stage disrupts the logical sequence of the paper, especially since the evaluation and results have already been discussed. This section feels a bit out of place. Moving it earlier might help the flow.
Additionally, many of the functionalities described here have already been introduced earlier, such as presenting search results as tables or graphs (which has been described in use case 1 and 3!). Repeating this information does not add value and creates unnecessary redundancy. The screenshots of the interface are already informative and sufficient in illustrating the system’s capabilities, making this section largely unnecessary. Instead of repeating these details, I suggest incorporating a more detailed depiction of the interface into Section 5.2 for a more cohesive presentation.
Furthermore, rather than demonstrating advanced capabilities of the Teaching KG, the use cases primarily describe basic UI interactions. These descriptions do not effectively highlight the system’s unique strengths or advanced functionalities. If use cases are to be included, they should focus on more complex, value-adding scenarios, such as advanced querying or integration with external resources. Otherwise, they should be omitted entirely in favor of a stronger system description in Section 5.2.
In general, the placement of this section after the evaluation and results feels out of place. Readers would typically expect a discussion or conclusion at this stage, rather than the introduction of new interface details. If this section is necessary, it should appear earlier in the paper, within the system description in section 5.2. Otherwise, I strongly recommend removing it and integrating relevant content into the earlier sections where it logically fits.
Section 8
given the nature of the evaluation, certain claims should be more cautiously framed to reflect the preliminary nature of the findings. Currently, the evaluation is based on a very limited number of participants who did not actually use the KG in practice. As a result, the insights gathered should be considered exploratory rather than confirmatory. The authors must indicate that these results do not fully validate the effectiveness of the KG in real-world educational settings, but rather offer preliminary feedback that informs future improvements.
GitHun repo: Please expand the README file to provide comprehensive information about the project, including setup instructions and descriptions of each directory and file.
|