Review Comment:
Summary
The paper presents a detailed methodology for generating a knowledge graph from the hadith corpus, aiming to enhance the accessibility and interoperability of Islamic knowledge resources. It builds upon the SemanticHadith Ontology, introducing an updated version to better model entities and topics within hadith texts. The methodology incorporates meticulous data selection, natural language processing (NLP)-based entity extraction, and semantic modelling to facilitate the seamless exploration and retrieval of Islamic knowledge in the digital age.
Introduction Section
Strengths
1.Relevance and Importance: The section emphasizes the significance of structured knowledge representation using Knowledge Graphs (KGs) and highlights the underutilization of this technology in the Islamic knowledge domain. This is a relevant and important issue, addressing a clear gap in the field.
2.Context and Background: The introduction effectively sets the stage by providing context about the role of KGs in various domains and the specific challenges associated with the Islamic knowledge domain. This helps readers understand the motivation behind the research.
3.Clear Problem Statement: The section clearly outlines the problem: the lack of comprehensive semantic modelling for hadith texts and the need for better integration of Islamic knowledge into the semantic web ecosystem.
4.Methodology Overview: It briefly outlines the methodology, including data selection, NLP-based entity extraction, semantic modelling, and knowledge graph construction. This gives a high-level view of the approach without overwhelming details.
Potential Errors or Shortcomings
1.Generalization Issues: The section mentions that existing studies in hadith primarily focus on specific domains, like prophetic medicine and the chain of narrators, too broadly and do not acknowledge other studies in the domain.
2.Lack of Detail on NLP Techniques: The mention of NLP-based entity extraction could be more explicit. Specific techniques, tools, or methodologies employed for NLP tasks are not described.
3.Assumptions and Limitations: There is no discussion of the methodology's assumptions or potential limitations. For instance, challenges related to linguistic diversity, dialects, and the varying authenticity of hadith texts could be briefly mentioned.
4.Comparative Analysis: The section lacks a comparative analysis with other methodologies or approaches. It needs to discuss why the proposed approach is expected to be more effective than existing ones.
5.Impact on Users: While the section mentions the potential benefits of research and knowledge discovery, it could better articulate the practical implications for end-users, such as scholars, students, or the general public.
6.Citation Accuracy: Ensure that all claims, especially those about the current state of research and previous work, are properly cited. For example, the section references multiple studies [1-16] without clear attribution to specific claims.
Background Context and Motivation Section
Strength: Overall, the "Background Context and Motivation" section effectively highlights the significance of hadith and the need for formalized semantic modelling.
Additional Recommendations
1.Incorporate More Examples: Including more examples throughout the section would enhance clarity. For instance, provide examples of hadith that elaborate on specific Quranic verses or explain Islamic concepts.
2.Elaborate on Computational Techniques: While the section mentions various computational techniques (e.g., NER, NLP, similarity computations), a brief explanation of how these techniques are applied in the context of hadith literature would be beneficial.
3.Address Future Directions: Briefly touching on potential future directions or research opportunities in the semantic modelling of Islamic knowledge would provide a forward-looking perspective.
Methodology Section
1.Strength: Clarity and Coherence: The section provides a clear overview of the approach. However, a summary or a flowchart would benefit from visually representing the stages mentioned (especially Figure 3 directly here for immediate context).
Data Selection and Acquisition
1.Unicode Conversion: While the conversion to Unicode is mentioned, the exact methodology or tools used for this conversion should be briefly detailed to provide transparency and reproducibility.
2.Data Description: The description of the corpus being 34,458 hadith is clear, but mentioning how the data was curated or any criteria for inclusion/exclusion would add depth.
NLP-based Custom Knowledge Extraction
1.Model Training: The use of spaCy and transfer learning is appropriate. However, details on the training dataset size, epochs, and performance metrics (like precision, recall, and F1-score) would enhance understanding.
2.Expert Validation: Expert validation is crucial. It would be beneficial to provide examples of how discrepancies were resolved or the extent of expert involvement (e.g., the number of experts and their qualifications).
Similarity Computation and Interlinking of Hadith
1.Similarity Metrics: Using cosine similarity and pre-trained sentence transformers is appropriate. However, mentioning alternative methods considered or benchmark comparisons would add depth.
2.Expert Validation in Similarity: Elaborate on the process of expert validation in similarity computation. How many pairs were validated, and what criteria were used?
Additional Recommendations
1.References: Verify all references are up-to-date and correctly cited. For instance, [12], [28], [34], [35], [36], [37], [38], and [39] should be cross-checked for accuracy and relevance.
2.Consistency: Maintain consistency in terminologies, such as "Hadith" vs "hadith" and "Sahih Bukhari" vs "Sahih al-Bukhari."
NLP Methodology for Entity Extraction Section
Strength: Overall, the section provides a comprehensive overview of the NLP methodology for entity extraction from the hadith corpus.
Additional Recommendations
1.Collaboration with Domain Experts: While the involvement of domain experts is noted, it would be helpful to provide more details on the expertise of these individuals and the extent of their participation (e.g., how many experts their specific contributions).
2.Customization Details: The modifications to the dataset are described in general terms. Including specific examples of additional concepts (e.g., examples of holy books and angels) and removing labels would make this section more concrete and informative.
3.Transfer Learning: The explanation of transfer learning techniques is repetitive. Consolidating this information into a single, clear statement would improve readability.
4.Consistency: Ensure consistency in terminology and formatting throughout the section. For example, the "NER model" vs "Named Entity Recognition model" should be used consistently.
5.References: Verify all references are up-to-date and correctly cited. Cross-check references [28], [29], and [40] for accuracy and relevance.
Design and Development of the Extended SemanticHadith Ontology Section
Strength: The section provides a comprehensive overview of the design and development process for the SemanticHadith ontology. It covers conceptual knowledge modelling, scope definition, reuse of existing ontologies, ontology design, modelling decisions, and integration and implementation.
Additional Recommendations
1.Interoperability: The emphasis on interoperability is good. However, the section could elaborate on any challenges faced while integrating these ontologies and how they were addressed.
2.Examples and Illustrations: Including more examples and illustrations throughout the section would enhance clarity and engagement. For instance, showing a sample ontology class with its properties and relations would be helpful.
3.Consistency: Ensure consistency in terminology and formatting throughout the section. For example, "RootNarrator" and "HadithNarrator" should be used consistently.
4.References: Verify all references are up-to-date and correctly cited. Cross-check references [10], [12], [13-15], [35], [36], [39], [41-43], [45-47], [48-52] for accuracy and relevance.
Results and Discussion Section
Strength: The "Results and Discussion" section is comprehensive and covers various aspects of the evaluation and application of the SemanticHadith ontology.
Additional Recommendations
1.Examples and Illustrations: Including more examples and visual aids throughout the section would enhance clarity. For instance, screenshots of the ontology in Protégé or WebVowl, examples of SPARQL queries, and visual representations of knowledge graph integrations would be useful.
2.Consistency and Formatting: Ensure consistency in terminology and formatting throughout the section. For instance, terms like "SemanticHadith ontology," "knowledge graph," and "annotation" should be used consistently.
3.References: Ensure that all references are up-to-date and correctly cited. Cross-check references [51], [55], [56], and [57] for accuracy and relevance.
|