An Ontology for the Tajweed of the Quran

Tracking #: 2768-3982

Authors: 
Amna Basharat
Ramsha Amin

Responsible editor: 
Special Issue Cultural Heritage 2021

Submission type: 
Ontology Description
Abstract: 
In the current information systems, many fields use ontologies for modeling domain knowledge to enable interoperable semantics. There is a plethora of knowledge sources within the Islamic heritage that derive from the primary sources of the Quran and the Hadith (Prophetic narrations), however, there is lack of sufficient ontologies and linked data to better describe and semantically annotate the information related to the Quran. Although several Quranic themes based ontologies have been developed to facilitate the retrieval of knowledge from the Quran, there is still a lack of comprehensive knowledge-based reasoning models created for the Tajweed of the Quran - the science of Quranic recitation. In this paper we propose the design of an ontology for capturing the core elements of Quranic recitation i.e. Tajweed. The knowledge model was developed by using the Protege framework and state-of-the-art semantic web technologies (OWL and SPARQL). METHONTOLOGY, an iterative design methodology was used for its development. The ontology focuses on describing the articulation points of Arabic letters and their characteristics together with the Tajweed rules (rules of recitation). Semantic Web Rule Language (SWRL) was used for the implementation of the Tajweed rules. To evaluate the ontology model, a hybrid approach was used. Expert driven validation and criteria based evaluation was conducted for the Arabic letters and their characteristics to evaluate the accuracy and structure of ontology. Results from the experts were incrementally improved before evaluating it with the next expert which results in 100\% accuracy. Tajweed rules were evaluated using data driven approach on the complete text of the Holy Quran. Also, an annotated dataset of the entire Quran was generated in OWL format using the developed tajweed ontology.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Victor de Boer submitted on 21/Jul/2021
Suggestion:
Minor Revision
Review Comment:

SUMMARY
This ontology description describes the process of creating an ontology for Tajweed as well as the results of that process. Tajweed as i understand it concerns the way of reciting Quran verses, which is based on various pronunciation rules. The paper's describes the following key contributions:
- An antology describing the pronunciation concepts related to recitation of Quran. This is constructed using an established methodology and based on existing source literature on the subject
- Use of SWRL to implement the actual pronunciation rules for these concepts
- A complete annotated Quranic Tajweed dataset
- Three types of evaluations based on 1) intrinsic quality metrics 2) iterative expert evaluation (n=5), and 3) plus data-driven evaluation.

POSITIVE POINTS
- The paper presents both the process and the resulting resources quite clearly, even while some details are difficult to grasp for non-domain experts
- The work is well-placed in related work
- The ontology is well-documented, including the method to produce it. I also appreciate some of the modeling decision described in more detail such as in section 3.4.3 where multiple options are discussed. For reuse and reproducibility this is very useful.
- The digital resources are linked in the paper and are generally correct, potentially persistent and useful
- The various evaluations are interesting, relevant and paint a good picture of the strengths and weaknesses of the solution

NEGATIVE POINTS
- While the motivation for the work is clear, the re-usability of the ontology outside of the main use cases that include Tajweed education is unclear. This limits the impact a bit.
- While the github repository with the ontology is simple and easy to use, the second repository (with the dataset) lacks any instruction on setup and use. I would suggest adding that to the repository

SUGGESTIONS FOR IMPROVEMENT
- For complete uninformed readers (such as myself), the Introduction can more clearly introduce the concept of Tajweed, maybe through one or more examples of Tajweed rules
- While the work is well embedded in ontology engineering and Quran-related literature, a valuable addition can be to relate it to other non-Quran-related ontologies. Are there fore example other linguistic or religious ontologies that are related?
- P4 col2 line 21: What is the difference between a Tajweed learner and a Tajweed seeker?
- P5 Table 1: please explain what the different rulesets are (Qalqalah etc). In what way are they different/similar in shape/origin?
- P8 col 2 line 27: Here the is-a relation (rdf:type) is claimed, which is incorrect. This should be subClassOF (it is correct in the ontology itself)
- P12 Table 5: Is this a complete list or are these examples?
- P12 col1 line 21: Explain what the "tajweed factory"is

- P17 Table 8: It is unclear to me what the source of these errors are. Are these caused by disagreement between the source material and the expert? Or did the error occur somewhere in the ontology engineering process? If so, where?

LANGUAGE
- The paper is overall quite well-written, but there remain some non-native speaker errors, mostly missing "a" and "the"'s, for example
p2 col 1 line 13: "a formal evaluation"
p2 col 1 line 14: "a hybrid approach"
p2 col 1 line 23: "a description of"

It would be good to try to fix as many of these as possible

Review #2
Anonymous submitted on 05/Aug/2021
Suggestion:
Major Revision
Review Comment:

Summary: the paper describes a new ontology for Quranic Recitation (Tajweed). There are many knowledge sources derived from Islamic heritage such as primary sources of the Quran and the Hadith, but there are insufficient semantic models that describe them for purposes of semantic annotation, and subsequent information retrieval and knowledge-based reasoning. To validate the ontology, they use it to annotate the Quran, and validate the model using a criteria-based approach in combination with expert validation and data-driven validation.

Although I am not an expert on Quranic recitation, the design choices seem well motivated, the work seems to be comprehensive and a relevant extension to existing work. The work has been carefully evaluated in a hybrid approach, both qualitatively with domain experts as well as quantitatively, which I appreciate. The only small concerns relating to the evaluation I have are that (i) the user scenarios and competency questions seem to have been developed by the authors themselves (at least, it was not mentioned how they came to be) whereas it would have been interesting when these would have come from the domain, and (ii) due to my lack of domain knowledge, the data-driven evaluation was hard to follow.

My major concern with the paper is that (i) the motivation and domain of Quranic recitation, as well as the system architecture and annotation process, are not described clearly, (ii) the unclear writing style and paper structure make the paper a tough read, and (iii) there are many spelling and grammatical errors. Additionally, the paper could be written more concisely. The SWJ journal mentions descriptions of ontologies should be short papers, whereas this paper consists of 22 pages. Detailed arguments could potentially be submitted as supplementary files.

More specifically:
- convincing evidence of the relevance and quality of the ontology must be given, and although the authors have (IMHO) succeeded in the latter, the former should be addressed. Some use cases are briefly enumerated later, but the introduction would benefit from a clear description of the field (on rules and the recitation process) and a motivation for how the ontology can be used in practice.
- the structure of the text is not always logical and consistent. E.g., the section 'system architecture' for instance contains descriptions related to the annotation process, data cleaning, and evaluation.
- consistency and correctness of terminology, spelling and use of capital letters is lacking (e.g., the tajweed ontology, Tajweed ontology, the Tajweed Ontology, the Tajweed ontology model, the ontology-based quranic tajweed knowledge model, the Tajweed ontology knowledge model).
- Often, the determinant of a sentence is missing (e.g., page 12 line 30 'Rule engine --> 'The rule engine')
- there are many mistakes in use of time and plural/singular (Table 5, SWRL iqlab rule: 'This rule implies [...] and have' --> 'This rule implies [...] and has')
I have annotated some spelling and grammatical mistakes only for the abstract and first section, to show examples.

ABSTRACT

Specific comments:
Semantic Web Rule Language --> The Semantic Web Rule Language
Expert driven --> Expert-driven
criteria based --> criteria-based
structure of ontology --> structure of the ontology
the ontology model --> the ontology (for consistency)
results from the experts --> results from the expert-driven evaluation
Unclear and grammatically incorrect sentence: "Results from the experts (?) were incrementally improved before evaluating it (?) with the next expert which (?) results in 100% accuracy."
Unclear and grammatically incorrect sentence: "Tajweed rules were evaluated ... Holy Quran."
data driven --> data-driven

INTRODUCTION

General comments:
* I suggest the authors add one or more references related to keyword search through Quranic heritage, and Tajweed in specific. In general it would be interesting to have a short description on the relevance of semantic search over keyword search in Quranic texts.
* Many readers might have no background and no knowledge on the content/ structure and appearance of these texts, nor of "recitation" in general -> what is Tanween or Un vowel noon, what do articulation points, characteristics, letter occurrences and rules mean in this context? Part of this could be further explained in the background, but the relevance of a semantic model for the Tajweed should be made clear early on in the introduction. Some of this information is actually described in the github link containing the Tajweed ontology.
* I suggest not to use letters as well as numbers in the same enumeration.

Specific comments:
line 48 col 1: will make easier --> will make it easier, the intelligent systems --> intelligent systems
Line 38 col 2: literal meaning --> The literal meaning
Line 41-42 col 2: ungrammatical/unclear sentence: from which it originates (?) and have some characteristics (?)
Line 47 col 2: matching keywords approach --> could benefit from a reference
OWL -> would be better to write it out when mentioned for the first time and link to the OWL specifications.
Page 2 line 2-3 col 1: ontology based --> ontology-based
Page 2 line 9 col 1: a complete annotated Quranic Tajweed dataset has been constructed of high accuracy --> unclear. What was annotated and what does it mean to have a high accuracy dataset.
Bhybrid approach --> a hybrid approach
over-view of the literature review work --> overview of the literature

Background and related work
* The related work would benefit from a section that describes the body of literature related to for instance Linked Data and semantic annotation of cultural heritage. I would personally place section 2.1 on ontology engineering methodologies in the methodology section, as it relates more to the methodology you have chosen than the scope of your research.
* I suggest to start with a section on the background of Quranic recitation. For a layman it is difficult to figure out what articulation points, characteristics, letter occurrences and rules are in the domain of Quranic Tajweed, and how they are useful.
* I suggest to revise section 2.2, since the title does not seem to reflect the content, and is therefore difficult to understand.

Section 3:
* I suggest the authors split this section into fewer sections, for instance a development process section and a model section which would describe modeling decisions. These could then be further subdivided using your subsections.
* I suggest the authors rewrite the the rules in Table 5, as they are long and difficult to understand by a layman.

Section 4:
This section describes the system architecture and the annotation process using the ontology
* Some terms are mentioned but not explained (Tajweed Factory, Automated Ontology Population)
* The section is difficult to follow. I suggest the author organise it according to Figure 9., as at the moment, section 4 as well as Figure 9 are difficult to understand. What makes section 4 unclear is that there is no distinction between the architecture of the system (which components does it contain), the data processing step, and the actual annotation process (how is that system used for annotation, by whom, was there agreement on annotation between annotators (IAA), what type of data does it ingest in what form, etc). Figure 9 does not seem to be supported by the text.
A separate description of the dataset that you annotate (examples of the Quranic Tajweed, I see a short description occurs later in section 5.3), the system, the process, and examples of the resulting output dataset, would greatly help.
Additionally, some parts of this Architecture section already contain sentences about validation ("When validating the results on the Quranic text, the rule of Noon-Sakinah and MeemSakinah was not predicted") which is discussed only later in section 5. These parts are difficult to interpret, as we do not know yet what this validation looks like and how it is performed.

Section 5:
Completeness: [...] by using SPARQL queries -> what was the outcome? How did you assess completeness this way?
* Data-driven evaluation: are you comparing the rules in your populated ontology with those in the data source for which the github link is provided? I suggest the authors provide a clearer description of the evaluation task, and a more consistent naming for their ontology as well as populated ontology, so that it is clear which resources are compared.

Data file assessment:
1. https://github.com/ramshaamin/ArabicLettersOntology
The folder contains an OWL ontology accompanied by a README.md containing a clear description of the background and available resources. The conceptual image however is unreadable due to the use of a black background, and the README could benefit from a per-class or predicate description with for instance their namespaces, related properties/classes and domains and ranges.
2. https://github.com/ramshaamin/TajweedThesisV5/
The folder lacks a README.md, making it difficult to figure out what is in the data folder, and how to run it. Java code which integrates OWL API libraries in an eclipse project appear to be there as well, but these are only briefly mentioned in the paper without explanation on how to set the environment up (which software is used, what are the prerequisites). Some folders appear to refer to a Tajweed Factory, which is also briefly mentioned by name in the paper, but is not further explained.

From looking through the files, it appears that most of the files mentioned in the paper appear to be in the folder: the Tajweed ontology, the rules and the Tajweed factory. Although the paper mentions the annotated Quran dataset (the populated Tajweed ontology) should be in either one of these folders, but seems to be missing. Preferably provide it as a SPARQL endpoint. A nice addition would be some example queries to run on top of the populated ontology.

Review #3
By Mohamed Sherif submitted on 10/Aug/2021
Suggestion:
Major Revision
Review Comment:

The paper proposes an ontology for capturing the Quran Tajweed (i.e., rules of Quran recitation). The proposed ontology focuses on describing the articulation points of Arabic letters, arabic letters’ characteristics and the Tajweed rules. The authors use SWRL for implementation of the Tajweed rules. For evaluating the ontology model, an Expert-driven validation as well as a criteria-based evaluations were conducted.

(1) Quality and relevance of the described ontology:
The provided Tajweed ontology as well as its rules is relevant to the Quran script, the author evaluation shows promising results. Personally I liked the amount of effort done in the paper, especially in the evaluation part.

(2) Illustration, clarity and readability of the describing paper:
The writing of the paper is generally good, but it needs some polishing, please see my detailed notes.
The one thing still missing in this work is to provide a SPARQL endpoint to serve the generated ontology as well as the annotated Quran dataset as well as to offer the W3C standard compliant IRI dereferenciation (maybe via https://github.com/LodLive/LodView).
I would also suggest adding a dataset characteristic table that contains the links to the github project (currently hidden within the text of page 19) as well as the endpoint location of the dataset, example resources, some statistics such as number of triples, classes, entities, etc.

Next, I give more detailed comment about each section of the paper:

Abstract:
- ”lsamic” → “islamic” also in other places
- “In this paper we propose” → “In this paper, we propose”
- “Quranic recitation i.e. Tajweed” → “Quranic recitation (i.e., Tajweed)”
- “METHONTOLOGY, an iterative design methodology was used for its development.”, you need to refer to [5]

Introduction:
- “Creating ontologies in the field of Islamic knowledge will make easier for ...” → “...will make it easier ...”
- Emphasize new when to define it such as “Quran” (line 37) and “Tajweed” (line 43)

Background and Related Work
- The sentence started at line 49 needs rephrasing
- I think that some semantic web Quaranic datasets such as semantic Quran and Semantic Hadith worth to be included within this related work section
[1] Sherif, Mohamed Ahmed, and Axel-Cyrille Ngonga Ngomo. "Semantic Quran: A Multilingual Resource for Natural-Language Processing." Semantic Web journal 6, no. 4 (2015): 339-345.
[2] Basharat, Amna, Bushra Abro, Ismailcem Budak Arpinar, and Khaled Rasheed. "Semantic Hadith: Leveraging Linked Data Opportunities for Islamic Knowledge." In LDOW@ WWW. 2016.

Overview of Tajweed Ontology:
- “Specification(identification …)” →“Specification (identification …)”, also many other places - missing a space before an opening brace
- Page 4, line 39: Emphasize “Pellet, HermiT”
- Page 4, line 48: “... resource i.e,” → “... resource. i.e.,”
- What is the difference between Tajweed student, seeker and learner or even a teacher within this context?
- Page 4, line 44 right: broken sentence, needs rephrasing
- Page 7 end: “rules:HaroofAshshafawiya” → “rules:HaroofAlshafawiya” or even better “rules:OralLetters ”
- Figure 6 is in low quality, i can not read it
- Page 8, Line 47: “nd” → “and”
- Use “\texttt{}” for predicate names such as “is-a, partOf, hasOpposite, involvesArticulationPoint”
- Page 8, line 33, right: “Figure8” → “Figure 8”
- The RDF-XML listing in the beginning of page 9 would be better if replaced with a Turtle serialization as it is easier for human understanding. Also, “https://dbpedia.org/page/Place_of_articulation” → “https://dbpedia.org/resource/Place_of_articulation” as the page is just the HTML representation of the resource. How did you find such links? By hand? Using some tool?. According to OWL “The built-in OWL property owl:sameAs links an individual to an individual.” but in your case you link a class to an individual. Maybe you need to find another linking property other than owl:sameAs.
- The rows of Table 3 and 4 are not aligned, you need to add \midrule after each relation, in order to distinguish them from each other.
- Table 5: description of the 1st rule: ‘sakinah’ → “Sukūn”, this should be also fixed for all other occurrences.
- Table 5: the second rule is suboptimal as the Qalqalah appears only for specific arabic letters “ق، ط، ب، ج، د”, which are not included in the rule. You may also add the type of the Qalqalah (i.e., “Kobra” (Big) and “Soghra” (small))

Evaluation:
- Page 13, the mention of Table 4 (line 37) is wrong, should be replaced by Table 6
- Again, you need to add \midrule(s) to Table 6
- The text in Equation 1 need to be written in a normal text mode for better readability
- Table 7: use \texttt{} for SPARQL as it will look better
- Table 7: I don't see a binding to Urdo language in the first SPARQL query
- Table 7: I don't see a binding to Surah Alfalak in the last SPARQL query
- Equations for precision, recall and F-measure are well known and could be removed from the paper. i.e., remove equations 2, 3 and 4
- In Section 5.3.1, computing the F-measure requires the existence of a gold standard, where you know where your rure already applies. How did you acquire such a gold standard? Maybe from the Uthmani script of the Quran, if it was the case, this should be cleared in the beginning of the section.
- Page 18: ”Iqlab” is mentioned twice in line 31 (right)
- Figure 13: I do not get how you get the rules in the data source (violet-colored bars)

Figure 1:
- “Tanween&NoonSakenah” → “TanweenAndNoonSakenah” or “TanweenOrNoonSakenah” or “Tanween_NoonSakenah”
- Hiding, Change, MergingWithghunnah, MergingWithoutghunnah should be rdfs:subclassOf Tanween&NoonSakenah as they are classes not instances (As far as i understand). They will have rdf:type only if they are instances. The same applies to all red-marked classes in Figure1.
- The “Item” class is very confusing, it could be anything. Maybe rename it to “LexicalEntity” or you may find a better name for it.
The predicate “rules:hasArticulationpoint” has no direction

References:
- Names and the title of Ref. 2 is capitalized.