Semantic Model for Legal Resources: Annotation and Reasoning over Normative Provisions
Submission in response to http://www.semantic-web-journal.net/blog/semantic-web-journal-special-is...
Solicited review by Paolo Ciccarese:
The article presents a semantic model for implementing the Hohfeldian fundamental relations so that legislation could be semantically annotated, reasoned upon and searched for.
After reading the entire paper, the goal of the author is clear. However, the introduction to the problem that the Provision Model is trying to solve, or, in other words, the transition from the work [5] to the reasons behind the one presented, can be improved.
Hohfeld defines the relationship between Right/Duty as 'correlative': these interests exist on opposing sides of a pair of persons involved in a legal relationship. If someone has a right, it exists with respect to someone else who has a duty. Therefore if the provision is classified as a Right is also classified as a Duty. The author should state this clearly before digging into the 'equivalent business' of the [5] work in section 4.
The author is correctly stating that considering terms such as Right and Duty equivalent is a problem for querying. In fact, I would argue that the use of the term 'equivalent' between the Hohfeldian terms such as Right/Duty is a problem in general. Therefore, equalizing 'correlative' and 'equivalent' in section 4 without before clearly introducing the correct Hohfeldian terminology is confusing.
I like the definition of the Implicit/Explicit subclasses as it preserves the original provenance/intentions and it has a real semantic value while marking up the text.
The link to the ontological model, or a draft of it, has to be provided. A file with the statements provided as example would be helpful as well. That would allow reviewers and readers to experiment directly with the model or, in other words, reproduce your experiments. Likewise, a link to the DALOS ontology should be provided.
The example of section 5.1 deserves a little more explanation. Without the ontology and the complete example, that is complicated to cross-check. For instance, 'Table 2' does not have as result also art7;par1;spa1. Is that because the provision type is not ExplicitDuty? Just looking at the query "?par prv:hasExplicitRightBearer cl:Consumer" the results influenced by the 'owl:equivalentClass' should be commented a bit more. As I don't have the ontology available, I am not able to directly verify those result.
Looking at the markup (section 5.1) I see spans and paragraphs. Is there a difference? In the RDF/OWL excerpt I don't see any reference to the type of document section so I would guess it does not matter? A few words on that could help as well.
It is not completely clear to me why the query in Section 7 returns also the Procedure with id="art5;par2". Once again ontology and the example file would have been of help. A better explanation is needed.
The word Counterpart is not always spelled correctly. I would run a spelling-check on the whole document and I would revise carefully the composed CamelCase terms.
Very little has been shared in relation to the annotation process. Besides the examples provided in the paper, is there a knowledge base that has been generated with this approach? Are there any plans to use this model in a real experimentation? Are there suitable annotation tools that can be used for this purpose or the existing mark-up has been performed manually?
Solicited review by Eva Blomqvist:
The article describes an extension to the Provision Model for distinguishing between explicit and implicit expressions of Rights/Duties, Powers/Liabilities, etc. The work relies on the notion of so-called Hohfeldian relations that are already expressed in the Provision Model, but in an unsatisfactory manner. The paper is well-written, with a clear narrative and good English. Nevertheless, I am missing some important pieces that would make this paper a high-quality research paper, and some claims of the author are not supported by the research results presented in the paper.
Before proceeding I would like to mention that I have reviewed this paper mainly from a knowledge representation point of view, since this is my field of expertise. Hence, some of my concerns and questions may have obvious answers to a researcher of the legal domain, still, this paper is submitted to the Semantic Web Journal and would be read also by people who are not experts on law theories (despite the focus of the special issue), which requires extra care to be taken when explaining and motivating the solution.
First of all, I am missing three important pieces of the narrative: motivation for the approach, related work, and evaluation. With respect to the motivation, it is not clear to me why this approach is needed. It is clear that it can be used, e.g. as exemplified in sections 5 and 7, but the paper only very briefly motivates why such queries and reasoning are necessary. What real-world problems of legal practitioners does this approach address? In what cases is the current Provision Model not enough? Are those problems frequent and severe? Who are the users that will benefit from this reasoning?
Next, the discussion on related work in section 2 looks to me primarily as a description of work that the current results are based on (without a clear motivation why), rather than a critical discussion of related work. Have others attempted to solve this problem before (apart from the Provision Model)? How? Why are those approaches not enough? Could there be alternative ways of solving the same problem? Perhaps there are no "competing approaches" but then this should be mentioned, so that the reader understands why the Provision Model is picked as the basis of the work and why the proposed solution is not later contrasted against alternative solutions.
In terms of evaluation, I believe that any research result that is proposed should be validated or evaluated in some way, whether analytically or empirically. This paper contains two "case studies" according to the author, however, I would rather say they are "examples", not case studies. Although obviously based on real-world data, it is not clear neither how representative and frequent this type of data is (i.e. no conclusion about the applicability of the approach can be drawn here) nor how representative and frequent the tasks are (i.e. no conclusion about the usefulness of the approach can be drawn). The so-called use cases simply concludes that "this can be done", which is too weak for a high-quality research paper in my opinion. First of all, as they are written now, the sections should be renamed from "case-study" to "example" and then I would like to see a proper evaluation or case study added. Even if no large scale case study has been performed, at least a discussion of a realistic scenario within the legal domain when this type of reasoning is essential would help to motivate the usefulness of the approach. Also, a critical discussion of any potential drawbacks of the proposed approach would improve the credibility of the paper.
Apart from these major aspects that I feel are missing, there are also some claims made by the author in the paper that are not really supported by what is presented. For instance, already in the first sentence of the abstract the author writes "A Semantic Web approach for the legal domain is presented..." which raises the expectations of the reader to the level of wanting to see some kind of overall framework for the complete legal domain, which is not exactly what the paper is about. Next, in the last paragraph on the front page the author claims that this approach is a combination of top-down and bottom-up. How? Throughout the paper I see examples of the top-down approach, i.e. carefully modeling the legal domain and applying this model to perform reasoning on pre-annotated text. I am afraid I do not see the bottom-up aspect in that. Perhaps I am missing something, but then the author needs to make this aspect much more clear: how does the proposed approach work bottom-up? How does it cater for Linked Data with minimal semantic markup, as expressed in section 1?
Also in the conclusions section, there are some claims that, to the best of my understanding, are not really supported by the text. In the first sentence of the conclusions section the author writes "The combination of Provision Model and domain ontologies can represent an effective approach to the Semantic Web for the legal domain." What does this mean, what kind of "combination", and what is "effective"? To my understanding the paper deals with a metamodel of legal concepts for annotating provisions, which I would not really call a domain ontology, so where does this claim come from? Also the last sentence brings up an issue that is not really discussed in the paper: the reasoning complexity. It is clear that the approach has a reasonable DL complexity, but why is this the main benefit of the approach? To me this seems to indicate either that there are other approaches that are much more complex (which should then be mentioned as related work), or that reasoning performance is really of essence in this case (which should then be exemplified through some real-world scenario), however, neither of these potential interpretations are discussed by the author in the paper.
In addition to the broader issues discussed above, I have a number of more specific questions, comments, and requests:
- Throughout the paper the author talks about "patterns", but never really defines exactly what is a pattern nor what the patterns he discusses/proposes consist of. I find it very interesting that the author has discovered some modeling patterns in the legal domain, and would urge the author to make this discussion more clear in the paper. I assume that the author means that there is a general abstract pattern underlying the way to model all the Hohfeldian relations? Could this pattern be described in its own right, i.e. apart from its instantiation in the different cases on Rights/Duties, and so on? What would it look like? In addition to the modeling patterns, it seems that there are also some potential query patterns at work here. Could it be the case that the query illustrated in section 7 actually exemplifies a general pattern of how data expressed using the extended Provision Model will be queried? Could it be useful to describe such patterns further, e.g. as query templates?
- Section 7 illustrates the use of the model in another context, but what does the extension to the Provision Model actually add in this case? Perhaps I am missing something, but it is not clear how this illustrates the application and usefulness of the proposed model, i.e. why it is important to distinguish between implicit and explicit roles in this case.
- The notations used in the paper are quite diverse. In the text some kind of semi-formal DL notation seems to be used, where sometimes triples are expressed (such as in the small table in the last paragraph of section 2) and sometimes DL expressions (although I am not sure what notation is actually used). Then figures use a UML-like notation (perhaps from TopBraid Composer), and suddenly in section 5 the notation switches to what seems to be RDF/XML, coupled with tabels containing triples (but additionally containing the type of the individuals in the second column, which is a bit confusing). Also, the semantics of == and = are not so clear in this context: in bullet 1 and 2 at the bottom and top of page 5 == is used to illustrate the condition of a rule, while later in the same section = seems to illustrate something similar. In Table 2 is is not clear what the equality in square brackets inside the triple pattern means.
- Is the ontology online somewhere? Being a Semantic web approach I would expect that the ontology is published online so that it can be both reviewed and reused by others.
- Please add references to the work mentioned in footnote 1, or if it is not relevant to the paper - remove the footnote.
- First sentence of section 1: "most challenging area" -> "most challenging areas"
- Last sentence of section 2: "a ontology" -> "an ontology"
Solicited review by José Manuel López-Cobo:
This paper describes some mechanisms to annotate and reason over normative provisions using a combination of Provision Model and domain ontologies, like DALOS.
The paper presents how normative provisions can be modeled and how those provisions can be linked to existing ontologies, differentiating the implicit views of the provision (like abstract classes) from the explicit views of the provision (actual instances extracted from normative texts). The approach to show how to model relationships among provisions (functional relations and functional-thematic relations) clearly implies that this exercise can only work as theoretical and wouldn't scale to a real environment where reasoning and inference in an open model with thousands (or tens of thousands) instances from the semantics of a directive or a law.
Regardless of practical consequences, the paper is well structured and the gain in expressivity using SPARQL to query an OWL-DL repository of semantic annotations as the study case proposes, is evident. The link between the bottom-up approach for a semantic refinement and Linked Open Data paradigm is not clear and the author does not provide enough evidence that LOD is used throughout the paper.
In general, the paper presents a good effort in combining a meta-model as the Provision Model with a domain vocabulary as DALOS to describe in more precise ways normative provisions and consequences of its uses. If this approach is scalable and practical for a real exploitation is still to be seen and subject of subsequent studies.
Below, the author can find some suggestions or minor comments:
- Open Linked Data and Linked Open Data are used with the same meaning. Please, choose one (LOD is the one usually selected)
- A language revision by an English native would be beneficial.
- Images are really difficult to follow in a regular A4 impression. I would suggest bigger pictures
- Section 6 promises more than it provides. Some examples about how the use of functional-thematic relations in the retrieval of provisions would be interesting. And more to the point, how it affects the annotation process and the endless possibilities that it opens. In my opinion, this has not been sufficiently described in the paper.