Review Comment:
This manuscript is the second revision. In this revision, the authors have discussed more about state of the art, provided more details regarding expressiveness of the SNL and query optimization strategies. It is an interesting case of semantic rule checking on real-world building models, and I encourage authors to continue this work. However, I do not recommend acceptance for this manuscript since authors have not addressed some major issues.
Originality: The idea of developing an SNL to write rules with high level concepts and transform them to SPARQL queries is similar to some existing work, such as the work mentioned by the authors in section 3 and the work as follows:
K.R. Bouzidi, B. Fies, C. Faron-Zucker, A. Zarli and N.L. Thanh, (2012). Semantic web approach to ease regulation compliance checking in construction industry. Future internet, pp.830-851.
In my opinion in this aspect the paper does not show sufficient added value.
The originality of this work can be attributed to extended application of semantic rule checking on real-world, large-scale building models, but authors should present advances in their approach (especially application of Semantic Web technologies), provide additional datasets about their cases and insightful discussion about results, limitations and challenges.
Significance of the results: In section 7, the authors have showed that a real-world, large building model can be checked effectively by using SPARQL queries to prove the applicability of this approach. In section 3, authors have also stated that 85% of the national building codes and more than 95% domain codes can be supported by the SNL regarding its expressiveness. There are a few major issues about these results:
First of all, the applicability of this approach is proved by checking a high-rise building model with a set of rules. However, in this process, the model extraction, model transformation and checking result generation has very limited relations with Semantic Web technologies and applications. The only thing that is highly related is the SPARQL query process, which is related to query optimization strategies described in section 6. This part is however presented without existing research and development related to e.g. query rewriting and indexation techniques. What is the contribution of the optimization strategies in comparison with existing work?
Secondly, the statement about the SNL’s expressiveness is not convincing. Authors should provide additional datasets or detailed description/discussion about it. In my opinion, this SNL is a user interface language rather than an executable layer and how many rules it can represent depends on how many concepts are formalized and mapped to data concepts. Mapping high level concepts to building data concepts is not trivial (not always like a space which has a name of “Bedroom” is a bedroom), on the contrary it is usually the most difficult and time-consuming part in the development of a rule checking system. In this research, it is realized by a configuration file, which is not specified and seems not advanced regarding Semantic Web applications.
Thirdly, the Revit building model is 93,596 KB. What about the IFC size of the model? They should not be equivalent. The extracted OWL model has 5531 entities and 402, 364 attributes. Do you mean the transformed RDF dataset has 402,364 triples? If yes, it is not really a large building model even if the original model is 93,596 KB. This part needs to be described clearly.
It is still a general description of programming and implementation work for a case. The authors should describe the value of their approach and the improvement regarding state of the art, and provide additional datasets to support their results, otherwise other researchers can hardly profit from this paper.
Quality of writing: The general structure of this manuscript is good. I suggest to provide detailed evaluation of the SNL e.g. expressiveness and usability in section 7, since it is part of the result. I enumerate a few detailed issues as follows:
Section 1:
Building information model (BIMs) -> Building Information Model (BIM) or Building Information Modeling (BIM).
“Representing design codes with rule description languages like SWRL [11], N3Logic [12], and then taking ontology reasoners like Jess [13] for checking are popular solutions.” Could you cite some work here?
“we propose a lightweighted method which rule checks big BIM models based domain knowledge on building codes and the feature of BIM models.” This sentence needs to be rephrased.
Section 3:
“logic based language” -> logic based languages
“…and SNL has no need to do re-logical and reorganization or other processing.” I am hardly convinced by this sentence. SNL rules are defined based on informally defined concepts, I do not think users who have no programming and data modelling experiences can define ready-to-use and consistent SNL rules that require no additional adjustment. Concepts need to be mapped to specific building models to make rules executable.
“…writing SQUALL statement will use the concept of RDF”. What do you mean by “concept of RDF” here? Do you mean the RDF vocabulary?
“reference among different building codes or variety of conditions in an item”. Authors did not show examples for this sentence.
Authors should provide grammar of this SNL language rather than to describe it in text.
“Currently, SNL is able to cover…” I think authors should provide additional datasets to prove it. By the way, what does “domain codes” mean here? There should be references for the building code GB 50016-2014 and GB50096-2011.
Section 4:
What is the difference between E and EI? Do you mean E is meta model level while EI is instances?
“…, we extract the attribute set that belongs to the set A and referencing r…”. I don’t quite understand this sentence.
The extraction process and algorithm is not clear enough, and I doubt the scalability of this approach. Authors need to present clearly or provide additional datasets about how the rule library is structured, how different concepts are mapped to IFC elements (not just saying it is based on a configuration file), and what are the connections with Semantic Web technologies.
Section 5: The transformation process between SNL rules and SPARQL queries is also related to mapping between concepts. How many concepts have been mapped? Can they all be mapped with the configuration file?
Section 6: “… we made BIM domain specific optimization strategies…” Are the strategies domain specific? They look applicable for any RDF graphs.
The second strategy is not presented clearly. How many structures have you refactored? What pre-queries did you define? Could you give an example?
In general, what are differences between your strategies and existing query rewriting methods?
Section 7: In section 6, authors said the pattern of “FILTER EXISTS {}” is transformed to normal graph patterns, but the example presented in Fig. 6 and Fig. 8 did not transform this pattern. I don’t see why the query in Fig.8 has much better performance than the query in Fig.6 does. What is the time difference between them? In my opinion, in this specific example the performance might be improved but the improvement is limited. The pattern “FILTER EXISTS {?x0 ifc2x3:hasBoundaryElement ?x}” needs to be refactored and put into a proper position in the graph pattern. In this case, it is related to how to combine two trees.
In the last paragraph of this section, execution time depends on how complex the rules are. I would appreciate to see some additional datasets about the experiment (e.g. building models, RDF datasets, query sets, SNL rules and original building code).
|