Review Comment:
This paper deals with a very important problem from information retrieval based
on images, which is known as the semantic gap. The semantic gap arises from the
different ways in which an image database can be queried. Given one or a few
images, one may ask for similar images (query by example), which is solved by
identifying and comparing features in an image. Alternatively, one may specify
a set of properties (keywords) of the images to be retrieved. Since features
and keywords are not interrelated, one cannot go from one of these queries to
the other.
This work uses three ontologies to give a semantic relationship between features
and keywords. Keywords are organized through a high-level ontology which can be
defined by the user to specify the meaning of these keywords. Features are
extracted automatically through previously existing approaches, and also organized
in a low-level ontology, taking many properties of the image into account (e.g.,
location, density, etc.). Finally, the connection between these two ontologies
is specified through a linking ontology. The overall result are three rather
simple TBoxes, which are then populated with large ABoxes.
The authors apply this general approach to create an ontology of lettrines from
medieval texts, to allow experts to query and retrieve important information from
existing data. It is important to note that the construction of the ontology
requires much manual work, not only in defining the TBox (which is small) but
also in populating the ABox. In fact, although the authors use automatic feature
extraction methods, they need to be manually verified. Still, the result is an
ontology which can be queries at both levels: from its features, or from its
keyword (or both at the same time), as was required by the original problem.
This is a pretty nice work. Although it does not give any technical innovations,
it shows how existing techniques can be used for practical problems, and also
points issues that need to be solved towards this goal. The application scenario
is very interesting and within the scope of this special issue. I have no
doubt that is can be accepted.
I have only a few (pedantic) remarks:
- "T-Box", "Tbox", etc. -> "TBox"
- the same for A-Box, A Box, etc
- Section 2.2: "Ontologies approach have" -> "Ontology approaches have"
- "[32], image content was modeled ..." -> I could not understand this sentence
- end of Section 4.4: if something is correct 17 out of 18 times, the error
rate is about 5%, not 2%
|