Focused Categorization Power of Ontologies: General Framework and Study on Simple Existential Concept Expressions

Vojtěch Svátek
Ondřej Zamazal
Miroslav Vacura
Jiří Ivánek

When reusing existing ontologies for publishing a dataset in RDF or developing a new ontology, preference may be given to those providing extensive subcategorization for the classes deemed important in the new dataset schema or ontology (focus classes). The reused set of categories may not only consist of named classes but also of some compound concept expressions viewed as meaningful categories by the knowledge engineer and possibly later transformed to a named class, too, in a local setting. We define the general notion of focused categorization power of a given ontology, with respect to a focus class and a concept expression language, as the (estimated) weighted count of the categories that can be built from the ontology’s signature, conform to the language, and are subsumed by the focus class. For the sake of tractable experiments we then formulate a restricted concept expression language based on existential restrictions, and heuristically map it to syntactic patterns over ontology axioms. The characteristics of the chosen concept expression language and associated patterns are investigated using three different empirical sources derived from ontology collections: first, the concept expression type frequency in class definitions; second, the occurrence of the heuristic patterns (mapped on the expression types) in the Tbox of ontologies; and last, for two different samples of concept expressions generated from the Tbox of ontologies (through the heuristic patterns) their ‘meaningfulness’ was assessed by different groups of users, yielding a ‘quality ordering’ of the concept expression types. The different types of complementary analyses / experiments are then compared and summarized. Aside the various quantitative findings, we also come up with qualitative insights into the meaning of either explicit or implicit compound concept expressions appearing in the semantic web realms.
