Abstract
When reusing existing ontologies for publishing a dataset in RDF (or developing a new ontology), preference may be given to those providing extensive subcategorization for important classes (denoted as focus classes). The subcategories may consist not only of named classes but also of compound class expressions. We define the notion of focused categorization power of a given ontology, with respect to a focus class and a concept expression language, as the (estimated) weighted count of the categories that can be built from the ontology’s signature, conform to the language, and are subsumed by the focus class. For the sake of tractable initial experiments we then formulate a restricted concept expression language based on existential restrictions, and heuristically map it to syntactic patterns over ontology axioms (so-called FCE patterns). The characteristics of the chosen concept expression language and associated FCE patterns are investigated using three different empirical sources derived from ontology collections: first, the concept expression pattern frequency in class definitions; second, the occurrence of FCE patterns in the Tbox of ontologies; and last, for class expressions generated from the Tbox of ontologies (through the FCE patterns); their ‘meaningfulness’ was assessed by different groups of users, yielding a ‘quality ordering’ of the concept expression patterns. The complementary analyses are then compared and summarized. To allow for further experimentation, a web-based prototype was also implemented, which covers the whole process of ontology reuse from keyword-based ontology search through the FCP computation to the selection of ontologies and their enrichment with new concepts built from compound expressions.
Introduction
The main motivation of providing machine-readable semantics to data on the web in the form of ontologies is that of achieving interoperability of independently built data sources and applications. For example, if the same kind of product offered by different e-shops is semantically described using the same web ontology, comparison and automatic recommendation of these offers can be provided to customers.
Obviously, interoperability depends not only on the existence of ontologies but also on their reuse. Rather than coining new entities in isolation, a dataset publisher should invest into finding relevant ontologies and integrating their entities into the schema of the given dataset. Similarly, the designers of a new ontology should consider reusing parts of existing ontologies from the same domain. Besides the interoperability benefit, reusing a categorization structure already existing in another ontology may also save a part of the design effort.
Since the majority of ontologies is nowadays published in the same standard language, OWL [27], the reuse is easy from the technological point of view, whether the method is the direct reuse of existing ontology entities or their subsumption/equivalence mapping from the dataset schema or from the new standalone ontology. However, despite the general agreement on the benefits of ontology reuse, this best practice is not massively adhered to yet [3]. One reason might be that selecting an ontology, or a fragment of it, suitable for being reused, from a larger pool of ontologies (typically pre-selected via keyword-based search in ontology repositories) is a non-trivial task for which only recently formal methods have emerged. They mostly rely upon
However, these approaches face the ‘cold start’ problem. Due to rapid growth of the ontological ‘ecosystem’ on the web, many emerging ontologies relevant for a certain dataset (or, generally, a reuse case) might not yet have achieved significant popularity ratings. Furthermore, the fact that an ontology as whole is thematically related to the reuse case does not mean that it structurally fits well the data that is to be semantically described. All this calls for complementing such ‘extrinsic’ sources of evidence with ‘intrinsic ones’, reflecting what is in the ontology itself, beyond mere unstructured keyword matching; the ontology’s axiom structures should be examined.Our proposed approach to enhancing web ontology reuse is based on four assumptions:
The first two assumptions are supported, among other, by the findings of a study on competency questions by Ren et al., from 2014 [17]. They collected 168 competency questions (CQ) from two ontology projects, and clustered them into twelve archetypes. Of them, at least three correspond to tasks that can be characterized as sub-categorization of instances of a relatively generic class (here, pizza or software), namely:2
#1: “Which [
#3: “What type of [
#4: “Is the [
Note that while Assumption #2 gives a value to compound class expressions, we can realistically expect that for certain reuse cases (e.g., when the categories have to be imported into a static hierarchy such as a thesaurus or product catalog on a web page) the compound class expressions have to be structurally and lexically transformed to named classes.
The validity of Assumption #3 obviously depends on the meaning of “(ontology) provides (categories)”. Namely, an ontology is, structurally, a collection of axioms rather than of categories (class expressions). The generation of categories from ontology axioms, allowing for their aggregated ‘counting’ by a metric, will essentially be a heuristic process.
Assumption #4 brings a bottleneck: the respective notion of quality escapes an automatic empirical evaluation, as it depends on the subjective perception of users. We however believe that the direct human perception factor is indispensable for assuring the versatility of the reused schema with respect to different use cases. We could of course imagine some alternative, indirect quality measures, in particular, the distinction whether an expression would fit the expressiveness of the query engine to be used for querying the dataset described with the ontology. There could be however many mutually overlapping queries, and the expressions corresponding to each of them would only mildly contribute to the semantic underpinning of the dataset. Imagine for example two ontologies: ontology A has two relevant atomic concepts (subclasses of the focus class), and ontology B has four of them. Without caring for human perception, we might be tempted to construct the compound categories using the conjunction operator, since conjunctive queries are the cornerstone of ontology-based data access. Ontology B would then probably achieve a far higher value of the aggregating metric, having five times more (15 vs. 3) possible non-empty conjunctions) than ontology A. It could however easily happen that none of the conjunctions would have any special meaning beyond that trivially following from its constituents, which would make this strong degree of preference for ontology B rather inadequate. For these reasons, and also due to the fact that the computational use of the reused ontologies may be rather diverse and unpredictable a priori, we view the metric computation as rather agnostic of this usage.
Let us now present a concrete motivation example, which we will use throughout the paper to illustrate various components of our approach.
A used vehicle retailer website is to be enhanced with RDF descriptions of the offered vehicles. The descriptions should refer to suitable ontologi/es whenever possible, in view of achieving interoperability with (presently unknown) applications of partners, search engines and aggregators. In other words, the ontologies will be reused as parts of the schema of the dataset (or, as recently called, knowledge graph) consisting of all the structured descriptions. Now, how can we assess the potential of different existing ontologies for being reused in this case? And how will the reuse itself then take place?
Various customers are interested in different categories of vehicles (in terms of technical parameters, make, operation history, etc.); the reused ontologies thus should allow to represent as many such user-demanded categories as possible. The categories can then be assigned to data items representing the corresponding vehicles within the website, this way made easier to search and browse. Note that the number of vehicle categories may not be correlated with the number of classes in the ontology. For the task of recommending a particular vehicle, categories refining the ‘vehicle’ concept are more relevant than others. For example, if a considered ontology O were a broader one covering transport in general, its capability to express categories of, e.g., traffic signs, would not be an argument in favor of its reuse in the given case. Ontologi/es allowing to express many ‘meaningful’ categories of vehicles – whether as named classes or as newly constructed compound concepts – are good candidates for becoming part of the dataset schema. The whole or part of the reused ontologi/es (after a merging step, if more ontologies are reused for the ‘vehicle’ concept) would possibly give rise to a product taxonomy published as a navigation structure of the shop website.
To name some hypothetical applications (in the context of this particular domain) where the category-centric processing of data would make sense:
within the retailer website, the user could navigate over a (multi-)tree consisting of both original and constructed categories; the designed categories would be considered as dimensional values in an internal sales reporting dashboard; meaningful sales segments to be coherently marketed would be identified.
We will now leverage on the presented example in order to explain the idea of our approach in intuitive terms, prior to formalizing it in Section 2. We will present the basic principles of focus categorization power computation, explain its broader ontology reuse context, and clarify the used terminology.
Focus categorization power computation OWL [27], being a primarily concept-based formal ontology language, allows to create compound class expressions of many kinds using a large collection of constructors. We define the notion of concept expression language3
Formally we should use the term ‘OWL concept expression language’; we omit the ‘OWL’ attribute for brevity, since we do not go beyond the limits of OWL in any part of this research.
Depending on the chosen language, we can build, from the signature (i.e., set of classes, properties and individuals) of an ontology, composed class expressions, for example,
These expressions can further subcategorize entities of which we already know their focus class, i.e. Car (or, Vehicle or similar). The conjunction of the focus class
A concept expression language and the signature of the ontology alone are not sufficient as input for building ‘meaningful’ categories. In most cases, the vast majority of the combinations of entities (fitting the language structure) would be nonsensical, as the entities would be mutually irrelevant. Therefore, we need one more ingredient: heuristic syntactic patterns relying on templated axioms. Upon successfully matching such a pattern (called FCE pattern) to the ontology as collection of axioms, a class expression will be constructed, with the help of a mapping function specific for the given FCE pattern. An example of an FCE pattern, later formally defined in Section 4, Equation (7)), is that expecting the existence of a domain axiom for a certain property, stating that the subject of this property is the focus class. In our example, an axiom might state that the domain of the hasParkingCamera property is Car. Consequently, under some conditions, the CE
The ability of the ontology to subcategorize the focus class, the focus categorization power (FCP), would ideally be expressed as a sum of weights (degrees of ‘meaningfulness’) of all individual class expressions that have been generated. For example, as humans we can judge that
Ontology reuse process Considering the used-cars example as an end-to-end scenario, the (re)usability analysis of a set of ontologies should first seek in them, using lexical methods, classes expressing the general notion of ‘vehicle’. In each ontology, it will make such a class the focus class, and perform the FCP calculation with respect to it. A high value of the FCP will indicate the suitability of the given ontology for becoming a part of the schema of the website’s data. Then the actual reuse will take place, in which the previously generated compound class expressions could become the definitions of new named classes. In the navigation structure of the used-cars website, the compound concepts thus would not have to be served as logical formulas, but could be transformed (ideally, in a partially automated way) into regular noun phrases that would, explicitly or implicitly, correspond to named OWL classes – e.g., Cars that have parking camera.
Terminological note In the context of introducing the terminology to be employed in the rest of the paper, we should also make explicit our usage of (closely related) terms concept, class and category. For the first two, we subscribe to the common practice within the semantic web community, where both are used nearly interchangeably, with ‘concept’ being more prominent within the description logic literature, while ‘class’ being the official term used in OWL specifications. ‘Concept’, in more general terms, can also be the calling for the intension of the class, i.e., for its underlying meaning independent of its extension (set of instances). To maintain the link with OWL specifications, we denote the individual expressions as ‘class expressions’; however, in the newly introduced terms (‘concept expression pattern’, ‘concept expression language’) we use the ‘concept’ alternative, to hint at the important role of human perception in our research. Finally, as regards the term ‘category’, we endow it with particular flavor: as we see, a category is always related to a more general class/concept (i.e., the focus class) which it subcategorizes. Additionally, we associate a category with a certain concept expression language
The research presented in this paper aims to bring the following contributions:
Introduce the notion of focused categorization power of ontologies as a fundamentally new concept in analyzing ontologies (particularly relevant in ontology reuse scenarios, but possibly even beyond), position it with respect to related research, and provide it with a formal underpinning and an algorithmic solution.
Demonstrate this ontology analysis approach using a particular concept expression language (and an associated collection of FCE patterns), which is simple but sufficiently rich to show various aspects and challenges of the approach.
The demonstration is not merely descriptive, but also features empirical and cognitive studies, which on the one hand produce tentative pattern weighting needed for operational FCP computation, and on the other hand provide some general insights potentially interesting for the ontology engineering community in general.
Finally, further experimentation is made easier for other researchers by making available a prototype implementation of the whole approach, including a convenient GUI.
The present paper is an evolution of a previous conference paper [22], which provided a brief explanation of the notion of FCP, informally proposed a concept expression language with FCE patterns, and provided the results of a first cognitive experiment. The present paper however extends the previous one along numerous axes, and the majority of its volume consists of entirely new content:
The core FCP model has been completely reworked and extended by a number of notions, thus gaining more formal rigor.
The survey on FCE pattern occurrence in ontology collections, only present in an online addendum to the previous paper [22], is now an integral part of the paper, and has been re-run to provide fresher statistics. Its results are now also more thoroughly discussed.
A study on the presence of compound class expressions inside axioms has been newly added.
The cognitive experiment has been repeated on a different data sample, with more guidance provided to the experimental subjects, and with additional meta-data collection.
The different analyses / experiments in the paper are now framed by a comparative analysis.
The prototype implementation (named OReCaP), now available both as a running instance and in source code, is an entirely new result.4
Its early version was presented as a short demo paper [14]; the current version however has substantially enhanced functionality.
The rest of the paper is structured as follows. Section 2 provides a formalization of the focused categorization power framework (outlined in intuitive terms in Section 1.3). Section 3 introduces a CEL having suitable properties for an initial study. Section 4 complements this language with FCE patterns. Section 5 surveys the occurrence of concept expression patterns in ontology axioms. Section 6, analogously, surveys the occurrence of FCE patterns in the Tbox of ontologies. Section 7 describes a series of cognitive experiments in which humans provided an assessment of ‘meaningfulness’ of class expressions belonging to different patterns. Section 8 provides a comparison of those complementary surveys and experiments and a summarizing discussion. Section 9 describes a tentative operationalization of the results from the analyses, and explains the functionality of the implemented OReCaP prototype. Section 10 reviews some related methods and projects. Finally, Section 11 wraps up the paper and outlines the directions for future work.
The aim of this section is to formally underpin the whole approach as well as to motivate the empirical analyses to which most of the remainder of the paper is devoted. We will proceed from the notion of concept expression language to the notion of
Concept expression language
The use of the syntactic constructors in OWL can be restricted in different ways, producing a formal system in logic often called a fragment or sublanguage. There is a number of decidable fragments of description logics [8]. The so-called OWL Profiles as sublanguages of OWL defined in the current OWL 2 standard are examples of such fragments [26]; however, non-standardized restricted sublanguages of OWL may be also be useful for particular tasks, as here.
The notion of concept expression language, in our terms, is based upon a set of concept expression patterns constructed over a set of variables together with a set of allowed substitutions for those variables. Given a particular ontology signature, a concept expression language generates a set of concept expressions by applying all allowed substitutions to all patterns.
Let us first recall or introduce a few preliminary notions. A signature is a triple
In order to bring in the notions of variable and substitutions, let us introduce a special signature
Further, given a signature
With preliminaries in place, we may proceed to our definitions.
A concept expression language (CEL) is a pair
One example of a CEL may be the trivial language of named concepts
Another CEL example is the language of existential restrictions to the top concept
For clarity, we explicitly define the notion of class expressions generated by a CEL: Given a CEL
Only some of the
Let
For simplicity, we will sometimes simply write ‘category’ rather than
Within this paper we will sometimes informally refer to a DCE as to the subcategory of the class
The capability of O to finer subcategorize the resources that are already known to belong to
Still remaining at the (non-operational) ‘guideline’ level, we should discuss some desiderata of the weight function just introduced. Intuitively, leveraging on some of the assumptions from Section 1.1, an ideal function
The weight w should be 0 if the extension of
The weight w should be 0 if
The weight w should increase with the decreasing complexity of
The weight w should be lowered if a category is assembled from thematically unrelated entities. An example derived from the DBpedia ontology (we also refer to it in Section 7.1) is ‘person beatified in a wine region’; it is unlikely that such a concept spanning across the religious and agricultural domain would be of high interest in any of them. However, this characteristic is least eligible for automatic assessment of all four,6
Presumably, graph-based metrics relying on the number of different paths connecting the constituent entities might be applied.
Checking whether the size of a category
Before proceeding to an operational elaboration of the category weight function in FCP computation, we need to return to the problem of category generation. While the CE patterns defining a given CEL allow for substitutions producing class expressions to be ‘counted’ in the FCP computation, such expressions cannot be directly extracted from the OWL code, which consists of (TBox, and sometimes ABox) axioms and not of standalone expressions. We will thus introduce the general notion of focused category extraction pattern – a pattern for transforming OWL axioms occurring in ontologies to class expressions to be used as the DCE of categories of a focus class.
The definition of a focused category extraction pattern relies on OWL axiom templates (axioms containing variables) that can be matched to an OWL ontology through a unification substitution (returning a set of variable bindings with ontology entities). In order to eliminate undesirableg matches without burdening the axiom templates with complex (negated) descriptions of exceptions, validity constraints are furthermore introduced. Finally, the generation of CEs is carried out by mapping functions.
A focused category extraction pattern (an FCE pattern, for short) is a 5-tuple c is a CE pattern. For simplicity we only consider conjunctions of axioms. Disjunctions could be expressed by means of multiple FCE patterns for the same CE pattern, if needed.
Considering we have two kinds of patterns (CE and FCE), which both abstract over some OWL structures, a question naturally arises why these two could not be more integrated or based one upon the other, e.g., in the sense that both the LHS and RHS of an FCE pattern axiom template would be CE patterns. There are however two factors that would make such an integration cumbersome or impossible:
First, while the variables in CE patterns are strictly typed (in the sense that the abstract signature Second, the role of the FCE patterns is not merely in matching the OWL structures but also in constructing class expressions based on the match. The CE pattern of the constructed expressions will not always (probably, just rarely) correspond to the structure of expressions explicitly present in the axioms matched by the FCE pattern.
For these reasons, we keep the two notions as completely separate. In the rest of the paper we will carefully distinguish which type of pattern is being discussed (using the ‘CE’ and ‘FC’ acronyms) unless the type of pattern is entirely clear from the context.
Assignment of a weight w to a certain
FCE pattern axiom template matching, applied on the context of
Inferential relationships between
As pointed above for the specific case of D being a named class, but valid generally, if
Moreover, if
Frequency of instantiation of
Likelihood that the CE pattern
Let us demonstrate these sources on the vehicle domain of our motivating example. A considered ontology contains in its signature a focus class Car8 For easy readability of the DL formulas, we will mostly use the simple DL notation without IRI prefixes in the examples. The namespace will be irrelevant for artificial examples and clear from the context in real-world examples. FCE pattern axiom template: Let the property hadAccident have Car as its domain. An FCE pattern based on the rdfs:domain property (see Section 4 for details) is then matched in the ontology TBox. The rationale of heuristically linking this FCE pattern to the CE pattern Inferential relationships: If the axioms of the ontology are consistent with reality, the reasoner would not infer any prohibitive statement, since the set of cars that have had an accident is neither equivalent to the set of all cars, nor is it empty or a singleton. (Hypothetically, if all cars in the universe have had an accident, RDF datasets (Abox): Let us assume there is already a publicly available car dataset published using this ontology, which contains 2500 instances of Car, of which 1000 appear as subject in at least one triple with predicate hadAccident. The category, if translated to a SPARQL9
CE pattern likelihood: As we partially verify in Section 7, the CE pattern
As suggested in the previous subsection, the ultimate weight of a category can be influenced both by the factors specific for the particular category and by the CE pattern type of the DCE of this category. The category weight computation (for the sake of FCP computation, see Equation (1), i.e., still at the non-operational level) could be possibly decomposed as
⊗ is a function for combining the two partial weights into a single weight of the category.11
As we exemplified (on ‘cars with an accident’) in Section 2.5, the presence of (trustworthy) Abox data should overrule the assumptions made based on the specific Tbox information (within
As regards the CE pattern weight Humans, who can assess concrete expressions conforming to this CE pattern as less or more meaningful subcategories of a focus class. RDF datasets instantiating expressions of this pattern (under the inferential closure) with smaller or bigger numbers of Abox statements. Existing ontology axioms in whose right-hand sides12 By the RHS we mean the syntactic RHS as indicated in the code. For the overwhelming majority of axioms inside the OWL code available in ontology documents, the syntactic LHS is a named class. Although the OWL grammar allows for general concept inclusions, having a compound concept on the left-hand side (see
Orthogonally, we can also aggregate the occurrence counts of FCE patterns in ontologies. This fourth source, in turn, will indicate the upper bound of the quantity of categories that can be obtained for a focus class using such patterns.
In different sections of the paper we will empirically investigate three of the sources identified in this subsection: the CEs in axioms in Section 5, the FCE patterns in Section 6, and the human assessment of FC+category pairs in Section 7 (thus only deferring the Abox analysis to later research), and discuss their potential and limitations in more detail.
To avoid any mismatch of the presented four ‘weight sources’ list with the (also four) ‘weight sources’ from Section 2.5, note that the sources from Section 2.5 are applied ‘deductively’, to estimate the weight of a particular category, while the sources in this section serve for ‘inductive’ derivation of the (average) weight pertaining to a whole CE pattern.
We will now specifically consider the situation when the Abox information on the usage of a specific category is either unavailable or unreliable, and we thus have to derive the category weight from the ontology alone. Under such conditions, the individual weight function
0 if
Let us then assume that we distinguish among k CE patterns allowed in the CEL
Note that in terms of Eq. (3), the multiplication by each CE weight were equal to its CE pattern weight, and, the FCE patterns were ‘perfect’ in the sense of generating exactly those CEs producing a non-zero weight (if used as the DCE of a category).
Notably though, these assumptions are entirely unrealistic in any practical setting. This entails that the resulting ranking of ontologies for reuse according to
Pattern-based CE extraction and approximate FCP computation algorithm
The computational process leading to extraction of class expressions using FCE patterns is rather straightforward with respect to the general description of the approach in Section 2.7, see the algorithm pseudo-code in Listing 1. Each call of the function
For completeness, we also include the approximate FCP computation algorithm, as Listing 2. It simply amounts to summation of CE pattern weights

Pattern-based CE extraction algorithm

Computation of approximate FCP
With the algorithm, we now have a general operational framework allowing us to approximately calculate the FCP of an ontology with respect to a focus class, relying purely on the axioms from the ontology.
The framework still needs to be instantiated with three components:
a concrete CEL, providing the CE patterns, sets of FCE patterns, linked to these CE patterns via mapping functions, and, finally, weights of these CE patterns.
In Section 3 we provide a CEL that is simple to work with but still sufficiently rich to demonstrate various aspects of the method. In Section 4 we suggest a set FCE patterns for its CE patterns. Finally, tentative CE pattern weights are deduced from the experiments in Section 7, while the preceding sections, 5 and 6, provide some empirical insights on the proposed CE and FCE patterns, respectively.
Simple existential CEL:
Having explained the whole proposed general framework of FCP computation, let us now return to its most essential element: the concept expression language (CEL) that determines what kinds of CEs are considered for the differentiating class expressions (DCEs) specializing the focus class.
As set up by the Assumption #2 from the introduction (importance of both atomic and compound CEs for the FCP), for meaningful analysis of the FCP computation landscape we have to combine some variant of the language of named concepts ( The interplay between properties and classes, which are prone to be viewed separately in some ontology design environments, e.g., Protégé, may intuitively give rise to concepts ‘interesting to look at’, whatever their ultimately judged contribution to the FCP will be. The number of matching CEs only grows linearly with the number of classes The existence of a property assertion (witnessing the validity of the existential restriction for the subject entity) can be easily checked in the Abox, even complying to the open-world assumption (OWA). This favors the existential restriction over the universal restriction or max-cardinality restrictions, whose validation has to rely on the closed-world assumption (CWA).
OWL concept constructors’ usability for focused categorization is analyzed in more detail in Section 5.
When considering the CE pattern
One of them is
The other is
Along with the CE patterns based on existential restriction, we will also consider the sole CE pattern of
Based on these consideration (and, additionally, eliminating some uninteresting edge cases) we defined a suitable CEL, for the sake of the empirical research described further in this paper, as follows.
The simple existential CEL is the CEL
Therefore, the CE patterns considered for the sake of the presented research are: a named class; an unqualified property restriction (i.e., one with ⊤ as the filler); a property restriction with a named class as the filler; and, an individual value restriction (a property restriction with a singleton class as the filler).
Since
Table 1 gives an overview of the CE patterns:
The first column assigns the CE patterns a numbering local to the given CEL, for convenience.13
We denote these specific CE patterns in bold face, to differentiate them from CE pattern meta-variables using the math font with subscript notation (
The second column indicates the structure of the CE pattern itself in DL notation; the CE patterns substitutions are already restricted according to
The third column indicates which variables from the CE structure are to be substituted by corresponding
The fourth column measures the length of the Abox path (as number of triples) connecting the individual j to be assigned to the CE with entities (‘responsible’ for the assignment) substituted for variables from the third column, in other words, the minimal size of the SPARQL graph pattern to be used for the CE instance detection in the Abox. The SPARQL pattern will rely on the rdf:type predicate (
The order of the CE patterns in the table reflects the increased complexity of their detection in the Tbox using FCE patterns designed for
Summary of CE patterns in
Looking back at the competency questions of Ren et al. [17] referenced in Section 1, we see that while the archetypes #3 (“What type of [CE] is [I]?”) and #4 (“Is the [CE1] [CE2]?”) assume named subclasses of the FC, i.e. are covered by the CE pattern
Having defined
Inventory for
FCE pattern axiom templates
In principle, the FCE patterns could feature any OWL constructs within their axiom templates (
Both the existential and universal restriction require three triples in the RDF representation; one of them in both cases relies on an auxiliary predicate
While the construction of the heuristic FCE patterns for
Existential restriction is also used, but it is in the constraint part of the FCE patterns rather than in the axiom template part. Therefore, while the pattern setting assumes the presence of both ‘global’ (domain and range) and ‘local’ (existential) restrictions, omission of the latter by the designers does not preclude the generation of CEs but only disables their pruning.
We present below five FCE patterns for
Pattern
Also note that even such a simple FCE pattern might not be the only possible for the given CE pattern. If we accept the possibility that not all classes in different hierarchical paths are pairwise disjoint, we could also apply to
As we move from
The relationship between
Pattern
The restriction can also be inherited from a superclass or part of a complete definition, or can have the form of a
In order to illustrate the application of the two FCE patterns introduced so far together, in the calculation of the approximate FCP (Eq. (4)),we will return to our running example. Unlike in the previous, artificial, examples, we will refer to a real-world ontology.
The schema.org ontology contains a class
Here and in the rest of the running example we omit the
There is actually a tweak in this example. schema.org uses schema:domainIncludes instead of rdfs:domain, and analogously for the range axioms (schema:rangeIncludes). These properties do not have inferential semantics, and merely relate a given property to a class that “is (one of) the type(s) the property is expected to be used on” (
The instantiated
Intuitively, while
Let the weights be, for example,
Let us now proceed to the FCE patterns for the remaining CE patterns included in
Pattern
As this FCE pattern is already a bit more complex, we will also demonstrate its application (and the added value compared to

Tbox of
Reliance of FCE patterns on asserted ontology structures might make the pattern vulnerable to ontology versions that would already contain the inferences materialized. However, while the transitive rdfs:subClassOf inference is sometimes materialized in ontologies (transitive superclasses being present for a class, rather than just its direct superclass), are linked to it in the ‘asserted’ ontology), it is very unlikely that this would be the case for the upward propagation of domain/range restrictions.
Pattern
The inferential closure is used as in
Pattern
We can now illustrate the application of all considered FCE patterns on our used cars example.
The property
Such special-purpose classes would have to be filtered out; typically they could be automatically detected by appearing in the range of a huge proportion of properties.
All five FCE patterns assure (
In contrast, this would for example not be the case for the alternative FCE pattern for
As regards the complexity of the FCE patterns in terms of (templated) RDF triples in the template axioms, it is as listed, for the respective CE patterns, in the fifth column of Table 1.
We have previously stated that the proposed FCE patterns are merely heuristic (approximate) with respect to the optimally chosen set of
First of all, the Assumption #4 from Section 1, i.e., that the quality of compound categories should be correlated with the degree to which users would consider a corresponding named class as meaningful, rules out any kind of optimality guarantee, since human judgment is always subjective. The designer wishing to reuse an ontology may even question the asserted subclass axioms from the source ontology, thus lowering the precision of the
The situation is analogous for the FCE patterns’ recall. Even
As regards the FCE patterns
The generic algorithm in Listing 1 ends by generating a list of CEs (plus their CE patterns) for a given focus class. However, to finalize our schema.org example narrative, we will also consider the subsequent step of named category generation, already briefly mentioned in Example 1.
Let us assume that the categories discovered for the purpose of the FCP computation are retained for future use. The vehicle retailer might then decide to rebuild the product catalog so as to cover the categories, even the compound ones, that are populated by a significant number of vehicle items. The categories derived from the same property, say,
This structure might either become materialized in the underlying ontology as such or might only be generated at the web engineering level. The taxonomy would naturally follow the specialization of CE patterns as in Table 1: the categories with DCE conforming to
FCE patterns are a crucial element of the operationalization of FCP computation over the ontology axioms. We therefore described the FCE patterns for
CE patterns in ontology axioms
We analyzed the axioms of publicly available ontologies as an empirical source for estimating the frequency of occurrence of CE patterns of the CEL
Ontology axiom sources
For our analysis we used three collections of ontologies:
The collection indexed by the Linked Open Vocabularies (LOV) portal,23
The BioPortal collection,24
A small experimental collection of ontologies having heterogeneous styles and relatively rich in axioms, from the domain of conference organization, called OntoFarm.25
The impetus for this empirical analysis was the close relationship between the central motivating task of the research, that of using CEs (either named classes, or compound CEs that can possibly be transformed to named classes) for categorizing individuals, on the one hand, and the task of defining new named26
We can for now ignore general concept inclusions (having a compound CE as their left-hand side), which are allowed in some dialects of OWL but only used sparingly in real-world ontologies.
Note that the subsequent use of the constructed CEs is, in some aspects, similar in both tasks, too. Notably, in all cases, an aspect of categorization is present:
A compound CE in the RHS of an equivalence axiom allows to intrinsically (i.e. using the apparatus of DL tightly coupled with OWL as representation language) infer individuals to be instances of the named class in the left-hand side (LHS) of the axiom.
A compound CE in the RHS of a subsumption axiom (which represents a mere necessary, and not sufficient, condition) allows, analogously, to intrinsically rule out individuals from being instances of the named class in the LHS of the axiom.
A compound CE that can be merely constructed from the ontology signature under some CEL and for some FC still allows to subcategorize individuals, but merely extrinsically, i.e., either using manual assignment or some other external source (e.g., a machine-learning-based classifier).
Technically, the expected outcomes of the analysis were the following:
Findings about the frequency of various concept expression patterns in the RHS of axioms in existing ontologies (possibly interesting for the community even beyond the focused categorization setting)
Positioning of the CEL
The considered list of CE patterns is, essentially, that of first-level constructors in the axiom RHS. However, since we were particularly interested in the compound CE patterns of
Strictly speaking, the singleton enumeration
The frequency of CE pattern occurrence, in absolute counts, was classified by three dimensions:
By the analyzed ontology collection. By the distinction of equivalence or subclass axioms (and the sum of both). By the level of nesting: outermost constructor vs. further levels.
The results are in Table 2, however omitting the last dimension (nesting level) for brevity. The most frequent constructors are listed: 10 for LOV and BioPortal, and 5 for OntoFarm (where only the top of the ranking is relevant, due to very low counts). The constructors corresponding to
CE patterns in axiom RHS
CE patterns in axiom RHS
As regards the
The other constructors appearing at the first level of the axiom RHS nesting are:
Conjunction (⊓). The possibility of building logical conjunctions via plausible axiom patterns, analogous to those from Section 4, deserves a further study. E.g., for the
Disjunction (⊔). Unlike for conjunctive expressions, any pair of expressions is mutually compatible in a disjunction. As mentioned in Section 3, the number of disjunctive combinations will thus be very high for larger ontologies, and the question of which combinations should be taken into account for FCP or not would not be easy to decide at the axiom pattern level. Intuitively, concepts ‘closer to each other’ in the ontology might be more adequately combined (e.g., the customer might be more likely to seek a “
Existential restriction to a compound CE (‘ExistAnon’). This could be one of the meaningful extensions to
Universal restriction (∀). The observation that universal restrictions rarely appear in equivalence axioms can be explained by the fact that classifying individuals by universal restrictions is impossible on the open semantic web due to the OWA: the fact that all known individuals
Cardinality restrictions (
Enumeration (
Negation (¬). In some contexts, a negated concept might perhaps be a sensible subcategory of a focus class; however, such contexts would be hard to guess a priori. Considering the negation categories in a generic way would simply double the number of categories for each CE pattern to be negated, and thus would not have other impact on the FCP than would have the doubling of the weight of those CE patterns.
The contribution of this section was twofold. First, it featured the CE pattern occurrence frequencies as such, which could be of general interest to the ontology design community. Second, specifically in the line of the current paper, it provided intellectual analysis of axiom type roles in FCP.
The analysis suggested that the
FCE pattern occurrence in the Tbox of ontologies
The questions to be answered by this analysis were:
How many ontologies, and for how many FCs, provide a decent number of ‘categorizing’ CEs through heuristic mapping from the patterns from Section 4. What are the differences in the occurrence of the individual FCE patterns overall and across different ontology collections.
The answers to these questions would help us estimate how likely it is that the particular FCE pattern would yield potential categories when a focus class is provided as an input. (They will not, though, provide an estimate of the quality of such categories; this question will be addressed in Section 7.)
Ontology sources and data aggregation method
In the analysis we made use of our Online Ontology Set Picker framework28
In order to provide aggregate results, we counted the occurrences of FCE patterns from Section 4 across all classes of all ontologies evaluated in the role of FC. We summed up these results at ontology level by identifying ‘categorizable’ classes, i.e. classes for which the pattern occurrence reached some threshold τ (1, 3, and 5) for patterns
Pattern
For illustration of the bottom-up calculation steps, let us take the example of an OntoFarm ontology called cmt. This ontology satisfies the parameter thresholds for
Ratio of ontologies with a
Let us now try to answer the questions posed at the start of the section, by examining the result table.
Unsurprisingly, the percentage of ‘categorizable’ classes is in most cases highest for
For
Results for pattern
p5
For completeness, we performed an analysis of ontologies which use SKOS concepts for entity categorization (pattern
Table 4 presents information about those 7 ontologies in terms of the number of SKOS concepts that can be used as the ‘categorization individual’ and the number of SKOS concept schemes from which those concepts come. Further, we include information about the date of the last modification of the ontology. In two cases, no concept schemes are available. For the other ontologies the number of SKOS concepts (and SKOS schemes, respectively) usable for categorization varied from 11 to 274 (from 1 to 16, respectively). Although the phenomenon captured by
Categorization via SKOS concepts (FCE pattern p5 )
Categorization via SKOS concepts (FCE pattern
The findings about
Cognitive experiments: CE assessment
The previous analyses carried out over the ontology Tbox (axioms RHS in Section 5 and FCE pattern occurrence in Section 6) only indirectly contributed to the central question: whether the compound CEs from
In order to get finer insights, we proceeded to a detailed investigation of sample CEs by human ‘ontologists’, both experts and relative novices (students of relevant subjects). We performed two campaigns of experiments, the first in Spring 201630
Already described in our early publication [22]. Here we provide a synoptic view of both campaigns.
Provide the human assessors with a set of ‘focus class – subcategory’ pairs, such that the subcategories correspond to different CE patterns. Each pair constituted an elementary assessment task.
Collect the assessment for each task, reflecting the degree to which the assessor perceived the category as ‘meaningful’, or, better, ‘meaningful and reusable’ (or also, for short, what is the perceived quality of the category), via a questionnaire.
Aggregate the collected data across the assessors.
Seek correlations between the category quality and the CE pattern, possibly corrected through metadata (also collected in the questionnaire) such as the level of written English of the assessors or their general comprehension of the meaning of the constituent entities of the category.
Examine the qualitative feedback also collected through the questionnaire.
A summary of the experimental setting in both campaigns is in Table 5.
Overview of cognitive experiments with students
Initial sampling As regards the FC+subcategory (from now on, ‘task’) sampling for both threads of analysis (expert/novice), we used the same collections as in the survey from Section 6, i.e. LOV (a 2016 snapshot) and OntoFarm (this collection has remained unaltered for nearly a decade). From each collection 40 tasks were randomly generated while maintaining an approximately even proportion of subcategories for patterns
Expert ontologist assessment and insights In the first stage, the assessment was made by three researchers with 10–20 years of experience in ontology engineering (authors of this paper: VS, OZ and MV). They first examined the 59 tasks independently and assessed them on the 5-point Likert scale: for each task, with a focus class Y and a subcategory X, the question “Is X a meaningful subcategory of Y?” was answered as either ‘certainly’, ‘perhaps’, ‘borderline’, ‘perhaps not’ or ‘certainly not’. Then a consensus was sought in a F2F session. The independent assessment had 76% agreement: in 45 out of 59 cases there was no contradictory assessment (certainly/perhaps yes vs. certainly/perhaps not); we will call these cases clear positives (42 cases) and clear negatives (3 cases), respectively. The consensus session then yielded a complete consensus on the remaining cases; in 12 out of the 14 ‘clash’ cases the final result was ‘yes’ (namely, a conceivable situation was formulated in which the CE would be a meaningful subcategory of the FC), one case was found dubious due to an implausible inference (see the second ‘insight’ below, on the ‘village head chef’) and in one case the subcategory was assumed semantically equivalent to its FC, both resulting into ‘no’. Of the five ultimately negative results, four were of
In this section we write the example CEs in Manchester syntax, with the keyword
Ontologies tied to software applications, such as some OntoFarm ones (capturing the processes supported by conference software) use object properties to capture relationships that are only relevant within a short time frame, e.g.,
In some cases the use of inferential closure for the filler class in
Some CEs of
Novice ontologist assessment There were two groups of students involved: Bc-level students in a course on Artificial Intelligence (AI) and MSc-level students in a specialized course on Ontology Engineering (OE). Both courses provided a certain degree of OWL modeling experience (in Protégé and Manchester syntax) prior to this exercise, although OE went into more depth as regards the underlying DL and reasoning. There were 17 AI students and 10 OE students altogether. In both courses the students were first provided with a 30-minute overview of the notions of CE (in
The questionnaire was in Czech. The English translation of a sample task is available in Appendix A.
The 59 tasks from the initial sample were randomly divided into three questionnaire versions (one
We aggregated the results by questionnaire task, and then both by the course and by CE pattern. The aggregation was carried out by simple summation over the answer values rescaled to the
A short digest of the results follows:
The average NS over all 60 tasks was 0.07, i.e. rather low, although positive. Of the 60 NS values, 28 were positive, 5 zero and 27 negative. The values strictly below 0.25 and above −0.25, possibly viewed as ‘borderline aggregates’, were 34 (57%).
Perhaps most important, the average NS was highest for
The cases33
Most namespace prefixes used can be expanded using the prefix.cc service. Prefixes unlisted by this service follow:
The average NS was higher for the OE students (0.12) than for the AI students (0.04), which might be attributed to more developed ‘ontologistic thinking’ of the former. The inter-task variance, indicating the tendency towards giving uneven values (averaged over the students filling the same task) across the questionnaire, was about the same (0.16) for both courses. However, the intra-task variance, indicating the degree of disagreement among the students filling the same task, was higher for the AI students (2.51) than for the OE students (2.12), i.e. the rating of the latter was more coherent.
CEs with highest and lowest average NS of student scores, 2016 campaign
In comparison with the ‘expert ontologist’ assessment:
The students gave a significantly lower score: only about a half of the tasks had a positive NS, compared to 92% (54/59) in the final consensus of experts. This can be explained by their lower ability to figure out specific situations in which less obvious categories might become meaningful.
If we apply the same method of average NS computation on the initial assessment of experts, the proportion of ‘borderline aggregates’ between −0.25 and 0.25 is only 14% (in contrast to 57% for the students’ values).
There is agreement on the less frequent ‘meaningfulness’ of
As regards the case-by-case comparison between the students and the experts, there is also a correlation in the sense that the 43 experts’ clear positives obtained a positive average NS from students (0.14), while the 14 initially ‘clash’ cases obtained a slightly negative average NS (−0.07) and the 3 negative cases obtained a clearly negative average NS (−0.24).
In the second campaing we tried to modify the setting so as to avoid some biases and gaps appearing in the first campaign, in particular:
The even distribution of tasks between LOV and OntoFarm was judged inadequate, as OntoFarm is by an order of magnitude smaller, addresses one domain only, and its ontologies have been created artificially, even if based on real-world non-ontological resources.
Assessing the CEs solely based on their formal representation risked of suffering from a comprehension bottleneck.
There was no a priori expert assessment this time (assuming that the correlation of the expert and novice assessment had been adequately studied in the first campaign).
Initial sampling and task preparation This time, all tasks were generated from the LOV. In contrast to the 2016 campaing, we also added CEs matching the pattern they had more than 90% of their classes equipped with the they had at least 10 classes (to eliminate the long tail of very small ontologies).
The actual sampling was then performed on approx. 130 thousand CEs generated from the 72 ontologies that satisfied the above conditions. From this pool we randomly sampled ten tasks for every CE pattern (
verbal definitions of the involved entities, as the
selected axioms in which the entities appeared in the RHS.
Unfortunately, the sampling results exhibited some potentially undesirable features, and we did not have time to redesign the sampling because of the planned experiment dates (within the schedule of both courses) that we were unable to shift. Namely:
Some domain ontologies contained links to upper-level ontologies. If the FC was then picked from an upper-level ontology, it was highly abstract (e.g. ‘Feature’, ‘Object’ or ‘Endeavor’), and its relationship to domain-specific concepts of the CE was hard to figure out. The assessment then had ‘strong philosophical flavor’, and the setting was unrealistic wrt. our target use case, since upper-level entities would not typically be sought as focus classes when publishing linked datasets. One of the tasks referred to an ontology in a language different from English (namely, Spanish).
By the results and the students’ feedback it however does not seem that these infelicities would have seriously biased the experiment.
The structure of the CEs was verbalized, using simple NLP patterns plus occasional manual tweaking to assure grammatical correctness of the generated sentences. For example, if the property label P in a CE matching the pattern
Novice ontologist assessment There were, again, two groups of students involved, from the same courses as in 2016 (Bc-level AI and MSc-level OE). The amount of prior training in OWL was also similar as in the 2016 campaign. There were 15 AI students and 16 OE students altogether. In both courses the students were first provided with a 30-minute overview of the notions of CE (in
The questionnaire was again in Czech, except the CEs (verbalized in English, to avoid issues with the inflection grammar of Czech). The English translation of a sample task is available in Appendix B.
The task question was slightly modified: it explored to what the degree the category is meaningful and reusable. The rationale was that possibly even subcategories with very small absolute or relative frequency might be viewed as meaningful (this term being rather vague and subjective), but undoubtedly their reusability should be perceived as low. Besides, the novelty of the assessment task setting compared to the 2016 campaign was in the following:
The questionnaire separated the meta-question on comprehension from the actual ‘meaningfulness’ assessment, for each task. A separate question now inquired to what degree the student understood the meaning of the individual entities (assuming that the lack of familiarity with the entities strongly impacts the competency to assess a compound CE), with possible values that can be shortened as: ‘quite familiar’, ‘roughly’, ‘pretty vague idea’ and ‘no clue’. The students could textually justify a negative value. For the compound CEs, the students could provide a noun phrase to which the verbalization of the category could be compressed.
We will however not discuss the last two types of metadata elements (the unstructured ones) in the current paper, to avoid thematic dilution.
In addition to the FCP tasks assessment, the questionnaire also examined the students’ assessment of their own level of written English.
We computed the normalized sum (NS) of the task assessment values, as in the 2016 campaign (using Equation (12)). The core results are as follows:
The average NS over all 40 tasks was 0.27, i.e. much higher than in the first campaign (0.07). Of the 40 NS values, 34 were positive, 2 zero, and only 4 negative. This can possibly be attributed to the longer time available for each task, to the higher amount of available documentation, and/or to the verbalization of the CEs.
The relative position of the compound CE patterns did not change from the 2016 campaign. The average NS was 0.33 for
The cases with highest positive and lowest negative values are in Table 7; both the FC and the subcategory are now shown at the level of labels, just as presented to the students. The property and its filler, whether a class or an individual, are separated with a colon. The underlying ontology is referenced in the third column, through its nickname, which can be resolved against the LOV portal by appending it to
CEs with highest and lowest average NS of student scores, 2018 campaign
We also computed the relative frequencies reflecting the impact of the (declared) English writing skills and of the comprehension of entities on the assessment value:
Of the 168 assessments by students with excellent or very good English skills, 102 (61%) were positive (‘certainly’ or ‘perhaps’); in contrast, of the 80 assessments by students with fair or basic English skills, only 37 (46%) were positive. Of the 176 assessments where the students comprehended the meaning of the CE entities (‘quite familiar’ or ‘roughly’), 126 (72%) were positive (‘certainly’ or ‘perhaps’). In contrast, only 13 assessments of 72 (18%) where the students did not comprehend the meaning of the CE entities (‘pretty vague idea’ or ‘no clue’) were positive.
In order to reflect the degree of entity semantics comprehension in the assessment (with the assumption that more weight should be given to ‘more informed’ assessment), we also applied simple numerical weighting: the formula from Equation (12) was changed to
The role of the cognitive experiments was to eventually attempt to assess the reusability degree of different individual CEs and thus (indirectly) their patterns, which is tied to the central idea of the whole approach (in which, for example, the FCE patterns are merely instrumental).
Across the different campaigns and settings, the order of the CE patterns according to the average ‘meaningfulness’ of the categories remains stable: (
Data availability The data from both experiments is available on the web through The description of each task (consisting in evaluation of the meaningfulness of a class expression) in each questionnaire variant The table with calculation of aggregated results.
Discussion
Since the empirical part of the paper may appear a bit fragmented and the results hard to align, in this section we first provide an integrative meta-view of the surveys / experiments settings and results.
From this we depart to a discussion of limitations and open questions of the analysis.
Meta-view of the empirical analyses
In Table 8 we synoptically summarize the three empirical pillars of our research so far, as elaborated in Sections 5, 6 and 7. We see that the surveys/experiments are to a large degree complementary, differing in their features: in the structural type of the data source; in the focus on either direct analysis of CEs or on their underlying patterns that are only indirectly tied to the CEs; in the objectivity/subjectivity of the obtained data; and, finally, in the actual scope of the CE and FCE patterns, and of the focus classes examined. Notably, all data sources currently refer to the Tbox. As mentioned before (esp. in Section 2.1), the fourth pillar of the empirical analysis of the FCP problem should be the analysis of the CEs occurrence in the RDF datasets Abox (which is ongoing but did not fit into the current paper).
The last row in the table attempts to summarize the core findings of each analysis. At the first sight the arrangement of the CE/FCE patterns might look incoherent. Especially, Section 5 considers the CEs within the RHS of axioms, where they primarily serve as a means for inferring the subordination of arbitrary instances (or classes) to the named class appearing in the LHS of the axiom; it is thus desirable that the RHS would specify some restrictive filler (as in On the other hand, Sections 6 and 7 already study the categorization in the setting with a known focus class, i.e. not for arbitrary instances. Then even the categorization of individuals based solely on the property they appear together with, irrespective the filler (as in
As regards the high ranking of
Summary of the complementary surveys / experiments, with core findings (last line)
Summary of the complementary surveys / experiments, with core findings (last line)
Since the paper explores a substantially novel problem space, the coverage of its different corners is still rather limited. In this section, we discuss five limitations and/or open questions, in turn: the omission of Abox data in the whole process, the simplification made in the empirical analysis of CE pattern frequency, the reliance on particular Tbox design principles in the application of FCE patterns, the negligence of logical considerations when creating the definitions of new named classes, and, eventually, the actual choice of existential restriction as central primitive of the initial CEL.
The presented research currently ignores the role of Abox analysis, both in the actual FCP computation (within a concrete reuse problem) for a specific ontology as well as in the inductive process of CE pattern weight estimation. The former means that the presented approach is tuned towards the ‘cold-start’ setting (new ontologies not yet referenced in data); otherwise the relevance of concrete CEs could be estimated (though with some computational cost) from existing datasets instantiating them. The omission of the latter (which would consist in generating CEs for a given CEL over a representative set of ‘training’ ontologies, measuring their frequency of instantiation in data, and aggregating those frequencies by CE pattern) to date does not have any principled reason, and is only owing to limited human resources on the project.35
While some early experiments are under way, their inclusion (in a low-maturity state) would have made the current paper lengthy and the contribution diluted.
The computation of the CE pattern frequency in axioms from Section 5 targeted the top-level structure in the axiom RHS (with additional distinction of the CE patterns from
The design of our FCE patterns, where global restrictions (domain+range) are used in the pattern template and local (existential) restrictions in the validity constraints, puts a strong assumption on the engineering approach used in the ontology analyzed. If the ontology lacks global restrictions36
Note that some best practices, e.g., in biomedicine, used to argue against domain/range axioms, see e.g. [16].
The generation of new named classes with their compound definitions in the reuse step, as described in Section 4.3 (as a follow-up technique, less central than the FCP computation, which is the main topic of the paper), is currently conceived rather naïvely from the logical semantics point of view. The definitions are assumed to be generated one by one locally, without taking into consideration the inferential structure of the ontology as a whole. This probably does not harm if the resulting ontology is to be used merely as a schema for data to be processed at the assertional level. However, possible exploitation of such an ontology by reasoners might ask for involving more sophistication in the generation of new definitional axioms.
Finally, even the assumption under which we gave preference to the existential property restriction as the cornerstone of our initially chosen CEL, over the Boolean conjunction (which would have been another simple candidate for a CE pattern), namely, that we cannot rely on proper equipment of ontologies with disjointness axioms, may be considered as potentially arguable. As mentioned in Section 5, in cases where we can (through pruning the ontology using those axioms) single out truly compatible pairs of classes, the consideration of conjunctive categories instead of existential ones would be promising (in other cases it would however lead to explosion of output CEs, which is what we wanted to avoid).
While most of this paper is devoted to the description and formalization of the FCP ‘view’ of ontologies and to the associated empirical analyses, we also provide a tentative (or rather, illustrative) operationalization of the empirical findings into FCP computation weights, an describe an early prototype of an ontology search (for reuse) tool leveraging on FCP.
Tentative operationalization of the empirical results
We understand the obtained insights into the usage of OWL class expressions and their perception by humans in general as a research achievement per se. However, the starting point for the overarching empirical study was an ‘engineering’ goal (possibly modest compared to the extent of the performed surveys and experiments): to propose adequate weights for the CE patterns to be used when computing the FCP in the context of an ontology reuse scenario.
In these terms, based on the cognitive experiments in particular, we can see that
Alternative CE pattern weights derived from the cognitive experiments
Alternative CE pattern weights derived from the cognitive experiments
Considering that the lowered average NS values of
The current version of the OReCaP tool (see the next subsection), when launched, proposes these values, i.e.
To demonstrate the whole focused categorization framework, specifically for
The acronym refers to the terms ‘ontology reuse’ and ‘categorization power’.
The interaction workflow with the tool consists of several, possibly iterative, phases:
The process starts with a keyword-based search where the input consists of at least one focused class keyword and of optional additional keywords. The intuition behind the search interface is that the focused class keyword/s denotes the high- or medium-level type/s of entities whose instances are to be further sub-categorized using concepts from the ontology; the additional keywords, on the other hand, correspond to whatever domain terms. Imagine, for example, that the data is currently stored in a relational database. The focused class keyword might then often be the name of the top-level table (which can be, e.g., ‘Client’, ‘Patient’, ‘Vehicle’, ‘Account’, or the like); the additional keywords can be taken, e.g., from the names of subordinate tables, table columns, or predefined values for the fields.
The search returns a sorted list of ontologies whose classes match one or more of the provided keywords by their IRI, name or description; classes with a match of focused class keyword are listed first. The matched classes are listed for each ontology. Classes that match the focused class keywords are preselected (i.e., checked) by default; classes that matches the additional keywords are not preselected but can be selected (checked) manually by the user.
The next step is to execute the FCP calculation for a chosen ontology, given the selected classes as focus classes, by clicking on the ‘Calculate FCP’ button. In a pop-up window, metadata about the ontology including its IRI and namespace is displayed, along with the total FCP score, which is calculated based on the FCP weight values and the categorizations (i.e., class expressions) listed at the bottom. This score is the sum of all partial scores for each focus class. The weight values can be adjusted for each calculated ontology according to the user’s assessment of each CE pattern, and the resulting FCP score will change accordingly. The global FCP weights can be changed in the settings section, so that every new FCP calculation would use them as the default weight values.
The calculated FCP score is then displayed below the ontology overview in the search results, and also saved to a comparison list, which shows the FCP-based ranking of the ontologies.
From the comparison list it is possible to proceed to the reuse summary view of a given ontology. There the user can inspect all generated categories in the form of a pop-up tree, and can check those s/he eventually wants to reuse.
The actual reuse is currently performed two ways: first, a particular ontology (chosen by the ranking) can be downloaded as is. Second, the axioms defining the named classes newly constructed from compound expressions can be downloaded separately from the reuse summary view, to be later manually added to the ontology.

OReCaP interface: two ontologies found via keyword search, with focus class and additional match.
Let us assume the user wanting to publish data about business contracts and their payment, and seeking a suitable ontology for their subcategorization. A partial screenshot showing the overall search results together with FCP scores for two ontologies is in Fig. 2. The focus class keywords (here, just one) are entered in the top-left field; their matches are therefore proposed (hightlighted in blue) as focus classes, by the tool. The user has also provided additional keywords, which may improve the result ranking but do not produce further focus classes (unless the user pro-actively highlights them, too).
For the first ontology, PPROC, a snippet of the reuse summary window is then shown in Fig. 3; the numbers (e.g., “(1/5)”) indicate how many categories were chosen in the given sub-tree. The user has chosen two categories, corresponding to ‘contract that has been anyhow modified’ (CE pattern

OReCaP interface: the reuse summary, with user’s choices of two CEs.
OReCaP makes use of the Linked Open Vocabulary API38
Since the research described in this paper addresses the focused categorization power problem from various angles, multiple areas of related research can be identified. In this section we report on the following, in turn: abstract notions similar to our notion of FCP; empirical studies on presence of class expressions and structural patterns in ontology repositories; cognitive experiments on assessing ontological structures; concept learning in DL; ontology reuse metrics and methods.
Abstract notion of focusing or categorization (power) in ontologies
We are unaware of prior work on the same topic of FCP as we coin it in the current paper. We will however reference some related research that overlaps with ours at the abstract level.
The notion of focusing recently appeared in the work of Gogacz et al. [7]. The so-called focusing solution pairs a set of predicate symbols that describe a database schema (that is, a set of predicates) with a set of assumptions on the partial completeness on the data and the ontology (closed and fixed queries). In their approach, focusing is about choosing which parts of the data and ontology are to be declared complete, to allow for efficient reasoning. In our approach, focusing is about choosing ontology classes for whose instances in data we would like to obtain many meaningful categorization options; the categorization itself however need not rely on logical reasoning (never mind using the ontology), but can be based on whatever kind of classification model or even made by humans.
The term classification/categorization power previously appeared in many scientific texts, however, rarely as a rigorously defined notion. For example, on many occasions, automated classifiers (typically, machine-learning-based) are reported to have certain ‘classification power’ with respect to classes from an ontology, which is merely an informal circumscription of measures such as accuracy or error rate. The ‘power’ also clearly pertains to the classifier and not to the ontology. The association of the notions of ‘categorization’ is thus merely verbal.
Partially relevant is the analysis made by Giunchiglia & Zaihrayeu [6], who categorized ‘lightweight’ ontologies with respect to two dimensions: complexity of labels (simple noun phrases vs. use of connectives and prepositions) and use of ‘intersection’ operator allowing to combine atomic entities of different nature (e.g., the atomic concepts ‘Italy’ and ‘vacation’ implicitly combine into ‘vacation in Italy’). Maximal ‘classification power’ is obtained when both explicitly complex labels and implicit concept combinations are allowed. This however only applies to classifying documents extrinsic to the ontology, since ‘intersection’ of concepts of different nature is not coherent with the set-theoretic semantics of DL. Overall, their ‘classification power’ is a global property of the method by which the ontology has been built. In contrast, our notion of FCP applies to individuals intrinsic to the DL world of the ontology and is calculated with respect to a focus class.
A. Rector’s work on entangling hierarchies (normalization) [15] addressed a different problem than us, but to some degree analogously considered the compound concepts as an alternative to named ones. This applies to ‘partitioning’ or ‘refining’ concepts, that only modify the ‘self-standing’ concepts; secondary partitioning aspects should not be expressed through subclassing (yielding a multi-hierarchy) but through existential restrictions filled with classes from separate ‘codelist’ taxonomies. For example, a class
Our own ongoing work on the PURO modeling language [21] deals with various options how the same ‘background’ state of affairs can be expressed in OWL. PURO structurally resembles OWL but relaxes some of its modeling constraints. A library of transformation patterns allows to proceed from one PURO model to alternative OWL ontologies in different encoding styles. An example relevant to our case is the notion of entrepreneur, which is likely to be expressed as type in PURO, but could be translated to relationship (
Certain research in cognitive psychology might also be relevant wrt. the notion of FCP. In particular, the notion of graded structure of categories [4] can be applied to our concept pattern weighting. The authors suggest that there is “a basic level of abstraction (e.g. CHAIR, DOG), …” which is “further discriminated at the subordinate level (e.g. KITCHEN CHAIR, SPANIEL) and abstracted at the superordinate level (FURNITURE, ANIMAL)” [4]. Presumably, this basic level might often coincide with the focus classes to be subcategorized in practical settings. Furthermore, the graded structure may include “ad hoc, goal derived categories such as GOOD PLACES TO HIDE FROM THE MAFIA” [4], analogous to our CEs defined via an existential restriction.
As regards the analysis of ontology repositories in terms of various aggregated features and metrics (logical, graph, lexical etc.), there has recently been renewed interest, following up with the early work of Tempich et al. [23] (aiming to build a benchmark for testing ontology tools). A large scale study of OWL ontology metrics was carried out by Matentzoglu et al. [13]. However, the categorization power of ontologies has not been, to our knowledge, studied, never mind with the flavor presented here.
Our study on class expression frequency in axioms from Section 5 looks similar to the recent study carried out in the MontoloStats project [12]. Both studies essentially analyze the same ontology repositories (primarily, LOV and Bioportal), and refer to the suitability of ontologies for reuse. There is however a difference in the restrictions coverage. For an unclear reason, the MontoloStats study does not cover existential restrictions (which are central for our study, and also shown as empirically very frequent) at all, nor the conjunctive concepts. On the other hand, it covers (subclass axioms with) named class in the RHS, and also property axioms such as domain/range or functional property. Notably, even over the common subset of CE patterns in restrictions (such as disjoint, universal and cardinality restriction CEs) that some differences in the computed ranking appear between MontoloStats and our research; these may be due to additional distinct features in the methodology used.
Yet another stream of empirical research aims to study ontologies not on their own but from the point of view of LOD datasets in which they are used. This was the subject of the project by Asprino et al. [1], which produced a condensed representation of the global, virtual, ‘LOD ontology’ in the form of so called equivalent set graphs. Various metrics related to the connectedness and extensional size of ontology entities were computed; while this research does only addresses compound concepts, it is in line with our ongoing activity in analyzing the Abox imprint of (named as well as compound) class expressions.
Cognitive experiments on assessing ontological structures
Several cognitive studies using ontologies as material have been published in the ontological engineering research. However, they primarily address the capability of humans themselves to carry out the categorization of objects to a set of classes or to understand the structure of OWL expressions. A recent example of the former is a study on classifying domain entities to upper-level ontology classes [20]. An example of the latter is an earlier study on the human capability of deriving useful information from differently verbalized OWL statements [24]. Our research in Section 7 of this paper differs in that the humans were to assess the automatically build concepts as more or less plausible, thus generating ground truth. (Semi-automated) verbalization was present, too, but only played an auxiliary role, the actual subject of assessment being the formal CEs themselves.
Concept learning in DL
The heuristic construction of compound CEs from the ontology axioms, triggered by the identification of a focus class, bears some resemblance to concept learning in DL [11], which is triggered by the specification of the target concept. However, (supervised) concept learning aims to identify CEs that best approximate a concrete goal concept as whole. In contrast, CEs logically equivalent to the focus class are uninteresting as its subcategories, while categories that only cover a few per cent of the FC instances may be valid means for partitioning its instances. Moreover, concept learning fully depends on presence of instance data for the particular concept to be learned. Yet, for a newly designed ontology such data might not be available (the well-known cold start problem). Finally, the overall goal is different: while concept learning primarily aims at enriching the ontology Tbox with new axioms [10], in our approach the prime task is to reuse the ontology (or its fragments) ‘as it is’, and the formation of new equivalence axioms is an optional, secondary step carried out with human interaction.
Ontology reuse metrics and methods
The broad context of our research, the task of ontology reuse, was studied by Schaible et al. [18]: the users expressed their preferences on reuse strategy in a survey. The results indicate that reusing multiple entities from the same vocabulary may often be preferred; this corroborates the relevance of our approach to measuring the categorization power of ontologies with respect to focus classes.
Vocabulary reuse techniques similar to the use of FCP-based metrics also appeared in a recent project on combining popularity metrics with the credibility of the vocabulary designers [19]. As regards the designer credibility, this is a feature of the ontology itself similarly to FCP, but it is completely orthogonal.
Reuse support [5] is also systematically sought by the maintainers of LOV [25], primarily at keyword relevance level; we are in contact with them and will seek to integrate our complementary approaches.
Conclusions and future work
Ontologies are an important means of subcategorizing entities already known to belong to a general focus class. Ontologies with the best categorization power, in terms of the number and quality of available subcategories of the focus class/es, should have preference in ontology reuse scenarios. We demonstrated that the scope of subcategories need not be confined to named classes but can also cover compound ones. In the first approximation, we treated the computation of the focused categorization power of ontologies, beyond named subclasses, in terms of simple existential restrictions over properties. This appears particularly relevant for publishing datasets on the linked data cloud, which is relatively ‘property-centric’.
Ongoing research addresses the analysis of the CE instantiation in linked datasets (Abox). This information can be employed two-ways. First, directly for the CEs generated from the Tbox of the given ontology considered for reuse. Second, the occurrence of instantiation of a compound CEs can be compared to the occurrence of its constituent entities, in the same dataset. The ratio of the compound vs. individual instantiation can then be aggregated per the pattern of compound CE, and provide an empirical grounding for the ‘meaningfulness’ of the CE pattern, complementary to the human feedback from the cognitive experiments. A dedicated thread of the Abox analysis will also study the availability of external, deferefencable SKOS codelists, which would enter the FCP computation through FCE patterns such as
Another area of ongoing research concerns the techniques of pattern-based lexical transformation of compound CEs to named ones (now merely as a trivial concatenation with
In middle term, we plan to extend the concept expression language considered, as regards the theoretical analysis, empirical studies, and support by a next version of the OReCaP prototype. A simple extension to
Within the scope of the different CE patterns, the syntactic FCE patterns providing the CEs will also be extended (e.g., by considering the subPropertyOf relationship) and new heuristic alternatives possibly added. The algorithm for pattern-based generation of CEs will also be enhanced, primarily by applying post-pruning according to the pattern applicability constraints. Overall, the transition to more expressive CELs and thus more expressive patterns will of course mandate deeper computational complexity analyses.
The pool of testing subjects in cognitive experiments should be extended to external ontology engineering experts (on the top of students and internal experts, engaged in the previous experiments). The new experiments should also feature more carefully pre-selected (rather than random) assessment tasks, so at to provide targeted feedback to so far unclear constellations of focus class and subcategory.
Since the main foreseen practical application of the whole approach is the improvement of ontology reuse, we also plan to explore this task in full, considering the FCP-based approach merely as a single element to be combined with other state-of-the-art supporting techniques for ontology reuse.
Footnotes
Acknowledgements
The research had been supported by CSF 18-23964S, “Focused categorization power of web ontologies”, by projects ORBIS, funded by Slovak SRDA agency under contract No. APVV-19-0220, KATO, funded by Slovak VEGA agency under contract No. 1/0778/18, and by TAILOR, funded by EU Horizon 2020 research and innovation programme under GA No. 952215.
Questionnaire task description from the 2016 cognitive experiment campaign
FC and category in Manchester syntax:
?i is an instance of a class including all objects that are in the relationship ‘bornIn’ to at least one object.
Hint: the expression, e.g. (bornIn some Thing), suitable for categorization of the FC should satisfy the following conditions:
Possible values:
certainly (2) perhaps (1) borderline (0) perhaps not (−1) certainly not (−2) no judgment, since I don’t understand the example (N)
Questionnaire task description from the 2018 cognitive experiment campaign
The focus class is from the ontology no. 162945,
Ontology name: Proton Ontology Ontology description: “PROTON (PROTo ONtology) was developed in the SEKT project as a lightweight upper-level ontology, serving as a modeling basis for a number of tasks in different domains. To mention just a few applications: PROTON is meant to serve as a seed for ontology generation (new ontologies constructed by extending PROTON); it can be used for automatic entity recognition and more generally Information Extraction (IE) from text, for the sake of semantic annotation (metadata generation). PROTON was extended to cover the conceptual knowledge encoded within the most popular datasets from Linked Open Data like DBPedia, GeoNames, etc.” Involved entities: Focus class: Property: Target class: Proposed category:
