Abstract
Conjunctive Query (CQ) answering is a primary reasoning task over knowledge bases. However, when considering expressive description logics, query answering can be computationally very expensive; reasoners for CQ answering, although heavily optimized, often sacrifice expressive power of the input ontology or completeness of the computed answers in order to achieve tractability and scalability for the problem. In this work, we present a hybrid query answering architecture that combines various services to provide a CQ answering service for OWL. Specifically, it combines scalable CQ answering services for tractable languages with a CQ answering service for a more expressive language approaching the full OWL 2. If the query can be fully answered by one of the tractable services, then that service is used, to ensure maximum performance. Otherwise, the tractable services are used to compute lower and upper bound approximations. The union of the lower bounds and the intersection of the upper bounds are then compared. If the bounds do not coincide, then the “gap” answers are checked using the “full” service. These techniques led to the development of two new systems: (i) RSAComb, an efficient implementation of a new tractable answering service for RSA (role safety acyclic) (ii) ACQuA, a reference implementation of the proposed hybrid architecture combining RSAComb, PAGOdA, and HermiT to provide a CQ answering service for OWL. Our extensive evaluation shows how the additional computational cost introduced by reasoning over a more expressive language like RSA can still provide a significant improvement compared to relying on a fully-fledged reasoner. Additionally, we show how ACQuA can reliably match the performance of PAGOdA, a state-of-the-art CQ answering system that uses a similar approach, and can significantly improve performance when PAGOdA extensively relies on the underlying fully-fledged reasoner.
Introduction
Conjunctive Query (CQ) answering over Knowledge Bases (KBs) is a crucial reasoning task for many applications. However, when considering expressive Description Logic (DL) languages, query answering is computationally very expensive, even when considering only complexity w.r.t. the size of the data (data complexity) [62]. Fully-fledged reasoners oriented towards CQ answering over unrestricted OWL 2 ontologies exist but, although heavily optimized, they are only effective on small to medium datasets. In order to achieve tractability and scalability for the problem, we need to rely on limiting the expressive power of the input ontology or sacrifice the completeness of the computed answers.
Query answering procedures have been developed for several fragments of OWL 2 for which CQ answering is tractable with respect to data complexity [9]. Three such fragments have been standardized as OWL 2 profiles, and CQ answering techniques for these fragments have been shown to be highly scalable at the expense of expressive power [10,46,51,68,72,73]. An interesting fragment of OWL 2, tractable for standard reasoning tasks, is RSA, an ontology language that subsumes all the OWL 2 profiles, first presented by Carral et al. [14] and for which a CQ answering algorithm based on the combined approach technique [46,47] was proposed by Feier et al. [22].
In order to deal with more expressive ontologies, several techniques have been proposed to compute a sound subset of answers to a given CQ. One such technique is to approximate the input ontology to a tractable fragment, so a tractable algorithm can then be used to answer CQs over the approximated ontology. A particularly interesting approach to CQ answering over unrestricted OWL 2 ontologies, using a combination of the aforementioned techniques, is adopted by PAGOdA [85]. Its “pay-as-you-go” approach uses a Datalog reasoner to handle the bulk of the computation, computing lower and upper approximations of the answers to a query, while relying on a fully-fledged OWL 2 reasoner (HermiT [24]) only as necessary to fully answer the query.
While PAGOdA is able to avoid the use of a fully-fledged OWL 2 reasoner in some cases (i.e., when the lower and upper answer approximations coincide), its performance rapidly deteriorates when the input query requires (extensive) use of the underlying OWL 2 reasoner. The computation of lower and upper bounds is achieved by under- and over-approximating the ontology into the
This work borrows from this “pay-as-you-go” technique and builds upon existing CQ answering techniques over OWL 2 ontologies. We propose a new hybrid query answering architecture that combines black-box1
By “black-box” we mean that the details of the reasoning process are not relevant, and indeed any reasoner providing comparable reasoning services could be used. However, the semantics are completely transparent, and the results could be interpreted/explained using a wide variety of techniques from the extensive literature on this topic (e.g., [3]). Note that this differs from the definition of the term that is often used, e.g., in the context of Machine Learning (ML), where the semantics of the system is opaque to the user.
An efficient implementation [39,41] of the combined approach algorithm for RSA [22], reorganized to fit the new implementation design and the integration of RDFox [54,55,57,60] as a backend Datalog reasoner. We streamlined the execution of the algorithm by factoring out those steps in the combined approach that are query independent to make answering multiple queries over the same knowledge base more efficient. In addition, we included an improved version of the filtering step for the combined approach. The system accepts any OWL 2 KB and includes a customizable approximation step to languages compatible with the RSA combined approach. The system is further extended with a reference implementation of the novel approximation algorithms for the computation of answer bounds mentioned above.
A reference implementation [42] of the hybrid architecture mentioned above, combining RSAComb, PAGOdA [85], and HermiT [24] to provide a CQ answering service for OWL. The resulting system ensures the same “pay-as-you-go” capabilities of the systems it is based on. The system has been designed to accommodate a high degree of modularity; the services it is built upon can be potentially substituted or augmented with more capable ones to improve the overall performance.
We carried out an extensive evaluation both for RSAComb, as a standalone tool, and for ACQuA, to assess their effectiveness, and compare our results with PAGOdA, aiming, primarily, at improving some shortcomings of the latter tool. Our experimental results show that the new technique yields significant performance improvements in several important application scenarios. Both ACQuA2
The present paper includes some previously published work:
We assume familiarity with standard concepts of first-order logic (FOL) such as term, variable, constant, predicate, atom, literal; refer to [1,5] for a formal introduction to these concepts.
We define a rule as an expression of the form
We will call a rule definite without negation in its body, and Datalog a function-free definite rule. A Datalog rule is disjunctive if it admits disjunction in the head. A fact is a Datalog rule with an empty body. The definition can be trivially extended to sets of rules.
A program Π is a set of rules. Let
for every
for every
The stratification partition of Π induced by δ is the sequence
A stratified program Π has a least Herbrand model (LHM), which is constructed using the immediate consequence operator
Given a stratified program Π, we define
Ontologies and conjunctive query answering
Next we give a brief overview of the description logic languages used in the paper. We will define them as restrictions of
In order to achieve decidability of reasoning,
Normalized
Table 1 also introduces a normal form for each of these description logic languages, and w.l.o.g. we assume that any ontology introduced is restricted to these axioms. Each axiom in Table 1 corresponds to a single logic rule, provided on the right. We define
OWL 2 profiles [53] have been defined as fragments of OWL 2, designed to provide a desirable balance between computational complexity of standard reasoning tasks and expressiveness of the ontology language. We will define these standard profiles as fragments of Horn-
Property chain and transitivity axioms are not taken into consideration here to keep definitions compatible with [22].
OWL 2 EL is based on the
OWL 2 RL is inspired by Description Logic Programs [29] and corresponds to a subset of Datalog; it does not contain axioms of type (T5);
OWL 2 QL is based on the DL-Lite
A conjunctive query (CQ) q is a formula
A knowledge base
CQs can be alternatively represented as Datalog rules. Let
We say that q is internalizable if it can be turned into an ontology axiom. This process is known as rolling-up [36] and is implemented in some solvers (e.g., Pellet [71]) to provide sound and complete CQ answering over OWL 2 DL over certain answer semantics when considering internalizable queries.
Given a knowledge base
PAGOdA is a hybrid reasoner for sound and complete CQ answering over OWL 2 KBs, adopting a “pay-as-you-go” technique to compute the certain answers to a given query. The idea is to compute lower/upper bound approximations to the answers to a query by approximating the input ontology into a less expressive language and possibly provide a fallback (more expensive) algorithm to process the answers in the gap between the bounds; to achieve this, it uses a combination of a Datalog reasoner and a fully-fledged OWL 2 reasoner. PAGOdA treats the two systems as black boxes and tries to offload the bulk of the computation to the former and relies on the latter only when necessary.
The capabilities, performance, and scalability of PAGOdA inherently depend on the ability of the fully-fledged OWL 2 reasoner in use, and the ability to delegate the workload to a given Datalog reasoner. In the best scenario, with an OWL 2 reasoner, PAGOdA is able to answer internalisable queries [36] under certain answer semantics [85] over OWL 2 DL.
In the following is a high level description of the procedure adopted by PAGOdA to compute the answers to a query. This will prove useful to understand how this approach will be later integrated in our system, ACQuA. For a more in-depth description of the algorithm and heuristics in use, we refer the reader to [84].
Given a KB
In the following we consider the input KB to be consistent and normalized. This is ensured by PAGOdA’s preprocessing step.
the Datalog reasoner is exploited to compute a lower bound the disjunctive Datalog subset of the input ontology, denoted with using a variant of shifting [18], a first materialization is performed, i.e., the the combined approach for
The upper bound
the ⊥ concept is substituted with a fresh concept name
disjuncts in the head of an axiom are reduced to a single disjunct. The “most favourable” disjunct is chosen according to a polynomial choice function that reasons over the dependency graph of the input ontology;
existential axioms of type (T5) are constant Skolemized.
If lower and upper bound coincide (i.e.,
otherwise, the “gap” between the upper and lower bound (i.e.,
for each
once all spurious answers have been removed from
Let’s take the lower bound computation as an example: the two performed approximations (i.e., to disjunctive Datalog and to
OWL 2 RL does not allow axioms (T5) and
PAGOdA’s reference implementation8
PAGOdA ensures that the returned answers are always complete under ground semantics, while being ultimately limited by the capabilities of HermiT when considering the returned answers under certain answer semantics. HermiT does not natively support CQ answering and the process needs to be reduced to fact entailment first. This is possible when the input query is internalisable, i.e., the query can be rolled-up into a set of DL concept assertions. In this scenario PAGOdA returns a set of answers that is sound and complete under certain answers semantics if the bounds match or the query can be internalized into a DL concept. Otherwise, PAGOdA will return a sound set of answers (complete under ground semantics) and a bound on the incompleteness of the computed answers (under certain answers semantics).
As we mentioned above, ACQuA combines multiple ontology approximation algorithms to compute the answers to an input query. As part of this work, we propose novel approximation algorithms that target RSA (and its extension
RSA (role safety acyclic) is a class of ontologies designed to subsume all OWL 2 profiles, while maintaining tractability of standard reasoning tasks. The RSA ontology language is designed to avoid interactions between axioms that can result in the ontology being satisfied only by exponentially large (and potentially infinite) models. This problem is often called and-branching and can be caused by interactions between axioms of type (T5) with either axioms (T3) and (R1), or axioms (T4), in Table 1.

Example of exponential model enumerating numbers from 0 to
Interaction between existential quantifiers (axioms of type (T5)) and universal quantifiers (encoded by axioms of type (T3) and (R1)) can lead to an ontology that may only be satisfied by models of exponential size.
Consider the following knowledge base with ABox
RSA is based on the Horn-
A role R in S is in an axiom (T4) and
A role R in
Note that, all OWL 2 profiles (
In Example 2.1, R is unsafe. In fact, even for
Then, R appears in an axiom of type (T5),
The distinction between safe and unsafe roles makes it possible to strengthen the translation π from Table 1 while preserving satisfiability and entailment of unary facts.
Let
([14], Theorem 2).
A Horn-
Note that if
Potential bad interactions between unsafe roles can be avoided by detecting any cyclic or diamond-shape materialization involving unsafe roles. This check is performed by constant Skolemizing all existential axioms with an unsafe role, and building a graph representing the materialization process, projected on unsafe interaction. Checking that the graph is an oriented forest, along with some additional conditions which preclude harmful interactions between equality-generating axioms and inverse roles, leads to the definition of RSA.
([22], Definition 3).
Let
Let for each pair of atoms for each pair of atoms
We say that
An oriented forest is a directed acyclic graph whose underlying undirected graph is a forest.
With reference to Definition 2.3, let
The fact that
If
Tractability of standard reasoning tasks for RSA ontologies follows from Theorem 2.1 and Theorem 2.2.
Combined approach for RSA
The combined approach for RSA consists of two main steps to be offloaded to a Datalog reasoner able to handle stratified negation and function symbols.
The first step computes the canonical model of an RSA ontology over an extended signature (introduced to deal with inverse roles and directionality of newly generated binary atoms). The computed canonical model is not universal and, as such, might lead to spurious answers in the evaluation of CQs.
The second step of the computation performs a filtration of the computed answers to identify only the certain answers to the input query.
Canonical model computation The computation of the canonical model for a knowledge base
First we define the Datalog program
Translation of Horn-
axioms to build
Translation of Horn-
Let
Finally,
Let
The set
The canonical model for an RSA input ontology is defined as
([22],Theorem 3).
The following holds:
if
there are no terms
Filtering spurious answers For the filtering step, a query dependent logic program
Rules in
. Variables u, v, w from U are distinct
Rules in
Filtering program
Let
([22], Theorem 4).
Let
if
We can then build a worst-case exponential algorithm that, given an ontology
([22], Theorem 5).
Checking whether
It is worth noting that, to the best of our knowledge, at the moment a similar bound for
We propose a hybrid query answering architecture which provides CQ answering capabilities for OWL 2 by means of combining different answering services treated as black-boxes. In particular, we combine scalable CQ answering services targeting tractable ontology languages with answering services for more expressive languages approaching the full OWL 2.
Given an input CQ over a certain knowledge base, we process the query using the tractable services; if the query can be fully answered by one of these tractable services, we simply provide the resulting answers to the user. Otherwise, we compute multiple lower and upper bounds to the answers to the query by approximating the knowledge base “from above” and “from below” and taking the union of the lower bounds and the intersection of the upper bounds. Finally, if the bounds do not coincide, the “gap” answers are validated by using the “full” service.
As part of this work, we introduce a novel algorithm to compute a lower bound to the answers to an input query by means of approximation to RSA. Similarly, we propose an algorithm to compute an approximation “from above” targetting
The reference implementation ACQuA is built on top of the following tools:
RSAComb, a novel system for CQ answering over RSA ontologies, based on the combined approach, extended with algorithms to computes bounds of the answers to a query via approximation of the input KB to RSA; PAGOdA, providing lower and upper bounds to the answers to a query and techniques to further refine these bounds to provide CQ answering capability over OWL 2 DL; a fully-fledged reasoner (such as HermiT) for CQ answering over a certain ontology language.
These tools allowed us to build a fine-grained “pay-as-you-go” approach, offering suitable, performant solutions depending on the inputs to the system; overall, this results in a lower complexity of the answer computation, when support for high expressivity is not needed. Note that we included both RSAComb and PAGOdA in ACQuA because, in general, the bounds computed by RSAComb are incomparable with the ones produced by PAGOdA, as we will show in Sections 4 and 5. However, any of these components could be potentially substituted or augmented with more capable ones; in particular, any relevant service mentioned above could be used in ACQuA (e.g., the fully-fledged reasoner HermiT could be substituted with Konclude).

Workflow of the ACQuA system.
Given a generic KB
A preliminary satisfiability check is performed over the input knowledge base
If
In this case,
If
In this case, RSAComb provides a sound and complete algorithm for CQ answering over
Compute the bounds to the answers to q as
If
Compute
Use HermiT on
Return
The choice of fully-fledged reasoner ultimately determines the class of ontologies for which CQ answering is sound and complete under ground and/or certain answer semantics for the overall system. Thanks to RSAComb, ACQuA is sound and complete for CQ answering under certain answer semantics for ontologies in RSA [22]. With the integration of PAGOdA, and a suitable fully-fledged reasoner, like HermiT, ACQuA is able to answer internalisable queries [36] over OWL 2 DL under certain answer semantics [85].
Steps 2,6,7 and the computation of
To help the reader follow along with the description of the proposed techniques, we consider the following running example.
Consider the KB
Running example
Intuitively, the ABox contains a collection of statements about researchers and their research outputs; on top of that, the ontology (RBox and TBox) models additional information about the relationships between different types of papers and their publications processes. The CQs ask about venues and works published in multiple venues. For reader’s convenience, a visual representation of the ABox
Graphical representation of the ABox 
Some axioms are not expressed in normal form (see Table 1) and can be further normalized as follows: axiom (t1) can be rewritten as
In this section we present a technique to compute a lower bound to the answers to an input query, by means of approximating the input KB to RSA.
RSA is not purely syntactically defined, and instead introduces a set of constraints over the ontology language Horn- From a generic From From Horn-
Approximation to
This first step is performed by discarding any axiom that is not in
Let
In Example 3.1,
Approximation to Horn-
We will now describe how to reduce an
While axioms of type (T4) do not use disjunction explicitly, their translation into definite rules involve disjunction in the head of the rule.
To address this and improve the approximation to Horn-
In Example 3.1, we know that assertions
Program shifting is formally defined as follows.
Let r be a normalized disjunctive Datalog rule. For each predicate P in r, let if r is of the form
if r is of the form
This can be generalized to sets of rules Σ as follows:
We apply this technique to our
Let
See the Appendix. □
In this section we provide a description of an algorithm to approximate the Horn-
checking whether
checking whether
We first consider step 1. If
Cycles can be broken by removing nodes from
Let
This is proven by observing that, by definition of dependency graph, constant
Using the Datalog reasoner, we compute

Cycle detection in
Next, we need to deal with equality safety (step 2). According to the definition of RSA, the following steps can be performed to ensure this property:
delete any (T4) axiom that involves a role S such that there exists
if there is a pair of atoms
Again, by removing some selected axioms we are able to force the input Horn-
Let
By Section 4.1 and Theorem 4.1 we know that
As mentioned in Section 2, PAGOdA uses a similar approach to compute a lower bound by approximating the input ontology first to disjunctive Datalog and then to Datalog; this is done by discarding any axiom that is not in the language, while introducing some additional heuristics to handle specifically disjunctive and existential axioms.
Note that, in general, the lower bound resulting from the algorithm proposed here is incomparable with the one produced by PAGOdA.
The next example shows a scenario in which the lower bound computed by our algorithm is tighter than the one produced by PAGOdA. On the other hand, the RSA language fully captures OWL 2 RL (used internally by PAGOdA) only when not considering property chain axioms.
Both approximation techniques will be later combined into ACQuA.
Consider our running Example 3.1 and
whereas all other roles are safe.
Now, let
The dependency graph Graphical representation of 
If we consider the query (q1), then:
We will now look at the problem of approximating a generic input KB
We adopt a similar approach to the one used in the lower bound computation and divide the procedure in steps. Given an replace any occurrence of ⊥ in the knowledge base with a fresh nullary predicate approximate disjunctive rules by removing all but one disjunct from the head of the rule. For each rule, the selected disjunct is chosen deterministically using an efficient choice function; enforce the constraints that define the RSA ontology language on the Horn-
⊥ substitution
As described above, ACQuA performs a preliminary satisfiability check on the input KB; in spite of this, while strengthening the KB, we might cause the KB to become unsatisfiable.
In order to provide a meaningful upper bound even in cases where the approximation leads to an unsatisfiable KB, we adopt an approach initially proposed in PAGOdA. The idea is to substitute every occurrence of ⊥ with a fresh nullary predicate
This step has been included purely for theoretical purposes. RDFox, used in the implementation of the approximation algorithm, will not explicitly check for satisfiability during query answering, making it possible to consider correct the answers to a query even when the KB is unsatisfiable.
Approximation of disjunctive rules
According to Table 1, axioms of type (T1) and (T4) can introduce disjunction in the head of rules. This usually results in non-determinism in the answering process and a corresponding jump in computational complexity. In order to rewrite these axioms and avoid the introduction of this operator, we borrow a technique used in a similar fashion in PAGOdA. The approach consists in replacing any disjunction in the head of a rule with one of the disjuncts. It is easy to see that this strengthens the KB and eliminates any non-determinism introduced by the disjunction. The surviving disjunct is chosen deterministically using an efficient choice function; the idea is to analyse the dependency graph of the KB and choose a disjunct which does not eventually lead to a contradiction. To this end, a standard dependency graph of the KB is built and disjuncts are ordered according to their distance from
Given a choice function Let δ be a function from axioms to axioms eliminating disjunction from the head of any axiom of type (T1) and (T4):
Consider axioms (t3) from our running example. Let
The final step of the approximation process consists in enforcing the additional constraints that the RSA language introduces on top of Horn-
Given
checking whether
checking whether
In order to ensure equality safety we proceed similarly to the lower bound case. For any pair of atoms
On the other hand, for each pair of atoms
This is equivalent to rewriting the axiom as
Finally, we reduce
These steps can be summarized in the definition of We define Finally, given
Let
if
See the Appendix. □ Consider, again, our running example (Example 3.1) and If we consider the query (q2), then:
Our tests show that, general property chain axioms (axioms of type (R4) in Table 1) are quite uncommon in practice. Transitive property axioms, on the other hand, are a specialization of (R4) that can be easily found in common ontologies. While we ignored the presence of these axioms so far, it can be shown that completeness is still guaranteed when including them in the language [14, Theorem 2,Proposition 1]. Intuitively, due to monotonicity of FOL, including more axioms in the computation of the canonical model will lead to a strengthening of the KB. Furthermore, the computational complexity for the computation of the canonical model is still bound by the translation of the problem into Datalog, for which new heuristics have being recently proposed to efficiently handle transitive closure of roles [38]. Note that, in this case, we are not modifying the filtration step, which will then only be able to detect a fraction of the spurious answers, effectively computing an upper bound of the certain answers.
Design and architecture
We proposed a new framework to compute CQ answering over unrestricted OWL 2 ontologies by using answer bounds and further refinement steps. The approach has been implemented in a system called ACQuA [42], which, as discussed in the previous sections, offloads different steps in the computation to a selection of underlying systems used as black boxes, i.e., RSAComb, PAGOdA and HermiT.
ACQuA is inspired by the “pay-as-you-go” philosophy that drove the development of PAGOdA and as such shares similarities and capabilities with the latter tool. The idea is to take different steps depending on how the input KB is classified. The input KB needs to go through a consistency check and normalization procedure first. If the normalized KB is inside one of the two ontology languages for which PAGOdA provides full support (i.e., OWL 2 RL and
In this section we will describe some design and implementation details that led to the development of ACQuA. In particular, we will focus our attention on RSAComb, a novel implementation of the RSA combined approach for CQ answering, and how the tool can be used to compute lower and upper bounds to the answers of an input query.
RSAComb
RSAComb [41] is an optimized implementation of the combined approach for CQ answering in RSA. We streamlined and reorganized the algorithm to make the different steps either ontology or query independent. On top of that we designed and implemented an API to introduce approximation capabilities in the system; RSAComb is able to take an unrestricted ontology as input and potentially apply an approximation algorithm (targeting RSA or
The system is written in Scala and uses the OWLAPI [35] to interface with the input ontology and manipulate OWL 2 axioms. RDFox is used as an underlying Datalog reasoner; RSAComb has been designed to maximize the amount of computation to be offloaded to RDFox, by redefining problems in terms of queries over a materialized RDF store.
RDFox is used as a black box, and RSAComb can be adapted to use any Datalog reasoner with support for stratified negation and Skolemization. Nonetheless, the use of RDFox allowed us to introduce some optimizations based on particular features provided by the tool.
These are:
a
support for named graphs to isolate and cache partial computation;
support for “TBox reasoning” in order to reason directly on the structure of an ontology even when outside the supported OWL fragment.
We designed and built RSAComb around these general principles:
The code should be modular and different steps in the algorithm should be as independent of each other as possible. It should be easy to reimplement (or enhance) an intermediate step of the algorithm as long as the signature and the interface with the system as a whole remain unaltered. We achieved this by an extensive use of Scala traits, building a collection of interfaces that describe the behaviour of the different actors that take part in the execution of the combined approach for RSA. As explained in the following sections, the integration with RDFox was also key to providing a good level of modularity to the systems.
The system has to be able to scale efficiently even for large amounts of data. Partial results are computed when needed and reused whenever possible. A more detailed analysis on the performance and scalability of the system is provided in Section 8.
It should be equally possible to use the system as a self-contained application or integrate it in another system. As such, our software presents a simple but effective command line interface alongside a well-structured set of classes exposing all the necessary tools to work with RSA ontologies, while hiding unnecessary implementation details. The different steps can also be disabled for user convenience.
We will first provide a description of RSAComb as an implementation of the RSA combined approach and then go into details on how lower and upper bound algorithms are implemented in the system.

Workflow of the RSAComb system.
Figure 5 summarizes the workflow of RSAComb:
the approximation steps take an unrestricted OWL 2 KB as input and approximate it to a target language handled by the RSA combined approach;
the canonical model for the resulting RSA KB is computed by materializing the data against a logic program derived from the input ontology;
a filtering program is derived from the input query and is combined with the canonical model to produce the set of certain answers to the input query over the approximated knowledge base.
The process of importing the input ontology (TBox, RBox) into the system is performed using the OWLAPI. Since importing large amounts of data (ABox) into the system might be expensive, data files are read and data is loaded on demand and reused whenever possible to maximize performance.
As mentioned above, two approximation algorithms ship with the system. The first approximation algorithm is an implementation of the algorithm presented in Section 4; it targets the RSA ontology language and maintains soundness w.r.t. CQ answering, i.e., answers to a CQ are a lower bound to the answers to the query over the original KB. A copy of the ontology, translated into Datalog according to Def. 2.3, is imported into RDFox along with the data and materialized by the reasoner. The dependency graph and equality safety checks (see Definition 2.3) are implemented as queries over the RDF store exposed by RDFox; the original knowledge base is altered accordingly. The second approximation is an implementation of the algorithm introduced in Section 5; it targets
The canonical model is computed for the knowledge base in Step (ii); this is done by converting each axiom in the KB into a logic rule according to Def. 2.5 and uploading it into RDFox. Note that the translation from axioms into logic rules is different from the one in Step (i), hence the need to reload them into RDFox. The data, on the other hand, is safely reused. Finally, the potentially spurious answers to the input query introduced during the canonical model computation are filtered out in Step (iii). It is worth noting that, in this scenario, steps (i),(ii) are query independent, while step (iii) is ontology independent. As such, when multiple queries are submitted over the same KB, steps (i-ii) are performed “on-demand” and only once, while the third step is performed for each input query.

RSAComb: canonical model computation.
The computation of the canonical model (Figure 6) involves the conversion of the input RSA ontology into logic rules as described in Def. 2.5, and where function symbols are simulated using RDFox’s built-in Skolemization feature.
A Skolemized rule derived from an existential axiom (T5)

where the built-in operator
The system performs the conversion and then offloads the materialization of the rules, combined with the input data, to RDFox.
Since the canonical model is query independent, this process can be performed once and the result cached and reused for every subsequent query over the same input ontology. We achieve this using RDFox’s support for RDF named graphs, which enables us to perform operations on specific “named” subsets of the data. Further operations on the graph operate and produce additional data in different named graphs, leaving the materialized canonical model intact.
Axiomatization of ⊤ and ≈ RDFox has built-in support for ⊤ (
In both cases we are not able to use these features directly: in the case of top axiomatization, we import axioms as Datalog rules, which are not taken into consideration when RDFox derives new ⊤ subsumptions;16
RDFox accepts both OWL 2 axioms encoded as RDF triples and Datalog rules; these are very different entities in the system and the semantics of special concepts/roles (like ⊤ and ≈) is applied to the former.
To work around this, we introduce the axiomatization for both predicates explicitly. For every concept name
This gives us the correct semantics for
Similar rules are introduced to axiomatize equality. We make the role reflexive, symmetric and transitive:
and introduce substitution rules to complete the axiomatization. For every concept name
The
We generate the instances of the predicate
where
A final improvement has been made to the computation of the

RSAComb: answer filtering.
As depicted in Fig. 7, answer filtration involves the computation of the filtering program from the input query, the filtering of the materialized canonical model and the final process of gathering the answers.
RSAComb performs the translation of the query into a set of logic rules. This step was modified w.r.t. the original definition [22] to be completely ontology independent by moving the generation of
Let
Using the
Rule 30 showcases how the
The complete rewriting of the filtering program is provided in Table 5. According to the documentation18
Improved rules for filtering step for the RSA combined approach
The filtering program is, then, loaded into RDFox and the materialization is updated taking into account the newly introduced rules. The triples produced by this materialization update are stored in a separate named graph to keep the product of filtration separate from the canonical model. This is possible because the signature of the atoms in the head of rules introduced by the filtering program is separate from the signature of the canonical model. When processing a new query, the only step we need to take is to drop the named graph associated with the filtration from the previous query, leaving unaltered all other triples. Better yet, here we have the possibility to execute queries in parallel, each one associated with a separate filtering program and hence storing their derivations in different named graphs. The materialization update for each of the queries is isolated and does not interfere with the other processes.
At this point, the task of gathering the answers to the query over the input KB is reduced to querying a materialized named graph for the atoms representing the certain answers.
Given a query 
where we first collect all instances 
As described in Section 4, we propose a novel algorithm for approximating an unrestricted input KB to RSA. The procedure is composed of 3 main steps:
Approximation to Approximation to Horn- Approximation to RSA by reducing the ontology dependency graph to an oriented forest and ensuring equality safety properties. building and reasoning over a custom dependency graph derived from the materialization of the input data over a Horn- reasoning over the knowledge base itself, and in particular performing some RBox reasoning task.
The first two steps are entirely carried out by RSAComb in a straightforward way. The knowledge base is first filtered by axiom type and then program shifting (Def. 4.1) is applied to all relevant axioms. The last step is designed to partially offload the task to RDFox; this involves:
We first translate the axioms in the knowledge base according to Definition 2.3, and import them, along with the data, into RDFox. The imported data and its materialization contain all instances of the atom
RSAComb builds the dependency graph for the input KB. Using Algorithm 1 we detect and break cycles by iteratively removing nodes. The existential axioms corresponding to the nodes returned by the visit are removed from the input ontology.
For the equality safety check we need to reason over the ontology itself and in particular perform some reasoning over its RBox. Regardless of the support offered by the Datalog reasoner for this task, axioms in a knowledge base can be encoded as RDF triples.19
RDFox supports importing OWL 2 axioms and the conversion into RDF triples is performed automatically. RBox reasoning (Listing 1) is then achieved by importing the following rules into the RDF store.

Rules for role subsumption reasoning
These encode reflexivity and transitivity of sub-role axioms (R2) (lines 7–9), taking into account inverse (lines 3–5) and equivalent roles (line 1), as well.
Once both the data and the axioms have been imported and materialized according to their respective rules, the equality safety condition (i) of Definition 2.3 can be formulated as a query (see Listing 2).
For each pair of atoms

Condition 1 of equality safety in RSA definition
Similarly, condition (ii) can be formulated as a query (see Listing 3).

Condition 2 of equality safety in RSA definition
For each pair of atoms
The approximation algorithm proposed in Section 5 is implemented in a similar way. Again, the procedure is divided into the following steps:
rewriting of ⊥ into a new nullary predicate rewriting of disjunctive rules to eliminate disjunction, and approximation to
As discussed before, the first step is not performed in practice. During the computation of the KB approximation and the upper bound set of answers, we simply ignore the satisfiability of the KB. Note that, even if ⊥ is derived during the process of materialization, RDFox will not derive the entire Herbrand base, to keep the operation as efficient as possible. We can use this to our advantage and still compute a meaningful upper bound approximation.
The rewriting of disjunctive rules is also straightforward, and is performed directly by RSAComb. The choice function is implemented as in PAGOdA, in order to avoid the derivation of ⊥ (see Section 5.2).
Finally, the third step involves the same framework introduced in the previous section for the lower bound computation, and in particular the construction of the dependency graph and role subsumption reasoning are performed in the same way. Both the enforcing of equality safety and the reduction of the dependency graph to a forest involve a rewriting of the KB according to Def. 5.2, and are implemented directly in RDFox.
Finally, RSAComb can be used to run the combined approach algorithm on the resulting
Related work
Conjunctive query answering over knowledge bases is one of the foundational problems when reasoning over ontologies. This, along with its corresponding decision problem (conjunctive query entailment, CQE) have been analysed both from the theoretical point of view (with extensive research on its computational complexity) and from the practical point of view, leading to a number of algorithms and their implementation in various reasoning systems. For a summary of the complexity results on decidability of CQE for some of the ontology languages mentioned in this work, see Table 6.
Decidability of CQE for a selection of ontology languages
Decidability of CQE for a selection of ontology languages
We will next provide an overview of the techniques proposed in the literature to perform CQ answering and in particular look at those tools that make use of bounds computation in order to drive the query execution. A closer comparison of ACQuA with these tools is also provided.
Support for CQ answering is offered natively by several existing reasoners. Some of them achieve this by ensuring sound and complete answers for a specific semantics over a certain family of ontology languages, while others limit the language in which the queries can be expressed. We will now give an overview of the several CQ answering techniques present in the literature.
Reduction to entailment checking
The first technique we are going to discuss is based on the reduction of CQ answering to entailment checking. Tableau–based DL reasoners like Pellet [71], HermiT [24], RacerPro [31] construct a finite structure that represents a model for the input KB and use blocking conditions to ensure the termination of the procedure. These reasoners usually target standard reasoning tasks and only offer limited support for CQ answering. Still, internalisable CQs can be rolled-up and included in the KB, effectively reducing CQ answering to entailment checking of a fresh concept entailed by the rolled-up query.
Pellet [71] provides support for CQ answering under ground semantics and supports CQ answering under certain answer semantics limited to tree-shaped queries (which can be internalized using the rolling-up technique).
HermiT [24] is a fully-fledged reasoner for OWL 2, based on the hypertableau calculus [59]; it does not provide a direct interface to answer CQs but reduction to entailment checking can be manually performed as a preliminary step.
RacerPro is a tableau–based system for the
Another tableau-based reasoner, Konclude [77], has been recently adapted to perform CQ answering over expressive ontologies [75], using an absorption-based technique [74,76]. The assertional part of a KB is divided into small packets used to parallelize the model construction of the tableau algorithm. This parallelizable approach, along with the use of caching to avoid the need of synchronization mechanisms between workers, can be used to derive possible answers to a CQ. Candidate answers are then checked using entailment checking, where bindings for the answer variables are restricted to individuals appearing in the possible answers. According to [75], the approach works best when considering ground queries, while the presence of existential variables can require a substantial amount of additional computation.
Overall, the systems described in this section are not primarily designed for CQ answering under certain answer semantics and instead target other reasoning tasks. The technique of reducing CQ answering to entailment checking is supported for expressive ontology languages but may not scale as well as other approaches. Optimizations have been proposed to further limit performance issues; examples are query execution order, based on the input KB [45] and data summarization [17].
When considering the development of fully-fledged reasoners targetting OWL 2, such as HermiT and Konclude, improvements on these reasoners can translate into improvements for hybrid systems like ACQuA and PAGOdA, which directly use these tools as black boxes.
Materialisation-based reasoners
Materialization-based reasoners are also widely in use and implement the forward chaining algorithm on top of (some fragment of) Datalog. Materialization-based systems are often built on top of RDF management systems; i.e., data management systems based on the Resource Description Framework representing knowledge as statements in the form of triples.
Triple stores like Jena [52], Sesame [8] and Virtuoso [21] offer query answering capabilities over RDBMS and support the RDFS description language. OWLim [7] provides support for OWL 2 RL ontologies. A materialisation-based reasoner extensively used in this work is RDFox [60], an RDF store supporting arbitrary Datalog rules over unary and binary predicates. The nature of the tool allows for important optimizations, e.g., incremental updates, and parallel materialization, at the expense of a limited expressivity in the supported description logic language [55–57,60]. RDFox covers most SPARQL 1.1 over an extension of Datalog. There are several other engines that support CQ answering over (extensions of) Datalog; among them, it is worth mentioning DLV [49], which provides support for CQ answering over an extension of disjunctive Datalog.
Although OWL 2 RL is expressive enough to cover a large portion of practical use cases, it lacks some common patterns like disjunctive knowledge or existentially quantified knowledge, that would potentially render the materialization process either non-deterministic or infinite. Typically, materialisation-based reasoners can still process ontologies outside OWL 2 RL, ignoring axioms that do not fall into the language. Answers to queries are still sound, but might not be complete, effectively providing a lower bound to the set of certain answers. This technique is used in the system PAGOdA [85] to effectively compute a sound lower bound to the set of certain answers to a CQ.
Ontology-mediated query rewriting
DLs are often used to model the domain of interest as collections of concepts and roles. In this sense, ontologies offer a great tool to build high-level semantics on top of some structured data (e.g., relational database).
OBDA directly applies this principle, creating a layer of abstraction on top of an existing data store; an ontology becomes an entry point for the user to access the underlying data via query answering. Another advantage of this approach is that it can rely on the underling data store (e.g., a RDBMS) to carry out the reasoning tasks. The OBDA framework [83] uses an ontology to rewrite an input query (i.e., expanding it by incorporating parts of the ontology). It then uses a set of mappings20
Often expressed in the W3C standard R2RML language [15].
It is worth noting that, since the query addresses the data source(s) indirectly, any updates made to the source are immediately reflected into the system. This is in contrast with the materialisation-based approach, where updates in the source require the recomputation (or the update) of the materialized dataset.
The OBDA approach is based on ontologies that fall into the DL-Lite family of DL languages, and hence the OWL 2 QL profile, for which the rewriting of CQs into unions of first-order queries is guaranteed to exist [10]. Perfect reformulation is implemented in QuOnto [2] and further integrated into the MASTRO system [11]. Unfortunately, the query rewriting process can lead to an exponentially larger first-order query [10] and polynomial rewriting is guaranteed only for small fragments of OWL 2, such as OWL 2 QL. For a more in-depth analysis on the performance of the OBDA approach we refer the reader to the Optique project and their work with Equinor [37,44].
The query rewriting technique has been applied to
We briefly mention the work done in Clipper [19,20] which implements a query rewriting technique for Horn-
A different approach involves the manipulation and rewriting of the input query [25,28]. The authors propose a decision procedure for CQE for
The combined approach is another widely known technique for computing a sound and complete set of answers to a CQ. In this scenario the dataset is first augmented by materializing entailed facts w.r.t. the ontology in order to build a model for the input knowledge base. This process is usually query-independent and performed in polynomial time. Spurious answers are then systematically identified by means of a filtration step or by rewriting the query [47,51] in order to derive the certain answers to the input query. An example is the combined approach for CQ answering over RSA ontologies (see Section 2.3.1), widely exploited in this work. The technique has been applied to several description logics in the
In general these techniques are designed for a specific ontology language and do not support unrestricted OWL 2 ontology. On the other hand, as shown in this work and in PAGOdA, the combined approach can be easily used as an intermediate step in the computation.
Hybrid approaches
We will now look at tools that combine more than one technique described above to implement CQ answering.
Hydrowl [78] is a reasoner for CQ answering combining an OWL 2 RL reasoner, a query rewriting system and a fully-fledged OWL 2 reasoner. Hydrowl uses a repairing strategy [79] (limited to those ontologies for which a repairing exists) and query rewriting to answer an input query q. First a query base, i.e., a set of atomic queries that can be answered using the OWL 2 RL reasoner, is derived from the query. It is checked whether the query base “covers” the query, and in that case the OWL 2 RL reasoner is used to answer the query; otherwise the tool falls back to the fully-fledged reasoner. Further investigation on the computation of the query base [85] shows that the algorithm is not always able to automatically extract a set of atomic queries, thus compromising the correctness of the approach.
Absorption-based query entailment checking [74] (inspired by the absorption technique introduced by Steigmiller et al. [76]) also falls into the category of hybrid approaches. An input query is rewritten in order to make its entailment more efficiently detected by the model constructed using an extended version of the tableau algorithm. In this sense, the rewritten query is used to identify the individuals that are involved in the entailment of the query and, at the same time, to guide the construction of the model in the tableau algorithm. The technique is sound for CQE for expressive ontology languages, such as
PAGOdA [85] uses a hybrid technique to compute the answers to a query, mixing different approaches. The idea is to compute lower/upper bound approximations to the answers to a query by approximating the input ontology into a less expressive language and possibly provide a fallback (more expensive) algorithm to process the answers in the gap between the bounds. For a more detailed description of the approach see Section 2.2 and [85]. ACQuA builds on top of these techniques.
Ontology approximation
The idea of approximating an expressive language into less expressive (but more tractable) languages has been exploited before. This was first introduced by Selman and Kautz [70] and Del Val [81] in the context of logic theories (both propositional and FOL) and has been applied in the context of ontologies and CQ answering as well. Besides PAGOdA, some of the systems that use ontology approximation to explore and restrict the set of answers to a given CQ are SCREECH [34], TrOWL [80] and SHER [17].
The SCREECH system [34] is able to compute an (unsound or incomplete) approximation of the answers to a query under ground semantics. It achieves that by performing a query dependent (and possibly exponential) rewriting of the input
TrOWL [80] is a system providing CQ answering capabilities over OWL 2 DL. It uses a semantic approximation [63] technique to transform an OWL 2 DL ontology into OWL 2 QL for CQ answering and a syntactic approximation [67] from OWL 2 DL to OWL 2 EL for TBox reasoning. While being sound and complete for CQ answering, approximations steps in TrOWL are ontology and query dependent, making in harder to reuse partial results in the computation. Moreover, the semantic approximation requires the use of a fully-fledged reasoner to compute a KB approximation whose axioms are valid w.r.t. the input ontology.
The SHER [17] system is a tableau-based reasoner for
In addition, a way to approximate an OWL 2 ontology into an OWL 2 QL ontology maintaining completeness for instance queries is proposed as part of the filter and refine technique presented by Wandelt et al. [82]. The idea is to transform every axiom
Under the umbrella of approximate reasoning for CQ answering, the query extension technique [26,27] is of particular relevance. This algorithm aims at improving the bounds of the answers by extending the query with additional atoms obtained analysing the input ontology. The resulting query can then be used to restrict the bounds of subqueries of the initial query.
Finally, different notions of approximation for ontology-mediated queries over a selection of expressive languages like
Evaluation
We provide here an extensive evaluation over a range of benchmark ontologies. We start by looking at some performance results for RSAComb [41], our implementation of the combined approach for RSA, followed by a comparison of our system ACQuA [42] with PAGOdA.21
All experiments were performed on an Intel(R) Xeon(R) CPU E5-2640 v3 (2.60GHz) with 16 real cores, extended via hyper-threading to 32 virtual cores, 512 GB of RAM and running Fedora 33, kernel version 5.10.8-200.fc33.x86_64. We were able to make use of the multicore CPU and distribute the computation across cores, especially for intensive tasks offloaded to RDFox.
We use two different sets of benchmark ontologies:
the PAGOdA batch mimics the evaluation process originally performed for PAGOdA [85];
the Oxford Ontology Repository (OOR) batch is a subset of the Oxford Ontology Repository,22
The PAGOdA batch consists of a selection of ontologies and benchmark data that comes with the PAGOdA distribution.23
LUBM and UOBM, standard benchmarks with a data generator (depending on a numerical parameter) and sample queries. When referring to a dataset generated for a particular parameter we will use LUBM(n) and UOBM(n) for some number n. PAGOdA provides an additional set of queries more challenging for the tool.
Reactome and Uniprot, realistic ontologies for which both data and relevant queries are provided. To test scalability, the datasets of these ontologies have been sampled in subsets of increasing size.
Benchmarks statistics, with LUBM/UOBM data generators depending on a parameter n
This batch aims at testing the system with a set of well–established benchmark queries. These queries have been defined to stress a system on a broad range of aspects of an answering routine, and vary a lot in terms of complexity and number of answers.
For the OOR batch we selected 126 ontology from the repository with non-empty ABoxes. A summary of the statistics of the ontologies in the repository can be found online.24
Since the Oxford Ontology Repository does not provide any test queries, we generated, for each ontology, a set of sample queries by extracting atomic concept, atomic role and existential patterns from the structure of the ontology. In order to generate a suitable number of queries we used the following step:
Import the ontology into RDFox as RDF triples. Query for a specific pattern in the ontology, e.g., Convert those patterns into queries.
Using this method, we extracted 14135 concept atomic queries, 4434 role atomic queries and 3893 existential queries for a total of 22462 queries over 126 ontologies. Apart from the basic atomic patterns, we included existential queries of the form
to retrieve all existential axioms in the ontology.
By doing so, we aimed for an empirical confirmation of our ability to produce the correct set of answers under certain answer semantics, for a set of queries that potentially provide different results when considering them under ground semantics.
While the generated queries have a limited number of atoms, the primary aim of this batch is to stress the system with a high number of queries per ontology; this allows us to draw conclusions on the behaviour of the system on a wider variety of test cases.
The collection of SPARQL queries and the scripts to generate them are part of our benchmark distribution [43].
We now present the test result obtained using the first set of benchmarks. We first tested RSAComb as a standalone system, in order to evaluate its performance and scalability. Later we compare the performance of ACQuA against the original PAGOdA. This is particular usefully since we were able to draw a very close comparison between the two tools and improve upon the observations provided by PAGOdA [85]. This also helped us identify how and when ACQuA’s algorithm outperformed PAGOdA, solving some performance issues with the latter tool.
RSAComb
As part of this work, we introduced RSAComb, an improved implementation of the combined approach algorithm for RSA, released as free and open source software [39]. Given that the original reference implementation [22] was not available when we started this work, and some details about the testing process are not provided, we will not try to draw a comparison between the results provided here and those provided in the original paper.
Our implementation is written in Scala and uses RDFox25
In the following we provide tests result of our system against LUBM [30] and Reactome26

Scalability of approximation to RSA and canonical model computation in RSAComb.
In Fig. 8 we show the scalability of our algorithm for the lower bound approximation to RSA and the computation of the canonical model for the approximated ontology. The two steps are query independent and the trend appears to be linear w.r.t. the dataset size, both in LUBM and Reactome; this can be explained by observing that the approximation algorithm involves the materialization of the input dataset against a modified version of the ontology, hence depending on the size of the whole KB;

RSAComb answer filtering in Reactome.
The filtering process is instead less dependent on the size of the data and more dependent on its composition and distribution. As such, a bigger dataset does not necessarily correspond to a greater amount of filtering, as shown in Fig. 9, where we reported the execution time for query 1 and 2 in Reactome. This figure also shows how the filtering depends on the data distribution; both queries take longer on a 50% sample of the data than on other datasets (even larger ones) due to its specific content. In general, we noticed that the time spent by the system on the filtering step is considerably lower than the time spent on the canonical model computation (as described below, and shown in Fig. 11).

Answer filtering in Query 2 in LUBM.
This unpredictability of the filtering step can “backfire” when a huge amount of filtering is involved. In Fig. 10 we show the filtering time for query 2 in LUBM along with the amount of unfiltered answers that the filtering program needs to process. It is worth noting that less than 1‰ of the unfiltered answers are found to be part of the certain answers. Figure 10 confirms the previous claims that the filtering step grows proportionally to the amount of filtering that is needed for a particular query. Finally, this figure shows how our system is able to handle a gigantic filtering step, processing hundreds of millions of facts in a reasonable amount of time.

Percent time distribution of canonical model computation (at the bottom, in blue) and answer filtering (at the top, in yellow) in Reactome.
Finally, Fig. 11 shows how execution time is distributed among the two main tasks of the combined approach. Filtering takes consistently less that 20% of the total execution time, when considering bigger datasets. As mentioned before, we can limit the impact of the canonical model computation by computing it “offline” whenever we find ourselves in a scenario in which we need to perform query answering over a fixed ontology.
We will now provide test results for the PAGOdA batch against ACQuA; this will allow us to draw a direct comparison between the tool and PAGOdA [85], for which the same benchmarks were used during the evaluation phase. During our tests we were able to reproduce the results provided in the original paper except for UOBM, for which PAGOdA does not terminate with a timeout of 10h and outputs no relevant information.
We chose this as a first set of benchmarks because we were able to use the extensive analysis on PAGOdA’s performance to guide our research and easily detect those cases that our system could improve. PAGOdA initially divided its test results into three groups:
queries for which the bounds match;
queries with a non-empty gap, but for which summarization is able to filter out all remaining spurious answers;
queries where HermiT is called on at least one of the test datasets.
When considering RSAComb and PAGOdA, the building blocks of ACQuA, separately, efficiency in the two tools mainly depends on the input ontology and the type of query answered, with PAGOdA showing worse performance when heavily relying on HermiT. On the other hand, when combining the tools into ACQuA, RSAComb is able to further limit the occurrence of these cases, providing better performance overall.
In general ACQuA is able to match PAGOdA’s results in all queries in the (G1-2) groups. This should not come as a surprise, since the already excellent results from PAGOdA weren’t leaving much room for improvement and were showing that more complex CQ answering techniques were not needed for these families of queries. In particular, for query in the G1 group, ACQuA does not perform any additional step other than PAGOdA’s computation of lower and upper bounds (avoiding the use of HermiT altogether).
For this reason we will be focusing on those queries falling in the (G3) group, for which PAGOdA’s performance does not scale well.

Scalability of query processing times for LUBM, UOBM and Reactome in ACQuA vs PAGOdA.
According to [85, Section 10.3.2], and partially confirmed by our tests, PAGOdA falls back to HermiT in the following queries to compute the correct set of answers: queries 32 and 34 in LUBM, query 18 in UOBM (for some data sizes) and query 65 in Reactome. Figure 12 sums up the results for our tests. Pre-processing times for the ontology are not taken into account here since the process is common to both tools and, in general, can be computed offline.
LUBM For both queries we were able to compute matching bounds, skipping the call(s) to HermiT altogether. This resulted in a significant improvement on the query processing time (Fig. 12a). It is interesting to notice that in this case the nature of the data and the queries seem to lead to a linear growth with respect to the size of the data.
Additionally, while testing the tools, we noticed that PAGOdA was having some difficulties returning a sound set of certain answers to queries involving existential knowledge, potentially falling back to ground semantics instead.
LUBM TBox contains the following axiom describing the fact that each research assistant works for at least one research group

should return all 39 instances of
UOBM We could not perform a direct comparison with UOBM since we were unable to reproduce PAGOdA’s results [85], with the tool timing out on a 10h run with no relevant output, even for the smaller datasets. Regardless, we were able to observe a recognizable pattern in the results for query 18. In Fig. 12b, we report our results against an estimate of PAGOdA’s performance; the estimation was carried out by looking at the graphs of the original paper and considering the closest values that the resolution of images allowed us to read. Even in this case we were able to avoid the use of HermiT, consequently improving the query processing time overall.
Reactome We were able to answer query 65 with matching bounds, avoiding again the use of HermiT. This resulted in an improvement of almost 600 seconds for the full Reactome dataset.
Furthermore, we found that the answers returned by PAGOdA for some of the queries in LUBM were only correct if considering CQ answering under ground semantics. Examples of these are query 15–16 from the PAGOdA benchmarks, for which PAGOdA was able to return only an incomplete set of answers.27
This is most likely due to a bug in the PAGOdA codebase.
For the second batch of benchmarks executed on the Oxford Ontology Repository, we were able to identify a set of queries for which PAGOdA requires the use of HermiT for the full computation of the query answers.
PAGOdA and ACQuA statistics on OOR batch (over 126 ontologies and 22462 queries)
PAGOdA and ACQuA statistics on OOR batch (over 126 ontologies and 22462 queries)
As shown in Table 8, PAGOdA was able to process 103 out of 126 ontologies considered, executing around 81% of the generated queries; ACQuA, on the other hand, was able to process the entire set of ontologies, answering the full suite of generated queries. Around 10% of the queries have a non-empty answer set, and while ACQuA was able to answer all of them, PAGOdA can reliably answer only
We identified a set of 18 queries (role atomic queries) over DOLCE [23] for which PAGOdA required the use of HermiT. In these cases, only the lower bound computed by PAGOdA is exact, while ACQuA was able to compute a matching upper bound. This was detected in two different fragments of DOLCE from the repository, corresponding to ontology 14 and 24. Ontology 24 corresponds to the full DOLCE ontology, while ontology 14 is a fragment of 24 partially restricting the ABox. Both ontologies are classified as
In Fig. 13 we provide quantitative and performance results for the queries over ontology 24, where we denote the lower bound, upper bound and query processing time for the corresponding tools with L, U and T respectively. We omit ontology 14 since the results are similar to the ones reported for ontology 24.
In the first 16 queries we obtained comparable results (see Fig. 13). This is understandable since DOLCE is a relatively small ontology (with a very small ABox) and this ended up hiding the performance differences that would potentially appear with larger datasets. Moreover, it should be noted that PAGOdA is able to deal with the larger upper bound by performing a limited amount of calls to HermiT (up to 8). The number of calls increases to 19 in the last two queries; these are also the cases in which we can observe a greater gain in performance using ACQuA.

Execution time (T) on DOLCE queries in PAGOdA (red) vs ACQuA (orange), alongside quantitative results for the lower (L) and upper bounds (U) computed by the two tools.
Finally, we found a set of 23 queries across multiple ontologies for which PAGOdA returned an unsound set of answers. This is the whole set of queries that PAGOdA was unable to process. This behaviour was not observed in ACQuA, which was able to compute the correct answers to the queries, without calling PAGOdA.
For the rest of the tested queries PAGOdA and ACQuA had comparable performance and were able to compute matching bounds.
To conclude this section, we provide a list of performance results and improvements highlighted by our evaluation:
RSAComb shows linear scalability for preprocessing and canonical model computation steps. Moreover, the filtering time is lower on average than the canonical model computation;
RSAComb handles huge amounts of data in a reasonable amount of time;
the additional logic introduced by ACQuA is able to outperform PAGOdA in a variety of test cases, improving both the lower and upper bounds;
ACQuA is able to fix some performance issues present in PAGOdA, by computing matching and hence further limiting the use of HermiT.
In this work, we presented a new hybrid query answering architecture that combines black-box services to provide a CQ answering system for OWL. Our system builds upon scalable CQ answering services for tractable restrictions of OWL, combining them with a CQ answering routine for a more expressive language. The technique is based on the computation of lower and upper bounds to the answers to a CQ and their progressive refinement to compute the full set of certain answers. We proposed two novel algorithms to compute lower and upper bounds to the answers to a query via approximation to RSA and RSAComb, an efficient implementation of the combined approach for RSA [22], introducing a new design and the use of RDFox as the underlying Datalog reasoner. The system improves upon the theoretical contribution of the original work by introducing several heuristics, in order to render the algorithm more efficient in practice. The system accepts any OWL 2 KB and includes a customizable approximation step to languages compatible with the RSA combined approach. To this end, we included a reference implementation of the novel approximation algorithms for the computation of answer bounds mentioned above. ACQuA, a reference implementation of the hybrid architecture combining RSAComb, PAGOdA [85], and HermiT [24] to provide a CQ answering service for OWL. By building ACQuA, we showed how the novel idea of chaining multiple services to refine answer approximations is feasible in practice. Ideally, the services it is built upon can be substituted or paired with more capable ones to improve the system performance–wise.
We provided an extensive evaluation of the systems, first testing scalability and performance of RSAComb as a standalone system and then, comparing ACQuA against PAGOdA. We showed how the additional computational cost introduced by reasoning over a more expressive language like RSA can still provide a significant improvement compared to relying on a fully-fledged reasoner. Additionally, we showed how ACQuA can reliably match PAGOdA’s performance, and further limit performance issues originally present in PAGOdA, especially when the tool has to extensively rely on HermiT.
We intend to further extend this work in a few different directions. The RSAComb-based algorithms for the computation of answer bounds depend on a cycle-detection procedure over a KB dependency graph. We think that altering the traversal of the graph and adopting (query dependent) heuristics in the cycle-detection algorithm could improve the quality of the computed bounds.
Moreover, ACQuA mostly focuses on ontology manipulation for computing bounds and further processing gap answers. While query independent processes can be cached or computed offline, a different, complementary, approach would be to study the problem of computing answer bounds from a query perspective. An example of such a technique for computing bounds to answers to SPARQL queries has been presented by Glimm et al. [27].
To conclude, this work led us to believe that relying on hybrid frameworks and leveraging existing systems for CQ answering is a winning strategy that can render the problem more viable in practice. Thanks to its modularity, this approach can benefit from the broader research in the area of knowledge representation, description logics, and CQ answering.
Footnotes
Proofs
This chapter provides proofs for lemmas and theorems used in Sections 4–5.28
These proofs are the result of several discussions with the authors of [22].
In the following we will consider either an RSA or an
We start with the notations concerning terms and atoms. For terms s and t, we write
The derivation level of a ground atom
We next relate terms in
Acknowledgements
This work was supported by the AIDA project (Alan Turing Institute, EP/N510129/1), the SIRIUS Centre for Scalable Data Access (Research Council of Norway, project number 237889), Samsung Research UK, Siemens AG, and the EPSRC projects AnaLOG (EP/P025943/1), OASIS (EP/S032347/1) and UK FIRES (EP/S019111/1). For the purpose of Open Access, the author has applied a CC BY public copyright licence to any Author Accepted Manuscript (AAM) version arising from this submission.
