Abstract
In order to provide personalized recommendations, decision support methods need to take into account user preferences in the production of their outcomes. This is a particularly relevant issue in the case of argumentative approaches to decision support, which have often focused on the construction of user-agnostic arguments concerning the various options at stake. In particular, while gradual bipolar argumentation (GBA) has been successfully adopted as a formal basis for the realization of decision support in a variety of application domains, none of these previous works involved personalization aspects and, hence, the study of techniques to integrate user-provided preferences into GBA turns out to be an open research question. In this paper, we provide an initial contribution to this investigative direction from both a theoretical and a practical perspective. On the theoretical side, we introduce a property of local coherence to characterize the expected effects of user preferences on argument strength assessment in GBA and provide results concerning the relations between local coherence and global behaviors. On the practical side, we illustrate a preliminary experimentation in the context of a GBA-based review aggregation system extended with the handling of user preferences, which allows us to draw some considerations on the opportunities and challenges of putting the proposed approach into practice.
Introduction
Artificial intelligence’s (AI’s) rise to prominence in recent years has resulted in decision support systems being more integrated in users’ lives than ever before. However, in order to provide personalized recommendations, decision support methods need to take into account user preferences in the production of their outcomes. Consider, for example, the realm of review aggregation, where the ever-increasing quantity of products available online means that the websites and platforms that provide these products are dependent on reviews as a method of quality control for users. To provide a cognitively manageable representation of these reviews, they must be summarized to users, usually by simple aggregation measures. Some notable efforts have been made toward incorporating AI into this review aggregation process, with goals of making it automated, interactive, and explainable. Several of these efforts, for example, Cocarascu et al. (2019), Ceolin et al. (2021, 2022), Mumford et al. (2024), Ceolin and Ootes (2024), and Rago et al. (2025) are centered around the use of computational argumentation (for overviews, see Atkinson et al., 2017; Baroni, Gabbay, et al., 2018), given its ability to represent and reason with conflicting information. However, existing argumentative review aggregation methods lack support for personalization to empower interactions as exemplified in Figure 1.

Example conversational interaction between a user and a review aggregation system concerning the recommendation of the movie Alice In Wonderland, where the user’s personal preferences are taken into account.
This is a particularly relevant issue in the case of argumentative approaches to decision support, which have often focused on the construction of user-agnostic arguments concerning the various options at stake. Indeed, quantitative bipolar argumentation frameworks (QBAFs) (see Baroni, Rago, et al., 2018; Baroni et al., 2019; Cayrol & Lagasquie-Schiex, 2005) endowed with a gradual argumentation semantics have been shown to provide a suitable formal basis for the development of applications for decision support in a variety of contexts, such as the evaluation of design alternatives (Baroni et al., 2015), multiparty cooperative work (Aurisicchio et al., 2015), and judgmental forecasting (Irwin et al., 2022), in addition to the aforementioned review aggregation (Cocarascu et al., 2019; Rago et al., 2025). However, while these gradual bipolar argumentation (GBA) approaches have been successfully adopted as a formal basis for the realization of decision support in a variety of application domains, none involved personalization aspects and, hence, the study of techniques to integrate user-provided preferences into GBA turns out to be an open research question, that is of huge relevance in practice. In fact, enabling personalized recommendations rather than providing just user-agnostic indications may significantly increase the effectiveness of a recommender system in terms of fitting users’ needs and hence increasing customer satisfaction.
From a technical viewpoint, QBAFs provide an argumentative representation of the network of reasons underlying the uncertain assessment of a given issue, which are related by attack and support relations (representing negative and positive, respectively, effects between arguments), with a base score constituting an intrinsic strength assigned to the arguments. A gradual semantics may then provide a numerical assessment of the strength of the arguments belonging to a QBAF. Strength values may then be used as the basis for informed decisions. When decision support concerns some personal choice (e.g., the selection of a product to purchase or a movie to watch) the issue of providing personalized outcomes, taking into account different user preferences, emerges. For example, in Figure 1, this would require that the strength of the argument representing the movie Alice In Wonderland being well-reviewed is further increased by the review aggregation system based on the fact that its supporter, the argument representing the writing being well-reviewed, is more important to the user than its attacker, the argument representing the directing being poorly reviewed. In the formal context sketched above, this requires user preferences to be taken into account in gradual argumentation semantics for QBAFs. While a number of works have considered preferences in different forms of argumentation, for example, Amgoud and Cayrol (2002), Modgil and Prakken (2013), Amgoud and Vesic (2014), Kaci et al. (2018), and Mailly and Rossit (2020), to the best of our knowledge, none have considered preferences for QBAFs.
In this paper, we provide an initial exploration of this investigative direction from both a theoretical and a practical perspective: first, we investigate some general principles concerning the effects that user preferences should have on the evaluation of argument strength in gradual argumentation semantics and then we illustrate their application in an enhanced version of the ADA review aggregation system (Cocarascu et al., 2019; Rago et al., 2025). Figure 2 shows a pipeline of how we envisage our approach to be integrated into the ADA system.

A comparison (adapted from Rago et al., 2025) of the simple aggregation methods used by websites such as Tripadvisor, Rotten Tomatoes, and Amazon, contrasted with ADAs (Cocarascu et al., 2019; Rago et al., 2025). The additions in turquoise to the original ADA pipeline are those which we propose in this paper, namely preferences provided by the user in response to the explanations provided by ADAs, which extend the QBAFs to PQBAFs (which we introduce in Section 2), thus allowing for personalized outputs to be provided to the user. Note. QBAF = quantitative bipolar argumentation framework; PQBAF = preference-based QBAF.
Our contributions are as follows:
We give a brief survey of the literature with regard to preferences in argumentation, introducing a simple taxonomy and identifying gaps in this research. We introduce a novel property of local coherence to characterize the expected effects of user preferences on argument strength assessment in GBA. We provide theoretical results concerning the relationship between local coherence and global behaviors. We undertake a preliminary experiment in the context of a GBA-based review aggregation system, ADA (Cocarascu et al., 2019; Rago et al., 2025), extended with the handling of user preferences. Finally, we draw some considerations on the opportunities and challenges of putting the proposed approach into practice.
This paper builds upon and extends Battaglia et al. (2024) in a number of ways, aiming to give a broader view of preferences in GBA and its application in personalized decision support. We have extended the theoretical definitions and analysis in the paper, namely Definitions 13–19, Propositions 2–6, and Lemma 1, which are all novel contributions in this work. We have also extended the empirical analysis with more experiments concerning the integration of preferences to the ADA system. Finally, we have rewritten a significant proportion of the paper, adding more description, motivation, and illustrative examples throughout.
The paper is organized as follows. After recalling some background notions in Section 2, we carry out in Section 3 a conceptual analysis on the role of preferences in formal argumentation, pointing out the different natures and uses that can be found in the literature. We then investigate in Section 4 general principles concerning the role of user preferences in the evaluation of argument strength and illustrate an application of the proposed approach in Section 5, while Section 6 discusses related work and Section 7 concludes and looks ahead to future work.
Our work lies in the family of abstract argumentation formalisms, which are focused on the evaluation of the acceptability of arguments based on the relations among them. Dung’s (1995) argumentation framework (AF) is the simplest model in this area: an argument is conceived as an abstract entity, whose role and status are uniquely determined by its relations of attack with other arguments. An AF is represented by a directed graph in which the nodes are the arguments, and the edges correspond to the attack relations.
Dung, 1995
An AF is a pair
A central notion of Dung’s theory is argument acceptance, that is, the identification of acceptable arguments on the basis of the relations of attack. In this context, an extension-based semantics is a criterion specifying which sets of arguments, called extensions, are collectively acceptable, that is, can together defend against any incoming attacks.
We will use the more expressive bipolar AF (BAF), a formalism that enriches the AF with a relation of support. Furthermore, we will consider gradual argumentation semantics, where the evaluation of argument acceptability is expressed by a strength value on a given scale. In this context, arguments are equipped with an initial evaluation, called the base score, intuitively representing the strength of the argument in the absence of interactions with other arguments, while a gradual semantics produces a strength evaluation also taking into account the attack and support relations encompassed by the framework. These notions are formalized by the definition of a QBAF (Baroni et al., 2019; Cayrol & Lagasquie-Schiex, 2005):
Baroni et al., 2019; Cayrol & Lagasquie-Schiex, 2005
A QBAF is a quadruple
For any
A gradual semantics
With a minor abuse of notation, for
We focus on applications of QBAFs for decision support: in this context, some of the arguments have a distinguished role, since they represent the possible answers (or options) of the decision process, while the reasons in favor or against each option are represented by the other arguments (called pro and con arguments), which in turn can be supported or attacked by other reasons corresponding to other pro and con arguments, and so on. QBAFs featuring this structure provide a formal counterpart to the issue-based information system (IBIS) method for decision making (Buckingham Shum & Hammond, 1994; Fischer et al., 1991) as illustrated in Baroni et al. (2015). As emerges from the description sketched above, QBAFs for decision support can be represented as sets of trees, with the root of each tree corresponding to an answer argument, other vertices corresponding to pro and con arguments, and the edges corresponding to the attack and support relations. Considering more general topologies, for example, encompassing cycles of attacks, is left to future work. For brevity, we will consider the treatment of preferences in QBAFs consisting of a single tree; the extension to the case of a set of trees is straightforward. Definition 3 (Rago et al., 2023) captures the structure of the QBAFs we focus on, namely QBAFs for decision support about an answer
Rago et al., 2023
Let
Given a QBAF for
Moreover, given arguments
Intuitively, a path is a sequence of arguments such that every pair of consecutive arguments in the sequence is connected by an attack or support relation. A QBAF for
In the following, if not otherwise specified, we will assume that every QBAF
Given a set
In AFs, preferences are expressed over arguments. We give in the following the relevant definitions for AFs and QBAFs, including preferences, with gradual semantics defined analogously as for QBAFs.
A preference-based AF (PAF) (Amgoud & Vesic, 2014; Kaci et al., 2018) is a 3-tuple
A PQBAF is a pair
A gradual semantics
A preference over a set
Given a preference if if if
Some comments on Definition 6 are in order. First, it requires that there is indifference between two sets in the absence of any reason to strictly prefer one to the other. Note that this includes, in particular, the case where no preference at all is specified at the level of arguments, that is, the case where
In the following, wherever a set-comparison criterion
Several approaches to the treatment of preferences have been considered in the formal argumentation literature. In the context of abstract argumentation, a major line of investigation has concerned the treatment of the so-called critical attacks in PAFs. An attack is said to be critical (Amgoud & Vesic, 2014) if the attacked element is strictly preferred to the attacker. The question then arises as to whether and how the preference for the attacked element influences the attack relation. Several approaches use preferences to reduce a PAF
Given a PAF
Four main reduction methods have been proposed in the literature (see the relevant references for details):
Method 1 (Amgoud & Cayrol, 2002): Method 2 (Amgoud & Vesic, 2014): Method 3 (Kaci et al., 2018): Method 4 (Kaci et al., 2018):
The first reduction suppresses the critical attack; this technique has been criticized in Amgoud and Vesic (2014) because it can lead to extensions which are not conflict-free with respect to the original PAF. For this reason, the second reduction aims to “repair” the AF and avoids that drawback by reversing the direction of the critical attack. Kaci et al. (2018) argued that the second reduction implies a strong constraint since a preferred argument can never be successfully attacked, hence they proposed the third reduction, which deletes a critical attack only if the opposite attack belongs to
In the context of the ASPIC+ formalism (Modgil & Prakken, 2013), a rule-based approach to argument construction is proposed, leading to the identification of different forms of attack between arguments, which are classified as preference-dependent or preference-independent. Only preference-dependent attacks are affected by preferences: they are ignored when the attacked argument is strictly preferred to the attacker. Since in ASPIC+ preference-dependent attacks are always symmetrical, this bears some similarity with the third reduction mentioned above.
While the above approaches concern extension-based semantics, the use of preferences in the context of
It can be remarked that this principle imposes a rather strong requirement on the evaluation of arguments, independently of the relations holding between them. Moreover, the role of preferences is quite different, as they are not used to modify the attack relation but rather are meant to affect the final evaluation.
The potential twofold role of preferences is also evidenced in Amgoud and Vesic (2014), where, in addition to handling critical attacks in PAFs, preferences are used to induce an ordering on the extensions prescribed by a given semantics. With respect to the goals of the present paper, two main limitations emerge from the above surveyed approaches: (i) none of them concern BAFs, that is, they do not consider the support relation, which is needed in our context and (ii) a conceptual analysis about the motivations underlying the different proposals is lacking.

Comparison of the different reduction methods from PAFs to AFs. Attacks are indicated by arrows accompanied by the − sign: those which are induced by the reduction are highlighted with dashed edges. Note. AF = argumentation framework; PAF = preference-based AF.
As to the latter point, we remark in particular that different uses of preferences may be required by different application contexts. In this respect, we propose here a simple taxonomy on the uses of preferences in argumentation based on two classification dimensions: (i) the origin of preferences, which can be endogenous or exogenous with respect to the argument construction process; (ii) the purpose of the formalization, which can be normative or descriptive.
Concerning the first point, we call endogenous preferences those which are induced on arguments from preferences concerning their constitutive elements, for example, premises and rules as in Modgil and Prakken (2013), while exogenous preferences are ascribed to arguments based on elements which are not involved in their construction, such as, for instance, the values they promote, as in Bench-Capon (2003). Concerning the second point, a normative approach aims to define a standard behavior on the basis of certain rationality principles, while a descriptive approach aims to represent how people actually behave, possibly in an unprincipled manner. We suggest that some links can be drawn between these notions and the uses of preferences in the literature.
For instance, the PP principle in Mailly and Rossit (2020), where, in a sense, preferences determine the evaluation outcomes, overriding any relations between arguments, can be justified in a descriptive approach with exogenous preferences. For instance, if some people have a preference for an information source they trust, they may accept all arguments from that source, no matter what their content is. This behavior would be in contrast with a normative approach, where arguments’ contents and relations should also play a role in the presence of preferences.
Concerning the treatment of critical attacks, suppressing all of them independently of any other condition (Amgoud & Cayrol, 2002) appears in line with a descriptive approach, where, as above, preferences have a sort of absolute priority over other factors, with the possible production of outcomes which are not conflict-free. On the other hand, the treatment proposed in Modgil and Prakken (2013) concerns endogenous preferences with a normative approach, where they have the role of converting mutual attacks into unidirectional ones when appropriate.
While an extended discussion of these aspects is beyond the scope of this paper, the observations above indicate that a proper characterization of the application context is necessary to lay the foundations of the approach we aim to propose. In particular, the preferences we are interested in are exogenous, since they can be provided by users as an additional element with respect to a QBAF representing domain knowledge in a given decision support context. Moreover, we aim for a normative approach where preferences are used to provide personalized recommendations in a principled manner, as discussed in the next section.
In this section, we illustrate the basic ideas of our approach to encompass user preferences in GBA for decision support. To support our presentation, we will use as a running example the simple framework presented in Figure 4, taken from a review aggregation by the ADA system (Cocarascu et al., 2019; Rago et al., 2025) within the movie recommendation domain. It should be noted that the conversational interaction in Figure 1 could be supported by such a BAF. 3

A simple bipolar argumentation framework (BAF) in the movie domain. Arguments are represented by vertices, attacks by red edges labeled “−” and supports by green edges labeled “
The first question we consider concerns the pairs of arguments on which preferences are given. In this respect, some differences with the approaches reviewed in the previous section have to be underlined. In particular, while, in principle, endogenous preferences can refer to any pair of arguments (since they involve constitutive elements common to all arguments), user-defined exogenous preferences can only be given on arguments whose comparison is meaningful to the user. In the family of frameworks we are considering, it is then natural to consider preferences between sibling nodes (i.e., between influencers of the same node) since they contribute together to the evaluation of the influenced node. For instance, in the example of Figure 4, it is reasonable to imagine that a user may give more importance to the themes of the movie than to the quality of acting, or may prefer one actor to another, while it does not seem meaningful to express preferences between an influencer and an influenced node (e.g., between love and themes) and more generally preferences across different levels of the tree. This represents a significant difference with respect to approaches whose main focus is the treatment of critical attacks.
Toward defining general principles for the treatment of preferences, a further question then concerns identifying the cases where their effect on argument evaluation can be univocally determined. In this respect, we distinguish the cases of preferences concerning arguments of different polarity (e.g., an attacker vs. a supporter) with respect to arguments with the same polarity (e.g., a supporter vs. a supporter). In the first case, the expected effect of preferences can be clearly identified. For instance, if an attacker is preferred to a supporter of a node
While the examples above concern a single preference between a pair of arguments, they can be extended to the case where multiple preferences are given, from which a comparison between sets of arguments can be derived. On this basis, we introduce a property of local coherence specifying the effects of preferences between the set of attackers and the set of supporters of a given node. The property refers to the comparison between two PQBAFs, which differ only in the preference relation concerning the influencers of a given argument
Given a QBAF,
Given a QBAF, if if if if
The property of strict local coherence with preferences is satisfied iff the following conditions hold:
Let us illustrate Definitions 7 and 8, with reference to the example of Figure 4. Consider a relation
One can then wonder whether the property of local coherence provides guarantees at the level of the whole tree. In particular, it is desirable that the effects of preferences are coherent with the roles of pro and con arguments along the structure of the tree. Intuitively, if one adds a preference for pros over cons, this should have a positive (or at least non-negative) effect on other pros and a negative (or at least non-positive) effect on other cons and vice versa in the case of a preference for cons over pros. Under mild requirements on the considered semantics, we show that, in fact, local coherence ensures that the effects of preferences are coherent with the roles of pro and con arguments along the structure of the tree. In particular, we show in Proposition 1 that a preference for pros over cons among the influencers of an argument
Toward this result, we require first of all that the strength of an argument is determined by its base score, the strengths of its attackers and supporters, and the preferences between them. This is the extension to the case of the presence of preferences of a property which is common to most gradual argumentation semantics for BAFs in the literature (see, e.g., Cayrol & Lagasquie-Schiex, 2005).
A gradual semantics
For example, in Figure 4, we would expect that the strength of the argument representing acting would only depend on its base score, the strengths of the four arguments attacking or supporting it, that is, Wasikowska, Depp, Carter, and Hathaway along with any preference relation involving them.
Moreover, we assume a semantics based on a monotonic strength function
Given two PQBAFs
In words, two arguments are
We then introduce a formal notion of strength comparison between sets of arguments.
Given two PQBAFs
Intuitively, two sets of arguments are strength equivalent if the multisets of their strength values are the same (i.e., the two sets have the same cardinality and one can establish a bijection linking arguments with the same strength). A set of arguments
Then, the property of monotonicity is defined using the notion of shaping triple, a structure which collects all the elements directly involved in the evaluation of an argument.
For any argument
Given two PQBAFs as boosting as at least as boosting as strictly more boosting than
A strength function if if
A strength function
The shaping triple includes the elements determining the strength of an argument: its base score and its supporters and attackers. The boosting relations are based on an element-wise comparison between shaping triples and essentially check whether two shaping triples are equal or one (strictly) dominates the other with respect to the strength values. A strength function is (strictly) monotonic if its outcomes on arguments (strictly) follow the (in)equalities between the relevant shaping triples. For example, in a PQBAF built from the BAF in Figure 4 with arbitrary base scores and preferences, for a (strictly) monotonic
Monotonicity ensures that, at local level, some variation in the base scores or the strengths of the influencers of an argument
Given a QBAF, C-pro-modification iff C-con-modification iff C-und-modification if neither of the above conditions holds.
Intuitively, a C-pro-modification is such that, in the modified framework, it reverses in favor of pro arguments the (not necessarily strict) preference relation (according to C) between the pro and con arguments included in the influencers of x. Analogously, a C-pro-modification reverses the preference relation in favor of con arguments, while a C-und-modification is not in favor of either pro or con arguments. To exemplify, if we consider a PQBAF with arbitrary base scores built from the BAF in Figure 4: adding a preference for writing over directing would be a C-pro-modification; adding a preference for Wasikowska over Depp would be a C-con-modification; while adding preferences for Wasikowska over Depp and Hathaway, a pro argument, over Carter, a con argument, would be a C-und-modification.
One can then expect that if a local modification corresponds to a preference for pros over cons, then it can only induce an increase in the strength of other pros and a decrease in the strength of other cons, and vice versa in the case of a preference for cons over pros. This can be regarded as a globally coherent behavior induced by the local coherence property.
We can exemplify the usefulness of this behavior in the review aggregation setting, again considering a PQBAF built from the BAF in Figure 4 with arbitrary base scores and preferences. Here, we would expect that adding a preference for power over revenge can only increase the strength of the movie argument, given that power was reviewed well and was thus a pro argument, and revenge was reviewed poorly and was thus a con argument. The opposite effect can be seen if a user expresses a preference for Wasikowska, a con argument, over Depp, a pro argument.
Proposition 1 provides the desired result by showing that local coherence implies coherent variations on the strengths of arguments at the global level when adopting a monotonic semantics. The proof 4 relies on the following lemma, which delimits the effects of a local modification.
Given a QBAF,
Given a QBAF, If If if If if if
Under the assumption of a monotonic semantics, in virtue of Proposition 1, C-pro- and C-con-modifications ensure an effect satisfying the expected inequality for the affected arguments, namely those included in
Assuming a strictly monotonic semantics and the property of strict local coherence, a strict version of Proposition 1 where the inequalities between
Given a QBAF, C-s-pro-modification iff C-s-con-modification iff C-s-und-modification if neither of the above conditions holds.
Given a QBAF, If if if If if if
While the above results and definitions concern only a set of local preferences over the influencers of a specific argument, they can be used as a basis to reason about the addition of a global set of preferences to a QBAF, under the assumption that preferences are focused on sibling nodes, which is formalized in the following definition.
A PQBAF
Intuitively, a PQBAF is sibling-focused if there are no preference relations involving nodes which are not siblings. A p-argument is an argument such that a strict preference relation holds for at least one pair of its influencers. If we consider our running example based on Figure 4, including only preferences between siblings, for example writing over directing or Wasikowska over Depp, means the PQBAF will remain sibling-focused, while adding other preferences, for example, Wasikowska over writing or Wasikowska over revenge, would result in this property being violated. It should be noted that sibling-focused PQBAFs seem to be especially intuitive in this review aggregation context, where the AF is based on a hierarchy of features.
A sf-PQBAF
We are interested in analyzing the case where some strict user preferences are added starting from a situation where preferences are uniform, resulting in a sf-PQBAF, which differs from the initial one only in the preferences involving the influencers of its p-arguments, as captured by the following definition.
Given a preference uniform sf-PQBAF
Proposition 3 shows that the transformation into a p-arg-variation can be put in correspondence with a sequence of local modifications.
Let
The sequence
Based on Proposition 3, by iterated application of Proposition 1 or Proposition 2, it is possible to characterize the variations induced by the preferences on the argument strengths, when they are univocal.
In particular, when the hypotheses of Proposition 1 or Proposition 2 are satisfied, a
If there are multiple modifications, they concurrently affect the arguments shared in the relevant paths: these include at least the root and may also involve other arguments (see Figure 5).

Two simple examples of the effects of multiple preferences. The non-leaf nodes are labeled by > and < to indicate, respectively, an increase and a decrease in their strengths, assuming Proposition 2. The effects on the root strength of the two preferences added in case (a) are concordant (they are both pro-modifications), while they are discordant in case (b), hence, the overall effect on the strength of
More formally, considering a generic sf-PQBAF
First of all, the following proposition identifies the arguments whose strength is subject to a variation given a set of preferences.
Let
We can see how this is intuitive in the running example: we would expect that adding a preference for Wasikowska over Depp would only affect the ancestors of these arguments, that is, acting and movie, and it would not affect the strength of other arguments, for example, writing or directing.
Considering now the arguments whose strength is affected by the change, namely those in
Let
In words, the affecting elements of
We can now define the notion of a determined sf-PQBAF with respect to an argument
Let
Let If If
Obvious modifications can be applied to Definition 18 and Proposition 5 to cover the case of the strict inequalities entailed by a strictly monotonic semantics, as reported below.
Let
Let If If
In case the hypotheses of Proposition 5 or 6 are not satisfied the effect of the preferences expressed in a sf-PQBAF on a given argument is not univocally determined by the general properties of (strict) local coherence, (strict) monotonicity and local evaluation of the semantics and may correspond to variations of different sign depending on the actual semantics adopted.
In short, in order to have a variation on the strength of an argument
Figure 5 presents a simple case where the propagation of the effects of preferences on the strength of the root node is determined, and a case where it is not.
While the condition that all preferences are concordant appears to be rather strong in general, as final comments to the analysis carried out in this section, we observe that:
The achieved results are significant from a foundational perspective as they ensure that, under some rather mild requirements on the nature of the semantics and on its behavior at local level, a coherent behavior at global level is guaranteed when preferences are concordant, which can be regarded as a confirmation of the soundness of the approach. In particular, this can be regarded as evidence of the cognitive plausibility of the approach, since it is guaranteed that the effects at the global level do not contradict the expectations induced by the preferences expressed by users at the local level. In some application domains, such as product recommendation, it is reasonable to assume that a user expresses explicitly a limited number of preferences or, when preferences are inferred from data or past user behaviors, that the system focuses on the strongest ones emerging from the analysis. In both cases, a scenario where the preferences are concordant does not appear to be particularly implausible. We leave to future work a deeper investigation about the behavior of different gradual semantics in the case of discordant preferences.
ADA is a review aggregation system introduced in Cocarascu et al. (2019), Rago et al. (2025) within the movie domain. The ADA pipeline illustrated in Figure 2, is organized as follows (see Cocarascu et al., 2019; Rago et al., 2025 for more details):
ADA is designed around a tree-structured ontology, which may be crafted by hand (Cocarascu et al., 2019), or be extracted with different levels of automation (Oksanen et al., 2021; Rago et al., 2025). This ontology is built using a part of relation, giving a tree with the item being reviewed at the top and its features below it. Natural language processing is then employed to break down the movie reviews into a feature-based review aggregation where essentially each entity in the ontology is assigned a polarity based on positive or negative votes from the reviews. For each movie, ADA generates a tree-structured QBAF to represent the feature-based review aggregation and the identified polarities (Figure 4 shows a simple example):
The base score of each argument, representing an item or a feature, is derived from the review aggregation. In a nutshell, the base score reflects how much the reviews about that feature are coherent: if a feature has consistently positive or negative reviews, its base score will be higher; if reviews are more mixed, its base score will be lower. Attack and support relations, which map onto the part of relation and are directed toward the item in the ontology, are extracted based on the polarities of the arguments, that is, depending on whether they agree or not. A gradual semantics is applied to the QBAF to compute argument strengths, where that for the item is used as the review aggregation. The extracted QBAF thus provides the underlying structure for generating review aggregations, but also dialogical explanations, therefore using the arguments’ effects on one another, based on the computed strengths.
As a preliminary application of the concepts we previously introduced, we investigated a method to integrate user preferences in ADA, in compliance with the properties introduced in Section 4, and examined its behavior in a few realistic examples in the movie domain. Our goal was to carry out an initial analysis of the issues to be faced when putting our general notions into practice and obtain a first assessment of the approach from an application perspective. The outcomes of this assessment are meant to provide the basis for further developments and refinements of the proposal before carrying out a more extensive experimentation and evaluation, which is left to future work.
The main ideas underlying this preliminary integration of preferences in ADA are as follows. We assume that, for a given user

Bipolar argumentation framework (BAF) for the movie It ends with us, where acting is preferred over writing, and themes is preferred over directing.

Bipolar argumentation framework (BAF) for the movie Twilight where directing is preferred over themes, and the theme sacrifice is preferred over immortality.

Bipolar argumentation framework (BAF) for the movie Pirates of the Caribbean: Dead Men Tell No Tales, where directing is preferred over writing, and the actor Bardem is preferred over Depp.
For instance, in the case of the movie It ends with us (Figure 6), we consider a user who prefers acting to writing and themes to directing. Note that the examples correspond to different configurations of preferences in terms of the role (pro or con) of the arguments involved, of their distance from the root, and of their position on the same path to the root, leading to possible interactions between them in the example of Twilight (Figure 7).
As to the role of the preferences in strength computation, the idea was to adopt a simple parametric approach to combine preferences with existing gradual argumentation semantics and then carry out a preliminary assessment of the impact of the use of preferences on the final outcomes. The approach we adopted consists of decreasing the base score of the arguments which are less preferred, by multiplying them by a given discount factor
As to the first choice, we considered the following values
5
for
In the following subsections, we discuss the results of our preliminary evaluations, 6 focusing on three main aspects, namely the role of the base scores, the definition of the discount factor, and the role of the gradual semantics, on which we draw some general considerations based on the results of the application of our approach in the examples introduced above.
The idea of using preferences to adjust the value of the base scores, besides being intuitive, has the advantage of simplicity and of enabling the use of existing gradual semantics as interchangeable alternatives, while ensuring the satisfaction of the properties introduced in Section 4. Embedding the use of preferences inside gradual semantics would instead require a case-by-case modification of semantics definition: investigating this direction is beyond the scope of this paper and is left for future work.
It has to be noted, however, that our approach makes the effect of preferences dependent on the way the base score of arguments is determined. In the case of ADA, the base score of an argument corresponding to a feature is derived from a normalized count of the positive and negative judgments about that feature found in the set of reviews under consideration (for details, see Cocarascu et al., 2019). In the original version of ADA, the base score is calculated as the absolute value of the difference between the number of positive and negative votes of the feature corresponding to the argument, divided by the total number of reviews; this normalization method typically gave rise to rather small base scores (around 0.05 for the dataset in the movie domain) since it is common that a specific feature is mentioned explicitly only in a small subset of the reviews. While these small base scores were in line with the original purposes of ADA, they turned out to be somewhat problematic, since the quantitative effects of preferences were, in fact, dampened since they were conveyed through the adjustment of small values. To avoid this problem, we experimented with a different normalization method in which the absolute value of the difference between the number of positive and negative votes of the feature corresponding to the the argument is divided by the number of reviews where the feature is mentioned and hence receives a vote, rather than by the total number of reviews. This led to base scores of the features, which are on average more than 10 times bigger.
In particular, looking at Table 1 for the example of Alice in Wonderland, one can observe the significant base score difference for all the features. The ratio between the base scores ranges from a minimum of 3.33 to a maximum of 22.5 with an average of 14.21. Similar results hold for the other examples, which are not reported in detail for the sake of conciseness. Higher base scores of features lead in turn to a greater quantitative impact of the preferences on the strength of the elements affected and notably of the movie, which is the subject of the recommendation.
Comparison of the Base Scores Obtained for the Features of the Movie Alice in Wonderland With the Two Normalization Methods.
Comparison of the Base Scores Obtained for the Features of the Movie Alice in Wonderland With the Two Normalization Methods.
Table 2 shows the strength computed according to the four selected semantics for some significant items and
Strength Values and Percent Variations With
Note. QuAD = quantitative argumentation debate; DF-QuAD = discontinuity-free QuAD; REB = restricted Euler-based; QEM = quadratic energy model.
Table 3 focuses on the variations of the strength of movie (which is the most relevant for our purposes) in the same example, for all values of
Movie Alice in Wonderland. Focus on movie for
Note. QuAD = quantitative argumentation debate; DF-QuAD = discontinuity-free QuAD; REB = restricted Euler-based; QEM = quadratic energy model.
Movie It Ends With Us. Focus on movie for
Note. QuAD = quantitative argumentation debate; DF-QuAD = discontinuity-free QuAD; REB = restricted Euler-based; QEM = quadratic energy model.
Movie Twilight. Focus on movie for
Note. QuAD = quantitative argumentation debate; DF-QuAD = discontinuity-free QuAD; REB = restricted Euler-based; QEM = quadratic energy model.
Movie Pirates of the Caribbean: Dead Men Tell No Tales. Focus on movie for
Note. QuAD = quantitative argumentation debate; DF-QuAD = discontinuity-free QuAD; REB = restricted Euler-based; QEM = quadratic energy model.
In all cases, the greater impact with the new base score normalization is confirmed. The extent of the difference in impact, represented by the ratio With QEM semantics, the impact of preferences with the old normalization method is so small in some cases that In each example, Leaving QEM apart again, fixing the semantics and the example,
Ratios
Note. QuAD = quantitative argumentation debate; DF-QuAD = discontinuity-free QuAD; REB = restricted Euler-based; QEM = quadratic energy model.
The value of the discount factor
Focusing on movie, it can be seen that this expectation is confirmed for all semantics (and for both normalization methods, though being of course more evident with the new one) in the examples of Alice in Wonderland, It ends with us, and Pirates of the Caribbean. Just to exemplify, for QuAD semantics and the new normalization method, in the example of Alice in Wonderland
The example of Twilight differs since in this case the highest absolute values of
An explanation of this behavior can be given by observing that in this case the two added preferences are along the same path to the root (see Figure 7). This means that there is an interaction between the effects of single preferences, leading to a global effect which does not follow the simple pattern of dependence on
Thus, while choosing a lower value of
Concerning the choice of
While a proper modeling of these different attitudes represents another interesting subject of future investigation (see in particular the discussion about the work of Potyka & Booth (2024) in Section 6) the use of different discount factors can be regarded as a first crude method to give them a counterpart in our approach. In a future implementation, one may imagine that setting the desired weight of their preferences is left to the users themselves. Our approach is defined in a way to be fully compatible with leaving this freedom of choice to them.
Role of the Gradual Semantics
As mentioned above, we experimented with our approach using four gradual semantics, which, while sharing the property of being monotonic, are rather different by design (a comparison is beyond the scope of the present paper; the interested reader may consult the cited papers for details).
This entails that a given change in base scores, as determined by preferences in our approach, may have significantly different impacts on strength evaluation outcomes, depending on the semantics adopted. In a sense, we can say that a gradual semantics can be more or less sensitive to preferences in the context of our approach. While the study of a formal notion of sensitivity to preferences is an interesting issue for future work, we draw here some preliminary considerations concerning the semantics we experimented with. In particular, we address two questions:
Does any semantics appear to be more sensitive to preferences than others in the context of the examined approach? Are there any semantics-specific properties which may affect the sensitivity to preferences?
Concerning the first question, we want to identify whether the considered semantics can be ordered according to their sensitivity. The idea is that a more sensitive semantics gives rise to a higher value of
Concerning Alice in Wonderland, with the new normalization method we have the following order (in brackets the relevant absolute value of
With the old normalization method with
Concerning It ends with us with the new normalization method (in all examples the same remarks as above apply) we get DF-QuAD (31.7) > QuAD (16.41) > QEM (11.37) > REB (8.86), while with the old normalization method we get QuAD (3.38) > DF-QuAD (3.28) > REB (1.47) >QEM (0.62).
Turning to Twilight, with the new normalization method we have QuAD (14.22) > QEM (6.48) > DF-QuAD (6.01) > REB (5.14), while with the old normalization method we get QuAD (2.51) > DF-QuAD (1.79) > REB (0.57) > QEM (0.04).
Finally, for Pirates of the Caribbean with the new normalization method we have QuAD (80.21) > QEM (37.48) > REB (29.31) > DF-QuAD (23.20), while with the old normalization method we get QuAD (7.71) > DF-QuAD (2.49) > REB (0.79) > QEM (0.25).
It turns out that with the new normalization method, the order of sensitivity varies significantly across the examples, so that it is cautious not to draw any indication. With the old normalization method, instead the order is the same in all examples: QuAD > DF-QuAD > REB > QEM. This suggests that REB and QEM tend to be less sensitive than QuAD and DF-QuAD in cases of small base scores, which, however, are not of interest for our application domain.
As to the second question, we provide an illustrative example of how some semantics properties may affect sensitivity to preferences by making an observation concerning the QuAD and DF-QuAD semantics. By design, in computing the strength of an argument
To give an example, if in the case of It ends with us, the base score (and hence also strength) of Slate or Lively (supporters of acting) is 1. Then, given also the fact that acting has no attackers, from the definition of QuAD and DF-QuAD it follows that the strength of acting is guaranteed to be 1 for every value of
Generalizing this observation, any semantics that includes some saturation effect such as the one exemplified above in the strength computation may become totally insensitive to some preferences under specific conditions. Whether this is (in a sense) a bug or a feature depends on the users’ expectations of the actual effect of their preferences in each specific application context.
Altogether, even if restricted to a small set of examples in a specific domain, we suggest that this analysis provides some indications on the problem of integrating preferences within gradual argumentation for a practical application:
While there are theoretical guarantees that the effects of preferences follow some desirable properties given some mild requirements on the semantics, the amplitude of these effects, which is a key element from an application perspective, depends on a variety of factors and is not easily foreseeable even in the context of a rather simple approach to preference treatment. Users’ needs and expectations are crucial to characterize an appropriate role of preferences in a given context, but different users might have different needs and expectations, thus the study of a framework for personalizing the use of preferences (exemplified in our simple case by the adjustment of The choice of the semantics, however, cannot be left to the user, if only because of its inherent complexity. The variety of available gradual semantics and of their properties (e.g., the presence or absense of saturation effects) calls for the investigation of more stringent, and possibly domain-dependent, requirements for the semantics to be adopted. In this respect, while for our initial analysis we considered a parametric approach oriented to the reuse of semantics available in the literature, the investigation of deeper semantics adaptations or even of new preference-driven semantics will be worth pursuing.
Discussion of Related Work
While we have surveyed at a general level the uses of preferences in formal argumentation in Section 3, in this section, we discuss works that show some direct relations with our proposal and provide inspiration for future research.
The work presented in Potyka and Booth (2024) is relevant to the issue of how heavily preferences should affect strength evaluation with respect to the rest of the framework, introduced in Section 5.2. In particular, in Potyka and Booth (2024), two properties of a gradual semantics for a QBAF have been considered: open-mindedness, which refers intuitively to the fact that the strength of an argument can assume any value in a given range, independently of the initial base score, provided that enough attackers or supporters are present; and conservativeness, which intuitively means that there are some constraints on how much the strength of an argument can differ from its base score. While there is an interesting conceptual connection between these notions and the observations in Section 5.2, it is also worth remarking that the properties we study in the present paper are not directly related and, in a sense, are orthogonal to those considered in Potyka and Booth (2024).
First, in Potyka and Booth (2024), no notion of preference between arguments is considered. Second, our notions of coherence are qualitative properties, concerning the sign of a variation of the strength of an argument between two (suitably) different frameworks, while open-mindedness and conservativeness are quantitative properties concerning the amount of variation over all possible frameworks with respect to the base score. In particular, local coherence cannot provide information or guarantees with respect to the difference between the strength and the base score, since this difference depends on the specific gradual semantics adopted, and different semantics can behave differently in this respect, even if they satisfy local coherence.
Investigating the extension of the notions of open-mindedness and conservativeness to the case of “preference-aware” semantics would therefore represent an interesting direction of future work. Moreover, from an application perspective, characterizing gradual semantics as open-minded or conservative with respect to preferences would give an indication on which semantics are more appropriate for a context where base scores have to be somehow preserved with respect to contexts where base scores can be completely changed (e.g., because it is assumed that user preferences are of utmost importance).
At a higher abstraction level, in Gonzalez et al. (2021), an approach is introduced which extends traditional bipolar AFs with additional information which can be used to represent specific argument features which can play a role in the assessment of argument acceptability, in addition to the attack and support relations. It is worth remarking that, similarly to ours, this work focuses on the evaluation of acyclic frameworks. The additional information is formalized in terms of labels, and a key notion of the approach is that of algebra of labels. In a nutshell, it is assumed that each set of labels is ordered, includes a top (
It is interesting to remark that an algebra with ordered labels can be used to express implicitly a preference order between arguments and indeed the approach proposed in Gonzalez et al. (2021) is highly expressive in this respect, as it allows one to represent a variety of preference orders that may correspond to different criteria, one for each algebra. It is, however, worth noting that the formalism prescribes that there is a total function from the arguments to the labels of each algebra, implying that every argument is comparable with every other. This represents a first difference with our approach, since, as previously explained, we refer to an application context where only some preference comparisons are meaningful.
Other significant differences with Gonzalez et al. (2021) include the fact that these implicit preferences cannot have effects on the leaves of the framework, while in our proposal, the use of preferences can affect them too. Moreover, in Gonzalez et al. (2021), an explicit preference relation over arguments is introduced, which is used to suppress attacks, analogously to Method 1 in Section 3. This differs from our approach, where preferences are meant to be taken into account in strength evaluation, not to determine changes in the structure of the framework. Moreover, since explicit preferences are not involved in the evaluation of the argument labels for each algebra, no property comparable to local or global coherence can be considered in the approach of Gonzalez et al. (2021). Notwithstanding these differences, the idea of considering a set of argument features for computing a set of “parallel” evaluations of arguments bears definitely some interest in application domains such as recommender systems, where multifaceted product evaluations are formulated, and investigating a combination of ideas from Gonzalez et al. (2021) and our work appears to be a fertile ground of exploration.
Conclusions
The incorporation of preferences into argumentation-based methods for decision support is an open problem, and particularly so in the case of use of gradual semantics, for which the use of preferences has not been investigated yet. With the aim of enabling personalized recommendations in this context, we explored a normative approach for the use of exogenous preferences in QBAFs for decision support. In particular, we undertook a foundational analysis centered around novel properties, such as local coherence, concerning the expected effects of preferences on argument strength. We have then proved that, under the assumption of a monotonic gradual semantics, local coherence ensures that these effects are in line with the roles of pro and con arguments along the structure of the framework. Based on this approach, we extended a review aggregation system with the ability to deal with user preferences and carried out preliminary experiments, showing how the quantitative effects of preferences are significantly affected by alternative design choices, analyzing in particular the roles of the base scores, discount factors, and gradual semantics in this process.
This research provides much fertile ground for future work. Among the research directions on which we would like to focus is the study of further methods to deal with preferences in gradual argumentation semantics, for example, modifying both arguments’ base scores rather than only that of the argument which is not preferred, or considering situations where preferences can be equipped with varying degrees of preference, such as “I much prefer X to Y”. We also plan to investigate the relationships of our approach with methods adopted in other fields, such as Multi-Criteria Decision Analysis and Bayesian decision theory. Last but not least, a larger-scale validation is foreseen. In this respect, a first phase might concern a broader experimentation on various datasets in the domain of review aggregation, for example, considering recommendations in different contexts, as in Rago et al. (2025), or using artificially generated preferences for a more extensive empirical analysis of the behavior of the proposed approach, possibly leading to its revision. Then, a second phase would involve user studies with humans in a selected domain in order to verify that the induced effects in the QBAFs of the added preferences indeed lead to a better modeling of how users feel about the arguments, and thus better, more personalized decision support.
Footnotes
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Baroni was partially supported by MUR project PRIN 2022 EPICA “Empowering Public Interest Communication with Argumentation” (CUP D53D23008860006) funded by the European Union—Next Generation EU. Rago and Toni were partially funded by J.P. Morgan and by the Royal Academy of Engineering under the Research Chairs and Senior Research Fellowships scheme. Rago and Toni were partially funded by the European Research Council under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 101020934). Any views or opinions expressed herein are solely those of the authors listed.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Notes
Proofs of Section 4
Given a QBAF, Observe first that, by the hypotheses of the lemma and in particular taking into account Definition 9, Given a QBAF, If if if If if if Proof is by induction on the length of the path from any element of Induction step. We inductively suppose that the statement above is valid for every Considering the case (a), for every Let now Similarly, if The treatment of the case (b) is analogous. Let From the hypotheses that Let As in the proof of Proposition 3, let Let If If For the first part of the statement, consider the case where
