Abstract
Due to the ongoing and dramatic growth in the volume of consumer returns, retailers continue to struggle with the trade‐off in returns service strategies between implementing stricter return policies to lower operational costs and environmental footprint versus providing customers with lenient return policies to positively stimulate customers' value perceptions and patronage intentions. This paper argues that effective management of this trade‐off requires a deep understanding of the process through which consumers perceive, evaluate, and respond to return policies that vary in terms of leniency across five key dimensions identified in the literature: monetary, time, effort, scope, and exchange. To this end, we theorize on a cognitive process model and empirically test the model using randomized experiments with diverse consumer samples. By viewing each of the five leniency dimensions as returns service design levers, we examine (1) how a retailer's return policy leniency across different levers impacts a consumer's intention to purchase from a retailer, through the influence of leniency on the perceptions regarding returns service quality and transaction costs that jointly form perceived returns service value, and (2) how different leniency levers are compared in terms of their impacts. We find significant heterogeneity in the effectiveness of different leniency levers in influencing consumers' purchase intentions through increased perceived service quality, reduced perceived transaction costs, and subsequently increased perceived service value.
INTRODUCTION
Consumer returns are endemic to US retail practice. The annual value of returned products in the United States has grown to over $428 billion, representing 10.6% of total sales (National Retail Federation, 2021). Product proliferation, heterogeneity in customer expectations and valuations, and the rise of e‐commerce can all be listed among the drivers for this growth (Cheng, 2015). Another key contributor is the proliferation of generous return policies (Shang et al., 2017b). Fierce competition, decreased consumer switching costs, and a “Customer is the King” mantra lead many retailers to offer overly lenient return policies. Yet, these retailers bear returns‐related operational costs that exceed $100 billion per year (Blanchard, 2007). Generous return policies also engender moral hazard issues and result in the rise of fraudulent and opportunistic returns, which recently surpassed $25 billion per year (National Retail Federation, 2021). More dramatically, excessive returns driven by such lenient policies pose significant environmental problems. For instance, products returned to US retailers in a year generate 5 billion pounds of landfill, which is equivalent to trash produced by 5 million Americans (Constable, 2017). Handling processes of these returns annually generate 15 million metric tons of carbon dioxide (Optoro, 2018).
Lenient return policies have a long history in the United States, starting with J.R. Watkins' “Satisfaction guaranteed, or your money back!” policy in 1868. Today, consumer return policies come in a myriad of forms. Some retailers offer a no‐questions‐asked, full‐refund return policy, whereas others try to disincentivize returns by charging restocking fees, setting strict return deadlines, and imposing hassles. Recently, the tide seems to be turning against leniency. Many large retailers, such as Best Buy, Macy's, and Bed Bath and Beyond, have tightened their return policies over the last couple of years to reign in the cost of returns, though the specifics of how they choose to do so have significantly varied (ConsumerWorld.org, 2018). Some retailers imposed restocking fees on certain product categories, whereas others shortened the allowable time window for making returns. The variety of changes observed may be in part due to the lack of comprehensive research into the impact that these changes have on consumer responses—a primary purpose of this research.
In general, return policy design poses an interesting operations–marketing interface problem in today's retail environment. From the marketing perspective, more leniency can stimulate customer purchases through positive product and service‐quality signaling and a reduction in purchase‐related risks and thereby enhance the consumer value proposition. From the operations perspective, return policies constitute an important strategic lever in the front‐end of closed‐loop supply chains, which influences the volume, timing, and quality of product returns (Guide & Van Wassenhove, 2009). Further, the choice of return policies, through documented influences on consumer purchase and return behaviors, is intertwined with and carries significant implications for other retail planning and execution activities. Such activities range from product pricing to assortment decisions to inventory management (Abdulla et al., 2019). Thus, retailers face a challenge to find a solution that balances the fundamental operational cost versus consumer value proposition trade‐off. In this paper, we contend that effective management of this trade‐off requires an understanding of the process through which consumers perceive, value, and react to different return policies.
From the customer's standpoint, a return policy is an important element of the returns service offered by a retailer and plays a role in the valuation of the returns service (Davis et al., 1998). Anecdotal evidence from popular press often suggests that “the best” return policies share some common characteristics that signal leniency (Kirkham, 2015; Mash, 2017). These characteristics include a full refund, a long return window, low return effort, among others. Janakiraman et al. (2016) develop a typology of return policies that consists of five leniency dimensions: monetary, time, effort, scope, and exchange. We note that all five dimensions of return policy leniency can be strategically chosen by retailers to influence consumer perceptions and behaviors. Following Abdulla et al. (2019), we refer to these dimensions as return policy leniency levers. Table 1 provides definitions for each of the five leniency levers.
Conceptual definitions of return policy leniency levers
Consumer return policies are studied extensively in the analytical operations management (OM) literature as a strategic instrument that can be used to stimulate sales (Ketzenberg & Zuidwijk, 2009), control opportunistic returns (Shang et al., 2017a), signal quality (Moorthy & Srinivasan, 1995), compete (Shulman et al., 2011), and coordinate a supply chain (Su, 2009a). However, these papers predominantly focus on the monetary leniency lever with only rare examples studying other levers (for a list, see Abdulla et al., 2019). Empirically, several studies examine the antecedents (Shang et al., 2017b) and performance impacts (Ertekin & Agrawal, 2021) of return policy leniency. Though valuable contributions, these studies leverage unique contexts involving a single retailer or eBay auctions and use quasi‐experimental or nonexperimental approaches to estimate the impact of return policy leniency based on observed transactional data. As such, these works do not propose or empirically test causal consumer cognitive mechanisms that potentially drive the observed behaviors.
Empirical research in the operations–marketing and operations–information system interfaces focus on the effect of return policy leniency on various cognitive, affective, and behavioral reactions such as purchase risk, willingness to pay, product‐ and service‐quality perceptions, dissonance and regret, among others (e.g., Mollenkopf et al., 2007; Suwelack et al., 2011). The vast majority of the studies operationalize and manipulate return policy leniency through only one or two levers or use an overall leniency measure. In general, the collective evidence from the existing literature signals that different leniency levers may have varying effects on these perceptual and behavioral constructs. Nevertheless, the existing studies vary significantly in terms of their empirical models and study designs. As such, making comparisons on the relative effects of these levers is not feasible. In fact, there are inconclusive and even conflicting findings regarding the effect of a particular leniency lever when results across studies are cross‐examined.
A comparative analysis of the relative effectiveness of the leniency levers in terms of influencing consumer cognitive and behavioral responses has not been conducted to date. A limited number of works that discuss consumer valuation of a returns service have acknowledged this gap and made a call for future research in this direction (Griffis et al., 2012; Jeng, 2017; Mollenkopf et al., 2007). Janakiraman et al. (2016) conduct a meta‐analytical review of 21 studies to derive insights regarding the effectiveness of each of the five leniency levers. Though a valuable contribution, the considerable variation in the theoretical frameworks, research designs, and empirical contexts of the previous studies, as well as the exclusion of several more recent studies from the analysis, limits generalizability and applicability of the results (Abdulla et al., 2019). Therefore, there is a need for a systematic examination of all return policy leniency levers in the context of a unified empirical framework, in order to reconcile the earlier findings and obtain more reliable and actionable insights for retailers. Our research addresses this need. We theoretically and empirically explore how consumers perceive and value return policy leniency levers available to retailers and how this valuation process influences a consumer's intention to purchase from a retailer. In particular, we investigate how different levers influence consumer purchase intentions (PI) through a cognitive process that involves perceived service quality (PSQ), perceived transaction costs (PTCs), and perceived value of a returns service.
We make several contributions to the literature on consumer return policies with managerial implications. Using general merchandise stores (i.e., department stores, big‐box stores, variety stores) as the empirical context, first, we explain how return policy leniency across different levers indirectly influences a consumer's purchase intention through a cognitive process model predicated on a mental accounting framework that combines transaction cost economics (TCE) and signaling theoretical lenses. Second, we demonstrate that the policy aspects of a service (i.e., attributes pertaining to terms and conditions), rather than solely the process aspects (i.e., attributes pertaining to service encounter and experience), can generate service quality and transaction cost perceptions. Further, we find that these perceptions can trigger service value assessment and influence subsequent behavioral intentions and outcomes. In particular, we show that return policy leniency positively affects the valuation of a returns service and that the perceived value of the returns service significantly impacts a consumer's purchase intention. This finding supports the view that a return policy, which is fundamental to a retailer's returns service, is an important element of the overall service bundle, and may influence patronage decisions of consumers.
The cognitive process model also provides a foundation for limited theorizing, through deductive reasoning, on the relative influence of the levers. In particular, we hypothesize and our empirical findings support that consumer purchase intentions are more sensitive to levers that pose direct financial risks to both retailers and consumers (monetary and exchange) than those that do not (effort, time, scope). Essentially, we find that monetary leniency, followed by exchange leniency, are the two most influential levers. Time, effort, and scope leniency show significantly lower impact on the perceived service value (PSV) and purchase intention of a consumer. These findings imply that retailers who consider their return policies to be overly lenient and unsustainable from an operational perspective should consider restricting their return policies, not through monetary and exchange leniency levers. Rather, the retailers should prioritize the remaining three levers in order to alleviate the cost burden of returns, because these levers minimally impact PSV and purchase intention. Overall, our paper constitutes an important inquiry into sustainable consumer return policies and can motivate future research to understand how retailers can design return policies that still provide value to customers (people), do not hurt the financial bottom‐line of retailers (profit), and reduce the volume of returns, more than half of which end up in landfills (planet) (Constable, 2017; Howland, 2017).
The remainder of this paper is organized as follows: In Section 2, we set forth the conceptual background, theory, and hypotheses. Section 3 introduces the empirical methodology to test the hypotheses and presents the data analysis and results. Finally, Section 4 is dedicated to theoretical and managerial insights, along with future research opportunities and some limitations of our work.
THEORY DEVELOPMENT AND HYPOTHESES
We start with a brief conceptual background for the key constructs and provide several propositions based on theories from behavioral economics, cognitive psychology, and marketing as well as empirical findings from the existing return policy literature. We then combine these propositions to theorize on a cognitive process model that explains how return policy leniency influences a consumer's purchase intention through a parallel‐serial mediation mechanism that involves PSQ, PTCs, and PSV. We propose a set of hypotheses for testing this parallel‐serial mediation mechanism. Then, through deductive reasoning, we hypothesize that leniency levers with direct financial risks to both retailers and consumers (i.e., monetary and exchange) have a stronger effect on purchase intentions than the other three levers. The cognitive process model is introduced in Figure 1.

Cognitive process model and empirical directionalities
Service quality is the degree of discrepancy between the customer expectations for a service and the actual perceptions of performance (Parasuraman et al., 1985). Expanding on this definition, service quality is the overall evaluation of service performance, compared to the customer's general expectations of what a service should offer (Parasuraman et al., 1988). In general, researchers agree that expectations have a strong effect on the perceptions of service quality (e.g., Boulding et al., 1993; Cronin & Taylor, 1994). Customers may have different sources of information that lead to expectations about a potential service encounter with a particular provider. Among the listed sources, exposure to similar services of competitors, word of mouth, and company‐controlled communications are most relevant to the returns service context (Parasuraman et al., 1991).
We draw upon the signaling theory to establish the relationship between return policy leniency and PSQ (Spence, 1973). Signaling theory posits that sellers can use costly mechanisms to reduce information asymmetry between buyers and sellers, signaling positive characteristics such as quality. An efficacious signal is one that is costly to the sender and observable to the receiver (Connelly et al., 2011). Return policy leniency embodies both characteristics. Lenient return policies are costly to retailers. For example, higher time leniency allows consumers to return products after a potentially long trial period, resulting in revenue loss, decreased salvage value, and increased return handling costs. Offering full refunds (high monetary leniency) or making returns process hassle‐free (high effort leniency) may stimulate convenience and opportunistic returns, resulting in revenue loss and increased operational costs. Further, offering cash refunds as opposed to store credits or exchange‐only policies (high exchange leniency) leads to significant opportunity costs, as the retailer becomes unable to secure repeat transactions and exchanges. More generally, lenient return policies make retailers vulnerable to opportunistic and even fraudulent return behaviors, which further increases the costs of returns (National Retail Federation, 2021). With respect to observability, return policies are observable to consumers as retailers are required by law to clearly communicate their return policies in stores or on websites.
To reinforce the signaling theoretical perspective, we employ the “will–should” expectations framework to explain the ex ante impact of return policy leniency on PSQ (Boulding et al., 1993). The framework proposes two expectation standards—will and should expectations—and look at their influences on PSQ in a behavioral process model. “Will expectations” are those expectations that pertain to what will happen during a service contact with a provider. “Will expectations” can arise due to a company's service process descriptions, terms, and conditions. Prior experience with the same service provider can also generate such expectations. “Should expectations” are formed on the basis of a customer's perceptions regarding what is feasible and reasonable to receive in a service encounter. “Should expectations” can also arise due to experiencing or being told of competitors' services. Through field studies, Boulding et al. (1993) find strong evidence that both classes of expectations influence a customer's perceptions of service quality. Specifically, the study reveals that “will expectations” have a positive impact, whereas “should expectations” have a negative impact on PSQ.
The will–should expectations framework of service quality has implications for the returns service context. A clearly stated return policy sets customer “will expectations,” as the policy specifies what customers will go through if they have to use a retailer's returns service. Meanwhile, prior return experience or familiarity with the return policies of other retailers engender “should expectations.” When a customer faces the return policy of the retailer under consideration, we expect the customer's will and should expectations to activate. As a result, different levels of PSQ emerge, depending on the retailer's return policy leniency across different levers. Therefore, combining the signaling theory and will–should expectations framework leads us to the following proposition: Higher (a) monetary, (b) time, (c) effort, (d) scope, (e) exchange leniency leads to higher perceived quality of a returns service.
The concept of transaction costs has its roots in TCE theory, which posits that buyers and sellers experience costs embedded within different aspects of transactions (Coase, 1937; Williamson, 1989). The theory suggests that the price of goods and services is not the sole criterion that influences a buyer's decision. The buyer tends to have an overall assessment of nonprice transaction costs before deciding whether to engage in a particular exchange. Originally employed to examine interfirm exchanges, the TCE perspective has been applied by researchers in a multitude of exchange contexts between economic agents, including between consumers and retail firms (e.g., Griffis et al., 2012; Grønhaug & Gilly, 1991). Asset specificity, uncertainty, and transaction frequency are important characteristics of a transaction that generates transaction costs (Williamson, 1989).
In the retail exchange and consumer returns context, transaction costs encompass monetary‐type, time‐type, and psychological‐type costs (Chircu & Mahajan, 2006). Monetary‐type costs may include restocking fees, nonrefundable forward or return shipping fees, transportation costs, return packaging costs, etc. Time‐type costs capture time spent on initiating a return order, commuting to store, waiting for refund processing, searching for a new product to make an exchange, etc. Psychological‐type costs are hard to quantify yet are believed to have a strong impact on customer's perceived sacrifice while engaging in an exchange (Woodall, 2003). Costs of this type may pertain to mental and physical effort, stress, inconvenience, frustration, and annoyance experienced in the postpurchase period.
In line with the TCE perspective, consumers incur transaction‐specific costs (i.e., asset specificity) described above when searching for, deciding on, and purchasing a product from a retailer and when returning the purchase to the retailer. Significant uncertainty exists in such transactions due to uncertainties related to product fit, quality, and personal valuation (Abdulla et al., 2019). Return policies can be viewed as contractual mechanisms that govern the allocation of transaction‐specific costs generated due to these uncertainties between retailers and consumers. Customers who have prior experiences of purchasing and returning to a particular retailer (i.e., transaction frequency) are likely to have an overall assessment of transaction costs that they incur. We posit that regardless of transactional history, retailers, by offering a lenient return policy, can reduce ex ante PTCs of consumers and subsequently increase purchase intentions.
For instance, low monetary leniency (e.g., a 15% restocking fee) may increase the perceived risk of losing a certain amount of money in case of a product mismatch. Low time leniency implies opportunity costs to consumers who could otherwise procrastinate returning and may generate time pressure in assessing the fit and value of the product. Low exchange leniency (i.e., a store credit or exchange‐only policy) may increase psychological costs associated with the possibility of getting locked‐in with the retailer by imposing a requirement to choose one of the products offered by the retailer. Low effort leniency may trigger consumers to mentally simulate the return process and perceive high time‐, monetary‐, and psychological‐type transaction costs. Low scope leniency, such as being disallowed to return discounted products, may generate psychological costs such as anticipated regret. Thus, we have the following proposition: Higher (a) monetary, (b) time, (c) effort, (d) scope, and (e) exchange leniency leads to lower PTCs.
Mental accounting theory explains consumers' patronage and purchase decisions under risk and uncertainty (Thaler, 1985). The theory posits that consumers evaluate a transaction with a party in two stages: (1) evaluating the potential transaction (judgment process) and (2) approving or disapproving the transaction (decision process). To evaluate the transaction, consumers weigh the perceived utility against perceived disutility, which encompasses all types of transaction costs discussed above. Based on this mental accounting process, consumers realize a perceived net utility (i.e., value). The perceived value of the transaction then leads to behavioral intentions and outcomes of consumers.
Perceived value is defined as a consumer's overall assessment of the utility of a product or service based on perceptions of what is received and what is given (Zeithaml, 1988). The construct of perceived value has been studied extensively in both product (e.g., Simpson et al., 2019) and service (e.g., Buell & Norton, 2011) contexts in the OM domain. Many researchers report a strong impact of PSV on consumer behavioral outcomes such as loyalty, (re)purchase intentions, and positive word of mouth (Cronin et al., 2000; Kuo et al., 2009).
There is a significant body of service literature that provides evidence of a positive relationship between PSQ and PSV. For example, Bolton and Drew (1991) develop a conceptual model for assessing service performance, quality, and value. The authors apply the model to residential telephone services and find a significant, positive association between service quality and service value. Gooding (1995) finds that perceived quality is a key antecedent to PSV, where the latter largely determines the choice of a healthcare service provider. Andreassen and Lindestad (1998) report the same effect in the context of package tour services.
Research also documents the negative effect of PTCs on PSV. In fact, following the standard definition of service value as the difference between what is received and what is given, PTCs stand for the sacrifice involved in a service exchange. Most of the works on this theme use the constructs of perceived sacrifice, perceived risk, or perceived cost to represent PTCs in our model. For example, Spreng et al. (1993) report that a consumer's anticipation of future sacrifice, including purchase, psychological, and time costs, has significant effect on the ex ante PSV. Gooding (1995) operationalizes perceived sacrifice through distance, transportation time to a hospital, and out‐of‐pocket costs, providing empirical evidence in which perceived sacrifice (i.e., transaction costs) decreases PSV. By defining PTCs as a sum of price, time, and effort, Brady and Robertson (1999) show a negative association between PTCs and PSV. Finally, the marketing literature provides significant evidence from broad service contexts in which PSQ and PTCs are key antecedents of perceived value (Cronin et al., 2000; Dodds et al., 1991; Teas & Agarwal, 2000). Combining the mental accounting perspective and perceived value framework, we have the following proposition: PSQ is positively and PTCs are negatively associated with the perceived value of a returns service.
A return service is a postsales service included in a retailer's overall service bundle and consists of policy and process aspects (Mollenkopf et al., 2007). Policy aspects include the terms and conditions of the returns service offering and are communicated through formalized return policy statements. Therefore, a return policy is a fundamental element of the returns service design and is subject to consumer evaluation (Davis et al., 1998). In addition, a return policy is a means of marketing communication that informs consumers what to expect in case they need to return a purchase. As discussed earlier, we can characterize any return policy by the degree of leniency offered across five levers. We contend that all five leniency levers are important design levers for a retailer's return service and can be used to influence the perceived value of the returns service. These arguments together with a logical combination of Propositions 1–3 lead us to the following proposition: Higher (a) monetary, (b) time, (c) effort, (d) scope, and (e) exchange leniency leads to higher perceived value of a returns service.
Perceived value is found to be the most important predictor of purchase intentions in a variety of service settings (Baker et al., 2002; Parasuraman, 1997; Parasuraman & Grewal, 2000). With respect to our context, research on return policies lends support where the overall leniency of a return policy is positively associated with purchase intention (Bonifield et al., 2010; Oghazi et al., 2018). Jeng (2017) finds that the perceived value of a return policy mediates the relationship between overall return policy leniency and purchase intention. Therefore, we state the final proposition: Higher (a) monetary, (b) time, (c) effort, (d) scope, and (e) exchange leniency leads to a higher intention to purchase from a retailer, by increasing the perceived value of the returns service that the retailer offers.
Based on the theoretical grounding above, we now present the formal hypotheses regarding the proposed cognitive process for a consumer's returns service valuation and the resultant purchase intention by anchoring on the degree of leniency across five levers. In particular, we posit that given two return policies, ceteris paribus, the policy with a higher leniency in one of the levers results in higher PSQ and lower PTCs, which then leads to a higher PSV, and ultimately results in a higher purchase intention. More formally, we hypothesize: PSQ and PTCs in parallel and PSV in series mediate the relationship between (1) monetary, (2) time, (3) effort, (4) scope, and (5) exchange leniency and purchase intention, such that higher leniency leads to a higher purchase intention through increased PSQ, decreased PTCs, and increased PSV.
We hypothesize on the positive effects of leniency across all five return policy levers on purchase intentions through the cognitive process model described above. However, the overarching theoretical model also leads us to expect significant heterogeneity among the levers in terms of their relative effectiveness. As signals of service quality, leniency levers would have different signal strengths or salience (Connelly et al., 2011). In particular, unlike high time, effort, and scope leniency, high monetary and exchange leniency involve direct financial risks and constitute cost‐risking signals for the retailers (Kirmani & Rao, 2000). The implied costs to low‐quality retailers, therefore, are higher when monetary and exchange leniency is high relative to high leniency across the other three levers. Consequently, high monetary and exchange leniency would become a stronger, more salient quality signal and increase PSQ of the retailer. Further, consumers would perceive greater transaction costs when monetary and exchange leniency is low due to direct financial losses, in addition to associated psychological transaction costs (Chircu & Mahajan, 2006). Due to their potentially stronger influences on PSQ and PTCs, we expect a stronger effect on PSV and subsequently on purchase intention from monetary and exchange leniency levers relative to time, effort, and scope levers. Hence, we hypothesize: Monetary and exchange leniency levers have stronger effects on a consumer's purchase intention than time, effort, and scope levers.
EMPIRICAL METHOD
To test the research hypotheses, we use primary data collected from six experimental studies involving US‐based consumers. The first five of these studies are prestudies that are conducted to (1) develop robust measurement scales, (2) empirically validate the constructs in our model, (3) pretest the experimental vignettes and manipulations, and (4) inform various design trade‐offs and choices for the main study (Eckerd et al., 2021). In Sections 3.1 and 3.2, we provide background regarding the first two purposes. Detailed discussions that are related to purposes (3) and (4) are provided in Supporting Information Appendix A.
All prestudies and the main study are designed using Qualtrics and are conducted via Amazon Mechanical Turk (MTurk) online crowdsourcing platform. MTurk enables data collection from a diverse and representative consumer population, which is important in terms of the external validity of the empirical findings (Berinsky et al., 2012; Goodman & Paolacci, 2017; Paolacci & Chandler, 2014). The platform also allows participants to complete studies in their natural living or working environments—which mitigates concerns regarding observer effects—and provides a lab‐in‐the‐field setting. Each participant received a compensation of $1.00 for completing a study, which took, on average, 7 minutes. On an hourly basis, the payment is significantly higher‐than‐average incentives paid to MTurk workers ($1.66; Paolacci & Chandler, 2014) as well as above the federal minimum wage ($7.25; US Department of Labor, 2021).
All experimental studies use vignette‐based methods. Experimental vignette methods, particularly when combined with diverse samples such as ours, are viewed as an effective, balancing solution to the methodological dilemma in choosing between conventional lab experiments with questionable external, but high internal validity and nonexperimental or quasi‐experimental methods that provide greater external validity, but engender many threats to internal validity (Aguinis & Bradley, 2014). Experimental vignettes are particularly relevant when the goal is to study cognitive‐affective perceptions (Eckerd et al., 2021). Service contexts provide an ideal setting for vignette‐based experiments because participants tend to be familiar with the contexts and can easily engage with the described situations or, more applicable to our context, innately comprehend the information provided in the vignettes (Eckerd et al., 2021; Rungtusanatham et al., 2011).
Measurement scale development and construct validation
We go through a rigorous scale development and construct validation process through a multisample, experimental approach with a replication logic (Pagell, 2020) for two key reasons. First, well‐established measurement scales are not readily available in the extant return policy literature to measure the constructs of interest in our empirical model. Second, our research objective is to provide a comparative assessment of different leniency levers, and robust measurement scales are crucial for conducting such an assessment with reliable results. To this end, the five prestudies involve joint experimental manipulations of different leniency levers, enabling a cross‐validation and stress testing of measurement scales.
In developing the measurement scales and validating the constructs, we follow the methodological practices recommended by O'Leary‐Kelly and Vokurka (1998) and MacKenzie et al. (2011). We start by generating measurement items to capture the domain of the constructs of interest. To do so, we analyze the existing return policy literature and theoretical research in marketing to generate an initial set of measurement items. These measurement items are either adapted from the scales in the existing literature to the context of our research or developed, as needed, based on theoretical and conceptual foundations of the constructs. All measurement items use a 7‐point Likert scale. We empirically validate our constructs by assessing four key components: unidimensionality, reliability, convergent validity, and discriminant validity. The full set of the measurement items and details of our construct validation process are provided in Supporting Information Appendix D.
Main study
Following the prestudies that establish empirically validated constructs and measurement scales, this section presents the main study in which we formally test our hypotheses. The next section discusses the details of the experimental procedure and sample characteristics, while Section 3.1 presents the data analysis and results.
Experimental procedure and sample
The main study has a completely randomized, full‐factorial between‐subject design (i.e., five levers manipulated at low vs. high leniency levels constituting 32 cells). In designing the experimental manipulations of each leniency lever, careful consideration was given to achieve a balance among the following: (1) the operationalization of low and high levels of leniency should be realistic and actionable from a managerial perspective (Bachrach & Bendoly, 2011), (2) the low and high levels of leniency need to constitute significant contrasts to be salient (Rungtusanatham et al., 2011), and (3) the nature of manipulations are aligned with the existing literature (Abdulla et al., 2019). To this end, the prestudy phase provided significant conceptual knowledge and preliminary empirical insights. In addition, we made a number of observations from practice to inform the design of the main study. We provide details for the design process in Supporting Information Appendices A and B.
A total of 840 participants (45.1% females) were randomly assigned to one of the 32 treatment conditions. Participants were asked to view a vignette designed as a website of retailer “ABC,” which was presented as one of the large retailers in the United States who sells products in multiple categories through online and offline channels. Our choice of a multiple‐category retailer based in the United States as the vignette context for this study is predicated on the following. First, the choice allows estimating the average treatment effects of different return policy leniency levers in a broad context (i.e., without priming participants on a narrow set of product types or price levels). Indeed, understanding the effect of different leniency levers averaged across all potential product categories and/or price levels would be of practical significance to the large retailers (i.e., big‐box stores, department stores) who may have negative connotations about offering complex return policies with many category‐based exclusions. Second, we surveyed the top 20 US‐based retailers by sales revenue. We found that 16 of them were general merchandise stores (i.e., department stores, big‐box stores) and e‐tailers that carried multiple‐category assortments sold through online and offline channels (the remainder was supermarkets specialized in grocery). Considering that the sales volume of these retailers constitute a significant percentage of all US retail sales (i.e., $1.4 of $5.5 trillion as of 2019), at an average return rate of 10%, these retailers alone would account for approximately $140 of $369 billion of annual returns in the United States (National Retail Federation, 2019). Therefore, among a myriad of other options, choosing a multiple‐category retailer selling through both online and offline channels for the vignette served the purpose for broader managerial relevance.
The vignette is designed to include common features in a typical retailer website (e.g., search box, product categories, store finder, member login, etc.) in addition to the return policy statement. This is to increase the realism and ecological validity of the experimental environment (Aguinis & Bradley, 2014). To check the perceived realism of the experimental vignettes by participants, we asked two realism check questions. In particular, on a 7‐point Likert scale with 1 (Strongly disagree) to 7 (Strongly agree), the participants indicated a score with mean = 5.110 (SD = 1.350) to the statement “The ABC website presented in the study carried most of the elements that I find in other retailers' websites.” Second, the participants indicated a score with mean = 5.255 (SD = 1.100) to the statement “ABC return policy was realistic in its wording and format considering other return policies I have seen.” The vignettes for the highest and the lowest leniency (across all levers) treatment conditions are provided in Supporting Information Appendix C.
Demographic characteristics of the sample were comparable with those from prestudies (see Table A1 in Supporting Information Appendix A). At the end of the study, participants were required to answer two attention check questions, asking the restocking fee amount and return time window indicated in the return policy (Abbey & Meloy, 2017). These attention check questions qualify as factual manipulation checks (Kane & Barabas, 2019), meaning that (1) they have objective, correct answers and (2) they are directly related to the experimental manipulations used. Factual manipulation checks are considered more effective compared to other common types of manipulation checks, such as instructional or subjective manipulation checks (Oppenheimer et al., 2009). Of the 840 participants who completed the study, 650 participants (77.4%) answered the two attention checks (factual manipulation checks) correctly and whose locations were verified to be in the United States are included in the final analysis reported in the next section. In the final sample, we did not observe significant imbalances in terms of the number of observations per treatment condition, which ranged between 19 and 22.
Analysis and results
To test the hypotheses, we perform a regression‐based mediation analysis with the nonparametric bootstrapping approach using the PROCESS macro for SPSS (Hayes & Little, 2018). This is a modern method to reliably estimate mediation effects with adequate statistical power (Hayes & Little, 2018; Rungtusanatham et al., 2014).
We use 10,000 bootstrap resamples for the analysis. In all regression equations estimating the path coefficients, we also include several covariates to further improve the statistical power and precision of the estimates (i.e., to reduce the standard errors). These covariates and definitions are reported in Table 2. Latent variable scores are calculated as averages of the respective measurement items. Again, the measurement scales and constructs are successfully validated via an exploratory factor analysis (EFA), using the same two‐factor extraction‐rotation methods as in the prestudies, and via confirmatory factor analysis (CFA) (see Supporting Information Appendix E for details). Descriptive statistics for the latent variables are provided in Table 3 and the intervariable correlations are reported in Supporting Information Appendix E. Figure 2 shows the statistical diagram for the mediation analysis and notations to facilitate the discussion.
Covariates included in the mediation analysis
Descriptive statistics for focal variables
Abbreviations: PSQ, perceived service quality; PSV, perceived service value; PTC, perceived transaction cost.

Statistical diagram and notations for mediation analysis
Table 4 reports the results of the analyses and includes the estimated individual path coefficients, direct effects, total indirect effects, and all path‐specific indirect effects with 95% percentile‐based bootstrap confidence intervals (henceforth CI for brevity). A significant mediation effect exists when the CI for the estimate of an indirect effect does not contain zero. The analyses provide support for Hypotheses 1–5. In particular, we find statistically significant between‐subject mediation effect for all five levers. Further, we find that PSQ, PTC, and PSV fully mediate the relationship between return policy leniency across a given lever and PI, evidenced by the fact that the direct effects of leniency on PI are not statistically significant when PSQ, PTC, and PSV are accounted for. By estimating the contrast between the two indirect effects (through PSQ vs. PTC), we find that in transmitting the effect of monetary leniency onto PI, PTC is a significantly stronger mediator (
Mediation results
Abbreviations: M, monetary; T, time; F, effort; S, scope; X, exchange leniency.
Covariates: Nonfocal levers,
Next, we compare the total indirect (mediation) effects of the five levers on PI and find support for Hypothesis 6. In particular, we find that monetary leniency—operationalized through practically common cases of a full refund vs. 15% restocking fee—proves to be the most effective lever in influencing purchase intentions through the mediators. The second most effective lever is the exchange lever, which is manipulated through offering a cash refund versus store credit only. The effort lever comes third, with a slightly higher total indirect effect compared to the time lever. Scope leniency, operationalized through whether sales items are allowed to be returned, demonstrates the smallest effect. Figure 3 provides a visual representation of the total indirect effects and CIs for each lever. In order to test whether the differences are statistically significant, we estimate the differences between total indirect effects of adjacent levers in the effect size rank ordering (i.e., adjacent contrast effects), using nonparametric bootstrapping. The point estimates for these differences and the 95% bootstrap CIs are provided in Table 5. Overall, while we find that there are statistically significant differences among the mediation effects observed, the practical significance of the differences among effort, time, and scope leniency levers in particular is not very high.
Statistical tests for the significance of the differences between total indirect effects

Total indirect effects of the leniency levers
Finally, we also analyze the interactions between different leniency levers in their impact on PI through the mediation mechanism that we study. To do so, we estimate an ANCOVA model that includes all leniency levers as fixed factors, PI as the outcome variable, and the remaining variables included in the mediation analysis above as covariates. We find no statistically significant (at
Robustness and generalizability of findings
Results of the main study suggest statistically significant and heterogeneous causal effects of the five leniency levers on purchase intentions through their influences on PSQ, PTCs, and PSV. One may wonder about the role of contextual dependencies (i.e., moderators) in the relative effects of the five levers in order to gauge the generalizability of the findings. Relevant contextual dependencies would be the assortment focus of the retailer (e.g., general merchandise vs. specialized stores), sales channel (i.e., online vs. offline), product category (e.g., electronics vs. apparel), and price (i.e., high vs. low) of products being considered for purchase. We know from the meta‐analysis conducted by Janakiraman et al. (2016) that the assortment focus of a retailer and the sales channel does not moderate the effect of return policy leniency on purchase intentions. Further, our empirical results from prestudy (1), where we manipulated the product category as electronics versus apparel (two product categories with the highest return rates and volumes), suggest that product category may not be a significant moderator. However, this is by no means conclusive and warrants further examination.
We experimentally examined the moderating role of price in determining the effect of monetary and exchange leniency (two levers with a significant financial risk) on consumers' purchase intentions. The experiment had a six‐cell (partial factorial) design, price level (low = $50, high = $500), monetary leniency (low = 15% restocking fee, high = no restocking fee), and exchange leniency (low = no cash refund/store credit only, high = both cash refund and store credit as options) as the manipulated factors. We excluded low monetary and exchange leniency conditions from this study because they (i.e., charging a restocking fee on a store credit) are unrealistic from a practical standpoint. We chose desk chairs for the purchase scenario because they have considerably high price variability in the market and significant product fit/quality uncertainty. We used experimental vignettes that are consistent with our earlier studies. As a result, we did not find a significant moderating effect of the price level of the product being considered for purchase, for either leniency levers that we tested (see Supporting Information Appendix F for detailed results). As such, we cannot reject the null hypothesis that price too, is not a significant moderating factor.
The prestudy phase provides supporting empirical evidence that consumers do have a general perception of what leniency levers matter most and perceive a consistent hierarchy of importance. Specifically, we asked participants to indicate on a Likert scale of (1) Most important to (7) Least important, how important they find leniency across different return policy levers in choosing which retailers to shop (i.e., no restocking fees (monetary), a long return time window (time), easy and hassle‐free return process (effort), not only store‐credit/exchange but also cash refund option (exchange), sale and clearance items allowed to be returned as regularly priced items (scope)). The statements associated with the five leniency levers were provided in a randomized order to each participant. Here, we found consistent evidence across all five prestudies (which manipulated different subsets of these five levers) that on average, consumers have an order of importance of leniency across different levers (see Supporting Information Appendix A). In particular, all five prestudies revealed that on average, monetary leniency was by far the most important, followed by exchange leniency, followed by effort, which was followed by scope and time leniency. Multiple
CONCLUSION
In light of the empirical findings from the main study, this section discusses managerial and theoretical implications of our research and highlights a number of limitations and future research opportunities.
Overall, our findings support the contention that the value of a retailer's returns service is an important determinant of a consumer's purchase intention. We also demonstrate that each of the five levers that constitute return policy leniency is a significant antecedent of the perceived value of a returns service, through each lever's ability to ex ante signal PSQ and PTCs. However, our investigation further reveals heterogeneous effects of different leniency levers. This implies that retailers need to be careful in choosing the right leniency levers to tighten the return policies in order to reduce the operational cost burden, and our findings provide new insights regarding this decision as we discuss below.
We find that monetary leniency is the most effective lever in influencing the cognitive perceptions that ultimately impact purchase intentions. The positive effect of monetary leniency on purchase intentions has been documented in the prior literature. Extending this, we show that the effect of monetary leniency dominates the effects of the other four levers. Indeed, monetary leniency has the greatest salience among all levers in terms of financial risk. Consequently, anticipated regret can also be expected to be the greatest when leniency along this particular lever is low, increasing PTCs (Inman & Zeelenberg, 2002). From the retailer's perspective, offering a full refund as a part of the returns service offering would send strong signals regarding the retailer's understanding of consumer needs and willingness to absorb the consumer's risk regarding product fit and valuation uncertainty (Abdulla et al., 2019) and thereby significantly stimulate returns service quality. The financial costs that a retailer bears from offering a full refund for all returns is also more salient to the consumer compared to cost of offering leniency across other levers. This may also explain the dominance of the monetary leniency lever over the other levers.
A direct managerial implication of this finding is that retailers who impose or consider imposing restocking fees should be aware of strong negative perceptions and subsequent consumer behaviors (e.g., decreased purchase intention, negative word of mouth, switching). For example, many retail stores, such as Best Buy, Macy's, and Sears, currently charge restocking fees on returns made in select product categories (ConsumerWorld.org, 2018). Our results suggest that in these product categories, such retailers may be losing customers to their competitors who offer a full refund. Anecdotal evidence supports the view that customers typically avoid purchasing from retailers who charge restocking fees, finding such fees unfair.
A more interesting finding is the second strongest impact of exchange leniency, previously unexplored in a causal framework, on purchase intentions through its impact on PSQ, PTCs, and PSV. Research on mental accounting has established that restricted‐use funds, such as store credits, are evaluated and spent differently than equivalent cash, even in the context of the same retailer (Reinholtz et al., 2015). Our research suggests that offering a return policy that allows customers only to make an exchange or receive store credit in case of a return, instead of getting a cash refund, can be detrimental to the perceived value of a returns service and reduce purchase intentions ex‐ante. The strong impact of the exchange lever would be due to financial risks associated with customer lock‐in with the retailer (Johnson et al., 2003; Zauberman, 2003). In fact, this explanation is supported by the anecdotal accounts from several participants at the end of the main study.
The overall conclusion is that retailers should avoid restricting their return policies through monetary and exchange levers in order to reduce the cost burden of returns and make the returns service more sustainable from an operational standpoint. Instead, retailers should consider opportunities across the remaining three levers. For example, our results imply that a longer return time window may not have a strong, positive effect on a consumer's ex‐ante returns service value perception and purchase intention. In fact, almost 60% of the participants in the aggregated sample of the prestudies indicated in a survey question that they would typically need less than a week to make a keep/return decision, with only 10% of the participants indicating that they would need more than two weeks. Therefore, providing excessively long return time windows, such as several months, may not provide any significant advantage to retailers. Rather, driven by customer inertia and procrastination, a time‐to‐return distribution with a long tail may cause unsustainable losses in recoverable value (Ferguson et al., 2006; Su, 2009b). This is because the longer the time it takes for a product to be returned after purchase, the larger the deterioration in salvage value of the product (Blackburn et al., 2004). Moreover, products that are returned late tend to have fewer disposition options that can generate value to retailers (Shang et al., 2019). Thus, by reducing the time window, retailers can better reduce the operational cost of returns. From this perspective, it is noteworthy that many of the recently restricted return policies involve tightening return time windows, such as the policies of Macy's, L.L. Bean, and Bed Bath & Beyond (ConsumerWorld.org, 2018).
Decreasing effort leniency by imposing additional hassles, such as tag and original packaging requirements or asking customers to fill a return authorization form, may also help prevent some of the returns ex post, with a relatively smaller negative impact on purchase intentions. As a case in point, Nordstorm, well known for its generous return policy, has recently imposed tag requirements for special‐occasion dresses and designer items. The smaller effect of effort leniency on ex ante purchase intentions relative to monetary and exchange leniency is likely to be due to expected transaction costs of nonmonetary nature (i.e., psychological, time, physical) that loom less than financial costs implied by monetary and exchange levers.
Further, we show that consumers may not dramatically decrease their value perceptions and purchase intentions if a retailer disallows discounted (i.e., sales or clearance) products being returned relative to when it applies a standard return policy for discounted products. The practically small effect of the particular operationalization of scope leniency in our context mirrors the dual entitlement principle (Kahneman et al., 1986). The dual entitlement principle posits that most consumers believe that they are entitled to a reasonable price and that firms are entitled to a reasonable profit. Knowing that offering products at discounted prices implies giving up on the usual sales revenue the retailer would be entitled to, consumer may feel that the retailer can also fairly disallow returns on discounted products, in order to attain a reasonable profit margin. Thus, decreasing scope leniency, particularly by disallowing returns for discounted products, provides another viable opportunity for retailers to decrease the burden of returns while keeping negative reactions at a minimum. Such a strategy, particularly when combined with effective pricing tactics (i.e., seasonal discounts, individualized pricing) can prove a better alternative to imposing restocking fees and other nonrefundable charges with similar or a greater positive financial impact. Many apparel retailers, such as Gap, Tommy Hilfiger, and Michael Kors, do not allow returns or exchanges for final sale items. Dillard's does not allow returns for clearance sales and also the products sold under stacked discounts (e.g., items marked down 20% plus an additional 30% discount for a limited time). Overall, our research provides actionable guidelines to retail managers regarding how to make return policies more sustainable from an operational perspective, without significantly deteriorating consumer value perceptions and patronage intentions.
Our research has multiple theoretical contributions to research on consumer return policy design and, more generally, on service design. First, while the previous literature on consumer return policies has predominantly focused on the quality signaling aspect of return policy leniency, our research postulates and demonstrates that return policy leniency can also influence transaction cost perceptions. In fact, we demonstrate that the strength of influence on purchase intentions through transaction cost perceptions is statistically no different than through service‐quality perceptions for all leniency levers, except the monetary lever. Thus, accounting for both service quality and transaction cost perceptions provides a more complete cognitive process model predicated on a PSV framework. In turn, this provides greater explanatory power to understand the relationship between return policy leniency and consumer perceptions and purchase intentions. A broader theoretical implication is the importance of recognizing that service design decisions of firms may not only influence consumer behavioral outcomes through the service‐quality mechanism but also through the transaction cost mechanism. As a result, different service policy levers may influence both service quality and transaction cost perceptions, resulting in heterogeneous effects on behavioral outcomes.
Second, we test value perceptions regarding a service and the resultant purchase intentions solely based on the service policy itself without any exposure to the process aspects of the service. An important theoretical implication is that certain attributes of a service—as indicated in service terms and conditions—can ex ante stimulate quality, transaction costs, and value perceptions that influence a consumer's patronage decisions. This highlights the importance of empirical research with respect to policy aspects of service design, in addition to the process aspects that are more commonly studied in the OM domain. Continued research can extend the boundaries of our cognitive model to other service contexts commonly studied in the OM literature, such as credit card, insurance, travel, and ticketing services.
Third, our cognitive process model is predicated on multiple theoretical perspectives and provides a generalizable framework to examine, in a comparative sense, the influence of different service design levers on the consumer valuation process and subsequent behavioral outcomes. Our model, combined with the study design, is a particularly good fit to study service contexts where levers, attributes, or strategies of interest involve trade‐offs from the firm's perspective (e.g., retailers typically do not simultaneously manipulate multiple levers while changing a return policy but rather try to choose which lever to optimize) and not necessarily from the consumer's perspective (e.g., participants did not have to make a trade‐off between leniency across different levers and could value leniency across all levers). To illustrate a future application of our approach in a different service operations context, researchers can study how different omni‐channel capabilities available to today's retailers would be compared in their effects on service quality, transaction cost perceptions, and resultant behavioral intentions and outcomes.
Abdulla et al. (2019) call for continuing analytical research that studies nonmonetary leniency levers, decisions across multiple levers, as well as design of return policies that involve the interaction amongst levers (e.g., time‐based restocking fees). Our empirical findings have implications for continuing and growing analytical OM research involving consumer return policies. First, as we find no significant interaction effects across multiple pairs of leniency levers, the utility gain due to return policy leniency can be reasonably modeled through an additive, rather than a multiplicative, functional form. Second, it is important to recognize the differences across levers in terms of their effect on purchase intentions while modeling aggregate demand or individual consumer purchase decisions. For instance, the marginal impact of monetary and exchange leniency on aggregate market demand should be modeled to be greater than the marginal impact of scope and effort leniency. Similarly, in terms of individual consumer utility, the ex ante utility gain from purchasing due to monetary or exchange leniency of return policies needs to be modeled as greater, by a factor, than that due to time and effort leniency.
There are a number of limitations to our work that we believe can motivate future research. First, we examine cognitive perceptions and purchase intentions but not manifest behavioral outcomes such as actual purchases. This allowed us to study consumer attitudes toward a retailer based on the return policy leniency operationalized through five levers available to the retailer. The theory of planned behavior and empirical evidence from marketing literature suggests that purchase intentions can reasonably predict actual purchase decisions (Ajzen, 1985; Chandon et al., 2005). Still, testing how return policy leniency across the five levers influences different behavioral outcomes when real monetary stakes are involved (i.e., purchase and return transactions) can be an interesting future research. This would strengthen the findings and implications by establishing predictive validity.
As a matter of scope, we propose a high‐level, parsimonious theory and document consistent empirical evidence for heterogeneous average treatment effects of leniency across different levers on purchase intentions. We do so by focusing on a broadly applicable context motivated from practice. Though we empirically examine product category in a prestudy and price level in a poststudy and find no statistically significant effect of these two contextual factors on the average treatment effects, our inquiries along these lines are not exhaustive. Future research can examine more systematically and exhaustively the role of different practically relevant contingencies on the overarching treatment effects of each leniency lever on purchase intentions.
Another line of fruitful research would be to investigate consumer perceptions of the complexity of return policies (e.g., the number of category‐specific policies in the overall return policy offering of a retailer) that could also generate interesting insights. Here, researchers can note that imposing category‐based exclusions to a standard return policy would be another form of low scope leniency. Given that many retailers have tremendous amount of transactional data on consumers' purchase and return behavior, an examination of the feasibility of and consumer reactions to personalized return policies (in a similar vein to personalized pricing) would also be a fruitful direction for future research.
In this paper, we provide insights for retail practitioners on how to make return policies more sustainable from an operational perspective, while minimizing damage to the value proposition offered to customers. Future research is needed to understand if there might be negative implications of restricting a long‐established lenient return policy due to negative signaling (Connelly et al., 2011). Another interesting future research avenue would be to investigate how retail managers evaluate potential costs and benefits of leniency across each lever, to understand whether the managerial perceptions are aligned with those of consumers.
