Abstract
Hotels that offer flexible cancelation terms often witness a high number of canceled bookings. A common assumption in the hotel industry is that cancelations are driven by random and exogenous factors beyond customers’ control. However, in transaction data obtained from a high-end U.S. hotel partner, we find that the cancelation rate increases substantially with the booking price. This points to the possibility that the booking price can be an important driver for cancelations. To investigate this relationship, we employ a hazard model and find empirical support. Specifically, a $50 increase in the booking price results in a 16% increase in the hazard of a cancelation. This finding is robust to several alternative model specifications, different subsets of the data, and an alternative operationalization of the booking price. Combining the reservation data with price data from competing hotels nearby, we test for customers’ continued price search after booking as a potential mechanism and show that it mediates 27% of the total effect of booking price on the hazard of a cancelation. Driven by this empirical evidence, we conduct a counterfactual analysis and find that the hotel may lose as much as 11% in revenue during high season by ignoring the effect of pricing on cancelation.
Introduction
Cancelations have long been a headache for hotel managers. Roughly 23% of hotel bookings in Europe were canceled in 2023 across various online booking channels, with the value of cancelation ranging from 18% for direct channels to 42% for some online booking platforms (D-Edge, 2024). Compared to many other services, hotels tend to offer generous cancelation policies (Freed, 2016). As of August 2025, some major hotels, such as Extended Stay America, still allow guests to cancel as late as 6 p.m. on the day of arrival. 1 The issue of high cancelation rates can be further exacerbated by social disruptions, for example, during the COVID-19 pandemic, a record 71% of hotel reservations were canceled on Expedia, and the cancelation rate for other booking channels varied from 32% to 63% (D-Edge, 2021). Although many hotels permit penalty-free cancelation until shortly before arrival, cancelations occurring well in advance can still impose meaningful costs, since they increase demand uncertainty, distort revenue-management forecasts, and can lead to suboptimal pricing and inventory decisions, especially during high-demand periods when released rooms may be difficult to resell at comparable rates.
Unsurprisingly, hotel cancelations have received considerable attention in the existing literature. There is a substantial body of work that explores how the likelihood of cancelations can be affected by various factors including booking channel, customer segment, seasonality, and cancelation policies (Antonio et al., 2017; Zakharya et al., 2011). With the exception of cancelation policies, almost all the factors considered in the literature are exogenous and cannot be directly controlled by hotels. To counter the high cancelation rates, hotels may tighten cancelation policies by imposing cancelation fees or offering non-refundable booking prices (Jet, 2017). Even though stricter refund terms can help reduce cancelations, hotels run the risk of leaving customers unhappy, incurring goodwill costs in the long term, and facing the potential erosion of their brands. This is especially true for high-end hotels. The two biggest hotel chains, Hilton and Marriott, only recently limited their no-penalty cancelation policies to 48 hours prior to arrival (Goldstein, 2017).
Other than cancelation policies, are there other factors controlled by hotels that may affect customers’ cancelation behavior? A prime candidate is booking price, which is actively managed by hotels practicing revenue management. Anecdotal evidence suggests that booking price is an important factor in customers’ cancelation decisions. For instance, advanced group blocks are prone to cancelation due to customers’ speculative bookings to lock-in low rates (Freed, 2016). New technologies have also empowered customers to track hotel prices. A handful of websites, such as Tingo, Hopper, and Booking.com, send price-drop alerts to customers, even after reservations have been made. Yet, existing revenue-management systems do not account for the relationship between booking prices and cancelations when setting prices and managing cancelations.
In this paper, we examine how booking prices at the time of reservation influence the cancelation rate using booking data from a high-end independent hotel in a major city in the United States. The hotel is located in an area with many competing hotels nearby. The data we gathered from this hotel are unique for the following reasons. First, the data contain the entire booking history of the focal hotel across different channels with arrival dates between May 2022 and October 2023, including daily prices even for days with no bookings. Second, the data allow us to rule out two potential confounding factors in the analysis, namely customer loyalty programs and the existence of different cancelation terms. Customers enrolled in a hotel’s loyalty program may be less likely to cancel than customers who are not enrolled, and some hotels have the practice of offering different booking prices with different cancelation policies simultaneously, which would directly affect the relationship between booking prices and cancelations. Instead, our focal hotel is an independent hotel that does not have a loyalty program and only offers the same cancelation policy at all times. As a result, these potential confounding factors are not a concern for our analysis.
We employ an empirical strategy involving survival analysis, which allows us to investigate the role of booking price—along with other variables—on the hazard of a cancelation. Throughout the paper, we use price to refer specifically to the realized booking price at the transaction level, which is set by the hotel’s revenue-management system. In our empirical setting, pricing primarily serves as an operational revenue-management tool aimed at revenue production rather than broader strategic objectives such as market share expansion or long-term growth. Results from this analysis indicate that booking price is a significant factor in cancelations. The survival model predicts that a $50 increase in the booking price will result in a 16% increase in the hazard of a cancelation, even when controlling for other factors such as room type, channel, seasonality, and time lags. We check the consistency of our results to alternative model specifications, subsets of data, an alternative operationalization of the booking price, and potential endogeneity concerns. Specifically, we use coarsened exact matching (CEM) to control for potential omitted variables that could drive cancelations. Our result is robust across all these analyses. Finally, we propose a potential mechanism driving our results—continued price search behavior (i.e., customers cancel when lower prices are available with nearby competitors). Combining the reservation data with price data from competing hotels nearby, we test for customers’ continued price search behavior as a potential mechanism and show that it mediates 27% of the total effect of booking price on the hazard of a cancelation. Considering this result, we perform a counterfactual analysis to investigate the effect on the firm’s revenue should this relationship be ignored. The results suggest that depending on the specific room type and season, the hotel could experience a revenue loss of up to 11% if the effect of cancelations is not considered when making the pricing decisions.
Related Literature
This study is closely related to three streams of literature: (a) empirical and practice-driven studies that explore the likelihood of cancelations in the hotel industry; (b) revenue-management models that consider customer cancelations and no-shows; and (c) customer returns affected by pricing in retail settings.
Researchers have taken numerous paths to predict cancelations in hotel and travel industries, including statistical models (e.g., Falk & Vieru, 2018; Iliescu et al., 2008), simulation (Zakharya et al., 2011), and data mining (e.g., Antonio et al., 2017, 2019; Romero Morales & Wang, 2010; Sánchez-Medin & C-Sánchez, 2020). With the exception of refundability, this literature mainly considers factors that are exogenous, such as booking dates, arrival dates, agents, channels, room types, and customer types. Even though refundability is an endogenous factor that hotels can manage and control, refund policies cannot be changed frequently. An investigation of the role of pricing is missing. Unlike refund policies, pricing is a tactical decision that can be adjusted frequently. Our work complements this literature by quantifying the effect of pricing on cancelations.
In the revenue-management literature, cancelations have been studied since its early days (e.g., Rothstein, 1971, 1974). In line with the empirical literature, cancelations are usually treated as exogenous and passive, driven by changes in customers’ state or need (see Baker & Collier, 2003; Liberman & Yechiali, 1978; Subramanian et al., 1999; Talluri & van Ryzin, 2004; Weatherford & Bodily, 1992, and references therein). As a primary method to manage cancelations, overbooking has been extensively studied (e.g., Gallego & Phillips, 2004; Karaesman & van Ryzin, 2004; Liang & Anderson, 2023; Wilson et al., 2006), whereas pricing is largely taken as given (except for some recent studies such as Yilmaz et al., 2017). In contrast, the literature that studies pricing with limited inventory ignores cancelations with a few exceptions, such as Xie and Gerstner (2007) and Hu et al. (2018), which mainly focus on exogenous cancelation rates. Therefore, the effect of pricing on customers’ cancelation behavior, and consequently on hotels’ booking policies, remains understudied.
To some extent, cancelations of hotel reservations are similar to customer returns of physical goods in the retail industry. The extant literature on customer returns typically considers exogenous reasons such as the mismatch between customer taste and the purchased products or valuation uncertainty (e.g., Akcay et al., 2013; Che, 1996; Chen, 2011; Courty & Li, 2000; Davis et al., 1995; Huang & Zhang, 2020; Shulman et al., 2015; Su, 2009). A strand of this literature studies the intrinsic relationship between pricing and customer returns. Anderson et al. (2008) use a dataset on women’s clothing to establish empirically that customer return rates increase with the price paid. Powers and Jack (2015) use customer survey data to show that finding a better price elsewhere is among the primary reasons for retail product returns. Ketzenberg et al. (2020) analyze customers’ abusive return behavior using a large transactional dataset. Lee et al. (2023) explore the relationship between impulse purchase and consumer returns and show that they may result in undesirable product shortages. Based on transaction data from an international online retail market, Bandi et al. (2018) find opportunistic return behavior upon price drops. Samorani et al. (2019) consider customer returns as part of a product-search process and show that even though customers are more likely to return a high-priced item, they may also have a higher chance to buy another item upon the return of the previous purchase. Altug et al. (2021) consider customer returns driven by inter-temporal pricing discounts and investigate return policy designs to mitigate strategic consumer behavior in retailing. More recently, Zhang et al. (2022) also identify a positive effect of average item prices on whether an order will be returned or not for certain product categories. Our finding that price is an important driver of hotel cancelations contributes to this stream of literature by studying a service setting instead of a physical good setting. Different from many physical goods, service products, such as hotel rooms, are perishable. Most of the aforementioned papers do not explore the underlying mechanism for their findings. An exception is Powers and Jack (2015); their work is based on customer survey data. By contrast, we collected competitor prices and used mediation analysis to establish continued price search as a potential underlying mechanism.
A summary of these streams of literature, with respect to their methodology and consideration of pricing vs. cancelation, is outlined in Table 1. To the best of our knowledge, our paper represents a first attempt to empirically validate the effect of pricing on cancelation rate (which is lacking in the literature as acknowledged in Hu et al., 2018) and to incorporate this finding into analytical pricing models to improve decision-making.
Summary of Relevant Literature.
Context and Data
Context
We obtained booking data from a high-end independent hotel in a major city in the United States. The focal hotel is a historic, upscale establishment located in a cultural and business district, characterized by its proximity to numerous dining, shopping, business, and arts venues. Originally opened in the late 19th century, this boutique hotel combines period-specific architectural elements with modern amenities. It has 94 rooms in eight different room types, which differ in amenities and square footage. Customer profiles include leisure travelers attracted by the historic ambiance and proximity to local attractions, as well as business travelers who benefit from its central location. As an independent hotel, this hotel does not have a loyalty program and does not offer different refund policies for cancelations. Instead, a cancelation and refund policy is uniformly applied to all bookings regardless of booking channel and price. More specifically, the hotel cancelation policy requires that cancelations be made by 4:00 p.m. local time at least 48 hours prior to arrival to avoid a fee equivalent to one night’s room rate plus tax. The hotel does not practice overbooking either. The data we have collected cover a service period of roughly 20 months from May 2022 to October 2023. Since the price information is not retained for days with no bookings in the hotel reservation data, we collaborated with our partner hotel to collect prices daily even for days with no bookings. The price information was critical in our analysis because of our emphasis on the effect of pricing on cancelations. Without this additional price information, we would have had to estimate the prices for days without bookings.
Each booking in the dataset corresponds to one reservation for one room. Reservation information in the data includes reservation ID, booking date, date of arrival (DOA), length of stay (LOS), booking channel, and room type. In the data, customers made reservations through seven different channels, including online travel agencies (OTAs) and the hotel’s own website. Notably, individual customer-identifying information, such as name, address, and contact information, is not included in the data. The data also contain detailed history on the reservation, including the date and time for cancelation, if any. When a customer makes changes in their reservation (e.g., wants to add a day or arrive early), the same reservation record is updated in the system (no new record is created).
Data Description
The data in support of our analysis contain 15,101 booking records, which are obtained after a preliminary cleansing that removed records with irregular booking prices (nightly rate ≤ $0.1), entries without a booking channel, and bookings that end up in no shows. 2 We present descriptive and summary statistics for all reservations, canceled reservations, and non-canceled reservations in Table 2. Panel A reports the number of observations; the range of dates for bookings, arrivals, departures, and cancelations; as well as the number of room types and booking channels. Panel B reports the mean and standard deviation (SD) for booking price, LOS, lag, time to cancelation, existing bookings, inventory, and capacity utilization. Booking price is the average price per night a customer commits to pay for the room over the entire length of their stay. LOS refers to the number of days for a reservation. Lag is the number of days between the booking date and the DOA. Among the bookings that were ultimately canceled, we also summarize the time to cancelation as the number of days between booking and cancelation. Using a simple counting mechanism, we obtain existing bookings as the number of rooms already reserved for the same room type and DOA as of the booking date. In a similar way, inventory can be obtained by subtracting existing bookings from the total number of rooms of the same type in the hotel. Finally, we report capacity utilization as the ratio between existing bookings and the total number of rooms of the same type.
Summary Statistics.
Among all reservations, we observe that the average booking price is $289.55 per night, a typical booking involves a LOS of about 2 days on average, and customers book about 41 days prior to arrival on average. At the time of reservation, on average, more than four rooms of the same type have already been committed to earlier reservations for the same DOA. Approximately 20.8% of the reservations were ultimately canceled, corresponding to 3,141 canceled reservations out of the 15,101 reservations in our data. On average, these canceled reservations took place 25.97 days after booking. As summarized in Table 2, canceled reservations share similar characteristics with non-canceled reservations in terms of booking, arrival, and departure date ranges. However, the average booking price for canceled reservations is $16.86 more than that for non-canceled reservations, and this difference is significant (p < .01) (we use two-sided t-tests to derive the p-values for our comparisons). The average LOS among canceled bookings (2.12) is longer than that of non-canceled bookings (1.99) (p < .01). Notably, canceled reservations have a substantially longer average lag (54.75) than the average lag across non-canceled bookings (37.28) (p < .01). For canceled reservations, the number of existing bookings at the time of reservation (3.77) is significantly lower than that for non-canceled bookings (4.34) (p < .01), suggesting that more inventory was available when the bookings that were ultimately canceled took place.
Next, we provide a brief exploration of the main variables of interest, namely cancelation rate and booking price. Specifically, we explore whether these variables exhibit any salient patterns across booking dates, arrival dates, room types, and booking channels.
Cancelation Rate
Figure 1A shows the weekly cancelation pattern for booking dates and dates of arrival. Reservations made on Sundays have the highest cancelation rate, while those booked on Fridays have the lowest cancelation rate. Overall, the cancelation rate ranges between 18% and 23% for reservations made on different days of the week. Arrivals from Thursday to Saturday are the most likely to be canceled, while those arriving on Sundays and Mondays are less prone to be canceled. Figure 1B shows a strong monthly pattern. In general, reservations made in October and November are more likely to be canceled, as are those for stays in February, July, and December.

Seasonality patterns of cancelation rate.
Cancelation rate also varies across different room types and booking channels. Figure 2 illustrates this variation by ranking room types and booking channels in descending order of cancelation rates. Figure 2A shows that average cancelation rates range between 13% and 27% across room types. Figure 2B shows that average cancelation rates range between 15% and 24% across booking channels.

Cancelation rates (with standard errors) for different room types and booking channels.
Booking Price
Booking prices also demonstrate weekly and monthly patterns as shown in Figure 3. Figure 3A shows that booking prices are lower for arrivals on Sundays and Mondays and typically higher for arrivals from Thursday to Saturday. Although the magnitude of variations over different days of the week is less than $35, comparing Figures 1A and 3A, we observe that days of arrival with higher booking prices (i.e., Thursday to Saturday) are also more likely to be canceled. Figure 3B indicates that monthly variations are larger in magnitude for both booking and arrival months than for weekdays. For instance, stays in December cost about $90 less on average than those in July.

Weekly and monthly patterns of average booking price.
Similar to the cancelation rate, we observe variations for the booking price across room types and booking channels (see Figure 4). However, there is less variation across booking channels than across room types. Note that we keep the same order of room types and booking channels as in Figure 2. By comparing Figures 2 and 4, we observe that a room type or booking channel with a higher cancelation rate does not necessarily have a higher booking price. In fact, the opposite seems to hold for room types—the most expensive room type with average nightly booking price of $491 (room type eight) has the lowest cancelation rate of only around 13%, while one of the least expensive room types with an average nightly booking price of $271 (room type one) has the highest cancelation rate of over 25%.

Average booking prices (with standard errors) for different room types and booking channels.
Relationship Between Booking Price and Cancelation Rate
Before conducting formal econometric analyses, we explore the relationship between booking prices and cancelation rates graphically. To do so, we focus on booking prices between $110 and $550, which remove extreme prices and represent 97.7% of our data. Figure 5 plots the average cancelation rate with standard errors for each $55 booking price bin in this price range. This figure suggests that the cancelation rate increases steadily with booking prices, providing model-free evidence of the positive effect of booking prices on the cancelation rate.

Cancelation rate (with standard errors) versus booking price.
Empirical Model, Estimation, and Counterfactual Analysis
In this section, we first explain our empirical strategy. We then report and discuss the results of our empirical analysis. The Variable Operationalization section explains the operationalization of the variables. The Empirical Model section introduces the empirical model. The Estimation Results section reports the empirical results of the effect of booking prices on cancelation rates. The Robustness Checks and Mechanism Test section presents robustness checks and a mechanism test.
Variable Operationalization
To examine the relationship between booking prices and the hazard of a cancelation, we use a hazard model. Hazard models are used to understand how the risk of an event at time t (dependent variable) relates to a set of explanatory variables that are relevant to the event (independent and control variables). To estimate the hazard rate, hazard models account for both a time measure and a failure measure of the event (Mayo et al., 2022). We measure time as the number of days from the booking date until one of three types of events occurs in our data: cancelation, arrival, or end of data collection. 3 We treat a cancelation as a failure (failure measure = 1) and an arrival as a non-failure (failure measure = 0). We note that our data is right-censored. That is, we have observations that reach the end of our data-collection period without cancelation. These observations may be canceled at some time in the future, but the time of cancelation (if any) is not captured by our data. Survival models account for this right-censoring by combining uncensored and censored observations in the likelihood function. That is, instead of discarding information about censored observations, survival analysis includes that information up to the date of censorship.
Dependent Variable
Our dependent variable is HazardCancellation, which consists of two parts: (a) a binary variable indicating whether a cancelation occurred at a given time, and (b) the number of days elapsed from booking until cancelation, arrival, or censoring. As hazard models estimate the hazard rate of an event using time-to-event as part of the dependent variable, they require the assignment of start and end times for each unit of analysis. In our data, the start time for each observation corresponds to the booking date. The end time is determined by the event occurring in the observation (i.e., cancelation, arrival, or end of data collection). If the customer cancels the booking, the end time is equal to the cancelation date. If the customer arrives at the hotel, the end time is equal to the arrival date. If the customer has neither canceled nor arrived at the end of data collection (i.e., the observation is censored), the end time is equal to the date we collected the data.
Independent Variable
Our main independent variable is BookingPrice. This variable allows us to test whether the price of the room at the booking time is significantly associated with the customer’s decision to cancel the reservation later.
Control Variables
We control for a set of variables that could influence customers’ cancelation behavior. In addition to LOS and Lag, we control for Inventory, which refers to the number of rooms of the requested room type available for the DOA at the moment the booking is made. Building on our exploratory analysis in the Data Description section, we also include a set of dummy variables to control for the type of room (RoomType), booking channel (BookingChannel), month and day of the week of the DOA (DOAMonth and DOADay, respectively), and month and day of the week the booking was made (BookingMonth and BookingDay, respectively).
Empirical Model
Since we are interested in understanding what drives the hazard of cancelation, our empirical strategy involves survival analysis (Klein & Moeschberger, 2006). To move away from the proportionality assumption, 4 we use a parametric form for the baseline hazard rate. Specifically, we assume an exponential distribution for the baseline hazard rate, which presumes that the hazard rate is invariant to time (Klein & Moeschberger, 2006) and is one of the most parsimonious choices in parametric hazard modeling. To ensure that our results are not driven by our assumption of the underlying statistical distribution of the baseline hazard rate, we conduct various robustness tests where we change the distributional assumption in the Alternative Model Specifications section. Our model specification is given by
where the constant term
Estimation Results
Table 3 displays the estimation results from the hazard model evaluating the relationship between booking prices and the hazard of a cancelation. We find that BookingPrice is positively and significantly associated with the hazard of a cancelation (β = .003, p < .01). In the proportional hazard specification, a one-unit increase in a covariate multiplies the hazard rate by
Hazard of a Cancelation.
Note. Standard errors in parentheses.
p < .10. **p < .05. ***p < .01.
Robustness Checks and Mechanism Test
In this section, we conduct robustness checks to test the consistency of our empirical results and run an additional analysis to test a potential mechanism to explain our main result. First, we validate that our findings are not driven by our estimation decisions, in particular, the parametric assumption of the exponential distribution for the baseline hazard and the decision to drop the observations for which the booking and the cancelation happen on the same day. Second, we alleviate endogeneity concerns across canceled and non-canceled bookings by using CEM. Third, we check the robustness of our findings to a different operationalization of the independent variable. Fourth, we conduct an additional analysis to check the consistency of our results across different room types and booking channels. Fifth, we investigate whether our results are driven by uncertainty resolution. Sixth, we explore a potential mechanism driving the results, that is, customers’ continued price search.
Alternative Model Specifications
First, we check the consistency of our results to changes in the parametric assumptions for the baseline hazard. Our main model assumes an exponential distribution. In this section, we use two alternative distributions commonly used in survival analysis: Weibull and Gompertz (Klein & Moeschberger, 2006; Mayo et al., 2022). In addition, we evaluate a logistic regression model as a means to confirm the results of the hazard model because it considers whether or not a cancelation occurs instead of the time to cancelation. For this analysis, we use an indicator that takes the value of 1 if the booking is canceled and 0 otherwise as a dependent variable. Finally, in our main analysis, we consider a time unit of days, and as a result, the survival analysis naturally drops the observations for which the booking and the cancelation happen on the same day. To check the robustness of our results to this estimation decision, we added a small lag of 0.5 to the survival times of all our observations and conducted the analysis with the updated data. By adding this small lag, we are able to include 1,290 more observations in our analysis.
Table 4 displays the results with these alternative model specifications. The results are consistent with those in Table 3, which indicate that our findings are robust to alternative model specifications and to the inclusion of a small lag to the survival times. 6 We also look at the shape parameters for the Weibull and Gompertz distributions. We find that the shape parameter p for the Weibull model is essentially a 1 (exp(ln(0.031)) = 1.031), and the γ for the Gompertz model is essentially a 0 (0.009), which indicate that the baseline hazard rate is constant, and that these models do not provide a better fit than the exponential model (Bakker, 2007). Taken together, these results suggest that using an exponential distribution for the baseline hazard is a reasonable assumption.
Hazard of a Cancelation (Weibull and Gompertz) and Likelihood of a Cancelation (Logistic).
Note. Standard errors in parentheses.
p < .10. **p < .05. ***p < .01.
Coarsened Exact Matching
Our models may suffer from omitted variable bias since we do not observe all the characteristics along which canceled and non-canceled bookings differ. We also must address the possibility that canceled and non-canceled bookings are systematically different, which may introduce selection bias in our analyses (Heckman, 1979). For instance, we do not observe the customers’ demographics or their travel purposes (e.g., business versus leisure), 7 weather conditions, economic or political conditions, or health and safety concerns, which are variables that may influence the cancelation of a room. To alleviate these endogeneity concerns, we apply CEM. CEM is a non-parametric approach that allows us to match canceled bookings with non-canceled bookings that have similar covariate distributions (Stuart, 2010). Thus, our goal is to obtain a subsample of bookings that are balanced in relevant characteristics (i.e., matching variables) and differ mainly in whether they are canceled or not.
We implement the CEM matching process following four steps (Iacus et al., 2011). First, we decide on the set of matching variables for which we aim to improve balance. Second, we partition the matching variables into strata. Third, we sort all bookings into a stratum, according to the values of each booking’s matching variables. Fourth, we remove observations in the strata that do not have at least one canceled and one non-canceled booking.
We use the following matching variables: LOS, Inventory, RoomType, DOAMonth, DOADay, BookingMonth, BookingDay, ExactDOA, and ExactBookingDate (the last two variables refer to the exact DOA and exact booking date, respectively). The balance across these covariates allows us to account for potential latent variables that may explain differences between canceled and non-canceled bookings. For instance, business travelers might have a different behavior when canceling bookings compared to leisure travelers. We do not have information about the business purpose in our data, but we know that business travelers are more likely to stay during weekdays, while leisure travelers are more likely to stay during weekends. By matching bookings with the same room type, DOA information, and LOS, we are likely to control for that omitted variable. Another possible concern is that bookings made a long time in advance might face more uncertainty than those made closer to the DOA. Thus, by matching bookings with the same booking date and DOA, we are likely to control for this inherent uncertainty. In doing so, CEM allows us to reasonably control for booking characteristics (e.g., trip purpose, uncertainty), providing a cleaner test of the relationship between booking prices and cancelations.
To assess the effectiveness of the matching procedure, we report the L1 imbalance measure, the standard imbalance metric used in CEM (Iacus et al., 2011). The L1 statistic measures the distance between the multivariate histograms of the treated and control groups across all matching variables. Its values range from 0 to 1, where 0 indicates perfect balance (identical distributions across groups), and 1 indicates maximal imbalance (no overlap across groups). Formally, L1 is computed as the sum of the absolute differences in the relative frequencies of treated and control units across all coarsened strata divided by two. In our context, the L1 imbalance decreases from 0.968 before matching to 0.066 after matching, indicating a substantial improvement in covariate balance due to the CEM procedure.
Our resulting sample consists of 426 canceled bookings matched with 731 non-canceled bookings. Using the CEM sample, we run the exponential hazard model and the logistic regression model and present the results in Table 5. We find that even after controlling for potential endogeneity with CEM, we still observe the positive and significant association between BookingPrice and the hazard, as well as the likelihood of a cancelation.
Hazard and Likelihood of a Cancelation After Coarsened Exact Matching (CEM).
Note. Standard errors in parentheses.
p < .10. **p < .05. ***p < .01.
Alternative Operationalization of the Independent Variable
In our main analysis, we operationalize our independent variable with BookingPrice, which is a continuous variable capturing the nightly price for a given room. In this section, we use a categorical operationalization for the independent variable that focuses on the most expensive booking prices. Specifically, we introduce HighPrice, which is an indicator variable that takes the value of 1 for prices greater than $334 per night (which is the starting point of the fourth quartile for BookingPrice), and 0 otherwise. In addition, we use CEM with the same matching variables as in the Coarsened Exact Matching section to account for potential endogeneity across bookings in the fourth quartile of BookingPrice (treatment) and other bookings with lower prices (control). The resulting sample consists of 193 highly priced bookings and 248 controls. Table 6 shows the results for the models using HighPrice; the first column for the model without CEM and the second column for the model with CEM. As our results are consistent with those in Table 3, we are confident that our findings are robust to a categorical operationalization of the independent variable.
Hazard of a Cancelation Using HighPrice.
Note. Standard errors in parentheses.
p < .10. **p < .05. ***p < .01.
Results Across Room Types and Booking Channels
In this analysis, we check the consistency of our results across the different room types and booking channels. Tables 7 and 8 present the results for the room types and booking channels, respectively. The results from all models are consistent in direction, and most of them are also consistent in significance levels with those in Table 3, which indicates that our findings hold across room types and booking channels.
Hazard of a Cancelation Across Room Types.
Note. Standard errors in parentheses.
As very few bookings are canceled for Room Type 8 (only 22 out of 164), the regression fails to converge. Therefore, we combined the bookings of Room Type 8 with Room Type 7 as they are similar in their cancelation rates (see Figure 2A).
p < .10. **p < .05. ***p < .01.
Hazard of a Cancelation Across Booking Channels.
Note. Standard errors in parentheses.
p < .10. **p < .05. ***p < .01.
Results Accounting for Uncertainty of Booking in Advance
In this section, we examine whether our results are influenced by the inherent uncertainty of booking a hotel stay in advance. It is possible that the observed relationship between booking prices and cancelations is simply due to the timing of bookings. As the arrival date approaches, uncertainty about future conditions, such as weather, travel plans, or event attendance, is gradually resolved. When reservations are made far in advance under high uncertainty, customers may book to secure availability while retaining the option to revise their plans later. As uncertainty decreases and new information becomes available, some customers may learn that their initial plans are no longer necessary and cancel their reservations. In this way, cancelations arise not because of the booking price itself, but because early bookings were made under greater uncertainty that is resolved later. To test the robustness of our results against this uncertainty, we first look at the descriptive statistics of the lag variable (i.e., the number of days between the booking date and the DOA) and identified quartile cutoffs at 9, 24, and 52 days (with a minimum lag of 0 and a maximum of 467 days). We then ran our regression analysis on booking prices and the hazard of a cancelation. If uncertainty drives our results, we would expect the relationship between booking prices and cancelations to be stronger for longer lags and to weaken or disappear for short lags, specifically in the first quartile, where the lag is 9 days or fewer. Table 9 shows the results of this analysis.
Hazard of a Cancelation for Lag Quartiles.
Note. Standard errors in parentheses.
p < .10. **p < .05. ***p < .01.
Results indicate that the relationship between booking prices and cancelations holds in magnitude, direction, and significance across different lags. Therefore, we conclude that although uncertainty may be playing a role in our results (at the focal hotel and competitor hotels), it cannot completely explain the relationship between booking prices and cancelations. In short, our results are consistent across different levels of uncertainty in the bookings.
Testing for a Potential Mechanism: Continued Price Search
In this section, we test whether our main result of the positive effect of booking prices on the hazard of a cancelation might be driven by customers’ continued monitoring and searching for lower prices even after booking. More specifically, we control for bookings being canceled when the overall market price is more attractive. In other words, we test for the possibility that customers cancel because there is a drop in prices and they can get a better deal.
The first step to test this mechanism is to determine the market price available to customers over time after booking until cancelation or arrival. To this end, we collected pricing data, corresponding to the lowest rate among all room types, for eight competitors of the focal hotel. The pricing data were retrieved every day from 2023-4-11 to 2023-7-18 (henceforth observation dates) and cover prices for DOA between 2023-4-11 and 2024-7-16. Important for our study, we have access to the competitor prices not only for the DOA of the current observation date but also for DOA up to 1 year in the future. We observe the price not only at the time of cancelation but also in the days before the cancelation. In light of the vast competitor data, we match our main booking database (both canceled and non-canceled bookings) with the competitors’ prices using the following process. For each booking in our dataset, we create multiple observations for each date that the reservation exists. For example, a booking created on 2023-4-15 with a cancelation date on 2023-4-20 will have six observations, one for each date, which we call reference date. Then, we match each observation with the competitor prices using two criteria. First, the reference date of a booking needs to be the same as the observation date of a competitor’s price. Second, the DOA for the booking should be the same as the DOA for a competitor’s price. After matching, we retrieve the competitor prices for each day of the LOS of the booking, which allows us to capture the market prices for the whole duration of the stay. Finally, we calculate the median price across all competitors for the reference day and use this value as our measure for the overall MarketPrice. Following this process, we find the MarketPrice for 106,098 reference dates.
Then, we calculate the variable DifferenceMarket as the difference between the BookingPrice and the MarketPrice. This difference lets us account for the magnitude of the discount available to customers who observe lower prices (compared to booking prices) at the time of cancelation. To test the potential effect of continued price search on the hazard of a cancelation, we run an additional hazard model where we include DifferenceMarket as an independent variable, as shown in Equation (2). Table 10 shows the regression results of this hazard model.
Hazard of a Cancelation With Continued Price Search.
Note. Standard errors in parentheses.
p < .10. **p < .05. ***p < .01.
We find that, as expected, the difference between booking price and market price has a positive effect on the hazard of a cancelation. Thus, when the market offers customers better deals (compared to the booking price), the hazard of a cancelation increases. However, we observe that even after controlling for this potential continued price search behavior, the main effect of BookingPrice continues to be positive and significant, which suggests that our proposed mechanism only partially mediates the relationship between booking price and the hazard of a cancelation.
We formally test for the extent to which continued price search is a mediator for booking price and the hazard of a cancelation following the Preacher and Hayes (2008) bootstrapping procedure. We run 200 simulations to estimate the non-parametric bootstrap standard errors for the direct, indirect, and total effects. Figure 6 illustrates our results. We find that our proposed mechanism of continued price search mediates 27% (0.0006/0.0022) of the total effect of booking price on the hazard of a cancelation, which is a non-negligible proportion.

Representation of total effect (left) and mediated effect of continued price search (right).
Counterfactual Analysis
To illustrate the potential effect of ignoring the relationship between prices and cancelation rates when setting prices, we conduct a counterfactual analysis in a simplified setting with one selling period and one room type. Results from this analysis indicate that the hotel may lose as much as 11% of revenue during the high season for ignoring the effect of pricing on cancelation. We provide the details of this analysis in the Appendix.
Discussion
Cancelations account for a significant proportion of hotel bookings. To date, the extant literature is mostly dedicated to understanding the exogenous drivers of customer cancelations, such as schedule changes, and coming up with strategies to mitigate their impact on hotel revenue. To the best of our knowledge, most revenue-management systems today also operate on similar assumptions. In this paper, we show empirically that cancelations are affected by an endogenous factor within the hotel’s control: booking prices. We explore the potential mechanism behind this effect and conduct a counterfactual analysis to estimate the impact of this effect on revenue.
Theoretical Implications
Our results have theoretical implications for revenue-management and service-cancelation research. First, we fill the gap in the literature of cancelation prediction by empirically quantifying the effect of booking prices on customer cancelations. Existing studies in cancelation mainly focus on the effect of exogenous factors such as DOA, booking channel, and customer types (e.g., Antonio et al., 2017; Falk & Vieru, 2018; Iliescu et al, 2008; Romero Morales & Wang, 2010). To the best of our knowledge, none has attempted to examine the effect of pricing on overall cancelation rates. Our study highlights that internal firm decisions may affect customer cancelation behaviors.
Second, our research extends the consideration of cancelation in revenue-management literature. In the hotel industry, the primary tactic to mitigate the effect of cancelation is overbooking, when hotels accept more reservations than their capacities. In this domain, cancelations are usually treated as exogenous and passive, driven by changes in customers’ state or need (Baker & Collier, 2003; Liberman & Yechiali, 1978; Subramanian et al., 1999; Talluri & van Ryzin, 2004; Weatherford & Bodily, 1992). Accordingly, overbooking policies have been extensively discussed through various lenses such as substitution (Gallego & Phillips, 2004; Karaesman & van Ryzin, 2004; Zacharias & Pinedo 2017) and strategic customer behavior (Wilson et al., 2006; Yılmaz et al., 2017), whereas pricing is largely taken as given (except for some recent studies such as Yılmaz et al., 2017). Our study extends the revenue-management literature by identifying the effect of pricing on customers’ cancelation behavior and highlighting the importance of this effect in pricing decisions.
Finally, we discover similar effects for perishable service products like hotel stays, paralleling the stream of research on the relationship between pricing and customer returns of physical goods (e.g., Anderson et al., 2008; Bandi et al., 2018; Samorani et al., 2019; Zhang et al., 2022). In this context, we have identified continued price search as a potential underlying mechanism for hotel cancelations, which echoes the findings in Powers and Jack (2015) concerning retail product returns.
Managerial Implications
Our findings carry crucial managerial implications. First, when managers estimate cancelation rates, it is advisable to expand the focus from exogenous variables such as DOA, channel, and customer types to include relevant endogenous variables such as local and competitor pricing. This helps the practitioners gain an inclusive view of the factors that could possibly drive the fluctuation of bookings and a firmer grip on capacity control.
Since pricing is one of the most pivotal levers in revenue management, setting prices without recognizing its intrinsic relationship with cancelations can be potentially detrimental to hotels. The conventional approach to managing cancelations is misguided because it treats cancelations as exogenous and does not consider the effect of pricing on cancelations. When the cancelation rate increases with the booking price, as established in our work, taking into account the effect of pricing on cancelations becomes important. Existing revenue-management practices focus primarily on demand management—prices are set to balance increased revenue with reduced demand. Our results suggest that price increases also need to be carefully balanced with the increased likelihood of cancelations. Since the current revenue-management practice in the hotel industry does not take this into consideration, there could be substantial missed revenue.
Subsequently, revenue managers should revisit their pricing strategies by considering the effect on cancelation at various price levels. Our counterfactual analysis confirms that ignoring such an effect of pricing may lead to substantial loss in expected revenue, especially in high seasons and for room types with high booking volumes. By quantifying the effect of pricing on cancelations using the approach outlined in our study, revenue managers can readily incorporate this information into their pricing model to update their strategies and achieve increased revenue.
In addition, revenue managers should be mindful about strategic customer behaviors that may lead to continued price searches after reservation. In the long run, our study encourages the exploration of a variety of pricing and cancelation policies, from no-cancelation pricing to a commitment of price match, in accommodating to the behavioral connection between pricing and cancelations.
Limitations and Future Work
Our work spearheads the examination of the aggregate effect of pricing on hotel cancelation rates and opens many avenues for future research. One potential direction is to obtain enriched data with individual customer characteristics similar to Passenger Name Record (PNR) data in the airline industry. Such data would help further alleviate the endogeneity concern currently addressed through CEM in our analysis. Although CEM provides a useful robustness check by improving balance on observable booking characteristics, it also carries inherent limitations: Coarsening decisions may introduce arbitrariness, and unmatched strata are dropped, leading to reduced sample size and results that may be sensitive to how strata are defined. To minimize these concerns in our current analysis, we rely primarily on naturally discrete matching variables that require minimal subjective coarsening, and we confirm that our findings remain consistent to alternative matching approaches. Future studies with richer data could rely less heavily on matching-based approaches and instead model customer heterogeneity directly, thereby overcoming these constraints and providing even cleaner identification.
Detailed individual customer data could also allow the study of customers’ cancel-and-rebook behavior at the same hotel. Similar to the insights that loyalty programs may have a positive effect on customer retention (e.g., Bolton et al., 2000; Verhoef, 2003), recurring customers might be less prone to cancelation than those who booked at the hotel for the first time. When applicable, one may also combine booking data with customer browsing history to characterize potential strategic customer behavior, for example, customers may postpone reservations in anticipation of future price changes. This would help overcome the challenge of lack of data footprints in quantifying strategic customer behavior experimentally and empirically.
Presently, our data concern only one hotel, which operates with a full-refund cancelation policy and does not have a loyalty program. This allows us to characterize the effect of pricing on cancelation for the focal hotel with less ambiguity. In future studies, researchers may verify the robustness of these insights by collecting comprehensive data from multiple hotels through various direct and agency channels. This could allow managers to understand how booking channels, loyalty programs, and competitive environments may reshape customers’ behavior and the relationship between pricing and cancelation rates.
With increased data availability and visibility, one may drill down the pricing–cancelation relationship even further and decompose various elements of the root causes, such as strategic customer behavior. We make a first attempt at understanding strategic customer behavior through the continued price search mechanism. Nonetheless, to better understand the strategic behavior related to bookings, such as opportunistic canceling and rebooking, upgrading, switching, and postponement, it is necessary to have richer and more detailed data on a focal hotel’s offering and bookings, and possibly the same from adjacent competitors.
Another avenue for future research is to examine how different refund policies affect hotel cancelation rates. While our study focuses on the relationship between pricing and cancelations within a single refund policy, future research could explore how prices associated with various refund policies may affect the cancelation rate differently. For instance, one may investigate whether and how some strategic designs of refunds—for instance, a flexible, non-refundable booking that tolerates one change of dates—may reduce cancelations. Finally, we acknowledge that there might be other factors that contribute to the positive association between high booking prices and the hazard of a cancelation. For instance, high booking prices reduce consumer surplus, thus making the reservation more vulnerable toward exogenous shocks. However, testing such additional factors may require data that are presently unattainable. Future research may tackle this challenge through field experiments.
Empirical findings in this and future studies will encourage researchers to revisit existing analytical models. For instance, one may incorporate the pricing–cancelation relationship into classic revenue-management models that are relevant to overbooking, capacity control, or pricing and enhance policy design accordingly.
Footnotes
Appendix
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Xiao Huang’s research is supported by the Natural Sciences and Engineering Research Council of Canada [NSERC RGPIN-2022-04671] and the Concordia University Research Chair program. Dan Zhang acknowledges with appreciation the research support provided by the Leeds School of Business at the University of Colorado Boulder, including funding through the MediaOne Professorship. Gloria Urrea gratefully acknowledges financial support provided by the Leeds School of Business.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, or publication of this article.
