Forecasting tourist arrivals using STL-XGBoost method

Abstract

Forecasting tourism demand in a timely manner is critical for ensuring the smooth operation of the tourism industry. Over time, time series models have been widely applied to estimate the number of tourists arriving. In this paper, we proposed a XGBoost model for tourism demand forecasting based on the STL seasonal decomposition. The first phase of our proposed model involves applying STL decomposition to preprocess the time series, separating it into two components: the seasonal and de-seasonal terms. During the second phase, the seasonal term is modeled and predicted with the Holt-Winters model. For the de-seasonal term, the ARIMA model is first employed to capture the residual part, Then, the XGBoost model is utilized to reconstruct both the de-seasonal term and its lag, along with the residual part obtained from the ARIMA model. By integrating the forecast outputs from both the Holt-Winters and XGBoost models, the final tourism demand predictions can be derived. The effectiveness of the proposed model is demonstrated using the tourist arrivals data in Macau from eight countries: United States, Germany, Malaysia, Philippines, India, Thailand, Italy and Korea (South Korea). The validation results indicate that the proposed model exhibits superior forecasting performance for time series data showing seasonality and trendency, simultaneously enhancing interpretability without increasing model complexity. The model outperforms five benchmark comparison models when assessed using the Symmetric Mean Absolute Percent Error (SMAPE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE) metrics.

Keywords

STL XGBoost holt-winters tourist arrivals forecast time series

Introduction

Tourism demand forecasting research offers valuable insights for governments and businesses, helping them to better anticipate market trends, optimize resource allocation and development, improve tourism service quality, and boost revenue and employment opportunities. Furthermore, accurate forecasting supports the rationalization of marketing strategies and advertising plans to attract more tourists and promote sustainable tourism development.

In research on tourism demand forecasting, its majority of forecasting methods have traditionally focused on time series methods. Many scholars have favored traditional modeling methods for their outstanding explanatory power and simplicity. Commonly used models include ARIMA (Lim and McAleer, 2002), SARIMA (Chang and Liao, 2010), Prophet (Taylor and Letham, 2018), Holt-Winters (Lim and McAleer, 2001), among others.

When utilizing these models to forecast tourist arrivals, only the linear aspect of relevant demand characteristics is taken into consideration. This assumes a linear correlation between future values of the data and the lagged values and residuals. However, when the data exhibits a nonlinear structure, these models may not accurately and precisely forecast. Their effectiveness is prominent when dealing with absence of complex nonlinear components.

For the past few years, various nonlinear methods have been utilized for tourism demand forecasting. These methods can capture the nonlinear structure of tourist data, resulting in a significant enhancement in forecasting accuracy. Currently, the main methods center around support vector machines (Pai et al., 2005), random forests (Zhang and Tang, 2022), gradient boosting trees (Liu et al., 2020), and integrated learning methods such as packing(Athanasopoulos et al., 2018), boosting (Ozaslan et al., 2022) and superposition (Kumar et al., 2022).

Further on, these methods incorporate more complex neural networks for prediction, including Artificial Neural Network (ANN) (Khashei and Bijari, 2010), Convolutional Neural Network (CNN) and Long Short Term Memory Network (LSTM) (Ni et al., 2021), the faster KELM (Sun et al., 2019), and the ensemble deep learning approach(Sun et al., 2022). These models are capable of performing nonlinear and non-stationary forecasting of time series data, significantly enhancing forecasting accuracy.

Despite demonstrating promising accuracy for most time series data, these methods’ performance is highly dependent on the correct selection of hyperparameters, which are typically related to the data. Mis-specification of these parameters can lead to a substantial decrease in forecast accuracy for out-of-sample predictions. Additionally, these complex models are prone to overfitting, meaning they may learn both the underlying patterns and the noise in the data, which can impair their ability to generalize to unseen data. Another significant drawback of some of the approaches mentioned is the lack of interpretability and the “black-box” nature of these models.

On the other hand, to overcome the limitations of single predictive models and enhance their scalability and generalization, researchers have proposed a hybrid of linear and non-linear methods for predicting the tourist arrivals. See next Section for the detailed literature review.

Data decomposition techniques have become a powerful tool in this field due to the unique seasonality and trends in tourist arrival datasets (Zhang et al., 2021). Frequency domain-based decomposition methods have demonstrated superior performance in handling seasonal and nonlinear time series data. Typically, the approach involves decomposing the original data into different components and then applying a combined forecasting strategy. The seasonal component is estimated and removed first, followed by the estimation of the remaining components. Among the standard decomposition methods, X12 (Cuccia and Rizzo, 2011, Empirical Mode Decomposition (EMD) (Fatema et al., 2022), and Seasonal-Trend-Loess (STL) (Gurnani et al., 2017) have been widely used. Researchers (Faraway and Chatfield, 1998; Nelson et al., 1999; Zhang and Qi, 2005) have investigated the effectiveness of time series decomposition, particular de-seasonalization, on modeling and forecasting performance. They found that decompose the time series before applying the models for prediction yields significantly better results, dramatically reduce forecasting errors.

Most studies, including ours, primarily focus on time series decomposition for its specific advantages in interpretability while simultaneously improving the predictive accuracy. In terms of interpretability considerations, these algorithms, encompassing both classic and hybrid methods, typically generate the final aggregate prediction by simply summing up the components. Despite claimed multiple benefits, several remaining issues significantly restrict the further development of these algorithms. Classical models, which assume a linear relationship, are not well-suited to handle complex nonlinear problems. Moreover, the differencing processing employed in these methods often fails to satisfactorily address the nonlinearity and seasonality. Hybrid methods usually combine the seasonal and trend terms using a linear model for forecasting purposes while assuming that only the residual term contains nonlinear factors. However, within the trend term itself- apart from the residual term-there exist not only linear factors but also nonlinear factors. Furthermore, when forecasting with seasonal time series data, incorporating the seasonal component into the forecast together often leads to suboptimal results.

Against the above background, we propose a novel hybrid method for tourism demand forecasting. Specifically, we introduce an XGBoost-based approach that forecasts tourist arrivals using STL seasonal decomposition to address the limitations of existing time series forecasting models. XGBoost, which stands for Extreme Gradient Boosting, is a highly optimized implementation of the gradient boosting algorithm. Its competitive advantage lies in its superior balance between exploration and exploitation, making it more effective than alternative methods. Notably, XGBoost offers key features such as incorporation of diverse regularization penalties to mitigate overfitting risks, and the ability to detect and learn from non-linear data patterns. To extract the seasonal component from the raw data, the Seasonal-Trend-Loess (STL) method is introduced. Compared to other decomposition methods, such as X12 and EMD, STL offers several distinct advantages, including: (1) STL is adaptive to abnormal values which ensures the robustness, thereby enhancing the accuracy of prediction. (2)STL has a wide applicability as it can handle time series with any seasonal frequency greater than one. (3) Being based on numerical methods, STL eliminates the need for parameter determination, making it an easily achievable method.

The application of hybrid models in tourism prediction is still in the stage of continuous exploration. Since the ensemble learning frameworks show promise as an effective method, this study aims to address these gaps by incorporating seasonal decomposition and introducing the interpretable ensemble learning method XGBoost to enhance available datasets and improve prediction performance.

In the first phase of our proposed model, STL decomposition is utilized to preprocess the time series data. This involves decomposing the time series into two components: the seasonal term and the de-seasonal term. During the second phase, we model and predict the seasonal term using the Holt-Winters model. For the de-seasonal term, we first employ the ARIMA model to capture the residual part. Then, the XGBoost model is utilized to reconstruct both the de-seasonal term and its lag, along with the residual part obtained from the ARIMA model. By combining the forecast results from the Holt-Winters model and the XGBoost model, we can obtain the final forecasting tourist arrivals. Furthermore, a direct strategy is used to implement multi-step-ahead forecasts.

The remainder of the paper is organized as follows. Section 2 provides a review of the literature on the forecasting of tourism demand using hybrid methods. Section 3 provides a brief overview of the basic modeling approaches, including Seasonal-Trend-Loess (STL), Auto-Regressive Integrated Moving Average (ARIMA), Seasonal ARIMA (SARIMA), Holt-Winters, and eXtreme Gradient Boosting (XGBoost). In section 4, the formulation of the proposed model and the description of the tourist arrivals datasets are presented. Section 5 applies the proposed model to forecast tourist arrivals and compares its performance to other forecast models. Section 6 contains the concluding remarks.

Literature review

To enhance their scalability and generalization, an increasing number of hybrid models have been proposed and widely applied in the forecasting of the demand for tourism. Liang, 2014 proposed the SARIMA-GARCH model to forecast the tourist arrivals in Taiwan. He et al., 2021 proposed a SARIMA–CNN–LSTM model for forecasting tourist arrivals. The model is a combination of the SARIMA model and the deep neural network framework. It integrates CNN and LSTM layers to identity linear and nonlinear data characterization. The results indicated that the SARIMA–CNN–LSTM model outperforms the individual models in terms of forecast accuracy. Wu et al., 2021 proposed a novel hybrid approach, SARIMA + LSTM to forecast daily tourist arrivals in Macau. SARIMA + LSTM leverages the forecasting power of the SARIMA model with the capability of LSTM aiming to minimize residuals further. The results show that the prediction technology of SARIMA + LSTM is superior to other methods. Tsui and Balli, 2017 forecast international tourist arrivals at eight major Australian airports using the SARIMAX-EGARCH volatility models. The results indicated that the proposed hybrid model is effective in identifying the impact of both positive and negative shocks on the arrival of international tourists. Recently, Xing et al., 2022 proposed a novel adaptive multiscale ensemble (AME) learning approach that integrates variational mode decomposition (VMD) and least square support vector regression (LSSVR) for short-, medium-, and long-term seasonal and trend prediction of tourist arrivals. The proposed model demonstrates superior directional and forecasting accuracy compared to other benchmark models. Sun et al., 2022 proposed a bagging-based multivariate ensemble deep learning approach, B-SAKE, which integrates stacked autoencoders and kernel-based extreme learning machines to address the challenges of predicting tourist arrivals to Beijing from four countries. By utilizing historical tourist arrival data, economic variable data, and search intensity index (SII) data, the proposed method is superior to the baseline model in terms of horizontal accuracy, in terms of directional accuracy, and even in terms of statistical significance. He et al., 2022 proposed an ensemble learning-based forecasting model. Initially, the model produces multiple sub-models. Each sub-model reconstructs the prediction input by selecting a sequence of information to learn the time series features. A novel technique is introduced to aggregate the outputs of these sub-models, thereby improving the robustness of the prediction to non-linear and seasonal features. To validate the effectiveness of the proposed framework, tourism demand data from the Chengdu Research Base of Giant Panda Breeding over the past 5 years is used as a case study. Bi et al., 2024 proposed a novel spatially dependent travel demand prediction model for diverse tourist sights. There are three stages to the model: selecting attractions, generating base predictors, and combining base predictors. In the first stage, a method based on multidimensional scaling is used to identify the relevant attractions and to determine the intensity of the spatial dependence between each pair of attractions. The second stage is the development of a hybrid basic predictor, integrating LSTM networks with an autoregressive model; The LSTM networks capture the spatial dependence among the attractions, while the autoregressive model takes into account the scale of the volume of tourists at each of the attractions. Finally, in the third stage, to mitigate over-fitting problems associated with LSTM models and to improve forecast stability, a strategy for combining these basic predictors is proposed.

Basic models

In this section, the basic modeling approaches of the STL, Holt-Winters, ARIMA, SARIMA and XGBoost models for time series forecasting are firstly briefly reviewed.

Seasonal-trend-loess (STL)

Developed by Cleveland et al., 1990, STL is a classical time series decomposition method that employs an algorithm based on locally weighted regression. This technique enhances data decomposition, especially for seasonality. Unlike other classical seasonal decomposition models such as X12 and SEATS, STL allows for flexible adjustment of the sizes of the seasonal and trend shift windows. It also improves the model’s robustness through outer loops, thereby enhancing outlier treatment. The STL method decomposes the time series Y_t into three components: the seasonal part S_t, the trend part T_t and the residual part R_t. The decomposition equation is as follows:

Y_{t} = S_{t} + T_{t} + R_{t}

(1)

Abbreviating the trend part and the residual part, i.e. the part T_t + R_t to TR_t , and the expression becomes

Y_{t} = S_{t} + T R_{t}

(2)

Holt-Winters

The Holt-Winters seasonal model (Holt-Winters) is a widely used technique in time series analysis that utilizes the cubic exponential smoothing algorithm. Originally proposed by Winters in 1960, this model has been refined by researchers such as Cipra, Romera, and Hyndman. The Holt-Winters method is particularly effective for time series data that exhibit fixed periods, such as seasonal patterns. By decomposing the time series into its seasonal components, the method reduces seasonal noise and improves forecasting accuracy. This is especially useful when making multi-step-ahead forecasts for seasonal series. The updated equation for this model is:

Trend update:

T_{t + 1} = β (F_{t + 1} - F_{t}) + (1 - β) T_{t}

(3)

Seasonal Updates:

S e_{t + 1} = γ (S_{t + 1} - F_{t + 1}) + (1 - γ) S e_{t + 1 - k}

(4)

Predicted value:

F_{t + 1} = α (S_{t} - S e_{t + 1 - k}) + (1 - α) (F_{t} - T_{t})

(5)

h-step-ahead forecasting values:

F_{t + h} = F_{t} + h T_{t} + S e_{t + h - k}

(6)

Where k is the period length, S_t is the true value of the sequence, and α, β, γ ∈ (0, 1) are the individual smoothing parameters.

ARIMA and SARIMA

ARIMA is a classical time series model. It combines autoregressive and moving average processes and difference variables. It is a member of the autoregressive moving average (ARMA) model family. The ARMA model is commonly used as a benchmark tool for forecasting the demand for tourism services (Hu and Song, 2020; Li et al., 2017, 2020; Pan and Yang, 2017; Park et al., 2017; Jun et al., 2018; Zhang et al., 2020). The ARIMA(p, d, q) model is written as:

φ_{p} (B) {(1 - B)}^{d} T R_{t} = ϑ_{q} (B) a_{t}

(7)

Where B is the lag operator, and φ is the autoregressive terms, and θ is the moving average terms, and a_t is white noise series.

Seasonal ARIMA (SARIMA) can be used to account for the seasonality in tourist arrival data. The SARIMA model, an extension of the ARIMA model outlined by Box and Jenkins, is primarily a linear model applied to univariate time series. Over time, the SARIMA model has been developed to incorporate both seasonal and non-seasonal components. The model can be denoted as SARIMA (p, d, q) (P, D, Q) , written as:

φ_{p} (B) Φ_{P} (B^{s}) {(1 - B)}^{d} {(1 - B^{s})}^{D} T R_{t} = ϑ_{q} (B) Θ_{Q} (B^{s}) a_{t}

(8)

where p, q, and d represent the autoregressive (AR) term order, moving average (MA) term order, and difference term, respectively, while P, Q, D represent the seasonal AR term order, MA term order, and seasonal difference term order, respectively. Generally, P and Q range from 0 to 3, D ranges from 0 to 1, and s represents the length of the fixed period, the frequency of seasonal fluctuations in the time series. B is the lag operator and B_s is the seasonal lag operator, and φ, Φ are the autoregressive and seasonal autoregressive terms, and ϑ, Θ are the moving average and seasonal moving average terms, and a_t is white noise series. In this paper, we use the “auto.arima” function built in the “forecast” package of R statistical software to automatically select the optimal parameters for ARIMA/SARIMA models (Hyndman et al., 2020).

eXtreme gradient boosting (XGBoost)

XGBoost is a boosted tree algorithm proposed by Chen and Guestrin, 2016. This algorithm integrates multiple decision tree models using the gradient boosting method, ensuring interpretability while improving prediction accuracy. The boosted tree XGBoost employs additive operations to combine several base models, creating a strong learner. By incorporating dependencies among weak learners and utilizing sparse matrix storage for multi-threaded computation, XGBoost is significantly faster than other typical learners. It also effectively captures non-linear trends and reduces variance through the inclusion of a regular term, thereby preventing overfitting. The general expression for the model is:

{\hat{T R}}_{i} = \sum_{t = 1}^{k} f_{t} (x_{i})

(9)

Here, k represents the number of base learners, and f denotes the base learner. The expression of the loss function with the inclusion of the regular term is given by:

L o s s = \sum_{i = 1}^{n} l ({\hat{T R}}_{i}, T R_{i}) + \sum_{t = 1}^{k} Ω (f_{t})

(10)

Where Ω represents the canonical term. The objective function is solved using Taylor expansions. By leveraging the property of summing basis functions and the fact that the predicted value of tree t − 1 is known when predicting tree t, we arrive at:

L o s s^{(t)} = \sum_{i = 1}^{n} [g_{i} f_{t} (x_{i}) + \frac{1}{2} h_{i} f_{t} {(x_{i})}^{2}] + \sum_{t = 1}^{k} Ω (f_{t}) + c

(11)

Here, g_i is the first order derivative of $l ({\hat{T R}}_{i}^{(t - 1)}, T R_{i})$ , h_i denotes the second order derivative of $l ({\hat{T R}}_{i}^{(t - 1)}, T R_{i})$ . By considering the definition of leaf nodes, sample sets, and regularization, the objective function can be transformed into:

L o s s^{(t)} = \sum_{j = 1}^{T} [G_{j} W_{j} + \frac{1}{2} (H_{j} + λ) W_{j}^{2}] + δ T

(12)

where G_j = $\sum_{i \in I_{j}} g_{i}, H_{j}$ = $\sum_{i \in I_{j}} h_{i}$ , and I_j represents the set of samples on the jth leaf node. W_j represents the value assigned to the jth leaf node, and T is the number of leaf nodes. λ, δ are the regularization parameter.

Methodology

Proposed model

We construct a STL-XGBoost model, which combines the advantages of STL, Holt-Winters, ARIMA and XGBoost.

The Holt-Winters method is one of the most commonly used techniques in the Exponential Smoothing family of forecasting models. It is interpretable, requires fewer control parameters, and can be easily automated. Additionally, it adapts well to changes in trends and seasonal patterns in time series data as they occur. There are two variations of Holt-Winters, which differ in their treatment of seasonality: additive and multiplicative. These variants are more flexible in handling seasonality compared to the SARIMA model. As mentioned above, ARIMA is widely used in demand forecasting. It employs differencing to convert a non-stationary time series into a stationary one, and utilizes autocorrelations and moving averages of residual errors to forecast future values. It performs well on short term forecasts.

The specific flow chart is shown in Figure 1. The proposed model can be divided into five steps: 1) Input the original tourist arrivals time series. 2) STL decomposition is used to decompose the time series into two compenents: seasonal term and de-seasonal term (Trend + Residual terms). 3) For the seasonal term, model with Holt-Winters to obtain the forecasting result. 4) For the de-seasonal term, model with ARIMA first, then model with XGBoost model to obtain the forecasting the de-seasonal term. 5) The final forecasting tourist arrivals can be obtained by adding the Holt-Winters model’s forecasting result to the XGBoost model’s forecasting result.

Figure 1.

Flow chart for STL-XGBoost models.

It is important to note that the fitting phase of Step 4 consists of two stages. During the first stage, the residual series (e_t) is generated from the de-seasonal term (TR_t) using an autoregressive integrated moving average (ARIMA) model. In the second stage, three different situations were considered to propose three different models.

The first model involves using the ARIMA residual series (e_t) as input for the XGBoost model to obtain the forecasting result. This model is denoted as SLT-XGBoost (1).

The second model incorporates both the ARIMA residual series (e_t) and the de-seasonal series (TR_t) as input for the XGBoost model to obtain the forecasting result. This model is denoted as STL-XGBoost (2).

The third model includes the ARIMA residual series (e_t), the de-seasonal series (TR_t), and the lagged periodic series as inputs for the XGBoost model to obtain the forecasting result. This model is denoted as STL-XGBoost (3). The decision to include the lagged periodic series in the STL-XGBoost (3) model is based on the calculation of the Mutual Information Coefficient (MIC) (Kinney and Atwal, 2014) for the series. MIC is an effective correlation measure that is capable of capturing a wide range of both functional and non-functional relationships between variable pairs in datasets. MIC can provide a more comprehensive measure of dependency not just linear or monotonic.

The key innovations of this method are as follows: 1) It utilizes STL seasonal decomposition to split the original time series without directly forecasting the seasonal series, effectively resolving bias issues arising from seasonal series forecasting in modeling tapes; 2) Different methods are employed to model the different components of the decomposition, leveraging the advantages of each model; 3) The inherent assumptions of the model, specifically the linearity and non-linearity of the components, are taken into account to enhance prediction accuracy; 4) Instead of employing complex network models for prediction, only interpretable models are used, enhancing interpretability while improving prediction accuracy.

Data collection

To evaluate the performance of our proposed model, we conducted empirical studies using Macau as our research subject. Macau is a world-renowned tourist destination and a Chinese special administrative region, attracting a constant influx of international tourists on a daily basis. Due to its reunification with China and subsequent rapid economic growth, tourism has become a major industry in Macau, contributing to over 40% of the total output in the 20th century. Therefore, analyzing short-term tourist arrivals in Macau is vital for both the Macau government and businesses to gain a comprehensive understanding of future market trends and changes (Song and Witt, 2006). This, in turn, leads to improvements in the quality and standards of tourism services.

For this study, we collected monthly tourist arrival data in Macau from eight countries: United States, Germany, Malaysia, Philippines, India, Thailand, Italy and Korea (South Korea). The data spanned from January 2009 to December 2018 and consisted of a total of 120 observations for each country. The data were sourced from the official website of the Macau Government Tourist Office (https://macaotourism.gov.mo). As shown in Figure 2, there is considerable volatility in the arrivals from these countries, with seasonal and cyclical patterns, as well as trends and random fluctuations, arising from nonlinear dynamics.

Figure 2.

Time series plots of tourist arrivals from eight countries.

To train and evaluate our model, we split the original data set into a training set and a testing set at a ratio of 9:1. The training set consisted of data from January 2009 to December 2017, totaling 108 observations, while the test set encompassed data from January 2018 to December 2018, comprising 12 months of data. Addtionally, we employed a specific time series cross-validation method for selecting the optimal hyper-parameters of the proposed model (Hasan et al., 2020; H. Assaad and Fayek, 2021). As time series data are made up of neighboring data points that are often highly dependent on each other, standard cross-validation is not appropriate. The hyperparameters have been tuned prior to the training and testing of the model. In order to avoid introducing bias, a grid search time series cross-validation was performed on the training set to identify the optimal combination of hyperparameters. Grid search involves evaluating multiple combinations of hyper-parameters for the model. Due to the length of the time series being studied, a repeated 5-fold time series cross-validation was used. Using this cross-validation approach, the training set expands with each consecutive split selected in advance, while the test window size remains constant and shifts sequentially until the fifth and final split. There are two reasons why repeated time series cross-validation is momentous: firstly, it ensures robustness by accessing performance across different periods; secondly, it prevents information leakage from future data points by respecting chronological order. The selection of hyper-parameters was based on achieving high performance while ensuring generalization (i.e., avoiding overfitting). Thus, we quantified model performance by averaging prediction errors over these five splits. Based on the aforementioned, the data partitioning protocol employed in this study is as follows (as depicted in Figure 3): (1) employ time series split 5-fold cross-validation on the training set; (2) assess the model’s performance on each of the five validation splits for every iteration; (3) determine the hyper-parameters of the computational model based on minimizing averaged prediction errors; (4) evaluate the final performance of the proposed model on the unseen test set.

Figure 3.

5-fold time series cross validation.

Evaluation indicators

To comprehensively evaluate the performance of the proposed models, we employ a set of evaluation indicators that offer different perspectives. These indicators include Symmetric Mean Absolute Percent Error (SMAPE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE). The mathematical formulas for these metrics can be found below.

S M A P E = \frac{1}{n} \sum_{i = 1}^{n} \frac{|y_{i} - {\hat{y}}_{i}|}{(|y_{i}| + |{\hat{y}}_{i}|) / 2} * 100 %

(13)

R M S E = {(\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2})}^{\frac{1}{2}}

(14)

M A E = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - {\hat{y}}_{i}|

(15)

M A P E = \frac{1}{n} \sum_{i = 1}^{n} \frac{|y_{i} - {\hat{y}}_{i}|}{|y_{i}|}

(16)

where y_i is the actual number of tourist arrivals and $\hat{y_{i}}$ is the forecasting number of tourist arrivals. n is the number of testing samples.

Process and results

According to the workflow shown in Figure 1, the tourist arrivals data are firstly decomposed by STL decomposition. The decomposed components include seasonal term, trend term and residual term, and the last two terms are combined to form the de-seasonal term, as shown in Figure 4. In Figure 4, there are a total of five subgraphs from top to bottom. The first subgraph is the actual tourist arrivals graph (Y_t) (taking Malaysia for example) and the second to fourth subgraphs are the seasonal (S_t), trend (T_t) and residual (R_t) term that decomposed by STL decomposition. The fifth subgraph is the de-seasonal term (TR_t = T_t + R_t).

Figure 4.

STL seasonal decomposition results.

For the seasonal term (S_t), the Holt-Winters model is used to fit it to obtain the seasonal forecasting result. For the de-seasonal term (TR_t), the ARIMA model is firstly employed to fit it and generate the residual series (e_t). Then, three different situations were considered to put into the XGBoost model to obtain the de-seasonal forecasting result. XGBoost (1) model incorporates the ARIMA residual series as input to obtain the forecasting result; XGBoost (2) model incorporates both the ARIMA residual series (e_t) and the original de-seasonal series (TR_t) as input to obtain the forecasting result; STL-XGBoost (3) includes the ARIMA residual series (e_t), the original de-seasonal series (TR_t), and the lagged periodic series as inputs to obtain the forecasting result. Here the Mutual Information Coefficient (MIC) is employed to evaluate if there are periodical correlation between the de-seasonal series (TR_t). (See Appendix Table 1A for the results of MIC). The MIC values of the original de-seasonal series and the period lag are mostly larger than 0.25, which indicate significantly intra-periodical and periodical correlations. Finally, the tourist arrivals forecasting can be obtained by adding the Holt-Winters model’s forecasting result to the XGBoost model’s forecasting result.

For modeling the de-seasonal term, we employed the Greedy Search method to exhaustively search for the optimal ensemble learning structure that minimizes in-sample forecast errors, as measured by the Mean Absolute Error (MAE). In the process of exploration, a series of parameter values of the ensemble learning model are used to generate candidate models. The lower and upper bounds for these parameters are provided in parentheses as follows: hyper-parameter p: (1,8), hyper-parameter q: (1,8) and lag order parameter P: (1,3). It is worth noting that the optimal values of the hyper-parameters for the ensemble learning model changes for different datasets and different step-ahead forecasts. Tables 1 –3 lists the optimal values of the hyper-parameters p, q and P respectively. The hyper-parameters of the XGBoost were also tuned with Grid Search. The optimal combinations of hyper-parameter values were assigned as learning rate 0.2, max_depth 6, nrounds 50 and min_child_height 1. Other hyper parameters other than these are default values.

Table 1.

Optimal values of hyper-parameters p, q, and P in eight countries for 1-step-ahead forecasts using STL-XGBoost (3).

Country	p	q	P
USA	1	8	2
Germany	1	4	3
Malaysia	8	3	3
Philippines	1	2	1
India	1	7	1
Thailand	8	7	2
Italy	3	3	1
Korea	2	2	1

Table 2.

Optimal values of hyper-parameters p, q, and P in eight countries for 2-step-ahead forecasts using STL-XGBoost (3).

Country	p	q	P
USA	1	5	3
Germany	5	2	1
Malaysia	6	7	3
Philippines	1	2	1
India	6	8	3
Thailand	8	7	2
Italy	2	5	1
Korea	2	6	1

Table 3.

Optimal values of hyper-parameters p, q, and P in eight countries for 3-step-ahead forecasts using STL-XGBoost (3).

Country	p	q	P
USA	1	5	3
Germany	1	2	2
Malaysia	6	7	3
Philippines	1	2	1
India	6	7	3
Thailand	8	8	2
Italy	3	3	1
Korea	2	2	1

We constructed a comparative analysis of our proposed STL-XGBoost (1), STL-XGBoost (2), and STL-XGBoost (3). To test the forecasting performance of the proposed models, the h-step-ahead (h = 1,2,3) out-of-sample forecasts were generated. The out-of-sample prediction period was selected as 12 months. Based on the data from January 2009 to December 2017, a preliminary in-sample estimation of all models was performed. Subsequently, the estimates were computed continuously by including one observation at a time up to December 2018.

Prediction performance indicators such as SMAPE, RMSE, MAE and MAPE were calculated for each model as these metrics are commonly used in tourism demand forecasting. Using SMAPE as an example, relative comparisons have been made on the basis of the improvement of Model A over Model B, which has been calculated as follows:

I m p r o v e m e n t = \frac{S M A P E (B) - S M A P E (A)}{S M A P E (B)}

(17)

The forecasting performance of STL-XGBoost (3) was compared with STL-XGBoost (1) and STL-XGBoost(2). The h-step-ahead (h = 1,2,3) forecasting accuracy and improvements are presented in Tables 4 –6 respectively.

Table 4.

1-step-ahead out-of-sample forecasting accuracy measures.

Region	STL-XGBoost(3)	Contrast model		Improvement (STL-XGBoost(3) vs. )		Benchmark models
Region	STL-XGBoost(3)	STL-XGBoost(1)	STL-XGBoost(2)	STL-XGBoost(1)	STL-XGBoost(2)	Holt-Winters	SARIMA	XGBoost	SARIMA-XGBoost	SARIMA(XGBoost)
SMAPE
USA	7.2213	10.4885	7.7576	31.1499%	6.9129%	15.0580	11.3494	9.4696	16.3939	9.0385
Germany	8.5812	12.6698	10.0986	32.2708%	15.0264%	30.3934	20.3982	9.5189	27.7727	11.7975
Malaysia	15.1580	27.3574	18.8693	44.5927%	19.6683%	35.2661	34.8065	19.3175	34.2159	20.3298
Philippines	8.4327	24.7038	8.7127	65.8648%	3.2135%	18.5261	16.2393	12.5729	29.4314	10.8925
India	13.4280	12.1717	13.5707	−10.3219%	1.0514%	31.8062	22.9747	14.3319	26.1203	12.3472
Thailand	15.0354	15.8803	17.9972	5.3201%	16.4569%	30.8641	28.5413	20.1409	28.4626	21.7734
Italy	9.6365	6.8773	7.4566	−40.1206%	−29.2341%	19.0460	18.8126	7.5870	5.4892	7.8001
Korea	13.9951	22.6107	10.7681	38.1041%	−29.9688%	11.5817	11.2051	17.0537	27.4234	16.1422
RMSE
USA	1383.75	2078.69	1516.37	33.4315%	8.7459%	3045.16	2314.23	2009.59	3342.61	1877.63
Germany	306.43	435.23	345.03	29.5940%	11.1881%	900.20	540.94	311.27	851.29	336.12
Malaysia	3270.30	5535.60	3667.43	40.9224%	10.8286%	8318.33	7865.90	4201.51	8658.14	4227.88
Philippines	3186.08	8441.69	3669.49	62.2578%	13.1737%	6332.82	4906.24	4567.41	10822.39	4274.47
India	2225.73	2287.27	2203.94	2.6905%	−0.9890%	5188.63	3821.88	1957.03	4344.79	1753.94
Thailand	2395.90	2545.33	3185.85	5.8711%	24.7958%	5195.66	5046.00	3062.76	5580.96	3600.53
Italy	145.54	110.10	125.63	−32.1980%	−15.8489%	277.78	277.24	116.36	234.96	113.25
Korea	11718.75	15864.59	9150.36	26.1327%	−28.0687%	11267.57	9743.60	13805.92	17937.77	12594.38
MAE
USA	1230.53	1749.86	1324.73	29.6782%	7.1109%	2527.05	1920.07	1637.22	2856.60	1549.78
Germany	221.20	331.73	251.77	33.3195%	12.1426%	774.35	498.26	246.74	690.81	285.96
Malaysia	2624.55	4463.69	3175.06	41.2023%	17.3388%	6627.31	6402.03	3395.88	6655.26	3672.47
Philippines	2281.86	7077.83	2429.43	67.7605%	6.0743%	4920.21	4158.79	3406.46	8700.74	3032.52
India	1605.94	1552.64	1610.77	−3.4324%	0.3002%	4118.39	2930.57	1652.85	3313.64	1462.04
Thailand	1797.44	2047.12	2481.28	12.1965%	27.5599%	4431.73	4059.32	2739.31	4164.92	2997.12
Italy	115.22	83.51	92.40	−37.9694%	−24.7021%	230.30	226.90	94.18	188.49	96.04
Korea	9858.88	13853.74	7511.97	28.8359%	−31.2424%	8492.76	7827.81	12142.45	15211.15	11519.33
MAPE
USA	0.0700	0.1026	0.0744	31.7660%	5.8897%	0.1422	0.1075	0.0915	0.1798	0.0878
Germany	0.0837	0.1322	0.0998	36.7047%	16.1190%	0.3241	0.2162	0.0947	0.2814	0.1142
Malaysia	0.1601	0.2967	0.2073	46.0455%	22.7728%	0.3878	0.3559	0.2056	0.3939	0.2062
Philippines	0.0840	0.2998	0.0824	71.9752%	−2.0010%	0.1838	0.1568	0.1254	0.3782	0.1066
India	0.1419	0.1284	0.1473	−10.5088%	3.6686%	0.3485	0.2472	0.1574	0.2807	0.1289
Thailand	0.1703	0.1745	0.2279	2.3875%	25.2671%	0.3853	0.3715	0.2288	0.3941	0.2642
Italy	0.0901	0.0682	0.0711	−32.1452%	−26.6648%	0.1821	0.1797	0.0745	0.1625	0.0782
Korea	0.1556	0.2114	0.1165	26.3724%	−33.5478%	0.1199	0.1193	0.1864	0.2335	0.1762

Table 5.

2-step-ahead out-of-sample forecasting accuracy measures.

Region	STL-XGBoost(3)	Contrast model		Improvement (STL-XGBoost(3) vs. )		Benchmark models
Region	STL-XGBoost(3)	STL-XGBoost(1)	STL-XGBoost(2)	STL-XGBoost(1)	STL-XGBoost(2)	Holt-Winters	SARIMA	XGBoost	SARIMA-XGBoost	SARIMA(XGBoost)
SMAPE
USA	7.2213	9.4051	7.6804	23.2196%	5.9781%	14.5011	13.1345	14.3947	15.3338	10.9623
Germany	10.9048	13.9806	11.0035	22.0006%	0.8972%	29.4118	22.1110	26.6780	25.7773	13.2026
Malaysia	17.1105	31.7614	18.3120	46.1281%	6.5616%	34.8523	38.0096	33.4695	45.2095	18.9891
Philippines	8.6197	34.3534	7.6949	74.9087%	−12.0192%	17.1534	18.5148	14.4364	37.1930	11.7390
India	13.1618	17.6477	12.2596	25.4190%	−7.3588%	33.8677	21.4121	30.3164	29.0152	15.7951
Thailand	15.5356	15.3829	17.6418	−0.9924%	11.9387%	31.5893	27.5994	30.2104	19.4351	21.3992
Italy	7.5304	6.3865	8.1649	−17.9112%	7.7703%	17.8720	18.0445	19.1532	15.2785	6.2418
Korea	12.4536	13.2699	10.9602	6.1514%	−13.6262%	12.6313	13.4149	21.3414	22.5885	16.0815
RMSE
USA	1538.99	1972.26	1589.22	21.9680%	3.1605%	3025.71	2661.78	2978.24	2984.22	2196.05
Germany	331.03	413.63	373.85	19.9699%	11.4551%	893.74	595.50	836.27	818.29	378.40
Malaysia	3370.33	7336.63	3460.15	54.0616%	2.5959%	7965.97	8446.42	7740.74	10240.71	3905.10
Philippines	2876.50	11792.66	2863.06	75.6077%	−0.4697%	5616.14	6081.96	5194.89	13788.48	3827.73
India	2061.52	2454.80	2083.65	16.0209%	1.0623%	6025.83	3683.29	4954.78	4779.96	2045.92
Thailand	2481.45	2654.33	2987.93	6.5131%	16.9508%	5026.45	4759.47	5743.34	3186.25	3717.23
Italy	122.51	106.24	129.42	−15.3102%	5.3369%	275.06	274.80	296.49	225.47	107.40
Korea	10410.24	9721.91	9188.79	−7.0802%	−13.2928%	11826.89	10875.87	15363.91	15044.83	12593.65
MAE
USA	1260.16	1589.36	1317.89	20.7127%	4.3801%	2438.24	2200.34	2371.74	2569.18	1838.07
Germany	264.31	322.74	276.65	18.1068%	4.4620%	719.29	537.93	651.72	635.92	319.56
Malaysia	2924.74	5539.75	2968.69	47.2044%	1.4803%	6323.75	6951.67	6326.13	8460.17	3378.66
Philippines	2177.08	10298.48	1954.65	78.8601%	−11.3796%	4432.45	4674.73	3767.06	11503.82	3027.22
India	1640.97	2049.15	1513.31	19.9197%	−8.4355%	4574.75	2802.76	3913.74	3751.10	1724.32
Thailand	1812.26	2058.96	2279.76	11.9818%	20.5065%	4437.29	3862.49	4449.68	2541.12	2980.79
Italy	91.85	77.72	99.17	−18.1833%	7.3830%	217.20	218.91	232.21	184.61	80.11
Korea	8679.48	8001.65	7591.59	−8.4711%	−14.3302%	9135.54	9303.92	12650.17	13049.58	11239.37
MAPE
USA	0.0691	0.0889	0.0734	22.2368%	5.7840%	0.1361	0.1228	0.1359	0.1542	0.1025
Germany	0.1118	0.1300	0.1092	13.9923%	−2.4031%	0.3002	0.2380	0.2828	0.2485	0.1277
Malaysia	0.1890	0.3482	0.2050	45.7215%	7.7916%	0.3687	0.3795	0.3947	0.4817	0.1961
Philippines	0.0887	0.4316	0.0775	79.4453%	−14.4969%	0.1727	0.1721	0.1332	0.4949	0.1172
India	0.1379	0.1794	0.1302	23.1399%	−5.9236%	0.3713	0.2269	0.3125	0.3172	0.1693
Thailand	0.1768	0.1731	0.2209	−2.1276%	19.9548%	0.3912	0.3563	0.4237	0.2217	0.2779
Italy	0.0713	0.0644	0.0768	−10.7237%	7.2268%	0.1704	0.1720	0.1804	0.1609	0.0609
Korea	0.1368	0.1235	0.1187	−10.7444%	−15.2876%	0.1353	0.1439	0.1890	0.1988	0.1769

Table 6.

3-step-ahead out-of-sample forecasting accuracy measures.

Region	STL-XGBoost(3)	Contrast model		Improvement (STL-XGBoost(3) vs. )		Benchmark models
Region	STL-XGBoost(3)	STL-XGBoost(1)	STL-XGBoost(2)	STL-XGBoost(1)	STL-XGBoost(2)	Holt-Winters	SARIMA	XGBoost	SARIMA-XGBoost	SARIMA(XGBoost)
SMAPE
USA	6.9127	11.6811	7.8054	40.8210%	11.4369%	13.7607	13.5158	15.6920	15.7572	10.6262
Germany	9.9986	13.0475	10.3313	23.3681%	3.2203%	29.8477	22.6676	27.2753	27.2829	11.9513
Malaysia	15.7672	20.7838	19.4966	24.1373%	19.1286%	32.4068	38.7878	33.3324	33.8066	17.8046
Philippines	8.7721	40.8676	7.1235	78.5353%	−23.1432%	16.7274	18.9581	16.4852	44.4648	12.6888
India	12.8339	15.9223	10.4314	19.3970%	−23.0309%	35.8938	20.5472	26.0169	32.1793	14.0745
Thailand	14.2912	15.2462	15.8693	6.2639%	9.9448%	30.2124	27.0640	35.7404	20.0332	21.3902
Italy	7.5636	6.6683	5.8731	−13.4267%	−28.7839%	17.3777	17.6243	17.2392	14.5250	6.3576
Korea	15.1817	13.5284	11.6970	−12.2217%	−29.7919%	12.2521	14.2674	27.3974	19.6448	17.9383
RMSE
USA	1425.73	2245.43	1565.40	36.5050%	8.9221%	2822.16	2641.69	3142.15	3226.26	2125.16
Germany	299.56	412.94	317.80	27.4574%	5.7415%	887.94	600.15	850.75	811.41	362.71
Malaysia	3063.37	4698.53	3643.83	34.8015%	15.9299%	7299.28	8417.34	7375.91	7782.46	4327.94
Philippines	2771.27	15030.44	2514.76	81.5623%	−10.2002%	5349.13	6539.43	5595.25	16872.01	3998.27
India	2125.57	2387.09	1941.48	10.9556%	−9.4823%	6833.40	3649.07	4575.52	4956.67	1925.24
Thailand	2242.33	2956.01	2595.31	24.1431%	13.6006%	4892.32	4750.36	6392.43	3991.36	3458.90
Italy	127.43	108.93	113.90	−16.9748%	−11.8768%	262.25	260.90	275.01	207.71	98.42
Korea	11921.53	9994.37	9473.83	−19.2825%	−25.8364%	12738.61	11452.63	17447.01	13172.87	13694.11
MAE
USA	1190.78	1922.35	1326.48	38.0561%	10.2302%	2295.56	2241.31	2559.49	2639.20	1768.10
Germany	243.15	313.53	250.67	22.4496%	3.0024%	714.86	538.63	661.89	651.05	287.42
Malaysia	2674.04	3719.89	3228.20	28.1152%	17.1664%	5807.02	6906.85	6146.27	6103.82	3277.12
Philippines	2177.20	13138.24	1763.70	83.4285%	−23.4453%	4273.76	4717.33	4222.15	14554.51	3181.76
India	1614.50	1964.55	1356.48	17.8182%	−19.0217%	5041.93	2748.45	3383.72	4250.38	1677.10
Thailand	1736.00	2037.50	2155.45	14.7974%	19.4598%	4281.07	3848.61	5396.52	2777.14	2925.74
Italy	90.49	80.90	73.17	−11.8566%	−23.6663%	208.84	211.39	206.04	173.71	75.11
Korea	10558.32	8187.35	7998.13	−28.9590%	−32.0099%	8907.52	9849.50	15411.82	11450.58	12528.55
MAPE
USA	0.0665	0.1096	0.0749	39.3616%	11.2114%	0.1302	0.1276	0.1506	0.1586	0.0993
Germany	0.1008	0.1264	0.1034	20.2680%	2.5718%	0.3087	0.2470	0.2822	0.2743	0.1158
Malaysia	0.1726	0.2118	0.2201	18.5012%	21.5546%	0.3528	0.3879	0.3602	0.3627	0.1750
Philippines	0.0911	0.5484	0.0719	83.3860%	−26.6336%	0.1702	0.1753	0.1590	0.6222	0.1329
India	0.1368	0.1738	0.1085	21.3008%	−26.0810%	0.4054	0.2167	0.2709	0.3665	0.1498
Thailand	0.1541	0.1917	0.1788	19.6025%	13.8271%	0.3698	0.3480	0.4776	0.2598	0.2568
Italy	0.0710	0.0663	0.0560	−7.1790%	−26.7999%	0.1671	0.1696	0.1634	0.1516	0.0653
Korea	0.1683	0.1267	0.1271	−32.8007%	−32.3469%	0.1355	0.1558	0.2351	0.1771	0.2002

The results of the improvement in Tables 4 –6 show that, compared with STL-XGBoost(1), the STL-XGBoost(2) and STL-XGBoost(3) model improved the forecasting accuracy for most source countries under consideration. Metrics such as SMAPE, RMSE, MAE, and MAPE also indicate better performances. Among them, the STL-XGBoost(3) model performs the best, followed by the STL-XGBoost(2) model. In terms of short-term forecasting accuracy for 1-step-ahead prediction, the STL-XGBoost(3) model outperforms others in most source markets. The improvements observed in this study mostly exceed 10%, indicating a significant enhancement in forecasting accuracy for short-term predictions. For long-term forecasting with 2- and 3-step-ahead predictions, both STL-XGBoost(2) and STL-XGBoost(3) exhibit even more remarkable improvements over STL-XGBoost(1).

It is noteworthy that the STL-XGBoost(2) model exhibits superior forecasting accuracy for predictions compared to the STL-XGBoost(3) model in the case of Korea. The time series plot depicting tourist arrivals from Korea reveals a consistent upward growth trend, with numbers escalating from 10,000 to nearly 80,000 (refer to Figure 2). This pattern distinguishes itself from other countries under consideration. A similar finding can be found in reference Song and Witt, 2006. Even in this scenario, our proposed STL-XGBoost(2) model demonstrates optimal performance primarily due to the robust trend observed in tourist arrivals from Korea. Consequently, when forecasting for Korea, greater emphasis should be placed on trend analysis as disturbances have negligible influence. Moreover, it is evident that the accuracy of 3-step-ahead forecasts by the STL-XGBoost(2) model surpasses that of STL-XGBoost(3) for Italy as well. Tourist arrivals from Italy have remained relatively stable without significant disruptive factors; henceforth only lag needs to be incorporated into residuals while periodic lag becomes unnecessary. Therefore, particularly for long-term predictions, the STL-XGBoost(2) model without periodic lag may outperform the STL-XGBoost(3) model with periodic lag.

These comparison findings verified the positive roles of including the ARIMA residual series, the de-seasonal series, or the lagged periodic series as inputs for the XGBoost model to obtain the predicted data. It helps to better fit the data and improve forecasting ability and performance.

On one hand, just includes the ARIMA residual series as input for the XGBoost model may hinder the prediction accuracy because the model only learns the residual’s nonlinearity. On the other hand, both empirical observations and the MIC values suggest a association between de-seasonal series and periodic lag series within specific short-term periods. Therefore, it is crucial to consider these series. In other words, the forecasting of the de-seasonal term is still affected by the periodical lag in short term prediction. Therefore it is essential to incorporate both the lag term and the periodic lag term together with the ARIMA residual term into the XGBoost forecasting model for accurate short-term predictions.

Figures 5 and 6 also display the predicted tourist arrivals from the countries under considered for the next 1 month based on STL-XGBoost(3) which is the one model of STL-XGBoost models. The results clearly indicate that the predicted arrivals are quite close to the actual tourist arrivals in Macau. The directional shifts in the actual and predicted values are consistent, demonstrating that the increases and decreases in the predicted values are consistent with the actual fluctuations in the tourist arrival time series. This holds true even for countries such as the United States, Philippines, and Thailand, which exhibit abrupt change points.

Figure 5.

Forecasts of tourist arrivals from the top four countries in Macau.

Figure 6.

Forecasts of tourist arrivals from the last four countries in Macau.

In order to show the forecasting performance of the proposed STL-XGBoost model more clearly, we present a comparative analysis of our proposed model, STL-XGBoost, with five other models. These include three benchmark models: SARIMA, Holt-Winters and XGBoost, as well as two hybrid models: SARIMA-XGBoost and SARIMA(XGBoost).

The SARIMA-XGBoost model was created by combining the SARIMA prediction and the XGBoost prediction, with the input being the SARIMA residuals. On the other hand, the SARIMA(XGBoost) model involved using the original tourist arrival data, its periodic lag and the SARIMA residuals as inputs for XGBoost prediction.

We also determined the optimal hyper-parameters for all the comparative models using the grid search method. Four evaluation indicators, namely SMAPE, RMSE, MAE, and MAPE, were employed to assess the forecasting performance. The prediction task involved estimating the tourists arrival from eight countries, including the United States, Germany, Malaysia, Philippines, India, Thailand, Italy and Korea in Macau. The models were evaluated for their ability to predict tourist arrivals 1 month, 2 months, and 3 months in advance. The comparison results are also presented in Tables 4 –6.

The results of Tables 4 –6 show that our proposed STL-XGBoost model, no matter it is STL-XGBoost(1), STL-XGBoost(2) or STL-XGBoost(3), has the most best forecasting performance of all the horizons of the forecast (h-step-ahead, i.e., h = 1,2,3) for all the countries under considered relative to five other models.

In particular, compared with SARIMA model, the reductions of STL-XGBoost(3) in SMAPE are 36.37%,57.93% 56.45%,48.07%,41.55%,47.32% and 48.78% in the case of 1-step-ahead forecasts for the seven countries under considered respectively, except for Korea. Compared with Holt-Winters and XGBoost, the reductions of STL-XGBoost models in SMAPE are quite significant. The reason behind the inferiority of SARIMA and Holt-Winters relative to STL-XGBoost model are that these two benchmark models are not capable of efficiently capturing nonlinear patterns and more sensitive to the outliers of tourism data. The worse performance of the XGBoost is due to that it is not capable to capture the seasonal components of tourism data.

The SARIMA-XGBoost hybrid model was used for predicting seasonal characteristics. However, its ability to reduce prediction errors compared to benchmark models is not consistently apparent. The results show mixed outcomes, with improvements in some cases and decreases in others, indicating that the model lacks stability. Therefore, using simple a additive of linear and nonlinear models for prediction is not consistent better in most cases.

To address the impact of seasonal factors on the nonlinear model and avoid assuming a simple linear and nonlinear additive relationship, the SARIMA model prioritized deriving residuals. In addition to incorporating the model-derived residuals, original sequence and the periodic lag part were included in the XGBoost model. This allowed the model to learn multiple nonlinear relationships, resulting in the SARIMA(XGBoost) model. The prediction results show that the SARIMA(XGBoost) model generally performs better than benchmark models that rely solely on linear or nonlinear models, or a combination of both. However, when compared to our proposed STL-XGBoost models, the SARIMA(XGBoost) model performances worse in terms of prediction accuracy and precision. The reduction of STL-XGBoost models except for India, in SMAPE is about 10%-30% compared with SARIMA(XGBoost).

Therefore, based on the comparison with the SARIMA-XGBoost and SARIMA(XGBoost) hybrid models, it is beneficial to extract and model the seasonal term separately. This approach results in overall smaller errors and improves prediction performance.

To thoroughly assess the predictive performance of the proposed STL-XGBoost model in comparison to other models and emphasize its boosting ability, we chose one of the STL-XGBoost models(STL XGBoost(3)) to compare with other benchmark models. We employed the Diebold-Mariano (DM) test. DM test examined the statistical significance of the out-of-sample predictive performance of the STL-XGBoost(3) model across all prediction periods, when compared to five other models. Tables 7 –9 summarize the results of the DM statistics. The values outside the parentheses in these tables represent the DM test statistical values, while the values inside the parentheses correspond to the p-values. Notably, Tables 7 –9 reveal a statistically significant improvement in approximately 70% of the predicted values with the STL-XGBoost(3) model when using 0.1 as the cutoff value to reject the original hypothesis. Moreover, among all the comparison models, STL-XGBoost(3) performs the best. As the number of forecasting steps increases, its accuracy is increasing. Comparing the STL-XGBoost(3) model with all benchmark models, we find that the prior data processing, STL decomposition, is critical as well as necessary and can significantly improve the forecasting accuracy.

Table 7.

Diebold-Mariano (DM) test results of 1-step-ahead forecasts.

	DM test statistics (STL-XGBoost(3)vs.)}
	Holt-Winters	SARIMA	XGBoost	SARIMA-XGBoost	SARIMA(XGBoost)
USA	−2.1299 (0.0566)	−2.3948 (0.0356)	−1.6157 (0.1344)	−2.7694 (0.0182)	−1.8523 (0.0910)
Germany	−3.1792 (0.0088)	−2.8896 (0.0147)	−0.1129 (0.9121)	−2.2815 (0.0434)	−0.5937 (0.5647)
Malaysia	−2.5650 (0.0263)	−2.2280 (0.0477)	−1.1478 (0.2754)	−2.5356 (0.0277)	−1.5609 (0.1468)
Philippines	−2.1822 (0.0517)	−2.8441 (0.0160)	−2.1218 (0.0574)	−2.5263 (0.0282)	−1.4922 (0.1638)
India	−2.0544 (0.0645)	−1.1379 (0.2793)	0.5156 (0.6163)	−1.2501 (0.2372)	0.9420 (0.3664)
Thailand	−2.8523 (0.0157)	−2.6191 (0.0239)	−1.3661 (0.1992)	−2.4069 (0.0348)	−1.9497 (0.0772)
Italy	−2.0699 (0.0628)	−1.9801 (0.0733)	1.0825 (0.3022)	−1.7048 (0.1163)	1.1126 (0.2896)
Korea	0.1339 (0.8959)	0.7719 (0.4565)	−0.9050 (0.3849)	−1.8200 (0.0961)	−0.3865 (0.7065)

Table 8.

Diebold-Mariano (DM) test results of the 2-step-ahead forecasts.

	DM test statistics (STL-XGBoost(3)vs.)}
	Holt-Winters	SARIMA	XGBoost	SARIMA-XGBoost	SARIMA(XGBoost)
USA	−3.0171 (0.0066)	−3.4791 (0.0022)	−3.1288 (0.0051)	−3.6665 (0.0014)	−3.1086 (0.0053)
Germany	−3.4910 (0.0022)	−5.3829 (0.0000)	−3.0360 (0.0063)	−2.8487 (0.0096)	−1.0846 (0.2904)
Malaysia	−2.8864 (0.0088)	−3.3308 (0.0032)	−3.0642 (0.0059)	−4.2088 (0.0004)	−1.3800 (0.1821)
Philippines	−2.9977 (0.0069)	−2.5451 (0.0188)	−2.2540 (0.0350)	−4.6835 (0.0001)	−1.8490 (0.0786)
India	−3.2357 (0.0040)	−1.7465 (0.0953)	−2.8118 (0.0104)	−2.2460 (0.0356)	0.0840 (0.9339)
Thailand	−4.2035 (0.0004)	−2.9986 (0.0068)	−2.9544 (0.0076)	−1.2332 (0.2311)	−2.5544 (0.0185)
Italy	−3.3275 (0.0032)	−3.2745 (0.0036)	−3.3384 (0.0031)	−3.3142 (0.0033)	0.8805 (0.3886)
Korea	−0.6853 (0.5007)	−0.3083 (0.7609)	−1.6112 (0.1221)	−2.5582 (0.0183)	−1.5525 (0.1355)

Table 9.

Diebold-Mariano (DM) test results of the 3-step-ahead forecasts.

	DM test statistics (STL-XGBoost(3)vs.)}
	Holt-Winters	SARIMA	XGBoost	SARIMA-XGBoost	SARIMA(XGBoost)
USA	−3.5218 (0.0014)	−4.2555 (0.0002)	−4.2624 (0.0002)	−3.6723 (0.0010)	−3.7629 (0.0008)
Germany	−3.7130 (0.0009)	−6.5235 (0.0000)	−3.3559 (0.0022)	−3.8795 (0.0006)	−2.6682 (0.0124)
Malaysia	−3.0735 (0.0046)	−3.5638 (0.0013)	−3.5420 (0.0014)	−3.5279 (0.0014)	−1.9977 (0.0552)
Philippines	−3.5202 (0.0014)	−2.2768 (0.0304)	−3.0349 (0.0050)	−5.9159 (0.0000)	−2.8290 (0.0084)
India	−3.2750 (0.0027)	−1.9619 (0.0594)	−2.4767 (0.0193)	−3.3580 (0.0022)	1.0139 (0.3190)
Thailand	−4.9053 (0.0000)	−3.7861 (0.0007)	−4.9537 (0.0000)	−1.5091 (0.1421)	−2.9443 (0.0063)
Italy	−3.5293 (0.0014)	−3.6390 (0.0011)	−3.1864 (0.0034)	−3.1777 (0.0035)	1.4156 (0.1675)
Korea	−0.6460 (0.5234)	0.3733 (0.7117)	−3.0877 (0.0044)	−0.8469 (0.4040)	−1.5360 (0.1354)

Discussion and conclusions

In this study, we proposed a forecasting model called STL-XGBoost to address the challenge of predicting tourist arrivals with seasonal and trend characteristics. The model effectively overcomes the limitations associated with non-stationary, non-linear, and seasonally complex time series models.

To achieve this, we employed STL seasonal decomposition to decompose the time series data into two components: the seasonal term and the de-seasonal term. By individually modeling these terms, we were able to effectively analyze and capture the seasonal patterns, trends and noise in the tourist arrivals data.

For the seasonal components, we utilized the Holt-Winters model, which has been widely used for modeling time series data with seasonal patterns. This allowed us to accurately capture the seasonal effects on tourist arrivals. For the de-seasonal components, we employed the ensemble learning model XGBoost. Unlike traditional non-linear models, XGBoost offers greater interpretability, speed, and stability. By considering both the lagging factors and the causal effects induced by these factors, our model mitigates the assumptions of a simplistic linear additive relationship as observed in the ARIMA model in the de-seasonal components.

Furthermore, our STL-XGBoost model exhibits scalability, generalizability, and wide applicability. It comprehensively considers the seasonal characteristics, the coexistence of trend stochastic and non-linear features, and effectively handles the complexity and uncertainty of tourist arrivals data. It provides a more accurate and reliable forecasting tool for predicting tourist arrivals with seasonal and trend characteristics.

Although our proposed STL-XGBoost model has demonstrated satisfactory predictive performance to some extent, we solely relied on time series data for prediction and did not consider other factors such as numerical features of holiday effects and economic fluctuations, as well as textual features of real-time policies that exert significant influence on tourist arrivals. Moreover, the uncertainty surrounding various factors often exceeds researchers’ initial assumptions. Additionally, empirical evidence reveals that the complexity of the tourism market cannot be fully addressed by a limited number of classical models. Therefore, it is anticipated that future research on modeling should integrate a wide range of factors, including text-based variables and interaction effects, while employing more adaptable techniques to develop highly accurate predictive models capable of effectively addressing the intricacies and uncertainties inherent in real-world tourism market.

Supplemental Material

Supplemental Material - Forecasting tourist arrivals using STL-XGBoost method

Supplemental Material for Forecasting tourist arrivals using STL-XGBoost method by Minmin He and Xiyuan Qian in Tourism Economics

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Correction (May 2025):

In the published version of the article, there was an error in the article type listed in the header on the title page. It was previously listed as a “Review Article” but has now been corrected to “Research Article.” The article has been updated online to reflect this change.

ORCID iDs

Minmin He

Xiyuan Qian

Supplemental Material

Supplemental material for this article is available online.

Author biographies

Minmin He is a Master student of School of Mathematics, East China University of Science and Technology. Her research interests include tourism economics, tourism flow and seasonal time series analysis.

Xiyuan Qian, Ph.D., is a professor of School of Mathematics, East China University of Science and Technology. His research interests include time series analysis, big data analysis and computational statistics. He presided over the completion of more than 5 scientific research projects.

References

Assaad

Fayek

(2021) Predicting the price of crude oil and its fluctuations using computational econometrics: deep learning, lstm, and convolutional neural networks. Econometric Research in Finance 6: 119–137. DOI: 10.2478/erfin-2021-0006.

Athanasopoulos

Song

Sun

(2018) Bagging in tourism demand modeling and forecasting. Journal of Travel Research 57: 52–68.

Han

Yao

(2024) Collaborative forecasting of tourism demand for multiple tourist attractions with spatial dependence: a combined deep learning model. Tourism Economics 30: 361–388.

Chang

Liao

(2010) A seasonal arima model of tourism forecasting: the case of Taiwan. Asia Pacific Journal of Tourism Research 15: 215–221.

Chen

Guestrin

(2016) Xgboost: a scalable tree boosting system. In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, 785–794.

Cleveland

McRae

, et al. (1990) Stl: a seasonal-trend decomposition. J. Off. Stat 6: 3–73.

Cuccia

Rizzo

(2011) Tourism seasonality in cultural destinations: empirical evidence from Sicily. Tourism Management 32: 589–595.

Faraway

Chatfield

(1998) Time series forecasting with neural networks: a comparative study using the air line data. Journal of the Royal Statistical Society - Series C: Applied Statistics 47: 231–250.

Fatema

Malik

Abd Halim

(2022) Hybrid approach combining emd, arima and Monte Carlo for multi-step ahead medical tourism forecasting. Journal of Intelligent and Fuzzy Systems 42: 1235–1251.

10.

Gurnani

Korke

Shah

, et al. (2017) Forecasting of sales by using fusion of machine learning techniques. In 2017 International Conference on Data Management, Analytics and Innovation (ICDMAI). IEEE, 93–101.

11.

Hasan

Kalipsiz

Akyokuş

(2020) Modeling traders’ behavior with deep learning and machine learning methods: evidence from bist 100 index. Complex 2020: 1–16. URL: https://api.semanticscholar.org/CorpusID:220523380.

12.

CWD

, et al. (2021) Using sarima–cnn–lstm approach to forecast daily tourism demand. Journal of Hospitality and Tourism Management 49: 25–33.

13.

Liu

Guo

, et al. (2022) Tourism demand forecasting considering environmental factors: a case study for Chengdu research base of giant panda breeding. Frontiers in Ecology and Evolution 10: 885171.

14.

Song

(2020) Data source combination for tourism demand forecasting. Tourism Economics 26: 1248–1265.

15.

Hyndman

Athanasopoulos

Bergmeir

, et al. (2020) Package ‘forecast’. Online. https://cran.r-project.org/web/packages/forecast/forecast.pdf.

16.

Jun

Yuyan

Lingyu

, et al. (2018) Modeling a combined forecast algorithm based on sequence patterns and near characteristics: An application for tourism demand forecasting. Chaos, Solitons & Fractals 108: 136–147.

17.

Khashei

Bijari

(2010) An artificial neural network (p, d, q) model for time series forecasting. Expert Systems with Applications 37: 479–489.

18.

Kinney

Atwal

(2014) Equitability, mutual information, and the maximal information coefficient. Proceedings of the National Academy of Sciences 111: 3354–3359.

19.

Kumar

Misra

Chan

(2022) Leveraging AI for advanced analytics to forecast altered tourism industry parameters: a COVID-19 motivated study. Expert Systems with Applications 210: 118628.

20.

Pan

Law

, et al. (2017) Forecasting tourism demand with composite search index. Tourism Management 59: 57–66.

21.

(2020) Forecasting tourism demand with multisource big data. Annals of Tourism Research 83: 102912.

22.

Liang

(2014) Forecasting models for Taiwanese tourism demand after allowance for mainland China tourists visiting Taiwan. Computers & Industrial Engineering 74: 111–119.

23.

Lim

McAleer

(2001) Forecasting tourist arrivals. Annals of Tourism Research 28: 965–977.

24.

Lim

McAleer

(2002) Time series forecasts of international travel demand for Australia. Tourism Management 23: 389–396.

25.

Liu

Sun

Liu

, et al. (2020) Generalized flight delay prediction method using gradient boosting decision tree. In: 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring), IEEE. pp. 1–5.

26.

Nelson

Hill

Remus

, et al. (1999) Time series forecasting using neural networks: should the data be deseasonalized first? Journal of Forecasting 18: 359–367.

27.

Wang

Zhang

, et al. (2021) Daily tourist flow forecasting using spca and cnn-lstm neural network. Concurrency and Computation: Practice and Experience 33: e5980.

28.

Ozaslan

Degirmenci

Karal

(2022) Tourism demand forecasting for Turkey by using adaboost algorithm. In 2022 Innovations in Intelligent Systems and Applications Conference (ASYU). IEEE, 1–5.

29.

Pai

Hong

Lin

(2005) Forecasting tourism demand using a multifactor support vector machine model. In Computational Intelligence and Security: International Conference, CIS 2005, Xi’an, China, December 15-19, 2005, Proceedings Part I. Springer,512–519.

30.

Pan

Yang

(2017) Forecasting destination weekly hotel occupancy with big data. Journal of Travel Research 56: 957–970.

31.

Park

Lee

Song

(2017) Short-term forecasting of Japanese tourist inflow to South Korea using google trends data. Journal of Travel & Tourism Marketing 34: 357–368.

32.

Song

Witt

(2006) Forecasting international tourist flows to Macau. Tourism Management 27: 214–224.

33.

Sun

Wei

Tsui

, et al. (2019) Forecasting tourist arrivals with machine learning and internet search index. Tourism Management 70: 1–10.

34.

Sun

Guo

Je.

, et al. (2022) Tourism demand forecasting: an ensemble deep learning approach. Tourism Economics 28: 2021–2049.

35.

Taylor

Letham

(2018) Forecasting at scale. The American Statistician 72: 37–45.

36.

Tsui

WHK

Balli

(2017) International arrivals forecasting for Australian airports and the impact of tourism marketing expenditure. Tourism Economics 23: 403–428.

37.

DCW

, et al. (2021) Forecasting tourist daily arrivals with a hybrid sarima–lstm approach. Journal of Hospitality & Tourism Research 45: 52–67.

38.

Xing

Sun

, et al. (2022) Seasonal and trend forecasting of tourist arrivals: An adaptive multiscale ensemble learning approach. International Journal of Tourism Research 24: 425–442.

39.

Zhang

(2005) Neural network forecasting for seasonal and trend time series. European Journal of Operational Research 160: 501–514.

40.

Zhang

Tang

(2022) Pso-weighted random forest for attractive tourism spots recommendation. Future Generation Computer Systems 127: 421–425.

41.

Zhang

Muskat

, et al. (2020) Group pooling for deep tourism demand forecasting. Annals of Tourism Research 82: 102899.

42.

Zhang

Jiang

Wang

, et al. (2021) A new decomposition ensemble approach for tourism demand forecasting: evidence from major source countries in Asia-pacific region. International Journal of Tourism Research 23: 832–845.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.54 MB