Tourism demand forecasting: An ensemble deep learning approach

Abstract

The availability of tourism-related big data increases the potential to improve the accuracy of tourism demand forecasting but presents significant challenges for forecasting, including curse of dimensionality and high model complexity. A novel bagging-based multivariate ensemble deep learning approach integrating stacked autoencoder and kernel-based extreme learning machine (B-SAKE) is proposed to address these challenges in this study. By using historical tourist arrival data, economic variable data, and search intensity index (SII) data, we forecast tourist arrivals in Beijing from four countries. The consistent results of multiple schemes suggest that our proposed B-SAKE approach outperforms the benchmark models in terms of level accuracy, directional accuracy, and even statistical significance. Both bagging and stacked autoencoder can effectively alleviate the challenges brought by tourism big data and improve the forecasting performance of the models. The ensemble deep learning model we propose contributes to tourism demand forecasting literature and benefits relevant government officials and tourism practitioners.

Keywords

bagging economic variables ensemble deep learning search intensity index stacked autoencoder tourism demand forecasting

Introduction

Tourism demand forecasting plays a crucial role in tourism management. On the one hand, tourism resource planning based on accurate demand forecasting is of great significance for avoiding unnecessary losses, due to the perishability of tourism products (Chu, 2011; Law et al., 2019; Shen et al., 2008). On the other hand, tourism demand forecasting can effectively help the government and tourism practitioners guide tourists properly, and thus improve the service quality and tourists’ experience (Liu et al., 2018; Zhang, Wang, et al., 2020).

Tourist demand forecasting approaches, used by the majority of quantitative studies, include various time series, econometrics, artificial intelligence (AI) approaches, and the combinations of these approaches (Jiao and Chen, 2018; Song et al., 2019; Song and Li, 2008). AI approaches can not only effectively capture the nonlinear characteristics between variables but also require little specialized expertise as data-driven approaches. Therefore, more and more researchers develop various powerful AI approaches to further advance the literature of tourism demand forecasting (Song et al., 2019; Zhang, Wang, et al., 2020). Notably, deep learning has been the research hot spot in this field recently (Law et al., 2019; Lv et al., 2018; Zhang, Li, et al., 2020, 2021).

The availability of tourism big data is improving gradually. In addition to historical tourist arrival data, the data widely used in current tourism demand forecasting literature mainly include economic variable data and search intensity index (SII) data. The causal econometric approaches have revealed the most crucial economic variables that determine the demand for international tourism. Specifically, these economic variables are summarized by Athanasopoulos et al. (2017), including tourism prices in a destination relative to those in the origin country, tourism prices in competing destinations, tourists’ income, and exchange rates. Additionally, with the booming in web search technology, tourists seek travel information by using search engines before traveling. These search behaviors are statistically generated into SII data that could be used to accurately measure tourists’ attention (Bangwayo-Skeete and Skeete, 2015; Fesenmaier et al., 2010). A set of effective methods for key word selection and data aggregation to form the indicator has been gradually developed (Li et al., 2017; Yang et al., 2015). The application techniques of SII data have been preliminarily established.

According to the concept of “data-intensive forecasting” proposed by Bunn (1989), a way to further improve forecast accuracy is by making use of the availability of multiple information and computing resources. Analogously, Song et al. (2013) point out that combining forecasts considering different data has become one of the most important and effective ways to improve forecasting performance. Inspired by this modeling idea, this study incorporates historical tourist arrival data, economic variable data, and SII data mentioned above into the forecasting framework. Nevertheless, introducing large amounts of data also poses huge challenges for forecasting. First, tourism-related big data means many influential factors potentially affecting tourism demand. With the increase of potential features, sample data will become sparse in the feature space, which eventually leads to curse of dimensionality and affects the forecasting effect (Law et al., 2019). Second, the data with many explanatory variables also increase the complexity of the model, resulting in large variance and overfitting (Zhang, Li, et al., 2020, 2021). Feature engineering is an effective way to solve the above two problems, but traditional feature extraction requires a lot of expert knowledge and manual work (Law et al., 2019; Lv et al., 2018).

To address these challenges, a bagging-based multivariate ensemble deep learning model, integrating stacked autoencoder and kernel-based extreme learning machine (B-SAKE) is proposed for tourism demand forecasting. Concretely, the deep learning technique automatically extracts mining features by simulating the brain’s pattern to process information and does not require much domain knowledge and human resource (Pouyanfar et al., 2018). Researchers have examined unsupervised feature learning applied in tourism demand forecasting. For example, Li et al. (2018) utilize principal component analysis (PCA) to reduce the dimension of data features, thus effectively reducing the redundant information of data. Stacked autoencoder (SAE) is capable of learning nonlinear relationships, which could be regarded as a more powerful nonlinear generalization of PCA. Bagging generates multiple data sets for training a set of models to improve the stability of forecasting and reduce variance effectively (Athanasopoulos et al., 2017; Inoue and Kilian, 2008) and its powerful performance has been demonstrated in many forecasting fields. Kernel-based extreme learning machine (KELM) has not only high computational efficiency but also better forecasting performance than extreme learning machine (ELM) because the random map in ELM is replaced by the kernel in KELM (Sun et al., 2019).

We conduct numerical experiments for Beijing international tourist arrivals. To verify the effectiveness of the models, we consider the cases of tourist arrivals in Beijing city from four origin countries including the United States, the United Kingdom, Germany, and France. In addition to its 67% market share in the United States, Google has more than 90% of the search market in the other three countries. SII data are more representative in these countries, and other data required in the models are publicly available. The results of the empirical study are fourfold: (1) Both bagging and SAE can improve the forecasting performance of the models. (2) Our proposed B-SAKE model is the most accurate in different forecasting schemes (i.e. one-step-ahead vs. multistep-ahead, and in-sample vs. out-of-sample) regarding the performance evaluation criteria including mean absolute percentage error (MAPE), normalized root mean square error (NRMSE), and directional symmetry (DS). (3) This forecasting model performs better than other benchmark models from the statistical perspective (Diebold-Mariano test and Pesaran-Timmermann test). (4) The consistency of our findings across the four countries we considered is encouraging.

The objective of this study is to propose a novel ensemble deep learning model to mitigate the curse of dimensionality and high model complexity caused by tourism big data and verify its good forecasting accuracy and stability. The most relevant literature is Zhang, Li, et al. (2021), who also points out that there may be overfitting problems in the deep learning model. They increase the data volume available for training through the decomposition method and improve the efficiency of feature extraction by designing a duo attention layer. Correspondingly, in the ensemble model we develop, bagging and SAE are responsible for implementing similar functions. Given excellent forecasting performance and consistency in multiple forecasting cases, our proposed B-SAKE model contributes to tourism demand forecasting literature and benefits relevant government officials and tourism practitioners.

The rest of this study is organized as follows. The second section details the literature on tourism demand forecasting with SII data and tourism demand forecasting with deep learning. The third section introduces related methods and describes the conceptual framework of this study. The fourth section provides a case study on Beijing tourist arrivals and compares the results of our proposed B-SAKE model with those of benchmark models. Finally, conclusions and limitations are summarized in the fifth section.

Literature review

Tourism demand forecasting with SII data

Search engines can provide a time series index of the volume of queries users using the search engine in a specific geographic area (Choi and Varian, 2012; Padhi and Pati, 2017), which is referred to in the literature as SII data. Social psychologists outline the spatiotemporal frequency of specific search terms provided by web search engines can reflect the attention of specific user groups on this issue in a specific time space (Lai et al., 2017). In terms of tourism, travelers seek relevant information through search engines regarding almost all aspects of the trip, including accommodations, transportation, attractions, and dining (Fesenmaier et al., 2010; Yang et al., 2015). Therefore, SII data, as a measure of tourists’ attention, have been widely used in tourism demand forecasting literature (Tang et al., 2020).

Choi and Varian (2012) first introduce Google Trends data to forecast visitor arrivals in Hong Kong, and the positive effect of Google Trends data in forecasting is demonstrated by using visitor arrival data from nine origin countries. However, they only consider the Google Trends index for “Vacation Destinations/Hong Kong,” and the way they aggregate the data results in information loss. Given these problems, Bangwayo-Skeete and Skeete (2015) propose a novel indicator for tourism demand forecasting for countries in the Caribbean, which is based on a composite search for “hotels and flights.” Yang et al. (2015) suggest that localized SII data should be selected by comparing the fitness and forecasting ability of Google Trends with those of the Baidu Index. The systemic search query and selection mechanism they develop is widely accepted by later literature (Law et al., 2019; Zhang, Li, et al., 2020). Wen et al. (2019) explore the possible nonlinear relationship between SII data and tourism demand and design a hybrid model that integrates linear model and nonlinear model, to better mine the forecasting power of SII data. Li et al. (2017) focus on SII data aggregation methods in the context of a large number of studies incorporating increasingly web search key words. They adopt a generalized dynamic factor model to process many key word variables and the proposed method improves the forecast accuracy over those of two benchmark models: a traditional time series model and a model with an index created by PCA. In recent years, some scholars have paid attention to spurious patterns in Google Trends data, such as changes in search behavior and total search volume (Bokelmann and Lessmann, 2019), and the language and platform biases that inevitably result from using SII data (Dergiades et al., 2018), and they present corresponding improvement measures.

Tourism demand forecasting with deep learning

AI models have achieved successful applications in tourism demand forecasting. However, the vast majority of the AI models covered in the forecasting literature are shallow architectures, which have limited capabilities for exploring higher nonlinearities, particularly when the data have large-scale and unclear patterns (Lv et al., 2018; Zhao et al., 2017). In recent years, the few studies that develop different deep learning methods have shown their great potential in improving tourism demand forecasting performance.

Lv et al. (2018) propose a novel deep learning method called the stacked autoencoder with echo-state regression (SAEN) to forecast the tourism demand. The proposed SAEN is applied in four different, but representative tourism cases and the forecasting results show that SAEN outperforms the benchmark models, including seasonal autoregressive integrated moving average (SARIMA), multiple linear regression (MLR), single hidden layer feedforward neural network (SLFN), support vector regression (SVR), echo state network (ESN), and long short-term memory (LSTM). Law et al. (2019) come up with two challenges (feature engineering and lag order selection) that tourism demand forecasting may face when large amounts of search engine data are adopted. And a deep network architecture for tourism demand forecasting based on LSTM is given, which not only overcomes the two challenges mentioned above but also significantly excels SVR and artificial neural network models. Zhang, Li, et al. (2021) mitigate the overfitting issue and improve tourism demand forecasting by introducing the decomposition and improving the attention mechanism. In Zhang, Li, et al. (2020), the group-pooling strategy is designed to identify tourism destinations with similar data patterns, thus increasing the available data for the tourism deep learning model and further improving the forecasting accuracy. In conclusion, the above deep learning tourism demand forecasting studies are generally aware of the curse of dimensionality and high model complexity brought by tourism big data and have designed various forecasting frameworks to reduce possible overfitting and improve the forecasting accuracy. In the context that the field is still in the stage of continuous exploration, the ensemble deep learning framework proposed in this study is a promising approach. It is worth noting that the previous tourism deep learning forecasting studies ignore traditional explanatory variables such as economic variables. This study tries to make up for this omission and enrich the available data set to improve the forecasting performance.

Related methods

Stacked autoencoder

SAE, proposed by Bengio et al. (2006), is a successful application of the layer-wise strategy in autoencoder. Structurally, SAE is considered as a neural network made up of several layers of autoencoders. Autoencoder, where the output is expected to reconstruct the input, is a SLFN. Autoencoder’s structure is depicted in Figure 1, in which X_I, H, and X_O are the input, output, and hidden layer vectors, respectively. In autoencoder, “encoding” refers to the transformation from X_I to H, and “decoding” refers to the transformation from H to X_O. Autoencoder tries to approximate an identity function to ensure the input vector X_I close to the output vector X_O and realize the compressed and high-level representation of X_I, which is the hidden layer vector H. Consequently, the autoencoder is chosen to extract nonlinear features.

Figure 1.

The structure of an autoencoder.

Figure 2 shows the structure of a SAE, in which the hidden layer output of the previous autoencoder is considered to be the input of the next autoencoder. Given input data, SAE can learn effectively representations from the original input and automatically filter unrelated features. Training SAE directly on the entire structure is very time-consuming and may cause the problem of gradient disappearance, especially in the case of large network depth. Layer-wise training has always been the core of deep neural network learning. For SAE, each autoencoder in SAE is first trained in a sequential and unsupervised manner through a back propagation algorithm. Since then, the updated parameters of SAE are shared. The SAE parameters can obtain local optimum values after layer-wise training. Moreover, SAE is not sensitive to the raw input features such that it does not require artificial feature extraction.

Figure 2.

Stacked autoencoder structure.

Kernel ELM

ELM, proposed by Huang et al. (2006), is a SLFN. The ELM model has been widely used in many fields because of its high computational efficiency and generalization ability. The input weights and biases of the ELM model are randomly generated, and there is no need to adjust the hidden layer parameters. The output weights are obtained through a simple matrix computation, and this is why the ELM has high computing speed.

For N samples $(x_{i}, y_{i})$ , $x_{i} \in ℜ^{N}$ , $y_{i} \in ℜ^{N}$ , $i = 1, 2, \cdot \cdot \cdot, N$ . Let $h (x)$ and Y be the activation function of the hidden layer and the output matrix, respectively. The typical ELM can be expressed as

Y = {[\begin{array}{l} y_{1 j} \\ y_{2 j} \\ ⋮ \\ y_{m j} \end{array}]}_{m \times N} = {[\begin{array}{c} \sum_{i = 1}^{l} β_{i 1} h (ω_{i} x_{j} + b_{i}) \\ \sum_{i = 1}^{l} β_{i 2} h (ω_{i} x_{j} + b_{i}) \\ ⋮ \\ \sum_{i = 1}^{l} β_{i m} h (ω_{i} x_{j} + b_{i}) \end{array}]}_{m \times N} j = 1, 2, \cdot \cdot \cdot, N,

where $ω$ is the weight between the input layer and the hidden layer, l is the number of hidden layer nodes, $β$ represents the weight between the hidden layer and the output layer, and b is the threshold of the hidden layer. The above equations can also be given by

H β = Y, Y \in ℜ^{N \times m}, β \in ℜ^{N \times m}, H = H (ω, b) = h (ω x + b),

where H is the output matrix of the hidden layer. The only unknown parameter is the output weight $β$ , which could be solved by the ordinary least squares method. The solution of the above equation is defined as

\hat{β} = H^{†} Y, H^{†} = H^{T} {(H H^{T})}^{- 1},

where $H^{†}$ represents the Moore-Penrose generalized inverse of the matrix H. $\hat{β}$ can be calculated by adding a positive penalty factor $1 / C$ according to the theory of ridge regression and orthogonal projection method.

\hat{β} = H^{T} {(1 / C + H H^{T})}^{- 1} Y .

Then the output function of the ELM could be written as

f (x) = H \hat{β} = H H^{T} {(1 / C + H H^{T})}^{- 1} Y .

This method overcomes some disadvantages of the typical gradient-based learning algorithms, such as overfitting, local minima, and long computation times. The topological structure of ELM is given in Figure 3.

Figure 3.

The topological structure of ELM. ELM: extreme learning machine.

Huang (2014) proposed a kernel-based ELM (KELM). According to Mercer condition, the activation function $h (x)$ of the hidden layer is replaced by a kernel function. The output function of the KELM can be expressed as:

f (x) = h (x) \hat{β} = {[\begin{array}{l} k (x, x_{1}) \\ k (x, x_{2}) \\ ⋮ \\ k (x, x_{n}) \end{array}]}^{T} {(1 / C + H H^{T})}^{- 1} Y .

In the above formula, we do not need to know the feature mapping $h (x)$ but can use its corresponding kernel function $k (x, x_{i})$ . This means that the kernel function can replace the random mapping of the ELM and make the output weights more stable. Thus, the KELM usually has a better generalization ability than the ELM. The flowchart of our proposed SAE-based KELM (SAKE) tourism demand forecasting approach is shown in Figure 4.

Figure 4.

SAKE flowchart. SAKE: stacked autoencoder with kernel-based extreme learning machine.

Bagging

Bagging (bootstrap aggregating) is originally developed by Breiman (1996) to improve the unstable process by generating new learning sets. The purpose of bagging is to reduce the variance of forecasting and thus lead to improved accuracy. Specifically, using the resampling method, bagging generates additional samples by extracting and replacing data from the original data to train the model. These additional samples are called resampling samples. We suppose that K samples are generated. For each resampling sample, the process described in the previous subsection for building the SAKE network is repeated and the forecasts will be generated in each iteration. Accordingly, we now have K sets of forecasts instead of one set of forecasts. To obtain the final forecast, we aggregate these K forecasts by taking the average or the median, and the forecast variance is lower than using only one original sample.

Bagging forecasting involves generating a great number of samples, called bootstrap samples. Let y_t be the predictor vector at time t and $y_{t} = {(1, {l^{'}}_{t})}^{'}$ . Suppose x_T is the latest observation where T is the number of in-sample data. The in-sample data are arranged through a matrix of dimensions $(T - h) \times (1 + q)$ shown as follows:

B = [\begin{array}{c} \begin{matrix} x_{1 + h} & {y^{'}}_{1} \end{matrix} \\ \begin{matrix} ⋮ & ⋮ \end{matrix} \\ \begin{matrix} x_{T} & {y^{'}}_{T - h} \end{matrix} \end{array}] .

We generate a bootstrap sample k by giving a replacement from the matrix B blocks of m rows to capture the dependence in the error term. It can be expressed as follows:

B^{(k)} = [\begin{array}{c} \begin{matrix} x_{1 + h}^{(k)} & {y^{'}}_{1}^{(k)} \end{matrix} \\ \begin{matrix} ⋮ & ⋮ \end{matrix} \\ \begin{matrix} x_{T}^{(k)} & {y^{'}}_{T - h}^{(k)} \end{matrix} \end{array}] .

For each bootstrap sample, we perform model selection and estimate the model from matrix $B^{(k)}$ blocks. We fit the model back to the latest observations in the matrix B blocks and obtain ${\hat{x}}_{T + h}^{(k)}$ . We repeat the process for $k = 1, 2, \cdot \cdot \cdot, K$ . The final forecast is then obtained as:

{\hat{x}}_{T + h}^{(b a g)} = \frac{1}{K} \sum_{k = 1}^{K} {\hat{x}}_{T + h}^{(k)} .

For more details about bagging, please refer to Breiman (1996) and Athanasopoulos et al. (2017).

Multivariate forecasting

Multivariate forecasting, unlike univariate forecasting, takes the autoregressive effect of the target series and the impact of the exogenous variable into account. This can be denoted by

y (t + m) = f (s (y), s (x_{1}), \cdot \cdot \cdot, s (x_{c})),

where $y (t + m)$ is the value of the dependent variable y at time $t + m$ , and $s (x) = x (t), x (t - 1), \cdot \cdot \cdot, x (t - N_{x} + 1)$ past values of the exogenous variable x with the number of N_x.

Ensemble deep learning approach

Figure 5 indicates the process of our proposed B-SAKE ensemble deep learning approach, which combines the advantages of bagging, SAE, and KELM. This approach is composed of the following five steps, which conforms to the general process of ensemble learning (Cao et al., 2020; Qiu et al., 2017; Zhao et al., 2017).

Step 1: Data preprocessing: transform and partition the original data into the in-sample data set (the training set) and the out-of-sample data set (the test set).

Step 2: Bootstrapping: generate K copies of the in-sample data sets by bagging approach.

Step 3: Model training: train K SAKE models with each copy of the in-sample data sets independently.

Step 4: Individual forecasting: generate K forecasts though using the K trained SAKE models.

Step 5: Aggregation: take the mean value of the K forecasts as the final forecasting results.

Figure 5.

The process of B-SAKE ensemble deep learning approach. B-SAKE: bagging-based stacked autoencoder with kernel-based extreme learning machine.

Empirical study

This section provides a case study on Beijing tourist arrivals and compares the forecasting performance of our proposed B-SAKE with the benchmark models. “Data” section details the data sets involved in this study, including tourist arrival data, economic variables data, and SII data. “Performance evaluation criteria and statistic test” section describes the forecasting performance evaluation criteria and statistical tests. “Benchmarks and parameter settings” section introduces the benchmarks we choose and parameter settings. “Empirical results” and “Summary” sections give the forecasting results and reasonable interpretation.

Data

Tourist arrival data

In this study, tourist demand (typically measured as tourist arrivals) is investigated for forecasting purposes. We select monthly inbound tourist arrivals in Beijing city over January 2008 to December 2018 from origin countries of the United States, the United Kingdom, Germany, and France, which is shown in Figure 6. It is observed that tourist arrivals show seasonality and volatility. The United States is the largest source of tourists for Beijing city within the four countries. Table 1 presents the statistical properties of the tourist arrival time series and indicates the difference in the statistical feature among the data sets. The tourist arrival data are obtained from the official website of the Beijing Municipal Bureau of Statistics (http://tjj.beijing.gov.cn/), which regularly publishes monthly tourist arrivals by nationality. The data sets are divided into the in-sample data set and the out-of-sample data set (see Figure 6). The in-sample data set serves as model training with data from 2008.1 to 2016.12, while the out-of-sample data set serves as model testing with data from 2017.1 to 2018.12. This division is consistent with the general laws of machine learning.

Figure 6.

Tourist arrivals in Beijing city from four origin countries.

Table 1.

Statistical properties of the tourist arrival time series.

Country	Max	Min	Mean	Range	SD	Skewness	Kurtosis
US	91,456	22,577	57,711.03	68,879	16,313.76	−0.4101	2.2544
UK	25,015	6590	14,619.63	18,425	4558.85	−0.1018	2.1909
Germany	26,968	7837	17,341.29	19,131	4949.07	−0.1020	2.2174
France	18,730	5581	11,757.76	13,149	3626.12	0.0612	2.0372

Economic variable data

Income and price are the basic variables of economic demand theory (Crouch, 1992). Income-like and price-like variables that have been extensively validated in the international tourism demand literature (Athanasopoulos et al., 2017; Li et al., 2005; Song and Li, 2008) include (1) income level of tourists, (2) future income expectations of tourists, (3) prices of the tourism products in the destination, and (4) prices of the tourism products in the substitute destinations. It is when income and price are considered at the same time that the tourists’ price perception of transnational tourism products could be accurately measured. Instead of constructing additional predictors, deep learning provides the possibility of end-to-end learning by directly utilizing typical income-like and price-like variables. This method not only greatly reduces the manual work but also avoids the loss of forecast accuracy. Considering the availability and reliability of data, we follow Athanasopoulos et al. (2017) and choose the following economic variables.

It is expected that tourists’ income level positively influences tourism demand. Tourist income level is usually measured in term of gross domestic product per capita

G D P p c_{i, t},

where $i = 1, 2, \cdot \cdot \cdot, n$ represents the n origin countries, t is the time. We adopt seasonally adjusted real GDP per capita in constant 2008 prices using the expenditure approach and the unit currency of the origin country.

The interest rate spread reflects future economic activity and the business cycle (Anderson et al., 2007; Athanasopoulos et al., 2011; Athanasopoulos et al., 2017; Stock and Watson, 2012). Then future income expectations of tourists can be measured by the interest rate spread (IRS)

{IRS}_{i, t} = L T G B_{i, t} - S T G B_{i, t},

where $L T G B_{i, t}$ is the long-term government bond and $S T G B_{i, t}$ is the short-term government bond of the origin country.

The demand of tourism product is inversely related to its price, which is indicated by the law of demand. The impact of exchange rates also needs to be considered for international tourism. We can use the price variable to measure this effect, which is defined as the ratio between the consumer price index (CPI) and standardized by the exchange rate

P_{i, t} = \frac{C P I_{C N, t} / E X_{i, t}^{C N Y}}{C P I_{i, t}},

where $C P I_{i, t}$ represents the CPI of the origin country i at time t, and $E X_{i, t}^{C N Y}$ is the exchange rate between China Yuan and the currency of the origin country i. The CPIs are adjusted in constant 2008 prices using the dollar.

Furthermore, the demand of tourism product is also affected by the prices of other competing tourism products. International tourism demand literature indicates that similarities in climate, culture, and geography are indicative of substitute destinations (Kumar et al., 2020; Seetaram, 2012; Song et al., 2003). China, South Korea, and Japan are East Asian countries with similar climates, all influenced by Confucian culture, all have long histories and a large number of places of interest. Especially for European and American tourists, these three countries are typical destinations of Oriental culture tours (Noh and Vogt, 2013). Therefore, the substitute prices are defined as

S_{i, t}^{K R} = \frac{C P I_{K R, t}}{E X_{i, t}^{K R W}} and S_{i, t}^{J P} = \frac{C P I_{J P, t}}{E X_{i, t}^{J P Y}},

where $K R W$ and $J P Y$ are Korean won and Japanese yen, respectively.

The data of economic variables mentioned above are publicly accessible and can be downloaded via Wind (https://www.wind.com.cn).

SII data

Following Yang et al. (2015) and Li et al. (2020), we choose 24 basic search key words in Google Trend based on the destination and various dimensions of tourism planning, including tour, lodging, recreation, traffic, dining, and shopping. The basic search key words related to Beijing tourism are listed in Table 2 with their corresponding dimensions. Then we search for the basic key words in a specific origin country and set iteratively recommended key words as the next time of search key words. We repeat this process until there are no new key words in the recommended list. Finally, we obtain 51, 45, 38, and 33 key words for the United States, the United Kingdom, Germany, and France, respectively.

Table 2.

Basic search key words related to Beijing tourism.

Dimension	Key words	Dimension	Key words	Dimension	Key words
Tour	Beijing maps	Lodging	Beijing hotels	Recreation	Beijing bar
	Beijing travel		Beijing resorts		Beijing show
	Beijing weather		Beijing restaurant		Beijing night life
	Beijing travel agency		Beijing accommodation		Beijing recreation
Traffic	Beijing subway	Dining	Peking duck	Shopping	Dashilan Street
	Beijing flights		Beijing food		Panjiayuan Center
	Beijing airports		Beijing snack		Beijing shopping
	Beijing airlines		Beijing food guide		Beijing shopping guide

We calculate the Pearson correlation coefficient between tourist arrivals and key words with different lag periods. Four correlation coefficients are calculated for each of the key words, including the correlations between the visitor volumes in the current period and search query volumes from 1 to 3 months prior. We choose the key words with the highest correlation coefficient values, which are presented in Table 3. To obtain the appropriate key words, we use 0.7 as the threshold, in other words, we select the key words with a correlation coefficient value greater than 0.7. It can be observed that the optimal lag order of most key words is 1, indicating that tourists retrieve travel-related information one month in advance, which is consistent with our intuition.

Table 3.

Maximum correlation coefficient of search key words.

Country	Key words	Lag order	Countries	Key words	Lag order
US	Beijing travel	3	UK	China travel	2
	Beijing travel	3		Beijing travel	2
	Beijing weather	2		Beijing travel agency	2
	China travel	2		Beijing travel agency	2
	China travel	2		Beijing airlines	1
	Beijing airlines	1		Beijing airlines	1
	Beijing airlines	1		Beijing flights	1
	Beijing flights	1		Beijing airports	1
	Beijing airports	1		Beijing hotels	1
	Beijing subway	1		Beijing restaurant	1
	Beijing hotels	1		Peking duck	1
	Beijing restaurant	1		Duck recipes	1
	Beijing restaurant	1		Beijing shopping	1
	Peking duck	1		Great Wall	1
	Peking duck	1		Beijing maps	1
	Duck recipes	1
	Beijing shopping	1
	Great Wall	1
	Forbidden city	1
Germany	Beijing travel	3	France	Beijing tourism	3
	Beijing maps	2		Beijing travel	2
	Beijing maps	2		Beijing weather	2
	Beijing weather	2		Beijing flights	1
	China travel	2		Beijing airports	1
	Peking duck	1		Beijing hotels	1
	Peking duck	1		Beijing shopping	1
	Beijing shopping	1		Peking duck	1
	Great Wall	1		Great Wall	1
	Beijing airlines	1		Forbidden city	1
	Beijing flights	1
	Beijing airports	1
	Beijing hotels	1
	Beijing restaurant	1

Performance evaluation criteria and statistic test

To evaluate and compare the forecasting performance of models, we adopt multiple error criteria commonly used in recent tourism demand forecasting literature (Law et al., 2019; Sun et al., 2019; Zhang, Li, et al., 2020, 2021), including MAPE, NRMSE, and DS. The specific formulas are written as follows:

M A P E = \frac{1}{N} \sum_{t = 1}^{N} |\frac{x_{t} - {\hat{x}}_{t}}{x_{t}}| \times 100 %,

N R M S E = \frac{1}{x} \sqrt{\frac{1}{N} {\sum_{t = 1}^{N} (x_{t} - {\hat{x}}_{t})}^{2}} \times 100 %,

D S = \frac{1}{N - 1} \sum_{t = 2}^{N} d_{t} \times 100 %, d_{t} = \{\begin{cases} 1 i f (x_{t} - x_{t - 1}) ({\hat{x}}_{t} - x_{t - 1}) > 0 \\ 0 o t h e r w i s e \end{cases},

where N is the number of observations in the data sets, x_t and ${\hat{x}}_{t}$ represent the true value and the forecasting value at time t, respectively. MAPE and NRMSE measure the level accuracy, the smaller the MAPE and NRMSE are, the better the level forecasting performance. DS measures the directional accuracy, the higher the DS is, the better the directional forecasting performance.

To exclude the influence of the specific choice of data values in the sample, the Diebold-Mariano (DM) test and Pesaran-Timmermann (PT) test are employed to test the statistical significance of all models in the level forecasting and the directional forecasting, respectively. In the DM test, the MAPE is used as the loss function and thus the null hypothesis is that the MAPE of the test model is not less than that of benchmarks. The null hypothesis is rejected when DM statistics and the p value are less than the significance level. In the PT test, the null hypothesis assumes that the true and forecast values are independently distributed. Similarly, comparing PT statistics and the corresponding p value, the directional forecasting ability of different models can be evaluated from the statistical perspective. The process of the DM test and the PT test can be referred to Diebold and Mariano (1995) and Pesaran and Timmermann (1992).

Benchmarks and parameter settings

To evaluate the forecasting performance of the B-SAKE model in different forecasting schemes (one-step-ahead vs. multistep-ahead and in-sample vs. out-of-sample), we formulate 10 benchmark models, including univariate time series models, econometrics models, and AI models, of which the latter two are the multivariate models. Considering the seasonality and periodicity of tourism demand data, seasonal naive (SN), SARIMA, and seasonal exponential smoothing (SES) are chosen as univariate benchmarks. Due to the additional introduction of SII data and economic variable data, we also adopt a variant of SARIMA, namely SARIMAX (Tsui and Balli, 2015), which incorporates both seasonal influences and external variables. The autoregressive distributed lag (ARDL) model is also a common multivariate econometric model for tourism demand forecasting. The multilayer perceptron (MLP) and KELM models, as the most popular AI techniques, are widely used in the forecasting literature. We add SAE network for dimension reduction based on KELM to construct the SAKE model. In addition, we consider bagging-based (B-based) AI models including B-MLP, B-KELM, and B-SAKE.

The parameter specification is crucial for model performance. We adjust the parameters through minimizing in-sample forecasting errors. The appropriate parameters of SES, SARIMA, SARIMAX, and ARDL model are estimated according to Akaike’s information criterion. The numbers of hidden neurons of MLP and KELM are determined by the trial-and-error approach. The Gaussian kernel function is adopted in the KELM model. And the bootstrap samples of Bagging are set as 100 based on Inoue and Kilian (2008).

Empirical results

Forecast evaluations

We adopt the dynamic forecasting with rolling windows. For in-sample data and out-of-sample data, the first 12 observations are used to fit and forecast the next value, respectively. In the one-step-ahead forecasting, the window rolls forward one step each time, while in the multistep-ahead forecasting, the corresponding number of steps is rolled forward each time. After obtaining the results of the dynamic forecasting, we use the predicted and actual tourism arrivals for each origin country to calculate the average error rate for each evaluation criteria (MAPE, NRMSE, and DS). The multistep-ahead forecasting in this study includes 3-month-ahead forecasting and 6-month-ahead forecasting. To verify the effectiveness of bagging and SAE in dealing with overfitting and improving forecasting performance, we also design a comparison between in-sample forecasting and out-of-sample forecasting.

We evaluate the forecasting performance of our proposed B-SAKE model and the above 10 benchmark models in the above forecasting schemes using the MAPE, NRMSE, and DS evaluation criteria. The results are presented in Tables 4 –6 and it can be summarized that (1) B-SAKE is the most accurate approach compared with the benchmark models in terms of the MAPE, NRMSE, and DS criteria. (2) The B-based models generally outperform the original models in forecast accuracy, which verifies bagging’s superiority in tourism demand forecasting. (3) As we expected, SAKE, combining SAE and KELM, has better forecasting performance than KELM. SAE automatically and effectively identifies data characteristics through dimensionality reduction. (4) The univariate models (i.e. SN, SARIMA, and SES) are the worst benchmark models, followed by the econometric models (i.e. ARDL and SARIMAX). This may be because these models cannot effectively capture the nonlinear patterns of tourism data compared with the AI models. In addition, the univariate models fail to take advantage of the effective information of tourism big data. (5) Although almost all model performance degrades from in-sample forecasting to out-of-sample forecasting, bagging and SAE significantly mitigate this problem, especially in one-step-ahead forecasting.

Table 4.

Forecasting performance of different models: 1-month-ahead forecasting.

Countries	Models	In-sample			Out-of-sample
Countries	Models	MAPE	NRMSE	DS	MAPE	NRMSE	DS
US	SN	5.703	6.413	54.17	6.140	6.942	45.83
	SARIMA	5.435	6.149	56.25	6.035	6.904	45.83
	SES	5.512	6.016	55.21	5.967	6.833	50.00
	ARDL	5.633	6.375	54.17	6.014	6.891	45.83
	SARIMAX	5.413	6.019	55.21	5.935	6.843	54.17
	MLP	3.915	4.263	72.92	4.017	4.586	62.50
	B-MLP	2.116	2.968	78.13	2.873	3.514	66.67
	KELM	2.019	2.875	77.08	2.637	3.142	70.83
	B-KELM	1.586	2.037	79.17	1.781	2.364	75.00
	SAKE	0.758	0.921	86.46	0.895	1.002	83.33
	B-SAKE	0.493	0.614	97.92	0.539	0.684	100.00
UK	SN	5.489	5.906	52.08	5.902	6.624	41.67
	SARIMA	5.406	5.833	55.21	5.839	6.603	50.00
	SES	5.434	5.897	54.17	5.691	6.205	45.83
	ARDL	5.417	5.934	55.21	5.685	6.153	50.00
	SARIMAX	5.321	5.873	57.29	5.765	6.521	58.33
	MLP	3.874	4.301	73.96	4.115	4.453	66.67
	B-MLP	2.106	2.829	77.08	2.704	3.437	70.83
	KELM	2.113	2.807	76.04	2.638	3.012	75.00
	B-KELM	1.602	2.115	78.13	1.876	2.364	79.17
	SAKE	0.927	1.206	87.50	1.049	1.258	87.50
	B-SAKE	0.583	0.705	95.83	0.639	0.794	95.83
Germany	SN	5.358	5.792	53.13	5.709	6.433	50.00
	SARIMA	5.307	5.705	52.08	5.635	6.391	45.83
	SES	5.144	5.689	55.21	5.640	6.384	54.17
	ARDL	5.172	5.703	54.17	5.436	6.048	58.33
	SARIMAX	5.217	5.638	56.25	5.601	6.332	54.17
	MLP	3.746	4.105	75.00	4.015	4.307	66.67
	B-MLP	2.019	2.736	79.17	2.507	3.275	75.00
	KELM	2.005	2.693	78.13	2.439	2.982	75.00
	B-KELM	1.403	2.012	80.21	1.631	2.204	79.17
	SAKE	0.809	0.994	88.54	0.947	1.143	87.50
	B-SAKE	0.514	0.621	100.00	0.548	0.683	95.83
France	SN	5.557	5.971	52.08	5.984	6.610	45.83
	SARIMA	5.510	5.943	54.17	5.938	6.506	45.83
	SES	5.489	5.814	55.21	6.045	6.493	50.00
	ARDL	5.478	5.831	56.25	6.135	6.507	45.83
	SARIMAX	5.447	5.906	54.17	5.907	6.492	54.17
	MLP	3.896	4.251	71.88	4.206	4.517	62.50
	B-MLP	2.258	2.906	76.04	2.844	3.352	66.67
	KELM	2.165	2.884	76.04	2.608	3.108	70.83
	B-KELM	1.652	2.067	78.13	1.835	2.216	75.00
	SAKE	0.944	1.305	85.42	1.106	1.295	79.17
	B-SAKE	0.613	0.745	93.75	0.701	0.785	91.67

Note: MAPE: mean absolute percentage error; NRMSE: normalized root mean square error; DS: directional symmetry; SN: seasonal naïve; SES: seasonal exponential smoothing; ARDL: autoregressive distributed lag; MLP: multilayer perceptron; B-MLP: bagging-based MLP; KELM: kernel-based extreme learning machine; B-KELM: bagging-based KELM; SAKE: stacked autoencoder with KELM; B-SAKE: bagging-based SAKE. Bold font indicates the unique highest forecasting accuracy among the models.

Table 5.

Forecasting performance of different models: 3-month-ahead forecasting.

Countries	Models	In-sample			Out-of-sample
Countries	Models	MAPE	NRMSE	DS	MAPE	NRMSE	DS
US	SN	5.637	6.115	51.06	6.136	7.025	45.83
	SARIMA	5.609	5.986	52.13	6.038	6.943	50.00
	SES	5.532	6.044	53.19	5.941	6.896	45.83
	ARDL	5.612	6.034	52.13	5.894	6.793	50.00
	SARIMAX	5.501	5.943	53.19	5.942	6.886	54.17
	MLP	3.896	4.128	57.45	4.109	4.593	58.33
	B-MLP	2.207	3.146	60.64	2.894	3.571	62.50
	KELM	2.158	3.053	65.96	2.729	3.203	62.50
	B-KELM	1.703	2.214	70.21	1.834	2.385	66.67
	SAKE	0.895	1.102	79.79	0.912	1.175	79.17
	B-SAKE	0.584	0.739	90.43	0.613	0.794	87.50
UK	SN	5.537	5.969	52.13	6.025	6.691	45.83
	SARIMA	5.568	5.903	53.19	5.847	6.635	50.00
	SES	5.506	5.932	55.32	5.701	6.306	50.00
	ARDL	5.510	6.054	56.38	5.891	6.511	45.83
	SARIMAX	5.408	5.886	55.32	5.802	6.613	54.17
	MLP	3.940	4.395	60.64	4.124	4.485	62.50
	B-MLP	2.251	3.058	68.09	2.718	3.445	66.67
	KELM	2.236	2.984	69.15	2.695	3.101	66.67
	B-KELM	1.742	2.205	73.40	1.925	2.397	70.83
	SAKE	1.025	1.374	82.98	1.139	1.402	79.17
	B-SAKE	0.675	0.793	91.49	0.725	0.884	87.50
Germany	SN	5.402	5.806	50.00	5.711	6.451	45.83
	SARIMA	5.339	5.810	55.32	5.697	6.402	54.17
	SES	5.196	5.701	53.19	5.643	6.419	50.00
	ARDL	5.279	5.816	53.19	5.690	6.481	50.00
	SARIMAX	5.321	5.759	52.13	5.711	6.425	45.83
	MLP	3.853	4.256	59.57	4.096	4.396	58.33
	B-MLP	2.127	2.858	63.83	2.617	3.364	62.50
	KELM	2.207	2.801	68.09	2.545	3.022	66.67
	B-KELM	1.526	2.106	72.34	1.726	2.307	70.83
	SAKE	0.915	1.145	78.72	1.021	1.253	75.00
	B-SAKE	0.608	0.733	89.36	0.657	0.760	83.33
France	SN	5.563	5.996	52.13	6.105	6.705	45.83
	SARIMA	5.526	5.958	50.00	6.098	6.674	45.83
	SES	5.491	5.893	51.06	6.049	6.637	50.00
	ARDL	5.562	5.906	50.00	6.197	6.642	45.83
	SARIMAX	5.514	6.004	51.06	6.014	6.583	50.00
	MLP	3.926	4.296	58.51	4.310	4.594	54.17
	B-MLP	2.338	2.988	64.89	2.896	3.412	58.33
	KELM	2.253	2.903	67.02	2.711	3.213	62.50
	B-KELM	1.751	2.175	71.28	1.905	2.305	70.83
	SAKE	1.031	1.412	77.66	1.217	1.309	75.00
	B-SAKE	0.709	0.816	88.30	0.819	0.901	83.33

Table 6.

Forecasting performance of different models: 6-month-ahead forecasting.

Countries	Models	In-sample			Out-of-sample
Countries	Models	MAPE	NRMSE	DS	MAPE	NRMSE	DS
US	SN	6.391	6.902	51.65	6.397	7.102	45.83
	SARIMA	6.385	6.733	50.55	6.409	7.041	50.00
	SES	6.403	6.845	52.75	6.534	6.905	54.17
	ARDL	6.374	6.891	52.75	6.516	7.019	54.17
	SARIMAX	6.204	6.913	51.65	6.585	6.963	50.00
	MLP	5.873	5.943	56.04	6.036	6.147	54.17
	B-MLP	4.161	4.207	59.34	4.543	4.601	58.33
	KELM	4.035	4.352	63.74	4.167	4.375	62.50
	B-KELM	3.256	3.106	69.23	3.402	3.321	66.67
	SAKE	1.758	1.701	78.02	1.808	1.905	75.00
	B-SAKE	1.142	1.235	87.91	1.236	1.343	83.33
UK	SN	6.406	6.844	49.45	6.597	6.905	45.83
	SARIMA	6.394	6.708	51.65	6.493	6.794	50.00
	SES	6.251	6.695	50.55	6.581	6.742	45.83
	ARDL	6.296	6.710	51.65	6.574	6.717	45.83
	SARIMAX	6.385	6.749	52.75	6.601	6.835	54.17
	MLP	5.741	5.810	58.24	5.904	6.024	58.33
	B-MLP	4.359	4.361	62.64	4.498	4.535	62.50
	KELM	4.336	4.458	67.03	4.441	4.601	62.50
	B-KELM	3.104	3.267	72.53	3.517	3.358	66.67
	SAKE	1.837	1.901	81.32	1.934	2.043	79.17
	B-SAKE	1.206	1.348	89.01	1.319	1.455	83.33
Germany	SN	6.154	6.385	49.45	6.309	6.512	45.83
	SARIMA	6.121	6.279	49.45	6.238	6.496	45.83
	SES	5.986	6.114	51.65	6.115	6.545	50.00
	ARDL	5.991	6.106	50.55	6.102	6.504	50.00
	SARIMAX	6.103	6.258	50.55	6.214	6.359	45.83
	MLP	5.507	5.654	57.14	5.639	5.832	54.17
	B-MLP	4.124	4.235	60.44	4.147	4.353	58.33
	KELM	4.048	4.209	64.84	4.265	4.298	62.50
	B-KELM	2.943	3.036	70.33	3.107	3.104	66.67
	SAKE	1.714	1.854	76.92	1.851	1.985	70.83
	B-SAKE	1.025	1.146	86.81	1.138	1.268	79.17
France	SN	6.401	6.512	49.45	6.654	6.826	45.83
	SARIMA	6.390	6.504	50.55	6.617	6.794	45.83
	SES	6.381	6.485	51.65	6.595	6.816	50.00
	ARDL	6.349	6.478	50.55	6.605	6.874	45.83
	SARIMAX	6.269	6.437	51.65	6.585	6.731	45.83
	MLP	5.543	5.610	56.04	5.836	5.836	50.00
	B-MLP	4.301	4.411	61.54	4.501	4.602	54.17
	KELM	4.258	4.374	63.74	4.394	4.589	58.33
	B-KELM	3.043	3.165	69.23	3.452	3.293	58.33
	SAKE	1.804	1.987	75.82	1.991	2.120	70.83
	B-SAKE	1.267	1.359	85.71	1.368	1.487	79.17

In particular, our proposed B-SAKE model achieves the best forecast accuracy in all four origin countries, which is shown in bold in the tables. Taking the example of the case of 1-month-ahead forecasting in the United States, the reductions in MAPE are 91.22%, 91.07%, 90.97%, 91.04%, 90.92%, 86.58%, 81.24%, 79.56%, 69.74%, and 39.78% in comparison with those of SN, SARIMA, SES, ARDL, SARIMAX, MLP, B-MLP, KELM, B-KELM, and SAKE, respectively. For NRMSE, the reductions are 90.15%, 90.09%, 89.99%, 90.07%, 90.00%, 85.09%, 80.54%, 78.23%, 71.07%, and 31.74%, respectively. B-SAKE achieves 84–118% better directional forecasts than the univariate models or econometric models and 20–60% better directional forecasts than the AI models. From in-sample forecasting to out-of-sample forecasting, the accuracy loss of MAPE after applying bagging on the SAKE decreases from 18.07% to 9.33%, compared with that of the SAKE without bagging. Analogously, SAE helps KELM reduce the accuracy loss of MAPE from 30.61% to 18.07%. It is clearly illustrated that the B-SAKE model is a highly promising forecasting approach.

Statistic tests

To further verify the level and directional forecasting performance of the B-SAKE model from the statistical perspective, the DM test and the PT test are also employed to test the statistical significance of all the models within the out-of-sample data. Tables 7 –12 report the results of the DM test and the PT test with respect to different forecasting horizons. The numbers outside the brackets in the tables are the DM statistics or PT statistics while the numbers inside the brackets are the corresponding p values.

Table 7.

The DM test results for B-SAKE versus univariate forecasting models.

Countries	Horizons	B-SAKE vs. SN	B-SAKE vs. SARIMA	B-SAKE vs. SES
US	1-month-ahead	−5.0233 (0.0000)	−4.8936 (0.0000)	−5.0716 (0.0000)
	3-month-ahead	−4.9467 (0.0000)	−4.9156 (0.0000)	−4.9813 (0.0000)
	6-month-ahead	−4.9825 (0.0000)	−4.9307 (0.0000)	−4.8815 (0.0000)
UK	1-month-ahead	−5.0038 (0.0000)	−5.1033 (0.0000)	−5.0147 (0.0000)
	3-month-ahead	−4.9140 (0.0000)	−5.0168 (0.0000)	−4.8985 (0.0000)
	6-month-ahead	−5.0237 (0.0000)	−4.9841 (0.0000)	−4.7963 (0.0000)
Germany	1-month-ahead	−5.0468 (0.0000)	−5.1136 (0.0000)	−4.9615 (0.0000)
	3-month-ahead	−5.0129 (0.0000)	−4.9681 (0.0000)	−4.8352 (0.0000)
	6-month-ahead	−4.9107 (0.0000)	−4.7936 (0.0000)	−4.8034 (0.0000)
France	1-month-ahead	−4.9658 (0.0000)	−5.0140 (0.0000)	−4.9633 (0.0000)
	3-month-ahead	−4.8933 (0.0000)	−4.9026 (0.0000)	−4.8724 (0.0000)
	6-month-ahead	−4.9115 (0.0000)	−4.8952 (0.0000)	−4.8046 (0.0000)

Note: SN: seasonal naïve; SES: seasonal exponential smoothing; B-SAKE: bagging-based stacked autoencoder with kernel-based extreme learning machine.

Table 8.

The DM test results for multivariate forecasting models in 1-month-ahead forecasting.

Countries	Models	B-SAKE	SAKE	B-KELM	KELM	B-MLP	MLP	SARIMAX
US	SAKE	−1.9873 (0.0234)
	B-KELM	−2.0358 (0.0209)	−1.9461 (0.0258)
	KELM	−2.3879 (0.0085)	−2.0133 (0.0220)	−1.8742 (0.0305)
	B-MLP	−2.9367 (0.0017)	−2.5381 (0.0056)	−1.8943 (0.0340)	−1.8627 (0.0291)
	MLP	−3.8749 (0.0001)	−3.4106 (0.0003)	−2.2033 (0.0014)	−2.1782 (0.0147)	−1.8658 (0.0310)
	SARIMAX	−4.5037 (0.0000)	−4.2917 (0.0000)	−4.0143 (0.0000)	−3.9741 (0.0000)	−3.2859 (0.0005)	−2.9576 (0.0016)
	ARDL	−5.036 (0.0000)	−4.8351 (0.0000)	−4.4380 (0.0000)	−4.2108 (0.0000)	−3.3605 (0.0003)	−3.0452 (0.0012)	−1.2025 (0.1146)
UK	SAKE	−1.9560 (0.0252)
	B-KELM	−1.9983 (0.0228)	−1.9562 (0.0252)
	KELM	−2.2749 (0.0115)	−2.1247 (0.0168)	−1.8868 (0.0296)
	B-MLP	−2.8907 (0.0019)	−2.4359 (0.0074)	−1.8856 (0.0297)	−1.8963 (0.0290)
	MLP	−3.6358 (0.0001)	−3.4267 (0.0003)	−2.3748 (0.0088)	−2.2041 (0.0138)	−1.8742 (0.0305)
	SARIMAX	−4.4937 (0.0000)	−4.2341 (0.0000)	−4.1045 (0.0000)	−3.9648 (0.0000)	−3.3058 (0.0005)	−2.8937 (0.0019)
	ARDL	−4.9875 (0.0000)	−4.7359 (0.0000)	−4.5027 (0.0000)	−4.1530 (0.0000)	−3.4502 (0.0003)	−3.1026 (0.0010)	1.0367 (0.1499)
Germany	SAKE	−1.9843 (0.0236)
	B-KELM	−2.0687 (0.0193)	−1.9687 (0.0245)
	KELM	−2.2975 (0.0108)	−2.1589 (0.0154)	−1.8963 (0.0290)
	B-MLP	−2.9954 (0.0014)	−2.3946 (0.0083)	−1.8756 (0.0304)	−1.9354 (0.0265)
	MLP	−3.7025 (0.0001)	−3.5085 (0.0002)	−2.4019 (0.0082)	−2.2241 (0.0131)	−1.8965 (0.0289)
	SARIMAX	−4.5136 (0.0000)	−4.3541 (0.0000)	−4.0157 (0.0000)	−4.0156 (0.0000)	−3.4523 (0.0003)	−2.9632 (0.0015)
	ARDL	−5.1033 (0.0000)	−4.8921 (0.0000)	−4.5906 (0.0000)	−4.4765 (0.0000)	−3.5019 (0.0002)	−3.0015 (0.0013)	0.8933 (0.1858)
France	SAKE	−1.8994 (0.0288)
	B-KELM	−1.9536 (0.0254)	−1.8713 (0.0307)
	KELM	−2.1254 (0.0168)	−2.0145 (0.0220)	−1.8623 (0.0313)
	B-MLP	−2.7412 (0.0031)	−2.3657 (0.0090)	−1.7843 (0.0372)	−1.9602 (0.0250)
	MLP	−3.5896 (0.0002)	−3.3215 (0.0004)	−2.2250 (0.0130)	−2.2247 (0.0131)	−1.8690 (0.0308)
	SARIMAX	−4.3582 (0.0000)	−4.1598 (0.0000)	−4.0236 (0.0000)	−3.8543 (0.0001)	−3.2968 (0.0005)	−2.8036 (0.0025)
	ARDL	−4.9024 (0.0000)	−4.8356 (0.0000)	−4.4381 (0.0000)	−4.1063 (0.0000)	−3.4106 (0.0003)	−3.0536 (0.0011)	−1.2315 (0.1091)

Note: ARDL: autoregressive distributed lag; MLP: multilayer perceptron; B-MLP: bagging-based MLP; KELM: kernel-based extreme learning machine; B-KELM: bagging-based KELM; SAKE: stacked autoencoder with KELM; B-SAKE: bagging-based SAKE.

Table 9.

The DM test results for multivariate forecasting models in 3-month-ahead forecasting.

Countries	Models	B-SAKE	SAKE	B-KELM	KELM	B-MLP	MLP	SARIMAX
US	SAKE	−1.9925 (0.0232)
	B-KELM	−2.1426 (0.0161)	−1.9895 (0.0233)
	KELM	−2.2567 (0.0120)	−2.1103 (0.0174)	−1.8803 (0.0300)
	B-MLP	−2.8893 (0.0019)	−2.6177 (0.0044)	−1.9011 (0.0286)	−1.8511 (0.0321)
	MLP	−3.6254 (0.0001)	−3.4526 (0.0003)	−2.0219 (0.0216)	−1.9416 (0.0261)	−1.7913 (0.0366)
	SARIMAX	−4.4029 (0.0000)	−4.2103 (0.0000)	−4.0014 (0.0000)	−3.8916 (0.0000)	−3.2341 (0.0006)	−2.6011 (0.0046)
	ARDL	−4.9112 (0.0000)	−4.8357 (0.0000)	−4.5069 (0.0000)	−4.3706 (0.0000)	−3.3968 (0.0003)	−3.1025 (0.0010)	0.7419 (0.2291)
UK	SAKE	−1.8365 (0.0331)
	B-KELM	−1.9141 (0.0278)	−1.8854 (0.0297)
	KELM	−2.1056 (0.0176)	−2.0112 (0.0222)	−1.8103 (0.0351)
	B-MLP	−2.7319 (0.0031)	−2.5133 (0.0060)	−1.8995 (0.0287)	−1.8356 (0.0332)
	MLP	−3.4103 (0.0003)	−3.2608 (0.0006)	−2.1236 (0.0169)	−2.0143 (0.0220)	−1.8041 (0.0356)
	SARIMAX	−4.3291 (0.0000)	−4.1890 (0.0000)	−4.0126 (0.0000)	−3.7859 (0.0001)	−3.2210 (0.0006)	−2.5961 (0.0047)
	ARDL	−4.8760 (0.0000)	−4.7019 (0.0000)	−4.6135 (0.0000)	−4.2013 (0.0000)	−3.5088 (0.0002)	−2.9930 (0.0014)	−0.9715 (0.1656)
Germany	SAKE	−1.8511 (0.0321)
	B-KELM	−1.9954 (0.0230)	−1.8453 (0.0325)
	KELM	−2.1063 (0.0176)	−2.0043 (0.0225)	−1.8510 (0.0321)
	B-MLP	−2.6917 (0.0036)	−2.3120 (0.0104)	−1.9013 (0.0286)	−1.8363 (0.0332)
	MLP	−3.4011 (0.0003)	−3.3017 (0.0005)	−2.2163 (0.0133)	−2.1247 (0.0168)	−1.8142 (0.0348)
	SARIMAX	−4.2985 (0.0000)	−4.1066 (0.0000)	−3.9859 (0.0000)	−3.7956 (0.0001)	−3.3018 (0.0005)	−2.5385 (0.0056)
	ARDL	−4.9019 (0.0000)	−4.8033 (0.0000)	−4.6351 (0.0000)	−4.3044 (0.0000)	−3.4890 (0.0002)	−3.1036 (0.0010)	−1.0304 (0.1514)
France	SAKE	−1.7995 (0.0360)
	B-KELM	−1.8563 (0.0317)	−1.7956 (0.0363)
	KELM	−1.9863 (0.0235)	−1.9983 (0.0228)	−1.8152 (0.0347)
	B-MLP	−2.2106 (0.0135)	−2.1183 (0.0171)	−1.8025 (0.0357)	−1.8564 (0.0317)
	MLP	−3.3142 (0.0005)	−3.1745 (0.0008)	−2.1195 (0.0170)	−2.0016 (0.0227)	−1.7974 (0.0361)
	SARIMAX	−4.1107 (0.0000)	−4.0135 (0.0000)	−3.8983 (0.0000)	−3.6968 (0.0001)	−3.1163 (0.0009)	−2.4101 (0.0080)
	ARDL	−4.7814 (0.0000)	−4.6539 (0.0000)	−4.5361 (0.0000)	−4.3811 (0.0000)	−3.6013 (0.0002)	−3.0017 (0.0013)	−1.3407 (0.0900)

Table 10.

The DM test results for multivariate forecasting models in 6-month-ahead forecasting.

Countries	Models	B-SAKE	SAKE	B-KELM	KELM	B-MLP	MLP	SARIMAX
US	SAKE	−1.8013 (0.0358)
	B-KELM	−2.0527 (0.0201)	−1.8416 (0.0328)
	KELM	−2.2015 (0.0139)	−2.0234 (0.0215)	−1.7983 (0.0361)
	B-MLP	−2.7749 (0.0028)	−2.4517 (0.0071)	−1.8546 (0.0318)	−1.7634 (0.0389)
	MLP	−3.4135 (0.0003)	−3.2635 (0.0006)	−2.0034 (0.0226)	−1.8125 (0.0350)	−1.7011 (0.0445)
	SARIMAX	−4.2011 (0.0000)	−4.1063 (0.0000)	−3.9568 (0.0000)	−3.7412 (0.0001)	−3.0253 (0.0012)	−2.2103 (0.0135)
	ARDL	−4.8126 (0.0000)	−4.7418 (0.0000)	−4.6502 (0.0000)	−4.5130 (0.0000)	−3.5367 (0.0002)	−2.9813 (0.0014)	−0.8933 (0.1858)
UK	SAKE	−1.7413 (0.0408)
	B-KELM	−1.8041 (0.0356)	−1.7998 (0.0359)
	KELM	−2.0143 (0.0220)	−1.9951 (0.0230)	−1.7654 (0.0387)
	B-MLP	−2.5639 (0.0052)	−2.3674 (0.0090)	−1.8123 (0.0350)	−1.7741 (0.0380)
	MLP	−3.3691 (0.0004)	−3.1231 (0.0009)	−2.1526 (0.0157)	−1.9526 (0.0254)	−1.6979 (0.0448)
	SARIMAX	−4.2141 (0.0000)	−4.0142 (0.0000)	−3.9896 (0.0000)	−3.6362 (0.0001)	−2.9958 (0.0014)	−2.2036 (0.0138)
	ARDL	−4.9035 (0.0000)	−4.6733 (0.0000)	−4.4305 (0.0000)	−4.2638 (0.0000)	−3.5340 (0.0002)	−3.1025 (0.0010)	0.9146 (0.1802)
Germany	SAKE	−1.7958 (0.0363)
	B-KELM	−1.9011 (0.0286)	−1.7896 (0.0368)
	KELM	−2.0034 (0.0226)	−1.9568 (0.0252)	−1.7963 (0.0362)
	B-MLP	−2.4968 (0.0063)	−2.2691 (0.0116)	−1.8856 (0.0297)	−1.7985 (0.0360)
	MLP	−3.2103 (0.0007)	−3.1029 (0.0010)	−2.0367 (0.0208)	−2.0014 (0.0227)	−1.7042 (0.0442)
	SARIMAX	−4.1953 (0.0000)	−4.0183 (0.0000)	−3.8561 (0.0001)	−3.6345 (0.0001)	−3.0152 (0.0013)	−2.3671 (0.0090)
	ARDL	−4.7826 (0.0000)	−4.6501 (0.0000)	−4.3908 (0.0000)	−4.3006 (0.0000)	−3.5906 (0.0002)	−2.9713 (0.0015)	−1.0207 (0.1537)
France	SAKE	−1.7013 (0.0444)
	B-KELM	−1.8354 (0.0332)	−1.7142 (0.0432)
	KELM	−1.8964 (0.0290)	−1.9648 (0.0247)	−1.7969 (0.0362)
	B-MLP	−2.1243 (0.0168)	−2.0356 (0.0209)	−1.9153 (0.0277)	−1.8015 (0.0358)
	MLP	−3.1425 (0.0008)	−3.0142 (0.0013)	−2.0148 (0.0220)	−1.9686 (0.0245)	−1.7126 (0.0434)
	SARIMAX	−4.0968 (0.0000)	−3.9163 (0.0000)	−3.7853 (0.0001)	−3.5964 (0.0002)	−2.9354 (0.0017)	−2.2012 (0.0139)
	ARDL	−4.7048 (0.0000)	−4.5389 (0.0000)	−4.4026 (0.0000)	−4.2518 (0.0000)	−3.4810 (0.0002)	−2.9647 (0.0015)	−0.7354 (0.2310)

Table 11.

The PT test results for univariate forecasting models.

Countries	Horizons	SN	SARIMA	SES
US	1-month-ahead	1.7114 (0.0870)	1.6938 (0.0903)	1.7933 (0.0729)
	3-month-ahead	1.6120 (0.1070)	1.6022 (0.1091)	1.7352 (0.0827)
	6-month-ahead	1.1963 (0.2316)	1.2039 (0.2286)	1.3001 (0.1936)
UK	1-month-ahead	1.8033 (0.0713)	1.8304 (0.0672)	1.8935 (0.0583)
	3-month-ahead	1.7509 (0.0800)	1.7429 (0.0814)	1.7536 (0.0795)
	6-month-ahead	1.2015 (0.2296)	1.1633 (0.2447)	1.1520 (0.2493)
Germany	1-month-ahead	1.7468 (0.0807)	1.7345 (0.0828)	1.8365 (0.0663)
	3-month-ahead	1.6352 (0.1020)	1.6516 (0.0986)	1.6917 (0.0907)
	6-month-ahead	1.1957 (0.2318)	1.1623 (0.2451)	1.1134 (0.2655)
France	1-month-ahead	1.7412 (0.0816)	1.8036 (0.0713)	1.7992 (0.0720)
	3-month-ahead	1.6438 (0.1002)	1.6513 (0.0987)	1.6933 (0.0904)
	6-month-ahead	1.1530 (0.2489)	1.1033 (0.2699)	1.1525 (0.2491)

Note: SN: seasonal naïve; SES: seasonal exponential smoothing.

Table 12.

The PT test results for different models.

Country	Horizons	ARDL	SARIMAX	MLP	B-MLP	KELM	B-KELM	SAKE	B-SAKE
US	1-month-ahead	1.8536 (0.0638)	1.9856 (0.0471)	2.2013 (0.0277)	2.9856 (0.0028)	3.1025 (0.0019)	3.8842 (0.0001)	4.3568 (0.0000)	4.9158 (0.0000)
	3-month-ahead	1.7905 (0.0734)	1.8335 (0.0667)	2.0109 (0.0443)	2.7992 (0.0051)	2.8519 (0.0043)	3.6539 (0.0003)	4.0985 (0.0000)	4.5985 (0.0000)
	6-month-ahead	1.3033 (0.1925)	1.2096 (0.2264)	1.9803 (0.0477)	2.3981 (0.0165)	2.4913 (0.0127)	3.3981 (0.0007)	3.7251 (0.0002)	4.0856 (0.0000)
UK	1-month-ahead	1.9346 (0.0530)	1.9921 (0.0464)	2.2103 (0.0271)	2.9936 (0.0028)	3.1127 (0.0019)	3.8911 (0.0001)	4.3602 (0.0000)	4.9011 (0.0000)
	3-month-ahead	1.8031 (0.0714)	1.8435 (0.0653)	2.0127 (0.0441)	2.8024 (0.0051)	2.8893 (0.0039)	3.6694 (0.0002)	4.1025 (0.0000)	4.5894 (0.0000)
	6-month-ahead	1.1940 (0.2325)	1.2109 (0.2259)	1.9934 (0.0462)	2.3845 (0.0171)	2.5005 (0.0124)	3.4038 (0.0007)	3.7358 (0.0002)	4.0952 (0.0000)
Germany	1-month-ahead	1.9036 (0.0570)	1.9735 (0.0484)	2.1980 (0.0279)	2.8913 (0.0038)	3.0106 (0.0026)	3.7569 (0.0002)	4.2011 (0.0000)	4.7351 (0.0000)
	3-month-ahead	1.7935 (0.0729)	1.8037 (0.0713)	2.0014 (0.0453)	2.7971 (0.0052)	2.7985 (0.0051)	3.5050 (0.0005)	4.0023 (0.0001)	4.3958 (0.0000)
	6-month-ahead	1.1987 (0.2306)	1.2003 (0.2300)	1.9802 (0.0477)	2.2021 (0.0277)	2.4801 (0.0131)	3.3912 (0.0007)	3.6918 (0.0002)	3.9965 (0.0001)
France	1-month-ahead	1.8951 (0.0581)	1.9825 (0.0474)	2.1833 (0.0290)	2.7395 (0.0062)	3.0012 (0.0027)	3.6028 (0.0003)	4.2125 (0.0000)	4.6981 (0.0000)
	3-month-ahead	1.7510 (0.0799)	1.7953 (0.0726)	1.9952 (0.0460)	2.7102 (0.0067)	2.6528 (0.0080)	3.4167 (0.0006)	3.8996 (0.0001)	4.2896 (0.0000)
	6-month-ahead	1.1432 (0.2530)	1.1952 (0.2320)	1.9733 (0.0485)	2.1537 (0.0313)	2.3341 (0.0196)	3.2533 (0.0011)	3.5853 (0.0003)	3.9561 (0.0001)

According to the DM test results (Table 7 –10), when testing the B-SAKE model, all the DM tests are less than −1.7013 corresponding to p values less than 0.0444, which means that the B-SAKE model outperforms the benchmark models under the 95% confidence level. This indicates the superiority of the B-SAKE model. Specifically, we note that when the B-SAKE is tested against the univariate models and econometrics models, the B-SAKE model statistically confirms its superiority under the 100% confidence level. Furthermore, Tables 8 –10 present that the level forecasting performance of the models increases successively for MLP, B-MLP, KELM, B-KELM, SAKE, and B-SAKE in multistep-ahead forecasting scheme. B-based models and SAE-based models outperform the original models under the 95% confidence level, which demonstrates the effectiveness of bagging and SAE.

The results of the PT test are displayed in Tables 11 and 12. The forecasting results of the B-KELM, SAKE, and B-SAKE models reject the null hypothesis in both one-step-ahead forecasting and multistep-ahead forecasting under near the 100% confidence level, which indicate the powerful performance of these three models in the directional forecasting. The B-SAKE performs the best in all forecasting schemes, followed by the SAKE, B-KELM, KELM, B-MLP, and MLP. The univariate models and econometrics models are almost ineffective in multistep-ahead forecasting where their p values are greater than 0.1. Bagging and SAE can significantly improve performance in the directional forecasting.

Summary

In this section, we train the models and conduct numerical experiments employing the multidimensional data related to tourism demand, including historical data on tourist arrivals in Beijing from four countries, economic variable data, and SII data. The forecasting results of our proposed B-SAKE model and benchmark models are compared through different forecasting schemes. In summary, some interesting implications are obtained as follows.

The proposed B-SAKE model achieves the highest forecast accuracy via MAPE, NRMSE, and DS and outperforms the benchmark models in the DM test and the PT test, followed by other AI models, whereas the univariate models and econometrics models rank the last.

The univariate models and econometrics models are not applicable to forecasting nonlinear, uncertain, and irregular tourism data, while nonlinear AI models have significant advantages.

As an ensemble approach, bagging can effectively mitigate overfitting and improve the forecast accuracy through the idea of the model average. The SAE is utilized to construct the deep learning network realizing feature recognition effectively.

This study analyzes the forecasting performance of the models based on tourist arrival data in Beijing from four countries and comes to almost consistent conclusions, which illustrates the effectiveness and robustness of this model framework.

Discussion

Considering the tourism big data, a bagging-based multivariate ensemble deep learning model, integrating SAE and KELM, is proposed for tourism demand forecasting. The data applied in this article include historical data on tourist arrivals in Beijing, economic variable data, and SII data. The empirical study results indicate that our proposed B-SAKE model substantially outperforms the benchmark models in different forecasting schemes (one-step-ahead vs. multistep-ahead and in-sample vs. out-of-sample). In particular, we analyze the cases of forecasting tourist arrivals from four countries and reach consistent conclusions. Moreover, bagging and SAE in the ensemble deep learning model framework designed by us effectively solve the overfitting problem and improve the forecasting accuracy by increasing the data volume and improving the feature extraction efficiency, respectively.

The ensemble deep learning model we propose has significant implications for relevant government officials and tourism practitioners. This accurate and reliable forecasting model assists these stakeholders to design and implement effective policies and strategies to meet the potential needs of tourists, thus improving the quality of tourism services and the competitiveness of destinations. In addition, this ensemble deep learning model framework can be performed to forecast other complex problems, for instance, passenger flow forecasting, electric load forecasting, and financial market forecasting.

The limitation of this study is that other newly developed deep learning models in the tourism forecasting literature don’t serve as benchmark models. This is because the details of model construction have not been clearly stated in previous literature, and few shared data sets contributing to comparability. Fortunately, Zhang, Li, et al. (2021) have noticed this issue and release their data set on GitHub. Additionally, it is worth noting that weather, safety factors, and online comment data have been taken into consideration in the tourism demand literature (Chen et al., 2015; Ghaderi et al., 2017; Sohrabi et al., 2020). These data can theoretically be incorporated into our proposed B-SAKE model to further improve the forecast accuracy. It is also meaningful to explore the preprocessing of these types of data and the construction of indicators related to tourism demand.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research work was partly supported by the National Natural Science Foundation of China under Grants No. 71988101 and No. 71642006, the Fundamental Research Funds for the Central Universities under Grant No. SK2021007.

ORCID iD

Shaolong Sun

Yanzhao Li

Shouyang Wang

References

Anderson

Athanasopoulos

Vahid

(2007) Nonlinear autoregressive leading indicator models of output in G-7 countries. Journal of Applied Econometrics 22: 63–87.

Athanasopoulos

Hyndman

Song

, et al. (2011) The tourism forecasting competition. International Journal of Forecasting 27: 822–844.

Athanasopoulos

Song

Sun

(2017) Bagging in tourism demand modeling and forecasting. Journal of Travel Research 57: 52–68.

Bangwayo-Skeete

Skeete

(2015) Can Google data improve the forecasting performance of tourist arrivals? Mixed-data sampling approach. Tourism Management 46: 454–464.

Bengio

Lamblin

Popovici

, et al. (2006) Greedy layer-wise training of deep networks. Advances in Neural Information Processing Systems 19: 153–160.

Bokelmann

Lessmann

(2019) Spurious patterns in google trends data—An analysis of the effects on tourism demand forecasting in Germany. Tourism Management 75: 1–12.

Breiman

(1996) Bagging predictors. Machine Learning 24: 123–140.

Bunn

(1989) Forecasting with more than one model. Journal of Forecasting 8: 161–166.

Cao

Wan

Zhang

, et al. (2020) Hybrid ensemble deep learning for deterministic and probabilistic low-voltage load forecasting. IEEE Transactions on Power Systems 35: 1881–1897.

10.

Chen

C-M

Lin

Y-C

, et al. (2015) Weather uncertainty effect on tourism demand. Tourism Economics 23: 469–474.

11.

Choi

Varian

HAL

(2012) Predicting the present with google trends. Economic Record 88: 2–9.

12.

Chu

F-L

(2011) A piecewise linear approach to modeling and forecasting demand for Macau tourism. Tourism Management 32: 1414–1420.

13.

Crouch

(1992) Effect of income and price on international tourism. Annals of Tourism Research 19: 643–664.

14.

Dergiades

Mavragani

Pan

(2018) Google Trends and tourists’ arrivals: Emerging biases and proposed corrections. Tourism Management 66: 108–120.

15.

Diebold

Mariano

(1995) Comparing predictive accuracy. Journal of Business & Economic Statistics 13: 253–263.

16.

Fesenmaier

Xiang

Pan

, et al. (2010) A framework of search engine use for travel planning. Journal of Travel Research 50: 587–601.

17.

Ghaderi

Saboori

Khoshkam

(2017) Does security matter in tourism demand? Current Issues in Tourism 20: 552–565.

18.

Huang

G-B

(2014) An insight into extreme learning machines: Random neurons, random features and kernels. Cognitive Computation 6: 376–390.

19.

Huang

G-B

Zhu

Q-Y

Siew

C-K

(2006) Extreme learning machine: Theory and applications. Neurocomputing 70: 489–501.

20.

Inoue

Kilian

(2008) How useful is bagging in forecasting economic time series? A case study of U.S. consumer price inflation. Journal of the American Statistical Association 103: 511–522.

21.

Jiao

Chen

(2018) Tourism forecasting: A review of methodological developments over the last decade. Tourism Economics 25: 469–492.

22.

Kumar

Patel

, et al. (2020) Modelling inbound international tourism demand in small Pacific Island countries. Applied Economics 52: 1031–1047.

23.

Lai

Lee

Chen

, et al. (2017) Research on web search behavior: How online query data inform social psychology. Cyberpsychology, Behavior, and Social Networking 20: 596–602.

24.

Law

Fong

DKC

, et al. (2019) Tourism demand forecasting: A deep learning approach. Annals of Tourism Research 75: 410–423.

25.

Song

Witt

(2005) Recent developments in econometric modeling and forecasting. Journal of Travel Research 44: 82–99.

26.

Chen

Wang

, et al. (2018) Effective tourist volume forecasting supported by PCA and improved BPNN using Baidu index. Tourism Management 68: 116–126.

27.

Pan

, et al. (2021) Machine learning in internet search query selection for tourism forecasting. Journal of Travel Research 60(6): 1213–1231.

28.

Pan

Law

, et al. (2017) Forecasting tourism demand with composite search index. Tourism Management 59: 57–66.

29.

Liu

Wang

, et al. (2018) Hot topics and emerging trends in tourism forecasting research: A scientometric review. Tourism Economics 25: 448–468.

30.

S-X

Peng

Wang

(2018) Stacked autoencoder with echo-state regression for tourism demand forecasting using search query data. Applied Soft Computing 73: 119–133.

31.

Noh

Vogt

(2013) Modelling information use, image, and perceived risk with intentions to travel to East Asia. Current Issues in Tourism 16: 455–476.

32.

Padhi

Pati

(2017) Quantifying potential tourist behavior in choice of destination using Google Trends. Tourism Management Perspectives 24: 34–47.

33.

Pesaran

Timmermann

(1992) A simple nonparametric test of predictive performance. Journal of Business & Economic Statistics 10: 461–465.

34.

Pouyanfar

Sadiq

Yan

, et al. (2018) A survey on deep learning: Algorithms, techniques, and applications. Association for Computing Machinery 51(5): 1–36.

35.

Qiu

Ren

Suganthan

, et al. (2017) Empirical mode decomposition based ensemble deep learning for load demand time series forecasting. Applied Soft Computing 54: 246–255.

36.

Seetaram

(2012) Immigration and international inbound tourism: Empirical evidence from Australia. Tourism Management 33: 1535–1543.

37.

Shen

Song

(2008) An assessment of combining tourism demand forecasts over different time horizons. Journal of Travel Research 47: 197–207.

38.

Sohrabi

Raeesi Vanani

Nasiri

, et al. (2020) A predictive model of tourist destinations based on tourists’ comments and interests using text analytics. Tourism Management Perspectives 35: 100710.

39.

Song

(2008) Tourism demand modelling and forecasting—A review of recent research. Tourism Management 29: 203–220.

40.

Song

Gao

Lin

(2013) Combining statistical and judgmental forecasts via a web-based tourism demand forecasting system. International Journal of Forecasting 29: 295–310.

41.

Song

Qiu

RTR

Park

(2019) A review of research on tourism demand forecasting: Launching the annals of tourism research curated collection on tourism demand forecasting. Annals of Tourism Research 75: 338–362.

42.

Song

Wong

KKF

Chon

KKS

(2003) Modelling and forecasting the demand for Hong Kong tourism. International Journal of Hospitality Management 22: 435–451.

43.

Stock

Watson

(2012) Generalized shrinkage methods for forecasting using many predictors. Journal of Business & Economic Statistics 30: 481–493.

44.

Sun

Wei

Tsui

K-L

, et al. (2019) Forecasting tourist arrivals with machine learning and internet search index. Tourism Management 70: 1–10.

45.

Tang

Zhang

, et al. (2020) A novel BEMD-based method for forecasting tourist volume with search engine data. Tourism Economics. DOI: 10.1177/1354816620912995.

46.

Tsui

WHK

Balli

(2015) International arrivals forecasting for Australian airports and the impact of tourism marketing expenditure. Tourism Economics 23: 403–428.

47.

Wen

Liu

Song

(2019) Forecasting tourism demand using search query data: A hybrid modelling approach. Tourism Economics 25: 309–329.

48.

Yang

Pan

Evans

, et al. (2015) Forecasting Chinese tourist volume with search engine data. Tourism Management 46: 386–397.

49.

Zhang

Muskat

, et al. (2020) Group pooling for deep tourism demand forecasting. Annals of Tourism Research 82: 102899.

50.

Zhang

Muskat

, et al. (2021) Tourism demand forecasting: A decomposed deep learning approach. Journal of Travel Research 60(5): 981–997.

51.

Zhang

Wang

Sun

, et al. (2020) Knowledge mapping of tourism demand forecasting research. Tourism Management Perspectives 35: 100715.

52.

Zhao

(2017) A deep learning ensemble approach for crude oil price forecasting. Energy Economics 66: 9–16.