Abstract
This article employs a mixed logit bounded random parameter model to analyze the returning preferences of tourists visiting the city of Lisbon, Portugal. The results show that accommodation characteristics and destination attributes (accommodation range, events, food quality, expected weather beach, overall quality, reputation, and safety) have a positive and significant impact on the probability of returning to the city. The model comparison shows that the mixed logit bounded random parameter outperforms the traditional mixed logit model and leads to more accurate conclusions.
The aim of this article is to analyze the covariates of repeat visitation to Lisbon—the Portuguese capital city. We employ a mixed logit bounded parameter model that takes into account the possible heterogeneity in the attributes of the tourists interviewed. The mixed logit is a commonly applied model in the contemporary tourism literature (Correia, Barros, & Silvestre, 2006, 2007; H. Kim & Gu, 2003; W. Kim & Arbel, 1998; Nicolau & Más, 2005, 2006). However, despite its ability to overcome the homogenous hypothesis of the standard logit model and, more importantly, the independence of irrelevant alternatives property, the model suffers from a relative limitation that is addressed by the mixed logit model with bounded parameters. In fact, a main problem with the standard logit model has been in specifying a functional form of the distribution of parameters. The most commonly used form is the normal distribution, but there are a number of circumstances in which this may be regarded as inappropriate, particularly in questionnaire data. In later sections, we provide more details about the mixed logit bounded random parameter model, which can simply account for all these existing problems (Train, 2003).
In testing the model, we use a unique application. Lisbon is one of Portugal’s main tourist regions. It currently occupies the 9th place in the ranking list of the world’s most popular cities for congress tourism purposes (International Congress and Convention Association, 2006). At a national level, Lisbon, with its blend of historic and modern cultures, its proximity to the ocean and its climate, has recently overtaken the Algarve region (sun and sand, golf, etc.) as the country’s first destination for overseas visitors (Tourism Statistics, 2005). Tourist numbers to the Lisbon region experienced strong growth from the mid-1990s following the selection of the city for a range of high-profile cultural and sporting events. Lisbon was the European capital of culture in 1994, then the venue for the World Expo ’98. In 2004, Lisbon hosted the Euro 2004 International Soccer Finals, in which 15 other European countries participated, bringing thousands of visitors to Portugal. This event was an excellent opportunity to promote tourism in Lisbon. More recently, due to the recent advent of low-cost airlines flying into the city’s airport, and the increasing number of cruise trips, Lisbon became an attractive venue for international events, seminars, and conferences.
Thus, due to its importance, Lisbon presents a rich context for our analysis. As already highlighted, the model used in this study also provides a contribution to the literature and expands the use of mixed logit models in tourism research (Train & Sonnier, 2003). Tourism questionnaires are characterized by variables that are bounded, for example, on the positive side of the statistical distribution (e.g., income and price); thus, the adoption of a choice stated experiment that takes into account this boundedness is justified.
The article is organized as follows. The next section presents a review of the related literature, followed by methodology. The hypotheses are then presented, after which are provided discussions on the empirical framework and the sources of data. Then we present the results, followed and by the discussions and conclusions of the main findings.
Literature Review
Several studies in the past analyzed the sources of repeat visitation (Baker & Crompton, 2000; Baloglu, 2001; Bigné, Sánchez, & Sánchez, 2001; Bowen, 2001; Caneen, 2003; Court & Lupton, 1997; Gallarza & Gill, 2006; Juaneda, 1996; Kozak, 2001; Kozak & Rimmington, 2000; Milman & Pizam, 1995; Oh, 1999; Pritchard, 2003; Um, Chon, & Ro, 2006). Generally, most studies agree that repeat visitation is affected by factors such as the reputation or quality of a particular destination (Barros, Butler, & Correia, 2010; Kozak, 2001; Kozak & Rimmington, 2000; Woodside & MacDonald, 1994). The intention to return can also be influenced by the number of previous visits, comfortability, or familiarity with a particular destination (Iso-Ahola & Mannell, 1987). A higher destination image can also be driver for future visits (Ashworth & Goodall, 1988; Bigné et al., 2001; Cooper, Fletcher, Gilbert, & Wanhill, 1993; Lee, Lee, & Lee, 2005). Other factors identified as important in explaining a visitor’s intention to return include the attributes and facilities of a particular destination.
The literature is also rich in terms of methodologies proposed to analyze repeat visitation. Stynes and Peterson (1984), for instance, proposed a logit model to estimate recreational choices, whereas Kockelman and Krishnamurthy (2004) proposed a multivariate negative binomial model for trip demand functions derived from an indirect underlying translogarithmic utility function. Both time and money budgets were incorporated into the model structure via an effective or generalized budget constraint. A nested logit model of trip mode and destination was used to calculate the effective prices for each trip proposed via nested logsum expressions.
Multinomial and nested logit models can be developed from a rigorous behavioral theory of utility maximization. However, standard multinomial logit models require discretion of choices (e.g., peak vs. no-peak travel times, trip vs. no trip); this affects the cardinality and continuity, which determine many travel choices, such as the time of day and number of trips made (Kockelman & Krishnamurthy, 2004). Examples of the multinomial logit model include Luzar, Diagne, EcGan, and Henning (1998), who investigated the socioeconomic and psychographic factors that influence tourists’ decisions to participate in nature-based tourism; Morley (1994), who assessed the independent effects of tourists from Kuala Lumpur, Malaysia, to Australia with eight contexts where the prices of the Sydney alternative varied; and finally, more in line with the present research, Nicolau and Más (2005, 2006) introduced a mixed logit model to analyze Spanish tourism, and Correia et al. (2006) and Correia, Barros, and Santos (2007) used the mixed logit to analyze golf tourism. With the present research, we extend the application of the logit model one step further, adopting the mixed logit model with bounded parameters (Boxall & Adamowicz, 2002; Hess, Bierlaire, & Polak, 2005).
The Mixed Logit Bounded Parameter Model
Let us assume that a person n is facing a choice among j alternatives in each of t time periods. j can be as small as 2, and t can be as small as 1. The person’s utility from alternatives j in period t is
where Unjt defines the utility derived from the choice (j = 0,1) made by individual n, who completes the questionnaire in period t. β is a vector of random parameters of individual-specific and alternative-specific attributes. Xnjt is a vector of individual specific and destination attributes. ε njt is an unobserved random term for the utility of returning to the visited destination by the individual that is independently distributed with a Gumbel distribution. Neither β n nor ε njt are observed, and the assumption that ε njt is distributed IID extreme value leads to the logit model. The β n are assumed to vary across the population and are drawn from some distribution: β n ~ F(b, ϕ). Thus, the unobserved parameters can be considered to have two elements: the mean of the distribution and the stochastic distribution around the mean.
In the mixed logit approach, the stochastic elements are separated; hence, utility from option j for person n is:
This model assumes that ε njt is IID extreme over alternatives with mean of zero while η njt can take on a number of alternative functional forms.
In particular, the joint-posterior distribution used in the estimations is obtained taking into account that the person n chooses alternative in period t if Unit ≥ Unjt ∀ j ≠ i. Denote Ynt the person n’s alternative in period t. The person n’s sequence of choices over the t time periods as Ynt = (yn1, . . ., ynt) and the set of yn ∀n as Y. Conditional on β n , the probability of person n’s sequence of choices is the product of the standard logit formulas:
where xnynt is the value of x associated with the selected choice, y, in period t.
The unconditional probability, the integral of L(yn/β n ) over all values of β n weighted by the density of β n , is
where φ(β/b, ϕ) is the normal density with mean b and variance ϕ.
Priors for both b and ϕ are required for implementation. The prior on b is normal with mean zero and an extremely large variance to generate an almost flat conditional on b and distributions: K(b) ~ N(b0, r0). The prior on ϕ is inverted Whishart: k(ϕ) ~ IW(K, I), where I is the k-dimensional identity matrix. This is a conjugate prior. This assumption regarding the prior on ϕ has the advantage of providing a distribution that is easy to draw from while not affecting the results of convergence.
The joint-posterior distribution on β n ∀n, b and ϕ is
where k(b, ϕ) is the prior of b and ϕ. n
Hence, one draw is taken of mean of the parameter b, conditional on ϕ and β n ∀n as if they were prefixed if they were known, then takes a draw of ϕ conditional on b and β n ∀n, and finally a draw of β n ∀n conditional on b and ϕ, using the Gibbs sampling (Train, 2003). The resulting three conditional posteriors are
The sequence of these draws from the conditional posterior converges to a draw from joint posterior.
The model estimation involves identifying the distribution parameter for the assumed functional form (the mean and the variance for a two-parameter distribution). Hensher and Greene (2003) note that
the distribution choice is essentially an arbitrary approximation to the real world. We select a specific distribution because we have a sense that the empirical truth is somewhere in the chosen distribution domain. All distributions have, in practice, at least two major deficiencies—typically with respect to sign and length of tail. (p. 146)
The most common distributions are the normal and log-normal distributions (Bhat, 1998, 2000; Hensher & Greene, 2003; Revelt & Train, 1997). The normal distribution implies that there will be some individuals who have, with some finite if small probability, extreme positive and extreme negative valuations of an attribute. In some cases, such as money, the theory posits that the preferences should be nonnegative. Assuming the distribution is normal will violate this. The log-normal distribution addresses this problem, since it can be restricted to either the positive or negative sign. However, this advantage is compromised by its long tail, which tends to generate a large range of implausible values and also the zero probability mass at zero. Therefore, bounded distributions were proposed to take into account this implausible distribution hypothesis.
Research Hypotheses
The tourism return choice can be explained by several factors. The theoretical framework supporting the present research follows Fishbein and Ajzen’s (1980) theory of reasoned action (Baker & Crompton, 2000), as applied in management and economics research, and the role theory of tourism behavior (Cohen, 1972; Pearce, 1982; Yannakis & Gibson, 1992) from the perspectives of sociology and ethnography. Both theories take into account different variables to explain tourism choice, namely, destination attributes and travel characteristics. The tourist is regarded as a rational individual who decides to visit a location according to its attributes, conditioned by previous experience (Howard & Seth, 1969). This assumption highlights the importance of travel characteristics and destination attributes in the returning choice.
The survey questionnaire therefore gathered data pertaining to (a) accommodation characteristics, measured by accommodation range and accommodation rate; (b) travel characteristics, measured by travel time and travel cost; (c) destination attributes (events, food quality, nightlife, expected weather, beach and safety); (d) destination reputation; and (e) number of previous visits. Using the survey data on these characteristics, we tested the following hypotheses:
Hypothesis 1 (Accommodation range): A higher accommodation range gives tourists more choice flexibility. It also helps in increasing destination quality as well as satisfying tourists from different social and market segments (Woodside & MacDonald, 1994). Thus, we hypothesize that return choice is a positive function of the accommodation range.
Hypothesis 2 (Accommodation rate): An increase in price is traditionally hypothesized to have a negative effect on tourism demand (Kozak, 2001; Varian, 1987). This is especially true in competitive markets where tourists search for higher value products at cheaper price. Thus, we hypothesize that return choice is a negative function of accommodation rate (Varian, 1987).
Hypothesis 3 (Travel time and travel costs): It is a traditional hypothesis in tourism models (Cook & Frechtling, 1976; Martínez Espiñeira & Amoako Tuffour, 2008) that trip demand is negatively related to travel time and travel cost. We hypothesize that the same applies to our present context.
Hypothesis 4 (Destination attributes): Woodside and Lysonski (1989) argue that repeat visitation can be enhanced through tourists’ enjoyment and contacts with the attributes of a particular destination (Wanhill & Lundtorp, 2001). The variables used here to test this hypothesis are events, food quality, nightlife, expected weather, beach, and safety.
Hypothesis 5 (Overall quality): Repeat visitation can be a result of the overall quality of a particular destination (Alegre & Cladera, 2006; Chi & Qu, 2008), since destinations with higher quality can also maintain client loyalty. Thus, we hypothesize that return choice is a positive function of overall quality.
Hypothesis 6 (Reputation): The reputation of a particular destination (Ledesma, Navarro, & Perez Rodriguez, 2005) affects repeat visitation, as it is a direct reflection on the quality of tourism products. We build here on this argument and we hypothesize that return choice is a positive function of the reputation of a particular destination.
Hypothesis 7 (Number of previous visits): Several studies in the past demonstrated that the number of previous visits is positively related to repeat visitation as it enhances the tourists’ awareness and familiarity with a particular destination (Court & Lupton, 1997; Kozak & Rimmington, 2000; Milman & Pizam, 1995). Thus, we hypothesize here that return choice is a positive function of the number of previous visits.
Empirical Framework
The present research analyzes the tourism return choice among alternative destination characteristics. Stated choice experiments were designed to elicit the responding tourist’s choice from among alternative destination attributes.
Choice Modeling
Choice modeling is a type of questionnaire that observes choices taken by individuals. There are two types of choice modeling: revealed preference models that collect choices in real markets and stated preferences models that collect choices in hypothetical markets. The present research adopts the stated preferences approach to answer why tourists choose to return to Lisbon. Revealed preference models have been used in all cited mixed logit articles. The choices are described in vignettes or scenarios, which are widely used in marketing research to elicit preferences for product characteristics. A product can be seen as a combination of different attributes that can have different characteristics. Vignettes can be best described as hypothetical situations that can be used to elicit preferences, judgments, or anticipated behavior (McFadden et al., 2005). A consumer’s decision about buying a tourism product may, for instance, depends on characteristics such as the price and quality (accommodation rate). In many applications, it is impossible to gather data that contain sufficient sets of individuals’ choices. The main advantage of using vignettes over standard survey questions is that it allows taking multiple attributes into account and varying them randomly. This allows estimating the relative importance of characteristics, and by presenting characteristics on a vignette simultaneously, the tendency to give social desirable answers is reduced. See a vignette in the appendix.
Experimental Design
A questionnaire that contained questions on travel behavior, travel motivation, and choice experiments was designed on the basis of previous studies (Barros et al., 2010; Correia et al., 2007). It was pretested on economic students at the Instituto Superior de Economia e Gestão, Technical University of Lisbon. Furthermore, the ordinal organization of constructions ranging from more peripheral to more central dimensions of meaning was taken into consideration in the study. However, recognizing that laddering frequently does not produce constructs that qualify as superordinate, and based on the regression that we pretended to estimate the variables were by hypothesis independent, we have adopted the following ordinality: Variables ordered from antecedents to subsequent in the decision process of the tourist according to the order is presented in Table 1. However, independence among the variables was assumed all over the process.
Choice Attributes and Levels and Characteristics of the Variables
In the experiment, each individual tourist was asked to indicate his or her choice in a series of choice tasks with consecutive questionnaires, each one with possible hypothetical scenarios, and hotel and site characteristics were described in terms of important contextual attributes. Therefore, the respondents ranked the hypothetical scenarios presented to them (as it is usual in choice surveys), evaluating in this way their own experiences and the resulting satisfaction levels on distinct possible scenarios. Taking into account the fact that the respondents’ actual experiences may have influenced their intention to return, the aim was to evaluate this return intention. The consecutive scenarios presented to them restrict the tendency for the interviewer to declare he/she will return, when in fact he/she does not pretend to return.
The combinations of all attributes and levels would result in a significantly large number of choice sets, which would be impractical to present as a whole to respondents. To overcome this obstacle while maintaining statistical efficiency, each choice experiment describing hypothetical contextual attributes was defined by four attributes. Each attribute was defined by variable levels, resulting in a fractional factorial design, thus assuring that all choice alternatives occurred with equal frequency, without having to use all theoretically possible combinations (full factorial). A constant base alternative was added to each scenario, which was the intention not to revisit Lisbon. Therefore, 15 vignettes were presented to tourists with distinct combinations of the attributes. For each main decision, the attributes information shown in Table 1 were listed. The choice experiments were designed to provide as wide a variation in each attribute and as little covariance among attributes as possible, while maintaining plausibility.
Data
The survey was conducted in March 2008, with the support of the Lisbon Tourism Authority (an institution supported by the Lisbon Municipal Council). The population for the survey consisted of tourists staying in city hotels. The hotels were randomly selected, as were the hotel guests, who were approached on the basis of their room numbers. The initial aim was to obtain 1,500 questionnaires from tourists visiting Lisbon in March 2008 (during the Easter vacation period), each tourist being invited by the hotel management to complete the questionnaire with the aid of interviewers. The active support of the Lisbon Tourism Authority was fundamental to gaining the agreement of the randomly selected hotels to participate in the study. A total of 1,495 tourists took part in the survey; of this number, only 1,035 were retained for the analysis, while 440 were rejected due to incomplete forms or due to response errors. Thus, the sample response rate was 68%. Most respondents were the oldest in their family and have traveled more than once to the city.
The survey consisted of 15 scenarios, each describing a hypothetical tourism package, as displayed in Table 1. The participant was asked to evaluate the information provided by each scenario relative to his/her intention to return to Lisbon. The second part of the survey obtained socioeconomic information on the individual.
Results
Based on the data gathered, a mixed logit model with bounded parameters was fitted, with the observed dichotomous variable (return) used to classify the individuals who declared that they would considered to return Lisbon (=1) or would not return (=0) as the dependent variable and the variables listed in Table 2 as explanatory variables. The mixed logit model with normal distribution is presented as a reference for comparative purpose. There was no prior view of the distribution for most of the variables.
Estimation Results for Mixed Logit and Bounded Mixed Logit With Bounded Parameters
Statistically significant.
We use the deviance information criterion (DIC) to compare between the traditional mixed logit model and the bounded mixed logit model. DIC is a measure of the goodness of fit of an estimated model. Models with smaller DIC have a better fit. Based on this test (Table 2), it is clear that that the bounded mixed logit with alternative distributions presents a superior fit. A comparison of the log-likelihood also indicates that the bounded mixed Logit model is a better fit.
The posterior means of the model parameters are presented in Table 2. The small standard deviations provide an indication whether a particular parameter estimate is significant, that is, if the variable has an impact on the likelihood of returning to Lisbon. The probability of returning increases significantly with accommodation range, events, food quality, expected weather, beach, overall quality, nightlife, reputation, and safety, and so we accept Hypotheses 1 and 2. The impact of these variables is significant in both models. The probability of returning decreases, but not significantly in the mixed logit model with accommodation rates and travel cost. However, these variables become significant in the mixed logit model with bounded parameters. We thus accept Hypotheses 3 and 4. The insignificance of the overall quality and reputation variables in the classic logit model, which becomes statistically significant in the mixed model with bounded parameters, is a critical finding. Thus, we accept Hypotheses 5 and 6. Finally, we accept Hypothesis 7 since the numbers of previous returns increase the probability of a next return to Lisbon.
Conclusion and Policy Implications
In this article, a mixed logit model and a mixed logit with bounded parameters were used to determine the variables that affect the decision of tourists to return to Lisbon. Assuming that tourists are heterogeneous, the models were estimated on the basis of the probability that a tourist will return to Lisbon. The findings point to a significant correlation between the probabilities of returning to Lisbon with the exogenous variables. A key finding was that the overall quality and reputation variable, which are statistically insignificant in the logit model, turned to be statistically significant in the mixed logit model, validating previous results by Alegre and Juaneda (2006). This finding calls the attention to variables that lack a probability density at zero (a particular problem in questionnaire data as some answers are positively defined in choice attribute). From the results it was also clear that destination attributes (accommodation range, events, food quality, expected weather beach, overall quality, reputation, and safety) are positive and statistically significant, validating previous research in the area (Alegre & Juaneda, 2006).
Thus, this study makes two unique contributions to the tourism literature. First, it presents a mixed logit model with bounded parameters, which enables real data to be adequately fitted to statistical distributions. In questionnaire data, this is of the utmost importance, since the variables are, for the most part, non-normal. Second, we provide a better understanding of the importance of the tourist-return hypothesis and the decision process involved.
The results seem to be in line with the main tourism policies adopted in Portugal, which currently focus on four main issues: (a) an increased funding for new hotel construction and hotel development; (b) a strong focus on improving the country’s reputation through undertaking extensive marketing campaigns in foreign destinations, especially in traditional inbound countries such as the United Kingdom, France, Germany, the Netherlands, and Italy; (c) a strong focus on recruiting human capital of excellence and developing market niche; and (d) increased investments in new regulations that protect the health and safety of tourists.
Thus, the results provide further incentive to the Portuguese government to focus on the above issues. The findings also highlight areas where policies should be focused in the future, especially in terms of travel cost and travel time, which have a negative impact on repeat tourism. For instance, one measure that the government needs to adopt is the reduction of airport taxes. Spain, for example, has lower airport taxes and to a great extent competes with Portugal with similar services and products. The reduction of airport taxes, therefore, would permit lower traveling costs and hence a more competitive position in comparison to Spain. On travel time, the transport minister should think twice about the construction of the new airport planned outside Lisbon. The present airport location is only 10 minutes from downtown and this is a main competitive characteristic of the city. Thus, if the decision to build the new airport goes ahead, more careful and effective policies are needed to control the travel time and facilitate the movement of tourists between the airport and the main city.
Finally, and despite all the policies adopted by the government to promote Lisbon and improve the accommodation quality, more work is needed to improve these important characteristics. Government agencies, for instance, have generally been ineffective in promoting Lisbon as a tourism destination and providing the infrastructure that tourism needs to flourish, so the much-needed dynamism in the tourism sector is more likely to come from private organizations than public ones. In the accommodation sector, more funding and assistance are needed from the government to boost the accommodation quality, which is still very low in many areas of the city compared with other competing destinations. Finally, future policies should focus on promoting the nightlife activity in the city. Security should also be maintained and promoted further to help the tourists.
Footnotes
Appendix
(i) State preferences start by defining the attributes of the question to the respondent. These questions are the variables used in the model. In order to control the dimension of the alternative combinations, the questions were split into two types. Questions 1 to 5 adopted state preference procedure. Questions 6 to 12 were inserted only in the first vignette and therefore used to reveal preference procedure. The variables/attributes are displayed below.
(ii) Stated preferences (SP) questions and levels.
(ii) An orthogonal design was first established for Questions 1 to 14. Based on this orthogonal design, the vignettes/scenarios were established.
(iii) A sample vignette/scenario
Therefore each vignette was a combination of each attribute defined by the orthogonal design. Only meaningful relations were retained from the initial orthogonal design and therefore each respondent answered 15 vignettes with alternative combinations. Each respondent selected a column of each vignette/scenario. In total, we have 1,035 questionnaires corresponding to 69 respondents.
