Abstract
Temporal transferability of model parameters is a critical issue, especially in the context of developing countries where data and resources for transport model development are extremely limited. This study investigates the temporal transferability of vehicle ownership models with special emphasis on exploring the effect of model structure on temporal transferability. The performance of potential updating methods for making the models more transferable are also compared. The household survey data collected from Dhaka, Bangladesh in 2005 and 2010 have been used in this regard. Different forms of random utility and count regression models of car, motorcycle, and bicycle ownership have been developed using income and household size, and number of workers, children, and licensed drivers as explanatory variables. The temporal transferability of each model between the two time periods has been compared rigorously using statistical tests. Results indicate that the multinomial logit model has better temporal transferability than the count regression models. In relation to model updating, the combined transfer estimation method for model updating is found to perform better than the Bayesian updating. The findings can provide useful guidance during application of a pre-existing model in the context of a developing country.
The economic growth of a developing country is very often highly interlinked with the growth of its transport sector. In recent years, economic growth has facilitated rapid urbanization in most countries of the developing world followed by exponential increase in ownership and use of private vehicles ( 1 ). This, in turn, has increased the demand for transport infrastructure and services and in many cases intensified the negative transport externalities such as air pollution, high energy consumption, and loss of lives from accidents ( 2 ). These highlight the extreme importance of robust travel-demand models that can be used for informing and guiding policy decisions directed at sustainable planning, control, and management of transport services and infrastructure. Development of travel-demand models however often requires significant resources which are not readily available in developing countries given the financial constraints. Transferability of the models across time, either in the original form or with limited updating, offers an economic solution to this problem.
Several studies have investigated temporal transferability of travel behavior but mostly using data from developed countries ( 3 – 8 ). However, developing countries typically have very different transport contexts and travel behavior. For example, in 2013, the per capita motorized vehicle ownership was about 0.52 in the UK and 0.058 in China ( 9 ). Further, the transport and economic landscape are changing at much faster rates in the developing countries. China, for example, is expected to have a 13–17% per year increase in car ownership till 2020 ( 10 ), whereas recent data indicates that many of the European countries have almost reached their “peak car” levels ( 11 ). This warrants the need for detailed research in the context of developing countries on vehicle ownership models, their temporal transferability and measures to improve temporal transferability.
The particular research questions we investigate are as follows:
Which model structure best explains the vehicle-ownership decision in the context of a developing country where the car-ownership level is very low compared with the total population?
How does the performance of different model structures compare in relation to temporal transferability?
Which candidate method has the best performance in improving the temporal transferability?
Disaggregate data collected from Dhaka, the capital of Bangladesh and one of the fastest growing megacities in the world, has been used to investigate these research questions. Dhaka already hosts more than 18 million people and attracts 300,000 to 400,000 new migrants every year from different parts of the country ( 12 ). To meet the mobility demands of the rapidly growing population, the number of vehicles is increasing at an alarming rate. According to the Bangladesh Road Transport Authority (BRTA), the number of newly registered vehicles in Dhaka in 2004 was 21,471 and 95,743 in 2015 ( 13 ). The rapidly changing demography and transport scenario of the city (further detailed in this paper under Results), makes it an interesting test bed for conducting the research on temporal transferability. Further, despite the very high growth rate, the vehicle ownership levels in Dhaka are one of the lowest in the world with more than 90% of households not owning any vehicle (car, motorcycle, or cycle). From a modelling perspective, this (i.e., the excessive occurrence of zeros as dependent variables) poses additional challenges and prompts us to investigate the most appropriate model structure for predicting vehicle ownership in the context of very low ownership levels.
In this research, vehicle ownership models are estimated using household survey data from two different time periods 2005 ( 14 ) and 2010 ( 15 ). The models are estimated as disaggregate at the household level taking into consideration the communal nature of making travel decisions ( 16 ). For the effects of the excessive zeros in the vehicle ownership data (which is typical in the developing world), different model structures have been estimated and compared in relation to goodness of fit and temporal transferability. The potential methods to improve temporal transferability have also been tested. The rest of this paper is organized in the following sequence: data, methodology, results, and conclusions.
Data
The data used by the study is obtained from two household surveys conducted in the Dhaka Metropolitan Area, Bangladesh in 2005 and 2010. Area definitions in both years were the same though the 2010 sample is much bigger (18,084 households) than the 2005 one (655 households). The surveys, originally conducted for developing the Strategic Transport Plan for Dhaka ( 14 ) and the Dhaka Urban Transport Strategy ( 15 ), respectively used the same questionnaire and identical stratified sampling strategies. The socio-demographic data collected in the survey included household income, total number of persons per household and the household composition (e.g., number of children, workers, students, and licensed drivers). Table 1 presents a comparison of the household-level demographics in the two datasets and more detailed statistical analyses are presented in Figures 1 and 2.
Comparison of the Two Datasets
Note: The GDP per capita in Dhaka changed from 485.21 USD to 762.81 USD during this period ( 9 ), the exchange rates were 1 USD = 84.10 BDT (2005) and 69.53 (2010) respectively.

Vehicle ownership distribution.

Demographic distribution and vehicle ownership: (a) vehicle ownership and number of workers, (b) vehicle ownership and household size, and (c) vehicle ownership and licensed drivers.
As seen in Table 1, there are some similarities between the demographics in the two datasets as well as differences. For instance, the average number of licensed drivers is significantly higher in the 2010 sample and the average incomes are higher as well. The differences are not unexpected though, given the increase in GDP in this period.
Figure 1 shows the detailed comparison of the type of vehicle ownership in the two datasets. As seen in the figure, the total percentage of households owning at least one car is very low (around 6%), out of which less than 1% own two or more cars. Bicycle ownership is also very low among the surveyed households, with no household owning more than one bicycle. It may be noted that though Dhaka is a flat city, bicycles are not popular because of safety (there are no designated cycle lanes) and security issues (they can easily get stolen in absence of proper bicycle racks). Also, it is considered culturally improper for women to ride bicycles.
In relation to vehicle ownership among different demographic groups, in most cases, car and motorcycle ownership is highest among households with two or more full-time working members (with the exception of car ownership in the case of 2005) (Figure 2a). This is expected as an increase in workers is reflective of an increase in the household income. However, though vehicle ownership is low for households with no full-time working members, in the 2010 dataset, some such households do report that they own vehicles. It is suspected that these household may have one or more members working part-time or have members who have retired from their jobs. Vehicle ownership rates are higher for larger households (Figure 2b). This is expected as mobility needs are expected to increase as the number of people in a household increases. The relationship between the number of license holders and the number of cars in a household (Figure 2c) demonstrates a weak correlation with car ownership. This is not unexpected, given that most of the cars in Dhaka are chauffeur driven and it is common to own a car without having a driving license or to have a driving license but not own a car (i.e., work as a chauffeur by profession, which is a low-income profession).
Methodology
The candidate model structures, methods for testing transferability, and updating the model parameters (using limited data) are discussed in this section. For all cases, the state of the art is presented first followed by details of the selected methods.
Model Structures
Vehicle ownership is typically modelled using ordered and unordered discrete choice models or count regression models.
Among the unordered models, multinomial logit (MNL) and nested logit (NL) models have been widely used for their ease of analysis and availability of estimation software, both in medium and long term (see 17 for further details). In these models, following the principles of utility maximization, the decision maker chooses the alternative that provides the greatest satisfaction. Therefore, for a given set of alternatives, the probability of household n choosing alternative i, given choice set
where
The MNL structure assumes that the error term is independently and identically (Gumbel) distributed across households. The probability of household n selecting vehicle ownership alternative i, is therefore expressed as
Ordered models are based on the assumption that a latent intangible variable represents a household’s propensity to own vehicles and the probabilities of owning a certain number of vehicles are then obtained by matching specific ranges of the values of the latent variable to the corresponding numbers using
where
Among the count regression models, the Poisson and the negative binomial regression models are some of the most commonly used to estimate and analyze count data, though their applications have been primarily in the context of crash frequency (see 18 for a comprehensive review) and trip generation (e.g., 19–22), with a few applications for vehicle ownership decisions (e.g., 23, 24).
In Poisson regression, it is assumed that the number of occurrences (k) of the dependent variable y has a Poisson distribution given the independent variables X1, X2, …, Xn as
It assumes
where
X1, X2,…, Xn = household characteristics in the context of vehicle ownership,
N = sample size,
The Pearson’s goodness-of-fit test is used to check model appropriateness to the data distribution. p-values less than 0.05 mean the data is significantly different from a Poisson distribution.
Whereas the Poisson model assumes the mean and variance are equal, the negative binomial model takes into account the possibility of over-dispersion in the data because of large differences between the observed mean and variance (see 25 for details). The negative binomial regression model takes the form
where
To investigate model appropriateness (i.e., to test if negative binomial model, which has one additional parameter, is superior to the Poisson model), a likelihood ratio test is performed and the likelihood ratio (LR) is compared with the chi-square distribution using
where LL(P) and LL(NB) are the log-likelihoods of the Poisson model and the negative binomial model respectively and use of the negative binomial model is justified if LR is greater than the critical chi-square value at k degree of confidence.
Variants of these models are zero-inflated Poisson and zero-inflated negative binomial models, which address the issues associated with excessive zeroes in the data and are expressed respectively as
The Vuong test ( 26 ) is carried out to compare the models with the simpler variants.
Transferability
Temporal transferability of the individual parameters is checked by testing whether or not there is a significant difference between the parameter estimates of equivalent variables in the two cities ( 27 ). Minimum and maximum t-ratio values of –1.96 and 1.96 corresponding to the 95% confidence interval are taken as the critical values.
where
Global measures of model transferability are also obtained using the transferability test statistic (TTS) ( 27 – 29 ).
where
The transferability test statistic follows a chi-squared distribution with degrees of freedom equal to the number of parameters estimated and its value should be less than the critical chi-square value at the chosen level of significance for good transferability.
Model Updating
Findings from previous studies indicate that temporal transferability of a model is improved by updating the model parameters with some information from the application context (e.g., 30, 31). Several updating approaches have been suggested in literature. Among these, the two most widely used are explained below:
Bayesian Updating
The Bayesian updating process follows the Bayes theorem in which prior information about the model is combined with a random sample from the application context to get updated information that is important in reducing doubt during prediction ( 32 ). The parameters estimated with the data from the first location can be used as the prior information in this case, using the formula
where βtrans and βappl are the vectors of parameters of the originally estimated model and the application context model respectively and σtrans and σappl are corresponding vectors of standard deviations.
Combined Transfer Estimation
The combined transfer estimation method ( 33 ) acknowledges the variations between parameters because of long time gaps and other differences between the estimation and application contexts such that the updated parameters are estimated as
where α = βtrans–βappl and α′ = βappl–βtrans
It may be noted that though there are simpler methods such as updating only the constants of the model or using scaling of the model parameters, based on the exploratory analysis results, they were not deemed to be appropriate in these cases and have not been tested rigorously.
Results
Model Coefficients
Review of previous studies on vehicle ownership (car ownership in particular) have shown that private ownership decisions are affected by both socio-demographics and urban forms. For example, it has been reported that a household’s decision when purchasing a first car is primarily based on socio-economic factors (income, age of household members, value of time, etc.), whereas the decision for purchasing a second car (or more) is largely based on traffic network, efficiency, and transit level-of-service parameters ( 34 , 35 ). In the context of developing countries, it has been also reported that good transit services decrease the tendency of households to own more motorcycles ( 36 ).
However, in this study, between the two waves, there were no significant changes in the transport and urban landscape. For instance, there were no significant improvements or investments in public transport. Nor were any new major roads constructed within the city. Rather, the economic landscape has undergone major improvements. This motivated us to focus on the socio-demographic variables. The model parameters of all models were estimated using maximum likelihood technique. The model parameters were retained based on their statistical significance. However, in the case of some variables, the coefficients found to be statistically significant in any of the models were retained for consistency.
The results of the MNL model are presented in Table 2. The estimated results indicate that the vehicle ownership decisions are significantly affected by the income and the number of license holders in the household. These are intuitive and in agreement with literature. For example, previous studies in developing countries also indicate that household income is one of the major determinants of car and motorcycle ownership ( 37 ). This indirectly implies that vehicle ownership will continue to increase as fast as per capita income growth in developing countries until saturation is reached ( 9 ). Among other household characteristics, number of workers is found to positively impact vehicle ownership, but the parameter is statistically not significantly different from zero in the 2005 data. In the 2010 data, it is however significantly different from zero at 95% level of confidence. The increase in number of workers per household has been reported to be positively correlated to vehicle ownership in previous studies as well. For example, in Chennai city in India, ownership of two-wheeled vehicles increased with the increase of young workers by 25% and indirectly encouraged car ownership as well since it translated to increased income ( 38 ). Increase in household size, however, is found to have a negative impact on vehicle ownership. This agrees with the suggestion of Zegras and Gakenheimer ( 37 ) that a decrease in household size could encourage vehicle ownership. This is because smaller households are likely to have fewer dependents (and therefore less expenses and more savings) to facilitate vehicle ownership. It may be noted that the coefficients of household size and number of workers have been found to be statistically different from zero only for 2010. The effect of number of children has not been found to be significantly different from zero in either year and is not included in the final model.
MNL Model Estimation Results
Note: ASC = Alternative Specific Constant.
It may be noted that market-segmentation tests have also been performed, but the coefficients were not found to be different for different segments of income, household size, and number of workers. Estimation of the Poisson model (separate for number of cars and number of motorcycles/bicycles) with 2005 data in Table 3 shows similar trends as the MNL model in relation to effects of increase in income, but interestingly the effect was found to be statistically insignificant in the case of motorcycles/bicycles in the 2005 dataset. The number of driving-license holders was significant for the car ownership model for both years. The workers and household size were found to be statistically significant in the 2010 dataset only (both for car and motorcycles/bicycles). Interestingly, the Pearson’s goodness-of-fit test showed that the 2005 data is not significantly different from a Poisson distribution whereas 2010 data is different from the Poisson distribution.
Poisson and Negative Binomial Regression Model Estimation Results
The negative binomial model results were very similar to the Poisson model, both in relation to magnitude of the coefficients and statistical significance. The goodness-of-fit measures indicated slight improvement over the Poisson model though the very small values of alpha ruled out over-dispersion for the car ownership model for both years and motorcycle/bicycle ownership model in 2005. However, both the LR test result and the alpha estimate indicated that for the 2010 motorcycle/bicycle ownership data, there is significant over-dispersion. The model is hence retained for the transferability analysis.
The results of the zero-inflated binomial models (ZINB) were substantially different from the Poisson and the negative binomial models. The Vuong test for appropriateness of the ZINB model indicates that the model is indeed better suited for 2010 data and 2005 car ownership than the negative binomial model. This is evident in the statistically significant z-values at 95% level of confidence as shown by results in Table 3. It may be noted that the coefficients in this case predict the occurrence of zeros and have opposite interpretation of the signs to the previous two models. Although for the sake of conformity all variables significant in other models were retained, other than income, no variables were found to be statistically significant.
Assessing Temporal Transferability
Going by the results of the t-statistic for the difference in parameters test shown in Table 2, most of the coefficients of the MNL model are found to be transferable between 2005 and 2010. The only exception is the number of licensed-driver dummy variables. However, despite most variables proving transferable, the model itself is not transferable between 2005 and 2010 as indicated by the transferability test statistic 4713.70 which is much greater than the critical chi-square value (
The findings of the Poisson model for car ownership were somewhat similar to the MNL with the coefficient of the number of licensed-driver dummy being significantly different in the two years. In addition, the coefficient of the number of workers was also found to have statistically significant differences originating from the t-stat being significant in 2010 and insignificant in 2005. The model as a whole was however not temporally transferable as the TTS value 7180.97 is much greater than the critical chi-square value (
For the Poisson and negative binomial distribution for motorcycles/bicycles, the difference between the parameters were all found to be statistically insignificant, but the TTS values, 269.6 and 276.46 respectively, are much greater than the critical chi-square value (
For the ZINB model, all parameters of the car ownership model except high-income dummy are found to be transferable across time as all values of the z-statistic are below 1.96. For the motorcycle/bicycle ownership model, the high-income dummy was however not found to be transferable. The models are not transferable as a whole and the TTS values are significantly larger than the critical chi-square value (
Improving Temporal Transferability
As mentioned earlier in this paper, two methods of updating have been tested: Bayesian updating and combined transfer estimation.
For the Bayesian method, three small samples of 3616 households were drawn randomly from the application data, i.e., the 2010 data. The sample size was one-fifth of the entire 2010 sample, a size recommended by Koppelman, Kuah, and Wilmot ( 39 ) in Santoso and Tsunokawa ( 40 ) as suitable for updating procedures. Three random samples were used to eliminate any bias and check for consistency in the data. The models were run using each sample and the resulting parameter estimates used to calculate updated estimates by equation. The updated model for each sample was then tested for transferability. Using another set of random samples measuring one-third of the entire 2010 sample, the updating procedure was repeated to check for the effect of bigger sample size on the resulting model transferability.
Similarly, the models were again examined for improved temporal transferability following updating of parameters by the combined transfer estimation method. Modified parameter estimates were calculated using Equation 13. Improvements that were sought were reductions in the TTS values.
The results are summarized in Table 4. As observed in the table, it is evident that model updating improves temporal transferability as the TTS values in the updated models are much less. It may be noted though, that even after updating, none of the models resulted in TTS values smaller than the critical chi-square values. The combined transfer estimation approach shows better performance in improving the TTS in all model forms. This is expected because the transfer bias in combining parameters is taken into consideration in this method.
Comparison of Model Performance
Among the different model forms, the MNL model shows better improvement than all other models.
Conclusions
Different forms of random utility and count regression models have been rigorously tested in this paper in the context of vehicle ownership in Dhaka. The aim was to contrast the values of the coefficients across different structures, investigate which models are more temporally transferable, and assess the performances of candidate parameter updating methods.
The key findings are listed below:
The model form results in some, but not substantial, differences in sensitivities towards different influencing factors. For example, in almost all models, the income levels are found to be statistically significant.
In relation to transferability, most coefficients are individually transferable, but the models are not transferable as a whole, as the TTS values were above the critical chi-square values.
Updating methods result in reductions in TTS values, but they are still above the critical chi-square values.
Of the model structures explored by this study, the MNL model was found to be more temporally transferable, both before and after updating.
Among Bayesian updating and combined transfer estimation, the latter results in larger improvement in increasing temporal transferability.
It may be noted that 2005 and 2010 are not too far apart and there have not been any significant changes in the urban or transport landscape in this period. Our results are therefore more on the conservative side. But the results serve as a proof of concept that updating of estimated models for temporal transferability is indeed a practical way for developing countries to make better travel-demand forecasts without the encumbrance of extensive new data collection and model estimation. The findings are expected to be of utmost practical importance to transport planners working in developing countries where very often it is not possible to collect detailed data on a frequent basis because of resource constraints.
For future research, we recommend further investigation into the performance of other updating methods on temporal transferability and testing temporal transferability of more advanced model structures, such as the mixed logit. Future studies on temporal transferability in the context of developing countries could also examine the use of the more flexible predictive tests, such as model elasticity, to check the sensitivity of the model to variations in input variables and the relative error measure to compare parameter values between the estimation and transfer context ( 6 ) to complement statistical tests of transferability as used by this study. This is relevant since models found statistically not transferable may still prove useful in forecasting with a reliable degree of practical accuracy.
Footnotes
Acknowledgements
Flavia Anyiko is funded by the UK Commonwealth Scholarship Commission.
Author Contributions
CC proposed the idea and guided the analyses and model development. FA conducted the main data analyses and model development. Both authors contributed to writing the paper.
The Standing Committee on Traveler Behavior and Values (ADB10) peer-reviewed this paper (19-01893).
