Abstract
This study aims to identify the incidence patterns of the most common infectious diseases, including acute diarrhea, pyrexia of unknown origin, hemorrhagic conjunctivitis, and pneumonia, in the 7 provinces of northeastern Thailand, based on individual hospital case records of infectious disease routinely reported from 1999 to 2004. Log-linear regression analysis with age-group, season, and district as factors was used, with data from all 4 diseases as outcomes combined into 1 model. Results confirmed that the highest incidence of each infectious disease occurred in children aged less than 5 years of age, with particularly high rates for diarrhea. In addition, the burden of pyrexia of unknown origin was found to be lower in districts bordering Laos, and the incidence rates were higher from April to June in 1999-2001 and 2004 and from July to September in 2002-2003. Higher incidence rates also occurred in most rural districts of Loei and Udon Thani provinces.
Infectious diseases constitute a large burden of illness and remain a major public health problem. They account for about a quarter of deaths worldwide, killing more than 13 million people annually. 1 They are caused by pathogenic microorganisms, such as bacteria, viruses, parasites, or fungi; the diseases can be spread, directly or indirectly, from one person to another. 2
In Thailand, acute diarrhea (Acute D), pyrexia of unknown origin (PUO), food poisoning, pneumonia (PN), and hemorrhagic conjunctivitis (HC) are the most common infectious diseases among inpatient and outpatient hospital cases. In 2004, the morbidity rates of these diseases were 1858 cases of Acute D, 294 cases of PUO, 247 cases of food poisoning, 218 cases of PN, and 165 cases of HC per 100 000. 3 These were the top 5 morbidity rates among infectious diseases under surveillance in Thailand. 3 However, these figures do not give a comprehensive picture of the disease epidemiology at the district level.
Understanding their spatial and temporal distribution is essential for policy intervention. When resources are limited, it may be necessary to phase in health programs rather than target all areas simultaneously.4,5
Most studies of spatial and temporal disease patterns have focused on modeling a single disease. However, several diseases share overlapping risk factors for the relevant sites, so there are benefits in analyzing them jointly.6,7 From the hospital administration point of view, it also makes sense to group all common infectious diseases together, even though their etiologies are different and may be reasonable to assume that the hospital burden for such infectious diseases is the same for all diseases in the group.
In this article, we describe an appropriate statistical method to assess the spatial and temporal patterns of the joint incidence rates of the 4 most common infectious diseases in northeastern Thailand.
Methods
Data Management
Data for the present study were routinely collected from the registry of hospital-diagnosed infectious disease cases over the period 1999 to 2004 by the Ministry of Public Health. Nearly all subjects were Thai citizens, but all were treated equally in the study. Most of the cases were suspected cases (ie, they were not confirmed from lab results). 3 Data for each year are available in computer files with records for individual cases and fields comprising characteristics of the subject and the disease, including dates of sickness and diagnosis; the subject’s age, gender, and address; and the severity of the illness including date of death for mortality cases. Preliminary analysis indicated that the hospital burden of PUO showed a different incidence pattern between districts bordering Laos and nonborder districts.
We fitted a model allowing different PUO incidence rates in districts bordering Laos and nonborder districts. Age was divided into 4 groups—0 to 4 years, 5 to 14 years, 15 to 39 years, and 40 years and older—with cases for each disease and age-group selected only if their incidence rates were substantial (at least 3 per 1000, as shown in Table 1 in boldface). Thus, counts were aggregated for combinations of disease type, age-group, and Laos border district location for PUO (13 demographic-disease groups); quarterly season (24 periods from March quarter 1999 to December quarter 2004); and district (95 regions). The 7 provinces studied were Loei, Udon Thani (Udon), Nong Khai (NK), Sakon Nakhon (SK), Nakhon Phanom (NP), Mukdaharn (Muk), and Amnatcharoen (Am). Incidence rates were computed as the number of cases per 1000 residents in each demographic-disease group, quarter, and district.
Annual Incidence Rates per 1000 by Disease and Age Group
Abbreviations: D, acute diarrhea; PUO, pyrexia of unknown origin in nonborder districts; PUO*, pyrexia of unknown origin in Laos border districts; HC, hemorrhagic conjunctivitis; PN, pneumonia.
Statistical Methods
We considered negative binomial and log-linear regression models for describing the relation between the outcome and determinant variables.
The negative binomial model is an extension of the Poisson regression model where overdispersion is present in the data. Data are overdispersed when the variance exceeds the mean,8-10 and this commonly occurs in disease modeling because of spatial clustering. If λ ijt denotes the mean incidence rate in the combination of demographic disease group i, district j, and period t, an additive model with this distribution has mean λ ijt , where
In this model, α i , β j , and γ t are demographic-disease group, district, and period effects, respectively, which sum to zero, and µ is a constant encapsulating the overall incidence rate. The variance of this distribution is λ ijt (1 + λ ijt /θ) with the Poisson model arising as the special case in the limit as θ → ∞.
The alternative additive log-linear regression model for the incidence rates with normally distributed errors has mean
where n*ijt is a simple modification of the count nijt in a cell to ensure that the incidence rates can be log-transformed. To cater for zero counts in the model, we replaced them by a specified fixed constant between 0 and 1 before log-transformation. The model fit is assessed by plotting deviance residuals against normal quantiles for the negative binomial model and similarly by plotting standardized residuals against normal quantiles for the log-linear model.
It is also informative to plot observed counts and appropriately scaled incidence rates against corresponding fitted incidence rates based on the models. The models give adjusted incidence. rates for each factor of interest, obtained by suppressing the subscripts in Equations (1) and (2) corresponding to the other factors and replacing these terms with constants satisfying the condition that the expected number of cases based on the adjusted incidence rates matches the total observed. Sum contrasts11,12 were used to obtain confidence intervals (CIs) for comparing the adjusted incidence rates within each factor with the overall incidence rate. An advantage of these CIs is that they provide a simple criterion for classifying levels of a factor into 3 groups according to whether each corresponding CI exceeds, crosses, or is less than the overall mean.
All statistical analyses, graphs, and maps were made using the R program.13,14
Results
Preliminary Results
During the study period from January 1, 1999, to December 31, 2004, 530 899 cases of Acute D, 23 519 cases of PUO in districts bordering Laos, 144 160 cases of PUO in nonborder districts, 88 150 cases of HC, and 59 300 cases of PN were reported to have been diagnosed at district hospitals in the 7 provinces.
Table 1 shows the annual incidence rates per 1000 for the various levels of the demographic-disease group factor. The annual incidence rates ranged from 0.29 for PN in the age-group 15 to 39 years to 71.3 for Acute D in the age-group 0 to 4 years.
The time series plots (Figure 1) show a common seasonal pattern with annual peaks for all 4 diseases in the middle of the year and lower rates around December of each year, but with little evidence of any trend of Acute D, PUO, HC, and PN. The rates have a similar pattern for each age-group, with higher rates in the younger age-groups for all diseases.

Monthly disease incidence rates in northeastern Thailand
Statistical Analysis
The results of the statistical analysis are presented in Figures 2 to 4. For each model, Figure 2 shows plots of observed counts versus fitted counts in the left panel, a similar plot of annual incidence rates in the middle panel, and a residuals plot in the right panel. The residuals plot for the negative binomial model (for which the dispersion parameter estimate was 0.83 with a standard deviation of 1.25) shows a poor fit. In contrast, the log-linear model (with zero counts replaced by 0.5) fitted the data extremely well, so in further analysis we used this model.

Plots of observed versus fitted counts and annual incidence rates and residuals plots for negative binomial (upper panel) and log-linear (lower panel) models
Figure 3 shows the adjusted annual incidence rates for each factor based on the log-linear model. The dotted horizontal lines on each graph represent the overall mean annual incidence rate for all diseases combined (12.3 per 1000). It can be seen that diarrhea for children aged 0 to 4 years had the highest incidence (78.4, 95% CI = 74.9-82.1) among demographic-disease groups. The annual incidence rate of PUO was clearly lower in Laos border districts in each demographic-disease group. The annual incidence rate of HC in children aged 0 to 4 years (5.7 per 1000) was higher than in children aged 5 to 14 years (3.0 per 1000). In addition, the annual incidence rate of PN of children aged 0 to 4 years in this area was 9.8 per 1000 (95% CI = 9.3-10.2), which was lower than the overall mean.

Annual incidence/1000 for each factor adjusted for other factors
Two different seasonal patterns were identified. During 1999-2001 and 2004, the incidence peaked in April to June (16.1, 95% CI = 15.0-17.3), whereas the lowest incidence occurred in October to December (7.1, 95% CI = 6.7-7.7). For 2002-2003, the incidence peaked in July to September (14.3, 95% CI = 13.3-15.3), with the lowest point in October to December (6.9, 95% CI = 6.4-7.4). The district incidence rates ranged from 3.2 (95% CI = 2.8-3.7) in Sakon Nakhon city to 46.3 (95% CI = 40.1-53.4) in Nongsaeng of Udon Thani province. The annual incidence rates for the 4 infectious diseases were generally higher than average in the districts of Loei and Udon Thani provinces, but lower than average in all districts of Nakon Phanom province.
Figure 4 shows a thematic map of the study area. Districts are classified according to whether their CIs exceed, cross, or are less than the overall mean. Higher incidence was found in the 2 provinces to the west and south of Loei (8 of 12 districts), including those west and east of Udon Thani (13 of 20 districts). All 12 districts of Nakhon Phanom had lower than average incidence for all 4 diseases.

Thematic map of annual incidence rates, 1999 to 2004
Discussion
This study used an appropriate statistical method to assess the spatial and temporal patterns of incidence of the 4 most common infectious diseases requiring hospitalization in northeastern Thailand. For these data, the negative binomial model gave a poor fit, whereas the log-linear model fitted extremely well. A further advantage of the log-linear regression model is that it can be easily adapted to give different burdens for different diseases, simply by weighting the number of cases accordingly. For example, the average lengths of stay in hospital for each disease could be taken as weights. This approach has been used in a more complex statistical model by Bu°žková and Lumley. 15
The demographic-disease group pattern demonstrated that a higher morbidity burden of the 4 infectious diseases occurred in children aged less than 5 years, with the highest incidence for diarrhea. This finding is consistent with a recent study reporting that childhood diarrhea is a common cause of illness in both developed and developing countries.16-18 In addition, a study by Scallan et al, 19 covering Australia, Canada, Ireland, and the United States, reported that diarrhea disease incidence among children aged 0 to 4 years was higher than for any other age-group.
The lower morbidity rates of HC, PN, and PUO (compared with Acute D) found in this study of northeastern Thailand are mirrored by National Notifiable Disease Surveillance reports for the whole of Thailand covering the years 2002 to 2006. 3 Even though these 3 infectious diseases have much lower morbidity than Acute D, they still regularly feature in the top 10 list of all diseases in Thailand.
When we fitted models separately to the data for each disease, there was a downward trend for PN, but the trend disappeared when we analyzed the diseases jointly.
Our results showed lower incidence of pyrexia of unknown origin in districts bordering Laos, and it is of interest to seek reasons for this finding in further studies.
Our findings showed that the highest incidence for all 4 diseases occurred from April to June in 4 of the 6 years, possibly because of the seasonal change from winter to summer. In addition, peaks in the rainy season (July to September) in 2002-2003 could be because of the higher overall rainfall and severe flooding that occurred in several provinces of Thailand in the study region, particularly in 2002. 20 Group immunity dynamics in human societies along with etiological agent dynamics, in particular environments, might also be relevant to our findings.
This region has widely varying regional incidences for the 4 diseases. Higher incidence mostly occurred in rural districts of Loei and Udon Thani provinces. These districts are mainly mountainous, and higher altitude could possibly be associated with increased risk of illness. 21
A limitation of the data used is that the surveillance data from Ministry of Public Health is known to be underreported for major infectious diseases because of some private hospitals not being included. 3
The co-occurrence of the 4 diseases in our study suggests the need to strengthen strategies for the integrated management of people’s illness at home and through the community. It is possible to include treatments and disease prevention knowledge for diarrhea, pyrexia of unknown origin, hemorrhagic conjunctivitis, and pneumonia in existing strategies. Health authorities should provide disease information and implement preventive programs at least 3 months prior to the anticipated disease season. Particularly for children who are in high-incidence age-groups, the Ministry of Public Health and the Ministry of Education would be well advised to set up strong strategies for disease control by cooperatively instigating a “Health Promoting Schools Project” for schools throughout the country.
In conclusion, these 4 diseases seem to have similar patterns, with high incidence for children aged less than 5 years, especially for diarrhea in rural districts of Loei and Udon Thani provinces. The model can be extended to use different burdens for different diseases. The findings of this study can be used by health authorities to help design effective prevention programs in specific areas where the disease burden is relatively high.
Footnotes
Acknowledgements
We thank the provincial officers of Ministry of Public Health for providing the data. This study was funded by the Graduate School, Prince of Songkla University. We are grateful to Prof Don McNeil for his assistance, and we thank the reviewers for their valuable suggestions.
