Abstract
Background:
The Mild Behavioral Impairment Checklist (MBI-C), a screening scale for neuropsychiatric symptom evaluation, facilitates Alzheimer’s disease (AD) screening. However, its validity and reliability for use as an AD screening tool have not been determined.
Objective:
To develop an AD screening scale suitable for the Chinese population.
Methods:
The MBI-C was translated into Chinese and back-translated with the original author’s consent. Forty-six AD patients, attending the Xuanwu hospital memory clinic, and 50 sex- and education-matched controls from the community underwent a full neuropsychological evaluation, including MBI-C assessment. Among them, 15 AD patients were evaluated repeatedly, and eight were evaluated simultaneously by two different clinicians, to assess MBI-C reliability.
Results:
The MBI-C demonstrated good internal consistency reliability, test–retest reliability, and inter-rater reliability. Its optimal cutoff point was 6/7 for identifying AD dementia, with a sensitivity of 86.96% and specificity of 86.00%, and its detection rate for moderate–severe AD dementia was higher than that of the Neuropsychiatric Inventory Questionnaire (NPI-Q). Pearson’s correlation coefficients ranged from 0.702 to 0.831, indicating content validity. Seven factors were extracted during principal component analysis, with a cumulative contribution of 70.55%. Moreover, the Pearson’s correlation coefficient was 0.758, indicating its criterion validity. The MBI-C could also distinguish AD dementia severity. MBI-C scores were significantly negatively correlated with MMSE and MoCA scores, and positively correlated with ADL scores.
Conclusion:
This study showed that the Chinese version of MBI-C has high reliability and validity, and could replace the NPI-Q for AD dementia screening in the Chinese population.
Keywords
INTRODUCTION
Dementia is the most common neurodegenerative disease of aging worldwide; 20% of patients with dementia live in China, and dementia has become one of the major diseases affecting the quality of life and health in this country [1]. According to a previous population-based cross-sectional survey, by the end of 2009, the prevalence of dementia was 5.1% and the morbidity of Alzheimer’s disease (AD), the most common type of dementia, reached 3.21% in Chinese elderly [2].
To ensure early treatment, it is clinically important to screen for the acquired cognition decline, appearance of neuropsychiatric symptoms, and decreased activities of daily living, which are the common clinical standards for the diagnosis of dementia [3, 4]. Cognitive decline screening scales, such as the Mini-Mental State Examination (MMSE) and the Montreal Cognitive Assessment (MoCA), and scales screening the ability to conduct activities of daily life, such as the Activities of Daily Life scale (ADL), are the most commonly used clinical screening tools for dementia; however, these also have shortcomings that limit their application [5–7]. Thus, there remains a need for a tool that facilitates early detection of dementia and identification of the at-risk population [8–10].
Neuropsychiatric symptoms (NPS) have been viewed as typical manifestations of dementia; these include the three main clinical symptoms (agitation, psychosis, and mood disorders), among others [11]. As neurodegenerative and vascular damage occurs earlier than the onset of clinical symptoms, NPS can appear in individuals with dementia even before they demonstrate cognitive decline [12]. Several studies have estimated the prevalence of NPS in individuals with dementia and have shown that NPS are observed in 75% to 100% of cases during the course of their disease, irrespective of AD-related or other types of dementia [13–15]. Numerous studies have also shown that NPS is highly correlated with the occurrence of AD and has even been regarded as a predictor of progression to severe AD and death [16]. However, the rate of recognition of pre-existing psychiatric symptomatology in the clinic is still low and only a few studies use a scale for NPS as the screening standard, to date [17, 18].
Mild behavioral impairment (MBI) is defined by the NPS Professional Interest Area of the International Society to Advance Alzheimer’s Research and Treatment, and can be used in the assessment of older individuals who are at high risk for dementia, who exhibit NPS but without cognitive decline [19]. MBI involves five domains, i.e., impairments of motivation, emotional regulation, impulse control, social appropriateness, and perception or thought content, which implies the possibility of using MBI for diagnosing pre-dementia [19]. Thus, professor Zahinoor Ismail and colleagues [20] developed the Mild Behavioral Impairment Checklist (MBI-C) for screening at-risk populations. The MBI-C, as a screening scale for MBI, is reported by a family member, close informants, or a clinician [20]. It assesses symptoms persisting for more than 6 months [20]. There has been worldwide interest in the reliability and validity of the MBI-C, and it has been demonstrated that the total MBI-C score is sensitive for detecting MBI, based on assessments of individuals diagnosed with mild cognitive impairment (MCI) and subjective cognitive decline (SCD) [21–23].
AD, as the most common type of dementia, often has mild symptoms related to cognition and behavior in its early stages [24, 25]. Currently, AD dementia patients are screened based only on cognitive symptoms, using tools such as the MMSE and MoCA, but the effectiveness of screening for behavioral impairment is often overlooked [18]. The most widely used behavioral screening tool is the Neuropsychiatric Inventory Questionnaire (NPI-Q), which is used with cognitive assessment scales (such as the MMSE and MoCA) and daily living ability assessment scale to screen for dementia. It is also the most publicized scale, utilizing the same screening index as the MBI-C to screen for AD [26]. However, the validity of the NPI-Q is less than ideal [27, 28].
Because the MBI-C targets non-dementia patients with MBI, it is more comprehensive and accurate, including many more items than the widely used NPI-Q; it could therefore be assumed that the MBI-C is more effective than the NPI-Q to screen for individuals with AD with NPS. However, there have been no international studies on this topic. Actually, MBI-C has much more significant practicability in screening AD population in China. It is difficult for patients with early AD or MCI in many remote areas to accept standardized cognitive evaluation due to insufficient doctors per capita. Up to 2009, more than half of the elderly in China still have low education level (have no education or only have primary education), which is difficult to complete common cognitive screening scales such as MMSE, MoCA, and Alzheimer’s disease rating scale. The MBI-C can be evaluated by relations and has little impact on the education level, which is very suitable for application based on the current situation of China. To develop a new dementia screening scale with a higher rate of detection of individuals with early dementia that is suitable for use in Chinese individuals, we translated the MBI-C into the Chinese version and verified its reliability and validity in population-based screening for AD, in comparison to the NPI-Q.
MATERIALS AND METHODS
Description of MBI-C
The original MBI-C was translated into Chinese by two researchers, and was back-translated by another two researchers to develop a Chinese version of the MBI-C, with the consent of the original author, Professor Zahinoor Ismail. The MBI-C was reported by clinicians, and included 34 items in five domains:1) interest, motivation and drive; 2) mood or anxiety symptoms; 3) the ability to delay gratification and control behavior, impulses, oral intake, and/or changes in reward; 4) following societal norms and having social graces, tact, and empathy; 5) strongly held beliefs and sensory experiences. Only symptoms presented in the last 6 months were assessed as “yes” and their severity was rated (1 to 3 points). Finally, the score of each domain and the total scores were counted.
Description of NPI-Q
The NPI is considered as one of the most effective outcome measures for behavioral and emotional symptoms in dementia patients [29, 30]. The NPI-Q, validated by Kaufer et al. [26], is a shortened version of the NPI for use by clinicians; previous studies in various countries have shown that its reliability and validity are as good as those of the NPI [31, 32]. The NPI-Q evaluates 12 neuropsychiatric disturbances commonly found in individuals with dementia: delusions, hallucinations, agitation, dysphoria, anxiety, apathy, irritability, euphoria, disinhibition, aberrant motor behavior, night-time behavior disturbances, and appetite and eating abnormalities [26]. Compared to the MBI-C, it is mainly applied to the evaluation of hospital patients and does not focus on the duration of symptoms, but includes an evaluation of the degree of distress experienced by caregivers.
Subjects
Fifty control subjects were recruited from the retired elderly individuals who were family members of the patients or those living in the various communities in Beijing; these individuals met the following MMSE scores for education level: 1) illiterate (>17 points); 2) primary school level (>20 points); 3) middle school level and above (>24 points). Their ages ranged from 46 to 82 years (mean±SD: 60.5±10.3 years). In addition, we recruited 46 AD dementia patients from the memory clinic in Xuanwu hospital, whose ages ranged from 42 to 84 years (65.3±10.4 years). These individuals had been diagnosed with AD in accordance with the IWG-2 criteria, and patients with a Clinical Dementia Rating scale (CDR) score <1 were excluded [33]. According to their CDR grading, 29 patients with mild AD dementia, 17 patients with moderate AD dementia, and two patients with severe AD dementia were included. All subjects were conscious, without a history of mental illness and serious physical disease. We obtained written informed consent of the individuals themselves or their families for participation in this survey.
Methods
All subjects were evaluated using the MBI-C, NPI-Q, MMSE, MoCA, CDR, and ADL scales, in order to calculate the validity of the MBI-C and compare it with the NPI-Q. Among them, eight subjects were simultaneously assessed with the MBI-C by two clinicians. For these evaluations, one evaluator was randomly selected as the examiner and the other as the observer; these clinicians independently scored the MBI-C to test the inter-rater reliability. Moreover, 15 subjects were repeatedly assessed at an interval of 24 h in order to detect the test–retest reliability of the MBI-C.
Statistical analysis
Data were analyzed using the Statistical Package for the Social Sciences (SPSS for PC-version 22, SPSS Inc., Chicago, IL, USA). All figures were drawn on GraphPad Prism 7 (Graphpad Software, La Jolla, CA, USA). Cronbach’s alpha coefficient was calculated to verify the internal consistency of the MBI-C. Interclass correlation coefficients (ICCs) were calculated to evaluate the test–retest reliability and the inter-rater reliability. Receiver operating characteristic (ROC) curves were drawn to analyze the sensitivity and specificity of the MBI-C, and determine its cutoff points, as compared with the NPI-Q. Pearson’s correlation coefficients were used to evaluate the content validity. The Kaiser–Meyer–Olkin (KMO) test and Bartlett’s test were used to determine the feasibility of factor analysis. Principal component analysis and maximum variance rotation were used to test the construct validity. Pearson’s correlation coefficients were calculated to assess the criterion validity and the correlation between the MMSE, MoCA, ADL, and MBI-C scores. Multiple independent sample rank-sum tests and Bonferroni correction were used to demonstrate the feasibility of using MBI-C for distinguishing the degree of dementia. The level of statistical significance was set to p = 0.05.
RESULTS
Demographics and psychometric value
Demographic characteristics, MMSE, MoCA, ADL, NPI-Q, and MBI-C domain scores and total scores are presented in Table 1. AD dementia patients were significantly older than controls (p < 0.05), but sex distribution and duration of education did not differ significantly between patient and control groups. On all scales, including the MMSE, MoCA, ADL, NPI-Q, and MBI-C, the control group performed better than the AD dementia group.
Demographic characteristics and psychometric results
AD, Alzheimer’s disease; MMSE, Mini-Mental State Examination; MoCA, Montreal Cognitive Assessment; ADL, Activities of Daily Living; NPI-Q, Neuropsychiatric Inventory-Questionnaire; MBI-C, Mild Behavioral Impairment Checklist. *chi-square test for categorical variables; rank sum test for continuous variables.
Reliability analysis
Internal consistency reliability
In this study, Cronbach’s alpha coefficient was used to determine the internal consistency of the MBI-C. Cronbach’s alpha coefficient was 0.936, i.e., greater than 0.9, overall. For the individual domains, the alpha value was as follows: 1) “interest, motivation, and drive” was 0.878; 2) “mood or anxiety symptoms” was 0.837; 3) “the ability to delay gratification and control behavior, impulses, oral intake, and/or changes in reward” was 0.863; 4) “following societal norms and having social graces, tact, and empathy” was 0.664; 5) “strongly held beliefs and sensory experiences” was 0.824. These results indicated a good internal consistency.
Test–retest reliability
The test–retest reliability ICC was 0.841, which was greater than 0.7, suggesting good test–retest reliability.
Inter-rater reliability
Two clinicians simultaneously completed the MBI-C for each of the eight subjects, and the ICC was computed to assess the inter-rater reliability. The high ICC (0.991) indicated excellent reliability.
Validity of MBI-C
Optimal cutoff points of MBI-C and NPI-Q for detecting AD dementia
First, for distinguishing AD from controls, ROC analysis revealed that the area under the curve (AUC) and 95% confidence intervals (95% CI) of the MBI-C was 0.900 (0.834–0.965), while the AUC of the NPI-Q was 0.830 (0.749–0.912), which was not statistically significant (p = 0.052) in Fig. 1. The cutoff point in the MBI-C was 6/7 and that in the NPI-Q was 2/3, which yielded good sensitivity and specificity for discriminating individuals with AD dementia from controls (Table 2).

Receiver operating characteristic curves for detecting AD dementia using the MBI-C and NPI-Q.
Optimal cutoff points and validity of the MBI-C and NPI-Q
MBI-C, Mild Behavioral Impairment Checklist; NPI-Q, Neuropsychiatric Inventory Questionnaire; PLR, Positive likelihood ratio; NLR, Negative likelihood ratio.
Moreover, for distinguishing patients with mild AD dementia from controls, the ROC curves of the MBI-C and the NPI-Q (Supplementary Figure 1) had AUCs of 0.852 and 0.794, respectively, which was not statistically significant (p = 0.247). For distinguishing patients with moderate–severe AD dementia from controls, the ROC curves of MBI-C and NPI-Q (Supplementary Figure 2) yielded AUCs of 0.981 and 0.891, respectively, which was also not significantly different (p = 0.014).
Content validity
The Pearson’s correlation coefficients between individual domain scores and the total score of the MBI-C ranged from 0.702 to 0.831 (all p < 0.05), which suggested that the scale has good content validity.
Construct validity
Construct validity was assessed using exploratory factor analysis (EFA) and Pearson’s correlation tests were used for domain and total MBI-C scores (Supplementary Table 1). We found four items with item-scale correlations below 0.40 with the whole scale, as follows: 1) Does the person display sexually disinhibited or intrusive behavior, such as touching (themselves/others), hugging, grouping, etc., in a manner that is out of character or that may cause offence?; 2) Has the person recently developed trouble regulating smoking, alcohol, drug intake, or gambling, or started shoplifting?; 3) Has the person started talking openly about very personal or private matters not usually discussed in public?; 4) Does the person now talk to strangers as if they are familiar, or intrude on their activities? We next constructed an MBI-C EFA load table, and determined the feasibility of factor analysis by the KMO test and Bartlett’s test. The results showed that the KMO value for the 96 subjects was 0.737, and the Bartlett test value was 2469.511 (df = 561, p < 0.001), demonstrating that factor analysis was feasible.
Principal component analysis and maximum variance rotation were then used to test the validity of structure. Ultimately, we extracted seven principal components, whose cumulative contribution rate was 70.55% (Table 3).
Total Variance Explained
Extraction Method: Principal Component Analysis.
Criterion validity
Criterion validity refers to the relationship between the research tool and other measurement standards. The higher the correlation coefficient, the better the validity of the research tool. In this study, the NPI-Q, which has high reliability and validity, was adopted as the corresponding measurement index, and the MBI-C score was positively correlated with the NPI-Q score (R = 0.758, p < 0.01).
Distinguishing dementia and the degree of dementia
Figure 2 shows the ability of MBI-C to distinguish the degree of dementia as defined by CDR scores. Subjects were divided into the control group (CDR0, n = 50), mild AD group (CDR1, n = 29), and moderate–severe AD group (CDR2 or CDR3, n = 17) according to CDR scores. There were significant differences in MBI-C scores between the different groups. The control group’s performance was significantly higher than that of the mild AD group (U = –4.728, p < 0.001), while the mild AD group performed markedly better than the moderate–severe AD group (U = 2.525, p = 0.035).

Relationship between the group divided according to clinical dementia rating scale scores and MBI-C performance. The boxes show the limits of the 25th and 75th percentile; the line in the box shows the median; the bottom and upper horizontal lines show the most extreme performance of each group.
Correlation between scales
In this study, the Chinese version of the MMSE, MoCA, and ADL, which have high reliability and validity, were regarded as the screening indicators for AD dementia patients and the correlations of the MBI-C with these scales were assessed. The MBI-C score was negatively correlated with the total score of the MMSE and MoCA (r = –0.641, p < 0.01; r = –0.623, p < 0.01), and was positively correlated with the ADL score (r = 0.742, p < 0.01).
DISCUSSION
Currently, early screening for dementia is a key issue in neuroscience, but cognitive impairment is often used as the screening standard, while the behavioral impairment of patients is often ignored. However, identifying dementia patients by behavioral disorders is crucial, especially in China, where the number of doctors per capita and the education level for the elderly is low, and MBI-C is a newly developed tool for this purpose. Because MBI-C is a subjective checklist that can be self-assessed by patients’ families or other informed individuals and can be completed within 8 minutes, it can be highly effective in detecting mild behavioral disorders and reducing the rate of missed diagnoses, particularly for the detection of dementia patients who develop only NPS at first, without showing obvious cognitive decline. To date, only two papers, from the original research group, have reported that the total MBI-C score is sensitive for detecting MBI in individuals with MCI and SCD [22, 23]. Thus, there is a need for further research on the feasibility of using the MBI-C in clinical practice. To study the applicability of the MBI-C in China, we translated the MBI-C to a Chinese version and sought to investigate these issues with the consent of the original author.
AD is the most common type of dementia, with long preclinical and prodromal phases (20 years). It has an estimated prevalence of 10–30% among individuals aged 65 years and older, with an incidence of 1–3% [34]. NPS was observed to be highly prevalent in AD patients; the most common NPS was apathy, with an overall prevalence of 49%, followed by depression, aggression, anxiety, and sleep disorder, the pooled prevalence estimates of which were 42%, 40%, 39%, and 39%, respectively [35]. In our study of AD patients, we calculated the reliability and validity of the MBI-C, in comparison with those of the NPI-Q, a common clinical NPS scale, to verify the feasibility of using MBI-C as an AD screening tool.
For reliability analysis, this study adopted Cronbach’s alpha coefficient to evaluate internal consistency; it was greater than 0.8 in both scales, and most of the MBI-C domains. Cronbach’s alpha coefficient for the domain of following societal norms and having social graces, tact, and empathy was only 0.664, which may be related to the few social activities of the elderly in China, which makes it difficult to detect their social decline. Moreover, the most common mental behavioral disorders in people with AD include apathy, irritability, and agitation; however, the decline in social compliance is rarely mentioned. Hence, a relatively low Cronbach’s alpha coefficient is plausible [36]. Additionally, the ICC for test–retest reliability was 0.841 and that for inter-rater reliability was 0.991. These results suggested that the Chinese version of the MBI-C has good reliability for the assessment of MBI in individuals with AD dementia.
The validity of screening scales can be affected by demographic variables. It has been shown in previous studies that age, education level, and population origin affect the prevalence of some NPS of AD. In this study, there was no statistically significant difference in sex distribution and education years, but the age of the AD dementia group was significantly higher than that of the control group, which may have impacted the validity results.
We considered sensitivity and specificity, based on ROC analysis, when determining the optimal cutoff points of the MBI-C and NPI-Q and compared the rates of detection of individuals with AD using those two scales. Cutoff values were determined at the maximum value of (sensitivity + specificity – 1); a cutoff of 6/7 of MBI-C could detect 86.96% of cases with AD dementia with a specificity of 86.00%, while a cutoff of 2/3 of NPI-Q could detect 76.09% of cases with AD dementia with a specificity of 76.00%. In addition, both positive and negative quasi-likelihood ratios of the MBI-C were better than those of the NPI-Q, probably mainly due to the contribution of moderate–severe AD. The ROC curves of the MBI-C and NPI-Q for detecting mild AD and moderate–severe AD were also drawn separately. These results showed that both the MBI-C and NPI-Q had a good rate of detection of patients with mild AD and moderate–severe AD, but their ability to detect patients with mild AD was essentially the same, while the detection rate of the MBI-C for moderate–severe AD was significantly higher than that of the NPI-Q. The reason for the lack of differences in the detection rate of mild AD between MBI-C and NPI-Q may be that the behavioral symptoms of AD patients mostly appear in the middle and late stage. Given that the MBI-C could better detect the NPS of moderate–severe AD patients than could the NPI-Q, and that its sensitivity and specificity in population-based AD screening were also excellent, we propose that it can replace the NPI-Q as a clinical screening scale; however, it cannot replace the cognitive screening scale. Considering that the MBI-C contains more comprehensive and specific behavior disorder items than the NPI-Q, we intend to conduct further research on patients with frontotemporal dementia, whose behavioral disorder is obvious at an early stage, as the MBI-C is likely to yield a much better detection rate than the NPI-Q.
Factor analysis was performed to evaluate the structure validity of the Chinese version of the MBI-C. Seven factors were extracted from the original scale, with a cumulative variance contribution rate of 70.552%. We found four items that had weak associations with the whole scale, as follows: 1) Sexual disinhibition: Compared with frontotemporal dementia and other types of dementia, the incidence of sexual disinhibition in AD was relatively low, and also usually occurred in the middle and late stages. The patients included in this study were mainly mildly to moderately affected, which may be the reason for the item’s weak associations with the whole scale. 2) Substance use: As Chinese people have a culture of smoking and drinking as part of their social lifestyle, smoking and drinking is not strongly regulated, and thus it is not likely to be considered abnormal by family members. In addition, drug intake is strictly controlled and gambling is illegal in China, thus, these behaviors rarely show up in the population. Therefore, we believe that cultural differences are the main reason for this low item-scale correlation, rather than the fact that the subjects are AD patients. 3) Talking openly about private matters and talking to strangers as if they are familiar: These items belong to the domain of following societal norms, which had the lowest internal consistency. Because elderly Chinese individuals have fewer social activities and seldom participate in public events or converse with strangers, cultural differences may be an important reason for this finding. The most common mental behavioral disorder in individuals with AD include apathy, irritability, and agitation, but the decline in social compliance is rarely mentioned, which may be another reason for this finding. Taken together, although culture is one of the factors, the low item-scale correlations are highly related to the low symptom occurrence rate in the AD population. In general, aberrant motor behavior and appetite/eating disturbances could reliably differentiate AD from other dementia, meanwhile increased disinhibition and impulsivity, excessive substance use, pathological gambling, and embarrassing social behavior belong to the typical behaviors of frontotemporal dementia, which indicated conducting a further studies on the feasibility of using the MBI-C for screening for frontotemporal dementia will be of significance [37, 38]. Moreover, it is necessary to expand the sample size for confirmatory factor analysis to optimize the items of the scale and further improve the structural validity of this scale in the Chinese population.
The results of this study showed that the five domains of the MBI-C all made good contributions to the total score and that the MBI-C was significantly correlated with the NPI-Q, which demonstrated the good content validity and criterion validity of the MBI-C. The present research results showed that the Chinese version of the MBI-C has high validity as a screening tool for AD dementia. At the same time, the study also confirmed the significance of the MBI-C in differentiating the severity of dementia through multiple independent sample rank-sum tests and Bonferroni correction. Since behavioral disorders were more common in patients with moderate-severe AD and the number of severe AD enrolled in the study was too small (only two cases), we combined subjects with CDR scores of 2 or 3 into a single group for analysis. The results showed that the MBI-C score of healthy controls was significantly lower than that of patients with mild AD, while the MBI-C score of patients with mild AD was also significantly lower than that of the patients with moderate–severe AD, which opens up the possibility of using the MBI-C to grade dementia severity.
We also demonstrated that the MBI-C was related to other scales used to screen for AD dementia. To some extent, the higher the MMSE or MOCA score, the lower the MBI-C score; and the higher the ADL score, the higher MBI-C score. These results were confirmed using linear regression analysis, which indicated that the performance of the MBI-C was related to the cognitive level and the ability to conduct ADLs, which further supported the use of the MBI-C as an auxiliary tool to screen for AD dementia patients.
The main limitation of this study is the small number of samples, resulting in a high sampling error. Moreover, age, as one of the major factors influencing behavioral symptoms, was not controlled and may also have adversely affected our results. In future, we intend to explore the feasibility of using MBI-C screening for frontotemporal dementia populations.
Our study showed that compared with the NPI-Q, the Chinese version of the MBI-C had higher reliability and validity, and was more sensitive and specific for the screening of AD patients, particularly moderate–severe AD patients. Our results indicate that the MBI-C can be developed as a new dementia screening scale, providing a high detection rate, which is an extension of the original MBI-C function.
Footnotes
ACKNOWLEDGMENTS
The authors thank Professor Zahinoor Ismail, the original author of the MBI-C scale, for his permission to translate the scale into Chinese language and conduct a non-commercial study.
This work was supported by the National Natural Science Foundation of China (No. 81470074, 81571294, 81601099), clinical funding from the Beijing Municipal Science and Technology Committee (Z141107002514 117), and Beijing Municipal Government Funding (PXM 2017026283 000002).
