Boston Naming Test as a Screening Tool for Early Postoperative Cognitive Dysfunction in Elderly Patients After Major Noncardiac Surgery

Abstract

Purpose

The Boston naming test (BNT), as a simple, fast, and easily administered neuropsychological test, was demonstrated to be useful in detecting language function. In this study, BNT was investigated whether it could be a screening tool for early postoperative cognitive dysfunction (POCD).

Methods

This prospective observational cohort study included 132 major noncardiac surgery patients and 81 nonsurgical controls. All participants underwent a mini-mental state examination (MMSE) and BNT 1 day before and 7 days after surgery. Early POCD was assessed by reliable change index and control group results.

Results

Seven days after surgery, among 132 patients, POCD was detected in 30 (22.7%) patients (95% CI, 15.5%-30.0%) based on MMSE, and 45 (34.1%) patients (95% CI, 26.3%-41.9%) were found with postoperative language function decline based on BNT and MMSE. Agreement between the BNT spontaneous naming and MMSE total scoring was moderate (Kappa .523), and the sensitivity of BNT spontaneous naming for detecting early POCD was .767. Further analysis showed that areas under receiver operating characteristics curves (AUC) did not show statistically significant differences when BNT spontaneous naming (AUC .862) was compared with MMSE language functional subtests (AUC .889), or non-language functional subtests (AUC .933).

Conclusion

This study indicates the feasibility of implementing the BNT spontaneous naming test to screen early POCD in elderly patients after major noncardiac surgery.

Keywords

Boston naming test mini-mental state examination postoperative cognitive dysfunction language function

Introduction

Postoperative cognitive dysfunction (POCD) is a subtle disorder of intellectual function after full recovery of consciousness and persists far beyond the expected effects of anesthetics.¹ As a mild neurocognitive disorder of unspecified etiology, POCD is purported to encompass acute or persistent deficits in attention, concentration, language function, learning and memory after surgery.^2,3 According to the international study of postoperative cognitive dysfunction (ISPOCD1), POCD was present in 25.8% of elderly patients 7 days after major noncardiac surgery,⁴ usually defined as early POCD,⁵ leading to remarkably decreased working capacity and increased long-term mortality.⁶

Despite the high prevalence and its association with adverse outcomes, clinicians often overlook POCD. The symptoms are usually not dramatic enough to gain attention, while neuropsychological diagnostic tests require specialized training and are time-consuming. Therefore, it’s crucial to make the assessment simple, fast, and easily implemented bedside screening tool for POCD assessment and diagnosis in clinical practice. Several articles noted that specific neuropsychological tests could be useful for screening mild cognitive impairment or decline.^7,8 Above all, we are looking for a simple neuropsychological test to screen POCD, which is easy to implement, with good reliability and validity.

The Boston naming test (BNT) is a widely utilized neuropsychological test that consists of a group of black and white line drawings of animate and inanimate items and is sensitive in detecting compromised lexical retrieval abilities and aphasia through visual confrontation naming.⁹ The 30-item BNT used in this study, modified for the Chinese population, has been proven valid in assessing cognitive disorders.^10-13 Patients in our study are in relatively severe condition, which leads to hindered reading or writing capacity BNT might be suitable for them in identifying POCD at the bedside.

The main objective of this study is to evaluate the validity and sensitivity of BNT in early POCD (7 days after surgery) identification after major noncardiac surgery in Chinese elderly patients. Besides, a further understanding of language capacity status in general POCD research is looking forward.

Methods

Participants and Design

This was a prospective observational cohort study in elderly patients conducted in XX Hospital between May 2017 and May 2019. The hospital ethics committee approved the study (approval number PJ-NBEY-KY-2017-003-01), and written informed consent was all subject before enrollment. This is a sub-study of a trial registered at the Chinese Clinical Trial Registry (chictr.org.cn) (identifier ChiCTR-ROC-17010610).

Eligible patients were ≥60 years old and scheduled to undergo elective major noncardiac surgeries, including open radical gastrointestinal, urology, or thoracic surgeries, and total hip or knee arthroplasty.

Exclusion criteria included: difficulty in comprehension (ie, inadequate Mandarin, blindness, and deafness); concomitant diseases that may lead to severe complications (ASA ≥ Ⅳ); pre-existing neurological or clinically identified cerebral disorders (ie, history of brain injury or surgery, stroke, Alzheimer’s disease, schizophrenia, and Parkinson’s disease); preoperative MMSE score indicating dementia (≤17 for illiterate, ≤20 for those with 1-6 years of education, and ≤24 for those with 7 or more years of education); surgery cancellation or approach change (ie, switching into laparoscopic or other non-major surgery approaches); and test interruption due to any causes.

Family members of patients (not limited to the enrolled patients) in the hospital were recruited as controls. These individuals met similar inclusion and exclusion criteria except surgery-related criteria. Controls who underwent unexpected surgery during the trial were excluded.

All individuals underwent neuropsychological tests at the same time points. Patients enrolled in this study received general perioperative care. Monitoring, anesthetic technique, and postoperative analgesia were prescribed at the discretion of the attending anesthesiologist.

Neuropsychological Testing

A fixed research staff trained under a neuropsychologist’s supervision was responsible for carrying out the tests. Education, medical history, and current medication were collected as demographic data before tests. All patients were tested in a quiet environment 1 day before surgery (baseline) and 7 days after surgery, without family accompanying. The MMSE and BNT were performed in order between 1 pm and 5 pm.

The Chinese version of MMSE we employed is modified from the English version of MMSE¹⁴ and has proven reliability and validity among illiterate or minimally educated elderly Chinese.^14,15 Modifications are as follows: dates are presented with both the lunar and Roman calendars due to the preference for the lunar calendar of the Chinese elderly. Orientation items and registration items were altered according to Chinese idiomatic expressions. The calculation item was modified into purchase balances due to a poor understanding of “subtraction.” The phrase in the repetition item is a Chinese tongue twister. The reading article is kept unchanged. However, illiterates often scored zero due to being unable to read the instructions. The writing item was changed into reading a sentence with the same scoring method. Naming test, three-step command and copying a figure are unchanged. In our study, besides MMSE total score (MMSE-Total), MMSE language functional subtests (MMSE-Lang, full sore = 9) and non-language functional subtests (MMSE-non-Lang, total score = 21) were studied and analyzed.

BNT is mainly focusing on measuring language ability and verbal fluency, compared with MMSE. For BNT, spontaneous naming (BNT-Spon) was to ask patients to name the presented objects, with a response time of 20 seconds. The total times of correct answers were taken as BNT-Spon scoring. If the answer was incorrect, a previously established semantic cue was provided with an additional 20 seconds period to name the object. The decision to give a semantic cue in a standard BNT test is subjective, based on the examiner’s opinion that the subject may not perceive the item properly. The counts of correct answers after providing semantic cue were recorded as BNT-Sem (n) scoring, and BNT-Sem (%) scoring was BNT-Sem (n) divided by the times providing semantic cue. If the naming was still incorrect, a selective cue was provided, which is an auditory task, and the subject needed to tell the correct item out of 3 choices provided. The number of correct answers after providing a selective cue was recorded as BNT-Selec (n) scoring, and BNT-Selec (%) scoring was BNT-Selec (n) divided by the times provided selective cue. Note that the selective cue was exclusive to the Chinese version of BNT.¹⁰

Calculation of POCD

The reliable change index (RCI) was used to diagnose POCD and postoperative language function decline.^1,4,16-19 To calculate the RCI of one test for an individual patient, the baseline score (X₁) from the test was subtracted from the preoperative score at 7 days (X₂), giving ΔX giving for each individual patient for each neuropsychological test. The same was done in the controls, giving ΔXc. The mean change of ΔXc on that test was then subtracted from ΔX to eliminate practice effects. To create a Z-score for the test, this result was then divided by the SD of ΔXc, to eliminate the effect of natural variation in the test performance.¹⁸ POCD in an individual patient is defined as Z-score equal to or less than −1.96 in MMSE-Total. In the same way, postoperative language function and other cognitive function decline were defined as Z-scores equal to or less than −1.96 in the corresponding tests.

Data Analysis

PASS 11.0 (NCSS, LLC. Kaysville, Utah, USA) and “tests for paired sensitivities” were used to calculate sample size.²⁰ Based on previous studies of POCD incidence on post-surgery day 7 in elderly patients going through noncardiac surgery,^4,16,17,19 we set the prevalence value at .3. Test 1 for MMSE was proposed as a POCD diagnosis tool and set the sensitivity to .883.²¹ Test 2 was to explore the effectiveness of BNT on POCD screening, and we set the sensitivity to .7. The proportion discordant value was set as .2. A sample size of 153 patients was thus generated (power of 90%, and α = .05, two-tailed) and further settled at 170 considering a 10% of drop-off rate. The control size was made large enough to reduce variability and allow matching.

Group comparisons were made using an independent t-test for continuous variables, Mann-Whitney U test for ranked data, and chi-square or Fisher exact test for dichotomous data. The type I error rate was controlled using the Holm-Bonferroni step-down procedure or Benjamini-Hochberg procedure²² for multiple comparisons.

The proportions of specific positive and negative agreement among BNT and MMSE and their subtests were analyzed using an unweighted Cohen’s Kappa with 95% CI. The positive agreement index estimated the conditional probability of a positive diagnosis on one test given a positive diagnosis on the other; likewise, the negative agreement for a negative diagnosis. In the present study, the positive diagnosis means the test’s Z-score ≤ −1.96, while the negative diagnosis means the test’s Z-score > −1.96. These indices are analogous to sensitivity and specificity in the presence of a gold standard classification.

To further compare the predictive value of BNT and MMSE’s subtests for POCD, each test’s score difference (day 7 score - baseline score) was calculated. Afterward, receiver operating characteristics (ROC) curves, including the correspondent areas under the curve (AUC), were calculated and compared using MedCalc 19 (MedCalc Software bvba, Ostend, Belgium). Other analyses were conducted by IBM SPSS Statistics 20 (IBM Corp, Zurich, Switzerland). All hypothesis tests were two-tailed. A P-value of less than .05 was taken to indicate statistical significance.

Results

Demographic and Preoperative Medical Data

473 patients were screened that scheduled for elective major noncardiac surgery. Patient flows through the trial are summarized in Figure 1. In the meantime, we recruited 97 family members of the patients as controls. Three controls were excluded due to baseline dementia (according to baseline MMSE scoring), 2 due to surgery during the study, and 11 did not show up the second time for assessment, leaving 81 controls for Z-score calculating.

Figure 1.

Study flow. Note: MMSE, mini-mental state examination.

The patient’s demographic characteristics and medical history (n = 132) and controls (n = 81) are shown in Table 1. Controls were higher in height, while there was no difference in body mass index. There were differences in years of education and smoking history between the 2 groups. After adjusting for the effect of multiple comparisons, compared with the surgical group, the ratio of ASA I in the control group was higher, mainly due to the surgical group’s more thorough medical history and examination data. After adjusting for the effect of multiple comparisons, the ratio of ASA I in the control group was higher compared with the surgical group, which probably contributed to the differences between the 2 groups.

Table 1.

Patient and Control Demographics and Medical History at Baseline.

Measures	Patients (n = 132)	Controls (n = 81)	Test Statistic	P-Value
Age, year	70.96 (5.47)	69.61 (5.77)	t = −1.715	.088
Gender, male/female	60/72	41/40	χ² = .537	.464
Height, cm	165.80 (8.48)	163.22 (7.86)	t = −2.210	.028
Weight, kg	61.71 (10.34)	60.86 (9.17)	t = −.606	.545
Body mass index	22.38 (2.93)	22.99 (2.78)	t = 1.493	.137
Education, year	5.0 (3.0-6.0)	5.0 (5.0-8.0)	Z = −1.964	.050
ASA status
Ⅰ	10 (7.6%)	18 (22.2%)	χ² = 9.431	.006^a
Ⅱ	81 (61.4%)	46 (56.8%)	χ² = .436	.509^a
Ⅲ	41 (31.3%)	17 (21.0%)	χ² = 2.570	.164^a
Diabetes	39 (29.5%)	16 (19.8%)	χ² = 2.513	.113
Hypertension	65 (49.2%)	33 (40.7%)	χ² = 1.461	.227
History of smoking	28 (21.2%)	30 (37.0%)	χ² = 6.344	.012
Prior surgery	51 (38.6%)	40 (49.4%)	χ² = 2.369	.124

Note: Continuous variables are presented as mean (SD) except education reported as mean (interquartile range), and categorical variables as number (%).

Abbreviation: ASA, American Society of Anesthesiologists.

^aAdjusted P-values according to the Benjamini-Hochberg procedure for multiple testing correction based on controlling the false discovery rate.

Neuropsychological Test Scores at Baseline

The results of the neuropsychological testing of patients and controls at baseline are shown in Table 2. There was no apparent difference between the patients and controls on baseline neuropsychological tests except for the BNT-Selec (n).

Table 2.

MMSE and BNT Score of Patients and Controls at Baseline.

Tests Score	Patients (n = 132)	Controls (n = 81)	Test Statistic	P-Value
MMSE-Total	25.49 (3.09)	25.71 (3.02)	t = −.491	.624^a
MMSE-Name	2.0 (2.0-2.0)	2.0 (2.0-2.0)	Z = 1.810	.210^a
MMSE-Lang	7.41 (1.58)	7.08 (1.43)	t = 1.508	.201^a
BNT-Spon	18.79 (4.38)	17.98 (3.92)	t = 1.404	.162
BNT-Sem (n)	1.0 (.0-2.0)	1.0 (.0-2.0)	Z = .092	.926^a
BNT-Sem (%)	8.01 (.00-20.00)	8.33 (.00-33.33)	Z = −1.620	.210^a
BNT-Selec (n)	5.0 (3.0-6.0)	4.0 (3.0-5.0)	Z = 3.300	.002^a
BNT-Selec (%)	45.45 (40.00-50.00)	45.00 (27.78-50.00)	Z = 1.465	.143^a

Note: Continuous data are presented as mean (SD) or median (interquartile range).

Abbreviations: MMSE, mini-mental state examination; BNT, Boston naming test; MMSE-Total, the MMSE total scoring; MMSE-Name, the MMSE naming subtest scoring; MMSE-Lang, the MMSE language functional subtests scoring including MMSE-Name; BNT-Spon, the BNT spontaneous naming scoring; BNT-Sem (n), the correct counts of BNT after providing semantic cue; BNT-Sem (%), BNT-Sem (n) divided by the times providing semantic cue; BNT-Selec (n), the correct counts of BNT after providing selective cue; BNT-Selec (%), BNT-Selec (n) divided by the times providing selective cue.

^aAdjusted P-values according to the Benjamini-Hochberg procedure for multiple testing correction.

Effects of Anesthesia and Surgery on Neuropsychological Test Scores

According to MMSE total scoring, in the surgical group, 30 (22.7%) out of 132 patients (95% CI, 15.5%-30.0%) were indicated as POCD on post-surgery day 7. Cognitive decline in the control group was found in 2 (2.5%) out of 81 (95% CI, −1.0% to 5.9%) simultaneously. The change scores of each neuropsychological test were obtained by subtracting baseline values from the score on post-surgery day 7 for each subject (Table 3). A negative score difference indicated cognitive function deterioration, while a positive score difference indicated cognitive function improvement.

Table 3.

Change Scores of Neuropsychological Assessment at 7 Days in Surgery Group.

Change Scores	POCD (n = 30)	No POCD (n = 102)	Test Statistic	P-Value
MMSE-Total	−3.23 (1.72)	1.09 (3.02)	t = 12.725	<.001^a
MMSE-Name	.0 (.0-.0)	.0 (.0-.0)	Z = .546	1.000^a
MMSE-Lang	−1.47 (1.22)	.36 (.87)	t = 7.630	<.001^a
MMSE-non-Lang	−1.76 (1.25)	.73 (1.33)	t = 9.107	<.001^a
BNT-Spon	−3.43 (1.48)	.78 (2.62)	t = 11.251	<.001
BNT-Sem (n)	.5 (.0-2.0)	.0 (.0-1.0)	Z = .536	.592^a
BNT-Sem (%)	3.04 (.00-16.27)	4.48 (.00-13.47)	Z = .269	1.000^a
BNT-Selec (n)	−1.0 (−2.0-1.0)	−1.0 (−2.0-1.0)	Z = −.017	.987^a
BNT-Selec (%)	−13.02 (−21.39-5.47)	−5.51 (−14.29-15.95)	Z = −2.536	.022^a

Note: Continuous data are presented as mean (SD) or median (interquartile range).

Abbreviations: MMSE, mini-mental state examination; BNT, Boston naming test; MMSE-Total, the MMSE total scoring; MMSE-Name, the MMSE naming subtest scoring; MMSE-Lang, the MMSE language functional subtests scoring including MMSE-Name; MMSE-non-Lang, the MMSE non-language functional subtests scoring; BNT-Spon, the BNT spontaneous naming scoring; BNT-Sem (n), the correct counts of BNT after providing semantic cue; BNT-Sem (%), BNT-Sem (n) divided by the times providing semantic cue; BNT-Selec (n), the correct counts of BNT after providing selective cue; BNT-Selec (%), BNT-Selec (n) divided by the times providing selective cue.

^aAdjusted P-values according to the Benjamini-Hochberg procedure for multiple testing correction.

The MMSE-Lang and MMSE-non-Lang scores of POCD patients were significantly lower, but the MMSE-Name scores showed no significance in the 2 groups. In terms of BNT, POCD patients were significantly lower in BNT-Spon and BNT-Selec (%) scores, while there was no difference in the rest of the BNT scores.

According to Z-score, the incidences of postoperative language function decline, indicated by different tests, are shown in Figure 2. The incidence of postoperative language function decline diagnosed by BNT-Spon was significantly higher than that of MMSE-Lang.

Figure 2.

Comparison of incidence of postoperative language function declines with different diagnostic methods. Note: MMSE, mini-mental state examination; BNT, Boston naming test; MMSE-Lang, the MMSE language functional subtests scoring; BNT-Spon, the BNT spontaneous naming scoring. The significant level was adjusted to .017 according to Bonferroni correction.

Comparisons of the BNT and MMSE

The proportions of specific positive and negative agreement among BNT and MMSE and their subtests are shown in Table 4. Agreement between BNT-Spon and MMSE-Total was moderate (Kappa .523). The BNT-Spon identified an extra 18 abnormal cognition subjects compared with MMSE-Total and an extra 32 compared with MMSE-Lang. In addition, the agreement between BNT-Spon and MMSE-non-Lang was fair (Kappa .303); meanwhile, the positive agreement index, recognized as sensitivity, was as high as 84.6% (95% CI, 65.0%-104.2%).

Table 4.

Comparison of BNT Spontaneous Naming and MMSE, and Its Subtests.

		MMSE
		Total -	Total +	Lang -	Lang +	Non-Lang -	Non-Lang +
BNT	Spon -	84	7	87	4	89	2
BNT	Spon +	18	23	32	9	30	11
Kappa (95% CI)		.523 (.343 to .684)		.216 (.054 to .378)		.303 (.146 to .464)
PA (95% CI)		76.7% (60.6% to 92.7%)		69.2% (40.2% to 98.3%)		84.6% (65.0% to 104.2%)
NA (95% CI)		82.4% (74.8% to 89.9%)		73.1% (65.0% to 81.2%)		74.8% (67.0% to 82.6%)

Note: Data are presented as number or ratio (95% CI).

+ indicates positive result with Z-score ≤ −1.96; −indicates negative result with Z-score > -1.96.

Abbreviations: MMSE, mini-mental state examination; BNT, Boston naming test; PA, index of positive agreement between the 2 tests, analogous to sensitivity; NA, index of negative agreement between the 2 tests, analogous to specificity; Total, the MMSE total scoring; Lang, the MMSE language functional subtests scoring; non-Lang, the MMSE non-language functional subtests scoring; Spon, the BNT spontaneous naming scoring.

Pearson correlation analysis showed that the change scores of MMSE-Total and BNT-Spon were significantly correlated (Pearson’s r = .923, P < .001). ROC curve analysis was used to evaluate further and compare each subtest change score for predicting POCD (Figure 3). The AUC of BNT-Selec (%) change scores was significantly smaller than that of the other 3 subtests’ change scores (All, P < .001). There was a significant difference between MMSE-non-Lang and BNT-Total change scores in AUC (z statistic = 2.178, P = .0294).

Figure 3.

Receiver operating characteristic (ROC) curves for change scores of BNT and MMSE subtests. Note: MMSE, mini-mental state examination; BNT, Boston naming test; MMSE-Lang, the MMSE language functional subtests scoring; MMSE-non-Lang, the MMSE non-language functional subtests scoring; BNT-Spon, the BNT spontaneous naming scoring; BNT-Total, the sum of BNT-Spon scoring and BNT semantic cued correct counts; BNT-Selec (%), the BNT selective cued correct counts divided by the times providing selective cue.

Discussion

Ever since the ISPOCD1 study,⁴ researchers reached a consensus that neuropsychological test battery and reliable change index-based Z-score are suitable for POCD assessment and diagnosis.^1,2,18 POCD in an individual was usually defined when the Z-score was equal to or less than −1.96 at least in 2 different tests, and the combined Z-score equal to or less than −1.96.^16,17,19 Most conclusions in this study are also based on Z-score. Based on MMSE scoring, we found that POCD incidence reached 22.7% (95% CI, 15.5%-30.0%). Considering MMSE and BNT are known to be influenced by age, gender, and education,^9,23 the controls enrolled were also matched to the surgical group on the above indicators. Consequently, there was no apparent difference between the patients and controls for demographic characteristics and neuropsychological testing baseline values in the majority of parameters.

Based on MMSE-Lang and BNT-Spon scoring, the incidence of postoperative language function decline on post-surgery day 7 was 34.1% (95% CI, 26.3%-41.9%) in patients going through major noncardiac surgery. This value was similar to that diagnosed by BNT-Spon alone but significantly higher than that diagnosed by MMSE-Lang alone, indicating to a certain extent that MMSE is less sensitive than BNT-Spon in predicting postoperative language function decline. Note that the incidence of 34.1% was diagnosed by BNT-Spon or MMSE-Lang positive; the incidence would fall to 6.8% (95% CI, 2.5%-11.2%) if the diagnosis were defined as both tests positive.

As revealed in our study, BNT-Spon and BNT-Selec (%) were significantly lower in POCD patients. Considering BNT-Selec (%) is exclusively in the Chinese version, and the ROC analysis showed its poor predictive value for POCD (AUC = .615, 95% CI, .497-.733, P = .056), we chose BNT-Spon as the main indicator of BNT, consistent with other studies in indicator setting.^24,25 The agreement between BNT-Spon and MMSE total scores was moderate (Kappa = .523), indicating that BNT-Spon can be used as a simple screening tool for early POCD. Moreover, there was a fair (Kappa = .303) reliability between BNT-Spon and MMSE-non-Lang on diagnosis results, suggesting a correlation between the 2 tests.

Some works of the literature suggested that tests composed of various cognitive dimensional subtests had quite a good construct validity.²⁶ Our unpublished data also suggest that BNT significantly correlated with Benton Visual Retention Test and Symbol Digit Modalities Test. In this study, ROC curve analysis found that BNT-Spon held similar sensitivity and specificity compared with MMSE-Lang and MMSE-non-Lang in diagnosing POCD. Since some researchers also use BNT-total (the subtotal score of BNT spontaneous naming and semantic cued naming) as the main indicator,^23,27 further analysis found that compared with BNT-Spon, BNT-total and MMSE-non-Lang do not have good consistency, indicating BNT-total is more specific for language function assessment. All the preoperative data collected in this study (n = 264) were analyzed. BNT-Spon and MMSE total score (Pearson’s r = .576, P < .001), MMSE-Lang (Pearson’s r = .576, P < .001) and MMSE-non-Lang (Pearson’s r = .467, P < .001) were found with significant correlations.

MMSE, the most recognized clinical cognitive screening scale, is often used as a “gold standard” to validate novel or improved neuropsychological tests.^28,29 This study included positive patients diagnosed by MMSE-Lang and MMSE-non-Lang in POCD patients diagnosed by MMSE-Total. The study found that 13 cases with no language function decline were identified as POCD, while only 2 cases were not detected by BNT-Spon. All indicators show that MMSE has limited sensitivity and specificity in diagnosing POCD. Above all, combined neuropsychological tests, such as BNT with MMSE, are an effective cognitive evaluation strategy.

BNT does not rely on surgery type and has a local version^23,30,31 suitable for the lower education level of the elderly group,^23,27,28 minimizing language interference.³¹ However, we found the Chinese version 30-item BNT also has certain defects. For non-dementia subjects, the correct answer rate of naming items such as “tree” and “pencil” is as high as 100%, while cultural differences made “harp” and “icehouse” much harder to name, and the correct rate is under 10%. Therefore, there is room for optimization for the Chinese version of BNT. Additionally, some other answers were answered right by chance. The method to detect this kind of false is needed too. Perhaps adding some other test content can reduce the probability of not being detected. Future studies were needed to verify these hypotheses.

As is known, BNT is more suitable for specialized language function assessment than MMSE. However, the results of this study preliminarily confirm that BNT and MMSE have good consistency in the evaluation of comprehensive cognitive function. The reason may not be due to BNT’s excellent reliability and validity in overall cognitive assessment but to the limitations of BNT in assessing mild cognitive impairment such as POCD. Therefore, it can be speculated that single cognitive dimension neuropsychological tests such as auditory verbal learning tests, trail-making tests, symbol digit modalities tests, etc., have similar effects in POCD clinical study, but further clinical research is needed to confirm.

According to the recommendations, early postoperative cognitive impairment should be classified as “delayed neurocognitive recovery within 30 days after surgery.”¹. However, the time point we chose to assess cognition was on post-surgery day 7, consistent with previous existing clinical studies of POCD. For the relative long-term (ie, 3 months after surgery) cognitive function, the screening effect of BNT needs further study.

Conclusion

Many consensus meetings have proposed using neuropsychological test batteries for POCD assessment and diagnosis.¹ However, the universal practical application of those tests faces obstacles due to culture and language factors, especially in non-English-speaking regions and subjects are less educated elderly population, who are also at a high risk of POCD. Therefore, developing a set of neuropsychological tests that can be globally promoted and suitable for the low-education population is crucial. BNT may fit in since our study verified BNT for early POCD screening, but further research is needed.

Footnotes

Author’ Contributions

Xiaojie Zhai designed the study. Bo Meng, Xiaoyu Li, and Ruichun Wang recruited participants and executed study procedures. Bo Lu and Bo Meng analyzed the data. Zhang Chen and Xiaojie Zhai wrote the paper under the supervision of Bo Meng and Junping Chen. All authors critically reviewed the paper.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Natural Science Foundation of Zhejiang Provincial China (No. LQ21H090004), Zhejiang Provincial Public Service and Application Research Foundation, China (No. LGF22H090012), Ningbo Science and Technology Service Technology Foundation, China (No. 2020F040), and Medical Scientific Research Foundation of Zhejiang Province, China (No. 2020PY023)

Trial Registration

Chictr.org.cn, ChiCTR-ROC-17010610

ORCID iDs

Chen Zhang

Xiaoyu Li

Xiaojie Zhai

Ruichun Wang

Chen Junping

References

Evered

Silbert

Knopman

, et al. Recommendations for the nomenclature of cognitive change associated with anaesthesia and surgery. Anesthesiology. 2018;129(5):872-879.

Needham

Webb

Bryden

. Postoperative cognitive dysfunction and dementia: what we need to know and do. Br J Anaesth. 2017;119(1):115-125.

Skvarc

Berk

Byrne

, et al. Post-Operative Cognitive Dysfunction: an exploration of the inflammatory hypothesis and novel therapies. Neurosci Biobehav Rev. 2018;84:116-133.

Moller

Cluitmans

Rasmussen

, et al. Long-term postoperative cognitive dysfunction in the elderly ISPOCD1 study. ISPOCD investigators. International Study of Post-Operative Cognitive Dysfunction. Lancet. 1998;351(9106):857-861.

Glumac

Kardum

Sodic

Supe-Domic

Karanovic

. Effects of dexamethasone on early cognitive decline after cardiac surgery: a randomised controlled trial. Eur J Anaesthesiol. 2017;34(11):776-784.

Steinmetz

Christensen

Lund

Lohse

Rasmussen

ISPOCD Group . Long-term consequences of postoperative cognitive dysfunction. Anesthesiology. 2009;110(3):548-555.

Bentvelzen

Aerts

Seeher

Wesson

Brodaty

. A comprehensive review of the quality and feasibility of dementia assessment measures: the dementia outcomes measurement suite. J Am Med Dir Assoc. 2017;18(10):826-837.

Van der Hoek

Nieuwenhuizen

Keijer

Ashford

. The MemTrax test compared to the Montreal cognitive assessment estimation of mild cognitive impairment. J Alzheimers Dis. 2019;67(3):1045-1054.

Kaplan

Goodglass

Weintraub

Goodglass

. Boston naming test. Philadelphia: Lea & Febiger; 1983.

10.

Cheung

Chan

. Confrontation naming in Chinese patients with left, right or bilateral brain damage. J Int Neuropsychol Soc. 2004;10(1):46-53.

11.

Lam

Kwok

Chan

, et al. Long term neurocognitive impact of low dose prenatal methylmercury exposure in Hong Kong. Environ Int. 2013;54:59-64.

12.

Lin

Chen

Lin

, et al. Confrontation naming errors in Alzheimer's disease. Dement Geriatr Cogn Disord. 2014;37(1-2):86-94.

13.

Yang

Shen

, et al. Cognitive characteristics in Chinese non-demented PD patients based on gender difference. Transl Neurodegener. 2018;7:16.

14.

Katzman

Zhang

Ouang-Ya-Qu , et al. A Chinese version of the Mini-Mental State Examination; impact of illiteracy in a Shanghai dementia survey. J Clin Epidemiol. 1988;41(10):971-978.

15.

Meyer

Huang

Chowdhury

Quach

. Adapting mini-mental state examination for dementia screening among illiterate or minimally educated elderly Chinese. Int J Geriatr Psychiatry. 2003;18(7):609-616.

16.

Culley

Flaherty

Fahey

, et al. Poor performance on a preoperative cognitive screening test predicts postoperative complications in older orthopedic surgical patients. Anesthesiology. 2017;127(5):765-774.

17.

Evered

Scott

Silbert

Maruff

. Postoperative cognitive dysfunction is independent of type of surgery and anesthetic. Anesth Analg. 2011;112(5):1179-1185.

18.

Rasmussen

Larsen

Houx

, et al. The assessment of postoperative cognitive function. Acta Anaesthesiol Scand. 2001;45(3):275-289.

19.

Silbert

Evered

Scott

. Incidence of postoperative cognitive dysfunction after general or spinal anaesthesia for extracorporeal shock wave lithotripsy. Br J Anaesth. 2014;113(5):784-791.

20.

Fine

. On sample size for sensitivity and specificity in prospective diagnostic accuracy studies. Stat Med. 2004;23(16):2537-2550.

21.

Berger

Schenning

Brown

, et al. Best practices for postoperative brain health: recommendations from the fifth international perioperative neurotoxicity working group. Anesth Analg. 2018;127(6):1406-1413.

22.

Benjamini

Hochberg

. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B Methodol. 1995;57(1):289-300.

23.

Leite

Miotto

Nitrini

Yassuda

. Boston Naming Test (BNT) original, Brazilian adapted version and short forms: normative data for illiterate and low-educated older adults. Int Psychogeriatr. 2017;29(5):825-833.

24.

Baldo

Arevalo

Patterson

Dronkers

. Grey and white matter correlates of picture naming: evidence from a voxel-based lesion analysis of the Boston Naming Test. Cortex. 2013;49(3):658-667.

25.

Fong

Hshieh

Wong

, et al. Neuropsychological profiles of an elderly cohort undergoing elective surgery and the relationship between cognitive performance and delirium. J Am Geriatr Soc. 2015;63(5):977-982.

26.

Kelley

Ranes

Estrada

Grandizio

. Evaluation of the military functional assessment program: preliminary assessment of the construct validity using an archived database of clinical data. J Head Trauma Rehabil. 2015;30(4):E11-E20.

27.

Jefferson

Wong

Gracer

Ozonoff

Green

Stern

. Geriatric performance on an abbreviated version of the Boston naming test. Appl Neuropsychol. 2007;14(3):215-223.

28.

Calero

Arnedo

Navarro

Ruiz-Pedrosa

Carnero

. Usefulness of a 15-item version of the Boston Naming Test in neuropsychological assessment of low-educational elders with dementia. J Gerontol B Psychol Sci Soc Sci. 2002;57(2):187-191.

29.

Gross

Jones

Fong

Tommet

Inouye

. Calibration and validation of an innovative approach for estimating general cognitive performance. Neuroepidemiology. 2014;42(3):144-153.

30.

Aniwattanapong

Tangwongchai

Supasitthumrong

, et al. Validation of the Thai version of the short Boston Naming Test (T-BNT) in patients with Alzheimer’s dementia and mild cognitive impairment: clinical and biomarker correlates. Aging Ment Health. 2019;23(7):840-850.

31.

Medvedev

Sheppard

Monetta

Taler

. The BNT-38: applying rasch analysis to adapt the Boston Naming Test for use with English and French Monolinguals and Bilinguals. J Speech Lang Hear Re. 2019;62(4):909-917.