Speech Recognition in Background Noise of Cochlear Implant Patients

Abstract

OBJECTIVES: The performances of adult patients using Spectral peak (Nucleus 22 or Nucleus 24 patients) or Continuous Interleaved Sampling or Advanced Combination Encoder (Clarion patients) were evaluated in their ability to perform in quiet and in 2 levels of background noise.

PATIENTS AND METHODS: Ninety-six patients were tested with the City University of New York Sentences presented at 70 dB in quiet and at signal-to-noise ratios (SNR) of +10 and +5 dB. Patients were scored on the number of words perceived correctly.

RESULTS: Scores were different at each condition (P < 0.05): 88% words correct in quiet, 73% correct at an SNR of +10 dB, and 47% correct at an SNR of +5 dB. Linear regression analysis found no significant correlation between test score and age at implantation or time using the implant. A weak negative correlation was found between years of hearing loss and score.

CONCLUSION: Competing noise interferes with comprehension of connected speech for most cochlear implant patients.

During the past 25 years, clinical data have accumulated that clearly demonstrate that adults and children with severe to profound hearing loss achieve substantial benefits from cochlear implants (CIs). 1 As implant technology has advanced, these features have steadily improved. Initially, hearing results in patients with CI consisted of improved lip reading and the ability to detect sound. Now with advanced speech processing strategies and implant design, patients often achieve open set speech recognition.

Currently, 3 companies manufacture devices, each with different ways to process speech and having the ability to “custom fit” the patient. Nucleus 24 (and, in the past, Nucleus 22) is manufactured by Cochlear Corporation and has been approved by the Food and Drug Administration (FDA) for adults and children since June 1998. This device has 22 intracochlear electrodes and 2 extracochlear grounds, which enable the audiologist to use different current pathways to custom fit the patient. Until recently, Spectral peak (SPEAK) has been the only speech processing strategy available; however, Continuous Interleaved Sampling (CIS) and Advanced Combination Encoder (ACE) are 2 new strategies currently available. The FDA approved the Clarion multistrategy cochlear implant for adults in 1995 and children in 1997. This device uses 8 intracochlear electrode channels (ie, 16 electrodes). Clarion's speech processing strategies include CIS, Simultaneous Analog Stimulation (SAS), and Paired Pulsatile Stimulation (PPS). The third cochlear implant system available in the United States is the MED-EL Combi-40 Plus, which was approved by the FDA in August 2001. This device uses 12 intracochlear electrode pairs and 2 extracochlear grounds. CIS and Number of Maxima (N of M) speech processing strategies are available.

Steady advances in implant technology have enabled more patients to achieve higher levels of speech understanding. Despite this, CI recipients are more susceptible to the effects of background noise on speech recognition than are normal-hearing listeners. 2–7 Possible causes for the reduced performance in noise include reduced spectral resolution from a limited number and location of electrodes in the cochlea, small dynamic range, abnormal loudness growth, information and redundancy reducing signal processing of the cochlear implant system, and having only monaural input to the central auditory system. 8–10

Table 1.

Demographic information of the 96 postlingually deafened adults

Gender	Females (n = 52)	Males (n = 44)
Operated ear	Left (n = 46)	Right (n = 50)
Age at implantation (y)	Mean, 53.0; SD, 12.8	Range, 16–90
Year profound hearing loss	Mean, 6.25; SD, 7.76	Range, 0.25–45
Years of cochlear implant use	Mean, 3.47; SD, 2.42	Range, 0.5–12
Cause of hearing loss	Idiopathic	(n = 64)
	Meniere's	(n = 8)
	Ototoxicity	(n = 7)
	Otosclerosis	(n = 5)
	Meningitis	(n = 4)
	Familial	(n = 3)
	Acoustic trauma	(n = 1)
	Head trauma	(n = 1)
	Osteogenesis imperfecta	(n = 1)
	Neurosyphilis	(n = 1)
	Cochlear hydrops	(n = 1)

Better speech understanding in noise is a goal in the research and development of cochlear implant systems. More effective processing of the speech within a noisy environment preceding the CI speech coding is one method being tried to improve performance in noise. 8 Increasing the number of electrodes or more effectively using the existing electrodes to enhance the spectral channels has been proposed. 9 Armstrong et al 11 demonstrated that combining the CI with a conventional hearing aid in the contralateral ear can improve listening in noise. In addition, improved speech processing strategies, such as SPEAK, CIS, and ACE used by Nucleus 22 and 24 recipients and CIS, SAS, and PPS used by Clarion patients, may allow improved perception in noise due to the redundancy of the acoustic information in a signal. 6,12–14

MATERIALS AND METHODS

Fifty-six Nucleus 22, 10 Nucleus 24, and 30 Clarion users were tested for speech recognition ability in quiet and in noise. As part of the cochlear implant preoperative evaluation, the implant audiologist and surgeon discussed all available options with the patient. The patient then selected the device to be implanted. All patients had full electrode insertions at surgery. Nucleus 22 and 24 recipients used programs with 16 to 20 electrodes activated. Clarion recipients were programmed with 7 or 8 channels. Patients had between 6 months and 12 years of implant experience at the time of testing. Composite demographic characteristics are listed in Table 1.

All Nucleus patients used the SPEAK speech processing strategy, which became generally available for use in 1994. The Nucleus 22 system uses 20 filter banks with center frequencies from 250 to 10,000 Hz to analyze acoustic signals. The Nucleus 24 system uses fast Fourier transformation for frequency analysis. The frequency bands with the greatest spectral energy content, called maxima, are selected, and the corresponding electrodes, which represent the frequency range for each filter bank, are chosen for stimulation in a pulsatile, sequential fashion. Speech sounds such as vowels, which contain a greater amount of acoustic energy, may be represented by as many as 10 electrode pairs. Conversely, presentation of a voiceless consonant sound may result in stimulation of 3 or 4 electrode pairs. 15

Twenty-five Clarion patients were programmed with the CIS strategy, which provides rapid, sequential stimulation of up to 8 channels. The acoustic signal is processed by a series of filter banks with the filtered waveform envelope signal represented via a particular channel according to that channel's assigned frequency range. Both dominant and nondominant areas of acoustic energy are included in the processed signal. 16

Table 2.

Group test performance in quiet, in SNR +10 dB, and in SNR +5 dB for City University of New York Sentence test (n = 96)

	Recorded speech, 70 dB SPL
	No background noise (quiet)	8-Talker babble, 60 dB SPL (SNR, +10 dB)	8-Talker babble, 60 dB SPL (SNR +5 dB)
Mean score	82.1	73.04	47.36
SD	23.05	26.19	27.76
Minimum score	0	0	0
Maximum score	102	102	91
Median score	91	81	52.5
dB SPL, Decibel sound pressure level; SNR, signal-to-noise ratio.

The remaining Clarion patients used SAS, a very high rate processing strategy in which the digitally processed speech signal is changed to analog form for simultaneous presentation to all the electrodes. As with CIS, both dominant and nondominant areas of acoustical energy are included. 17

Patients were tested in a calibrated Tracoustics Acoustical Enclosure Model RS-252. The City University of New York (CUNY) Sentences (lists 1–72; Cochlear Corporation recordings) were used to assess speech recognition ability. Test materials were presented via a GSI-16 audiometer and Fisher CR-W681 and Sony TC FX170 cassette decks. Six lists of the CUNY Sentences were administered at 70-dB sound pressure level (SPL), with 2 lists presented in quiet, 2 with competing 8-talker babble at 60-dB SPL (signal-to-noise ratio of +10), and 2 with talker babble at 65-dB SPL (signal-to-noise ratio of +5). Both presentation order and lists were randomized. For all measures, patients were instructed to repeat any of the test stimuli they understood and to guess if unsure. No feedback or correction of the response was given. Patients whose speech was not readily understood by the tester were asked to write their responses. Sentences were scored by the total number of words correctly understood, and the scores for the 2 lists per condition were averaged. The minimum score was 0, and the maximum score was 102. No normative data are available for age-matched cohorts. None of the patients were tested with a hearing aid in the nonimplanted ear.

Before testing, patients were questioned about their implant equipment, and a sound field audiogram was obtained. If necessary, malfunctioning components were replaced and/or the speech processor program was adjusted before testing. Nucleus 22 patients were asked to set the speech processor to “Normal” or “N” with the sensitivity dial at the patient's most comfortable listening level. Nucleus 24 patients used their most frequently used speech processor program with the sensitivity set at “8” and the volume adjusted to the most comfortable level. Patients were not allowed to activate the autosensitivity setting. Clarion patients also used their most frequently used program with the sensitivity set at “10:00” on the dial and the volume set to the most comfortable listening level. None of the patients had any other medical issues that might have affected speech comprehension.

Descriptive statistics, including mean, SD, and median values, were obtained for age, duration of hearing loss, and duration of cochlear implant use (see Table 1). Two-sided t tests were used to compare the groups tested in quiet, +10-dB signal-to-noise ratio (SNR), and +5-dB SNR settings. The correlation coefficient was calculated for test score and age at implant, years of hearing loss, and duration of CI use. The significant statistical difference was set at P < 0.05. Data were analyzed using Primer of Biostatistics.

Fig 1

Patient score distribution in quiet, City University of New York Sentence test (n = 96).

RESULTS

Performance in quiet on the CUNY Sentences averaged 82.1. With increasing noise, performance decreased to a mean of 73.04 at SNR +10 dB and fell further to 47.36 at SNR +5 dB (Table 2). The differences in performance at quiet versus SNR +10 dB (P = 0.012) and versus SNR +5 dB (P < 0.001) were both statistically significant. In addition, performance at SNR +10 dB was better than that at SNR +5 dB (P < 0.001).

Linear regression analyses revealed no significant correlation between age at cochlear implantation and test score or between time using the cochlear implant and test score. There was, however, a weak negative correlation between years of profound hearing loss and score in all 3 test conditions (Table 3). For example, the mean performance of patients with profound sensorineural hearing loss of <1 year (n = 17) was 91.12 in quiet, 83.59 in SNR +10 dB, and 55.18 in SNR +5 dB, whereas the mean performance of patients with profound sensorineural hearing loss of =14 years (n = 13) was 60.62 in quiet, 62.40 in SNR +10 dB, and 21.15 in SNR +5 dB.

Figs 1 to 3 demonstrate how the score distribution changes with the different levels of background noise. Even though there is a reduction in mean scores with increased noise, 9 (9%) of 96 patients were able to score better than 81 at SNR +5 dB (Fig 3).

Because of the disproportionate number of patients using SPEAK (n = 66), compared with CIS (n = 25) and SAS (n = 5), the speech processing strategy was not analyzed as to its performance in noise. As more implanted patients begin using CIS, SAS, and other newer strategies, they will be compared with the patients using SPEAK. This analysis of speech processing strategies and performance in noise will be the topic of future studies.

DISCUSSION

As shown by the performance scores in quiet, most patients do very well with their cochlear implant under this optimum listening condition. Two thirds of the patients scored ≥ 80 on the CUNY Sentences, and only 3 patients scored ≤20. With added background noise, the scores decreased, but >50% scored 80 or better at SNR +10 dB, and 9% scored at least 80 at SNR +5 dB. These results in quiet and in noise are similar to recent results described by others. 5,6,11

Fig 2

Patient score distribution in signal-to-noise ratio of +10 dB, City University of New York Sentence test (n = 96).

Performance, especially in noise, has shown progressive improvement as the speech processing strategies have advanced. In 1984, McCabe et al 18 found that patients performed no better than chance (ie, 25% correct) on the 4-Choice Spondee test when tested at SNR +10 dB. With a newer speech processing strategy (ie, F0F1F2), performance increased to 90% to 95% on the 4-Choice test at SNR +10 dB and 59% to 66% at SNR +5 dB. 2,19 Further studies have shown multipeak (MPEAK) strategy to be better than F0F1F2, and SPEAK to be better than MPEAK, especially in noisy conditions. 5,6,14

Unlike MPEAK and earlier strategies, SPEAK does not rely on extraction of speech features; rather, it uses a spectral analysis of the signal. The signal is digitally processed through a bank of 20 filters. The output is scanned, and the corresponding electrodes, which represent the frequency range for each filter bank, are chosen for stimulation in a pulsatile, sequential fashion. Another advanced strategy, CIS, was compared with SPEAK and found to be less susceptible to noise with certain test conditions. 6 CIS also does not rely on formant estimation; rather, the acoustic signal is processed by filter banks with the filtered waveform envelope signal represented by a specific channel, which provides rapid, sequential stimulation of up to 8 channels. These newer processors are faster, have better software and expanded memory, and provide redundant acoustic information, which may explain their increased performance. 20

As advances in speech coding strategies have resulted in better understanding in noise, so have several preprocessing algorithms. 3,4,8 By enhancing the SNR, these programs clarify the incoming speech signal and reduce the amount of nonspeech sound that reaches the filters. A gain of 4 to 6 dB in SNR can be seen with noise suppression pro-grams. 3,8 Also, addressing specific limitations inherent to cochlear implants may improve patients' performance. These include reduced spectral resolution, small dynamic range, abnormal loudness growth, information and redundancy reducing signal processing, and monaural input. 8–10

Fig 3

Patient score distribution in signal-to-noise ratio of +5 dB, City University of New York Sentence test (n = 96).

Table 3.

Linear regression analysis results for test score in quiet, in SNR +10 dB, and in SNR +5 dB and age at implantation, time using cochlear implant, and years of profound hearing loss for City University of New York Sentence test (n = 96)

Variable	Quiet	SNR +10 dB	SNR +5 dB
Age at implantation (y)	R = 7minus;0.062	R = 7minus;0.043	R = 0.129
	P = 0.551	P = 0.675	P = 0.209
Time using cochlear implant (y)	R = 0.033	R = 0.029	R = 0.036
	P = 0.751	P = 0.780	P = 0.744
Years of profound hearing loss	R = 7minus;0.300	R = 7minus;0.322	R = 7minus;0.327
	P = 0.003*	P = 0.001*	P = 0.001*
SNR, Signal-to-noise ratio.

*Statistically significant.

The clinical parameters associated with better performance in noise were analyzed. Age at implantation and time spent using the implant were not predictive of performance. A negative correlation, however, was found between the length of profound hearing loss and performance, which has been described by others. 13 The longer a patient goes with profound hearing loss, the worse the performance becomes. This is not a very strong correlation but does give support to the clinical impression that patients do better if they are implanted sooner rather than later. Perhaps changes in anatomy and/or physiology, which occur over time in the absence of acoustical stimulation, may adversely affect a patient's ability to use the CI. Further investigation is needed to clarify the relationship of length of hearing loss to cochlear implant performance.

In examining performance in the 3 test conditions, the distribution of scores and median score give additional information about the overall performance within each trial. The variability in individual performance within each group was demonstrated. Despite lower mean scores with SNR +5 dB, some patients scored better than the mean score in quiet. Conversely, in quiet, some patients scored very low. The parameters that contribute to the extreme variability between individual adult CI users need further evaluation. Patient performance using newer strategies such as ACE and PPS also should be assessed.

CONCLUSION

Significant decreases in performance occur as increasing levels of background noise are added. A weak negative correlation between length of hearing loss and performance was found. Future implant systems will need to address this, as patients in real life routinely encounter background noise.

The authors thank Xianxi Ge, MD, for statistical analysis of the data and Rhonda Foley for help with manuscript preparation.

References

Schindler

Personal reflections on cochlear implants. Ann Otol Rhinol Laryngol 1999(suppl 177);108:4–7.

Dowell

Seligman

Blamey

, et al. Speech perception using a two-format 22-electrode cochlear prosthesis in quiet and in noise. Acta Otolaryngol (Stockh) 1987;104:439–46.

Hochberg

Boothroyd

Weiss

, et al. Effects of noise and noise suppression on speech perception by cochlear implant users. Ear Hear 1992;13:263–71.

Weiss

Effects of noise and noise reduction processing on the operation of the Nucleus-22 cochlear implant processor. J Rehabil Res Dev 1993;30:117–28.

Muller-Deile

Schmidt

Rudert

Effects of noise on speech discrimination in cochlear implant patients. Ann Otol Rhinol Laryngol 1995;166:303–6.

Kiefer

Muller

Pfenningdorff

, et al. Speech understanding in quiet and in noise with the CIS speech-coding strategy (MED EL Combi-40) compared to the MPEAK and SPEAK strategies (Nucleus). Adv Otorhinolaryngol 1997;52:286–90.

Hochmair-Desoyer

Schulz

Moser

, et al. The HSM sentence as a tool for evaluating the speech understanding in noise of cochlear implant users. Am J Otol 1997;18:S83.

Hamacher

Doering

Mauer

, et al. Evaluation of noise reduction systems for cochlear implant users in different acoustic environment. Am J Otol 1997;18:S46–S49.

Shannon

Wang

Effects of noise and spectral resolution on vowel and consonant recognition: acoustic and electric hearing. J Acoust Soc Am 1998;104:3586–96.

10.

Zeng

F-G

Galvin

III

Amplitude mapping and phoneme recognition in cochlear implant listeners. Ear Hear 1999;20:60–74.

11.

Armstrong

Pegg

James

, et al. Speech perception in noise with implant and hearing aid. Am J Otol 1997;18:S140–1.

12.

McKay

Vandali

McDermott

, et al. Speech processing for multichannel cochlear implants: variations of the spectral maxima sound processor strategy. Acta Otolaryngol (Stockh) 1994;114:52–8.

13.

Battmer

Reid

Lenarz

Performance in quiet and in noise with the Nucleus Spectra 22 and the Clarion CIS/CA cochlear implant devices. Scand Audiol 1997;26:240–6.

14.

Liu

Huang

W-H

, et al. Evaluation of coding strategies under noisy environment by stimulating electrodes. Adv Otorhinolaryngol 1997;52:100–2.

15.

Nucleus Technical Reference Manual. Cochlear Corporation, 1996, pp 11–13.

16.

Tyler

(ed): Cochlear Implants: Audiological Foundations. Singular Publishing, 1993.

17.

Clarion Device Fitting Manual, pp 13–17.

18.

McCabe

Tyler

Gantz

, et al. Preliminary assessment of the Los Angeles, Vienna and Melbourne cochlear implants. Acta Otolaryngol (Stockh) 1984;411 (suppl):247–53.

19.

Franz

BK-HG

Dowell

Clark

, et al. Recent developments with the Nucleus 22-electrode cochlear implant: a new two formant speech coding strategy and its performance in background noise. Am J Otol 1987;8:516–8.

20.

Battmer

Feldmeier

Kohlenberg

, et al. Performance of the new Clarion speech processor 1.2 in quiet and in noise. Am J Otol 1997; 18:146.