Abstract
Objectives
This study aims to investigate the effect of lexical tone and sex on nasalance in Cantonese, a lexical tone language, and to identify both statistical and clinical implications.
Design
Forty Cantonese-speaking adults were recruited; 72 stimuli words based on 12 syllables across 3 syllable types (CV, CVV, CVC) and 6 lexical tones were constructed. Each stimulus was repeated 6 times, resulting in 1296 repetitions randomized for each participant. Acoustic data were annotated and mean nasalance extracted from the NasometerTM software.
Results
A 3-way mixed analysis of variance with sex as the between-subjects factor and syllable type and tones as the within-subjects factors showed a significant 3-way interaction effect, F(5.491,647.953) = 6.759, P < .001, partial η2 = .054, a significant 2-way interaction effect of syllable type × tone, F(5.491,647.953) = 57.524, P < .001, partial η2 = .328, and significant main effects, for example, sex, F(1118) = 12.078, P = .001, partial η2 = .093. Females had higher nasalance than males across all syllable types. Tone 1 words and CVC syllable type yielded the highest nasalance values.
Conclusions
Study findings show an effect of lexical tone and syllable type on nasalance for males and females. The theory of transpalatal acoustic transmission is used to explain and discuss study findings. Clinical implications pertain to the potential need for separate normative databases for males and females and control of tone 1 words and syllable type (CVC) with final syllable nasal, in nasalance speech sampling material and in cross-linguistic comparisons.
Background
Velopharyngeal insufficiency (VPI) persists in up to 20% of individuals with Cleft Palate ± Lip (CP ± L) even after primary repair of the palate. 1 There is a range of possible speech outcomes of VPI including hypernasality, nasal airflow errors, grimace, and passive cleft speech characteristics, for example, weak oral pressure consonants.2–5 Hypernasality or excessive nasal resonance is considered the “hallmark speech characteristic” of cleft palate speech and VPI 6 (pp111) and is typically heard on non-nasal sonorants such as vowels and glides.7,8 Accurate assessment of resonance is therefore obligatory in the decision-making process for surgical and/or physical intervention, and as an outcome measurement of intervention. Although perceptual evaluation of resonance remains at the core of speech methodology in CP ± L, its validity can be affected by low reliability and variation in raters’ judgments due to variation in measurement scales, 9 the type of speech sample sets used, 7 and listener experience and training. 10 There is also increasing evidence that ordinal-type scales which are commonly used in resonance ratings, may not be the most valid form of measurement. 11 The use of an instrumental measure of nasality is thus seen as a necessary clinical (and research) tool. Data from instrumental assessment can complement perceptual judgments by confirming and/or supplementing perceptual evaluations, particularly in borderline cases or in the context of a less experienced Speech and Language Therapist (SLT).9,10
A standard instrumental evaluation of nasality is the NasometerTM (Pentax Medical). The system consists of an external module and a handset or headset with a metal separator which separates the nasal from the oral microphones. The NasometerTM provides an objective and quantitative acoustic measure of nasality, termed as Nasalance, according to the formula, ratio of nasal to nasal plus oral acoustic energy multiplied by 100. 12 Good relationships have been identified between perceptual ratings of nasality and nasalance scores particularly when speech samples and phonetic environment are similar across the 2 measurement methods, further supporting the clinical utility of the NasometerTM. 9 The system is used in many cleft centers/services globally, in view of both its clinical and research utility, and has been extended to other clinical groups such as hearing impairment, 13 motor-speech disorders, 14 and postmaxillectomy. 15 In an online survey of SLTs and surgeons of the American Cleft Palate-Craniofacial Association and Division 5 of the American Speech-Language-Hearing Association, Kummer 16 reported that almost 30% of participants who responded to the survey, included the NasometerTM as part of their pre- and postoperative assessment. In Hong Kong, the NasometerTM is available in several hospitals within the Hospital Authority including the recently launched Hong Kong Children's Hospital, as well as at Universities with Speech Therapy (ST) training programs.
The clinical and research utility of nasalance, however, is dependent on the availability of appropriate and accurate norms. There are several factors that can affect nasalance scores, including language and/or dialect,17,18 sex,19,20 and age21–23 and phonetic content of the speech stimuli.24,25 In terms of language for instance, a nasalance mean of 25% is reported for Mandarin 24 versus a high 38% for Turkish, for a mixed oral-nasal speech sample set. 26 In terms of age, a recent systematic review by Pereira and colleagues 27 reported moderate to large effect sizes between adults and children, and adults/older adults and adolescents for oral texts/passages. Bettens and colleagues 28 found sex differences, where females had higher nasalance scores than males, but only for nasal texts/passages. The phonetic environment of speech sample sets must be carefully constructed and considered as high vowels, and nasal consonants are known to inflate nasalance scores. 15
The effect of lexical tone on nasalance, however, has not been well studied as NasometerTM studies have been focused on Indo-European languages which do not have tone in their sound systems. Yip 29 (pp1) describes a language as a tone language if “the pitch of the word can change the meaning of the word” as with language such as Mandarin, Cantonese and Vietnamese. According to Lee and Mok, 30 almost half (42%) of the 526 languages studied by Maddieson 31 are tone languages, particularly languages used in Africa, East Asia, and Southeast Asia, implicating the importance of studying the effect of tone on nasalance. For instance, Cantonese, a lexical tone language, is estimated to be spoken by over 85 million people worldwide, although it is primarily used in Hong Kong, Macau and the southern Chinese provinces of Guangdong and Guangxi. 32 Cantonese is regarded as monomorphosyllabic, where each written character corresponds to a single syllable with a lexical tone. 33 Lexical tone in Cantonese relates to the use of pitch that changes the meaning of the word (or syllable). 29 There are 6 contrastive lexical tones in Cantonese: T1 high-level, T2 high-rising, T3 mid-level, T4 low-falling, T5 low-rising, and T6 low-level. 34 There are also 3 allotones or “entering tones”, T7 high-stopped, T8 mid-stopped, and T9 low-stopped, which occur only in syllables with syllable final unreleased stops [-p, -t, -k]. 35 In Cantonese, the typical syllable can take 2 forms: an open syllable (vowel, V or consonant + vowel, CV) as in /tɔ1/ 多 “many,” or a closed syllable (VC, CVC) as in /ɐm3/ 暗 “dark,” and /pak3/ 伯 “uncle.” Only plosive consonants /p, t, k/ (and nasals) can occur in word- or syllable-final position.
The primary acoustic cue of a lexical tone is the fundamental frequency (F0) contour with each lexical tone having its own F0 pattern, 36 where F0 of a speech signal typically relates to the perceptual correlate of pitch. Very early on, Fletcher and colleagues developed the TONAR system, which provided the objective measure of nasalance, incorporating both F0 and the vocal intensity of a speaker. 37 More recently, Blanton et al. alluded to the impact of palate repair on F0 and the resultant nasalance. 38
To-date, there are only 3 studies that have investigated the effect of tone on nasalance.25,39 Nguyen and colleagues 25 investigated the effect of tone on Vietnamese with a Southern dialect in 75 adolescents and adults ranging from 17 to 33 years of age. Speech stimuli included consonant-vowel (CV) syllables across 6 Vietnamese tones. The authors 25 reported a significant main effect of tone on nasalance in Vietnamese-speakers, where stimuli with high tones (i.e., tone 1 and tone 5) and mid tones (i.e., tone 3 and tone 4), had higher nasalance scores than stimuli with low tones (i.e., tone 2 and tone 6). Sex as a factor did not influence nasalance. A limitation of the study by Nguyen et al 25 is the inclusion of only 3 sets of CV stimuli. An earlier study by Cheung 39 similarly investigated the effect of lexical tone on Cantonese-speaking adults but in males only. The author used 4 sets of monosyllablic words: /wɐi/, /si/, /sɵy/, and /fu/. Cheung's study 39 found that the effect of tone was only statistically significant and present with the stimuli /wɐi/, and that nasalance for tone 3 (a mid-level tone), was significantly higher than for tone 4 (a falling tone). A limitation of the study by Cheung 39 is the small sample size of 12 participants, and the inclusion of only male participants. The small sample size could have impacted on the power of the study resulting in nonsignificant results. Additionally, a limited range of stimuli was used, and not all Cantonese vowels were included. More recently, She et al 40 reported a significant main effect of Cantonese lexical tone on nasalance, although this was based only on 10 female adults. The small sample size could have also resulted in a nonsignificant interaction effect between tone and syllable.
The evidence for the effect of lexical tone of nasalance, thus, remains limited and ill-defined implicating the need for further research. This is particularly crucial with the increasing trend of cross-linguistic nasalance studies where the recommended speech sample sets are the 9-word string41,42 and potentially, syllables.43,44 Further research is needed to evaluate the effect of tone on nasalance and to explore if sex is an influencing factor.
The primary aim of the study is to, therefore, evaluate if lexical tone (and syllable type) impact on nasalance, and whether sex is an influencing factor, using Cantonese, a lexical tone language.
Materials and Methods
Participants
This study was approved by the Joint CUHK-NTEC Clinical Research Ethics Committee (2023.485). All participants were provided with written informed consent prior to enrolment in the study. This research was conducted ethically in accordance with the World Medical Association Declaration of Helsinki. A priori sample size calculations were undertaken based on She et al.'s pilot work on 10 female adults. 40 To obtain at least a moderate to large effect size between 2 independent groups with alpha set at 0.05, and power at 0.8 (assuming equal numbers in each group), the recommended sample size per subgroup ranges from 9 to 17. Forty healthy adults (20 males and 20 females) were recruited. Their overall age ranged from 18:1 to 59:11 (Males, M = 28.43, SD = 7.8; Females, M = 30.75, SD = 6.68). Participant recruitment was advertised through several modes: (i) convenience and referral sampling methods which included ST students from the MSc in Speech-Language Pathology, students from the Professional Diploma in Communication Sciences and Disorders program, as well as staff of the Division of Speech Therapy, the Chinese University of Hong Kong (CUHK), (ii) the University's digest mail which is sent to all registered students and staff of CUHK, and (iii) the Division's Facebook page. Inclusion criteria were as follows: (i) native Cantonese-speakers with Cantonese as home/main language, (ii) born and raised in Hong Kong, (iii) no known or reported speech sound disorder, (iv) no resonance and/or velopharyngeal function disorders, (v) no language/learning difficulties that can impact performance; (vi) no past or current hearing difficulties, and (vii) no ear, nose, and/or throat-related issues such as nasal obstruction, nasal allergies, history of tonsillectomy and/or adenoidectomy, current cold. As many of these factors are plausible confounders and can impact nasalance scores, a detailed screening procedure was undertaken prior to actual intake into the study. This was undertaken by the first and fourth authors, under the supervision of either the third author (a qualified ST with clinical and research experience in cleft palate speech/ VPI) and/or the last author (an ST with more than 20 years in cleft palate speech/ VPI). The screening procedure was as follows: (i) questionnaire, (ii) pure-tone audiometry, (iii) oral and nasal cavity examination (including the Mirror Fogging test), and (iv) perceptual assessment of speech and velopharyngeal function. The questionnaire was divided into 4 sections containing items relevant to the predefined inclusion criteria and potential confounding variables that could have an impact on nasalance scores. Sample items are shown in Table 1. Pure tone audiometry was undertaken using the Audiometer (GSI 18 model); 1000, 2000, and 4000 Hz at 25 dB and 500 Hz at 30 dB on both ears were tested. Participants had to pass all levels to meet the inclusion criteria. The oral and nasal cavity examination included evaluation of presence/absence of tonsils and an evaluation of the soft palate for any missed submucous cleft of the palate. 45 The Mirror Fogging test, which has been shown to have high sensitivity (0.95), specificity (0.95), and positive predictive value (0.97), 46 was administered as part of the oral and nasal cavity examination to rule out inaudible nasal air emission during speech which could indicate velopharyngeal dysfunction. Screening of speech was undertaken using the Cantonese-Cleft Speech Assessment Tool (C-CSAT). 47 The tool was constructed and developed in accordance with international guidelines recommended by CLeft palate International SPeech Issues (www.clispi.com) and published frameworks.7,48 The tool has since been disseminated to over 200 SLTs in Hong Kong and Macau as well as over 100 graduating student STs across the 4 training programs in Hong Kong. Perceptual speech assessment using the C-CSAT encompasses evaluation of speech articulation (or consonant production), resonance, nasal airflow errors, grimace, and voice. 47 The screening procedures were undertaken by a second-year student from the MSc in Speech-Language Pathology program (a 2-year program), who had received training in Speech Sound Disorders and Craniofacial Anomalies, and an experienced ST. Of the 48 participants who enrolled in the study, 40 passed the screening procedures and met all the inclusion criteria. The remaining 8 who did not meet all the inclusion criteria (e.g., not born in Hong Kong, failed the pure tone test) were informed of the screening results, and further data collection was terminated.
Sample Items in the Questionnaire (Screening Procedure).
Cantonese Test Stimuli
The microphone speech corpora, CUSENTTM (Continuous Cantonese Sentences, Version 1.0), was accessed to identify base syllables that occur in all 6 lexical tones. A base syllable refers to the tone-independent syllable. Part of Cantonese Spoken Language Data Resources, 49 CUSENTTM was designed to be phonetically rich, consisting of 5100 distinct sentences and 4000 unique characters. From this corpora, an initial 17 base syllables were identified. Identification of base syllables for inclusion in this study was as follows: (i) the base syllables should have real words across all 6 tones, (ii) the real words should be high-frequency words in modern Cantonese, and (iii) the tested stimuli should be phonetically comprehensive. A final 12 base syllables (CV, CVV, CVC) with words across all tones were identified: CV, /sɛ/, /si/, /ji/, /fu/; CVV, /jiu/, /wɐi/, /sɵy/, /jɐu/; and CVC, /jɐn/, /fɐn/, /hɔn/, /sœŋ/. This resulted in a total of 72 test stimuli which is shown in Appendix A. Words across all tones of the combination of low vowels in CV structure were not identified, hence, low vowels were not included in the CV stimuli. There are also no real words across all tones with initial plosives (e.g., /p/) or affricates (e.g., /ts/). As such, the test stimuli included only fricatives (e.g., /s/) and approximants (e.g., /w/) as initial consonants. As syllables with lexical tones usually end with an open vowel or nasal consonant [-m, -n, -ŋ], Cantonese words that contain plosives in word final position which are categorized as allotones, 50 were excluded in this study. After identifying the 72 test stimuli, the order of the stimuli was randomized to present to each participant. Fifty sets of test stimuli were generated from a self-written program by the research associate of the Division with an engineering and linguistic academic background. The program was written through Microsoft Excel Visual Basics for Application.
Procedures
The NasometerTM 6500 was used for all nasalance data collection. The model was used with the accompanying handlebars and separator plate with attached oral and nasal microphones and a password-protected research laptop (Lenovo Thinkpad L480). No calibration was required for the NasometerTM 6500 model according to manufacturer guidelines. 51 Participants were seated on a comfortable chair facing the laptop (and the NasometerTM Contour window), which was placed on a computer desk. Positioning of the metal separator plate with silicone tubing was undertaken following strict protocol and manufacturer guidelines (e.g., the angle of the metal separator plate had to be perpendicular, at a 90-degree angle, to the frontal plane of the participant's face). All participants held the handlebars independently throughout the nasalance data collection process. The metal separator plate, plastic tubing, and handlebars were disinfected prior to each participant's data collection using rubbing alcohol (isopropyl). For each participant, new plastic tubing was used.
Participants were instructed to repeat each of the 72 test stimuli 6 times, resulting in 432 productions for the entire trial. A total of 3 trials were conducted, resulting in 1296 productions for the whole process. The repetition rate of each test stimulus was 1 repetition per second. Participants took 30 to 40 minutes to complete the recording. Since the maximum recording duration of the NasometerTM is 100 s, the recording file was saved with the default format of .NSP for every 12 test stimuli. If a participant made any tone errors, the researcher would model the correct tone, and the participant had to repeat the block of 12 test stimuli. As test–retest reliability is typically reported to be stable in both children and adults, 34 and no reliability study was undertaken.
Acoustic Analyses
Annotation of the acoustic data and extraction of mean nasalance scores were undertaken for stimuli and tone, and for each participant. Nasalance scores for any identified segment of data were obtained by using the “analyze” function which the software calculates according to the formula of ratio of nasal to nasal plus oral acoustic energy multiplied by 100. 12 As the NasometerTM provides an objective and well-defined outcome measure, 15 there is no risk of detection bias.
Statistical Analyses
All statistical tests were undertaken using SPSS (IBM® SPSS® Statistics 27). Descriptive statistics including the mean nasalance and standard deviation were extracted for each tone and syllable type for males and females separately and combined. The assumption of normality was explored visually using Q-Q plots and tested using the Shapiro-Wilk Test. Typically, if the assumption of normality is violated, corresponding nonparametric tests will be undertaken. However, for the current study and objectives, the parametric test, analysis of variance (ANOVA) would be utilized as each cell consisted of 240 datapoints (20 participants × 3 trials × 4 stimuli for each syllable type). Additionally, data were on an interval/ratio scale. 52 Test of homogeneity of variance was undertaken using Levene's Test. Where a P value of less than .05 is obtained, the assumption of homogeneity of variance is assumed to be violated. As group sample sizes were equal, the 3-way mixed ANOVA would be run as it was robust to heterogeneity of variance in such circumstances. In terms of the within-subject factors, the assumption of sphericity was tested using Mauchly's Test of Sphericity. Where sphericity is violated (i.e., P < .05), the Greenhouse-Geisser correction would be applied. A 3-way mixed ANOVA was performed using IBM® SPSS® Statistics 27 to analyze the effect of tone (6 levels, tone 1-6) and syllable (3 levels, CV, CVV, CVC) as the within-subject factors and sex (male, female) as the between-subjects factor, on nasalance. Simple 2-way interactions would be undertaken if the 3-way interaction was statistically significant. Subsequent simple main effects would be analyzed if the 2-way interactions were statistically significant. Pairwise comparisons would be run using a Bonferroni adjustment to help interpret simple simple main effects.
Results
Mean nasalance scores for each syllable type and tone for males and females are shown in Table 2. Mean nasalance was consistently highest for CVC syllable type, ranging from a combined nasalance score of 44.3% to 57.37%, in contrast to CV (19.5%-23.47%) and CVV (18.13%-21.85%) across all 6 tones. For each syllable type, nasalance was highest for tone 1 at 23.47% for CV, 21.85% for CVV and 57.37% for CVC, and lowest for tone 4 at 19.5% for CV, 16.74% for CVV and 44.3% for CVC. This profile was similar for both males and females (see Table 2). Nasalance scores for females were consistently higher for females than for males for each syllable type and tone, with the largest range occurring for CVC syllable type (2.1%-7.69%), although the patterns were fairly similar (Figure 1).

Mean Nasalance of Each Lexical Tone According to Syllable Type and Sex.
Mean Nasalance and Mean Difference According to Syllable Type and Tone for Males, Females, and Combined.
The assumption of homogeneity of variances was violated, as assessed by Levene's test for equality of variances. Mauchly's test of sphericity indicated that the assumption of sphericity was not met, χ2(54) = 337.251, P ≤ .001. As such, the Greenhouse-Geisser correction was applied. A 3-way mixed ANOVA with sex as the between-subjects factor and syllable type (CV, CVV, CVC) and tones (1-6) as the within-subject factors, revealed a statistically significant 3-way interaction between sex, syllable type and tone, F(5.491,647.953) = 6.759, P < .001, partial η2 = .054. Given the presence of a statistically significant 3-way interaction, simple 2-way interaction analyses were subsequently conducted. For completeness, the main effects and 2-way interaction effects are also reported prior to the presentation of the simple 2-way interaction results. Significant main effects were found for sex, F(1118) = 12.078, P = .001, partial η2 = .093, syllable type, F(1.305,153.978) = 2266.961, P < .001, partial η2 = .951, and tone, F(2.740,323.286) = 90.612, P < .001, partial η2 = .434. Among the 2-way interactions, the effects of syllable type × sex and tone × sex were not significant, whereas the interaction of syllable type × tone reached significance, F(5.491,647.953) = 57.524, P < .001, partial η2 = .328. Next, the simple 2-way interactions stratified by sex were conducted to clarify the nature of the 3-way interaction. Mauchly's test of sphericity indicated that the assumption of sphericity was not met for males, χ2(54) = 191.411, P ≤ .001, and for females, χ2(54) = 230.439, P ≤ .001, the Greenhouse-Geisser correction was applied. There was a statistically significant simple 2-way interaction between syllable type and tone for males, F(5.402,318.721) = 51.281, P < .001, η2 = .465, and for females, F(4.826,284.732) = 14.638, P < .001, partial η2 = .199. Subsequent simple main effects were run. The assumption of sphericity was violated in all cases and the Greenhouse-Geisser correction applied. There was a statistically significant simple main effect of tone for CV syllable type for males, F(3.114,183.747) = 8.977, P < .001, η2 = .132, and for females, F(2.032,119.899) = 7.687, P < .001, η2 = .115, for CVV syllable type for males, F(2.659,149.623) = 15.329, P < .001, η2 = .206 and for females, F(2.592,152.952) = 19.928, P > .001, η2 = .252; and for CVC syllable type for males, F(2.542,150.006) = 117.766, P < .001, η2 = .666 and for females, F(2.604,153.616) = 64.810, P < .001, η2 = .523.
Simple pairwise comparisons were run between each tone and each syllable type for males and female. Bonferroni adjustments were applied. Data are reported as the mean ± standard deviation unless otherwise stated (Table 3). For CV syllable type, for males, mean differences between tone 1 and tones 3, 4, 5, and 6, tone 2 and tone 4, and tone 4 and tone 5, were statistically significant, for example, CV, males, tone 1 and tone 4, 2.603 (95% CI, 0.689-4.517), P = .002. For females, there were statistically significant mean differences between tone 1 and tones 4 and 5, and tone 2 and tone 5, for example, CV, females, tone 1 and 4, 4.038 (95% CI, 0.520-7.555), P = .013. For CVV syllable type, for males, the mean differences between tone 1 and all other tones, tone 2 and tones 4 and 5, tone 3 and tones 4 and 5 were statistically significant, for example, CVV, males, tone 1 and tone 4, 4.350 (95% CI, 1.791-6.909) P ≤ .001. For females, there were also significant mean differences between tone 1 and all other tones, tone 2 and tones 4 and 5, and tone 3 and tones 4 and 5, as well as between tone 5 and tone 6, for example, CVV, females, tone 1 and tone 4, 5.879 (95% CI, 2.960-8.798), P ≤ .001. For CVC syllable type, all mean differences between tones except for that between tone 2 and tone 6, were statistically significant for males, for example, CVC, males, tone 1 and tone 4, 15.787 (95% CI, 12.695-18.880) P ≤ .001. For females, however, there were statistically significant differences only between tone 1 and all tones, tone 2 and tone 4, tone 3 and tones 4 and 5, tone 4 and tones 5 and 6, and for tone 5 and tone 6, for example, CVC, females, 10.342 (95% CI, 7.513-13.170), P ≤ .001. Figure 2 shows the significant pairwise comparisons for each syllable type and for males and females.

Significant Pairwise Tone Comparisons According to Syllable Type and Sex. + = CV syllable type; * = CVV syllable type; o = CVC syllable type.
Pairwise Comparisons and Mean Difference for Each Syllable Type, for Males and Females.
Md. = mean difference based on estimated marginal means.
The mean difference is significant at the .05 level.
Discussion
Nasalance is known to be affected by various factors such as language,17,53 sex,54,55 age, 26 and phonetic content of speech stimuli.15,25 To date, however, the effect of lexical tone on nasalance remains ill-defined. The current study investigated the effect of lexical tone and sex on native Cantonese-speakers. The test syllables were selected from a local word database which captures the phonetic comprehensiveness of Cantonese. 20 males and 20 females were recruited to read aloud 1296 productions. All participants received a randomized set of printed stimuli for the study. The current study found a significant 3-way interaction between lexical tone, syllable type, and sex, significant simple 2-way interaction between syllable type and tone for males and females, and significant simple main effect of tone for each of the 3 syllable types. The results indicated that lexical tone influences nasalance in Cantonese and that this effect of tone is potentially modulated by syllable type, for both males and females. For all syllable types (CV, CVV, CVC), nasalance was consistently highest for tone 1 and lowest for tone 4 for both males and females. Females also showed consistently higher nasalance values for each syllable type and tone compared to males, albeit with similar patterns.
The primary characteristic of Cantonese lexical tone is the subtle and rapid change of fundamental frequency (F0), 56 where F0 refers to the number of times per second the vocal folds vibrate during sound production and attributes to pitch alternation. 29 Mandulak and Zajac 57 identified that alterations of F0 were observed to impact on nasalance change in adult English speakers aged between 18 and 55 years. In their study, female speakers (N = 20) were required to reduce F0 to a target level for vowel productions, while male speakers (N = 20) were asked to increase F0. Higher nasalance during production of vowel /a/ was observed in male speakers, suggesting that males were more likely to increase oral acoustic impedance instead of nasal acoustic energy transfer. The findings of F0 alterations observed in English speakers may be similar to our study findings, although a much quicker rate of change of F0 was seen with lexical tones. This lexical tone effect may be explained by the relationship between F0 and the phenomenon of transpalatal acoustic transmission/ nasalance.58,59 Both Gildersleeve-Neumann and Dalston 59 and Bundy and Zajac 58 offer explanations for why overall nasalance cannot be zero, which would also explain the phenomenon of transpalatal nasalance. Three factors have been proposed: (i) the limited acoustic separation of the separator plate enabling a flow of oral acoustic energy across to the nasal microphone (KayPentax), (ii) vibration of the anatomical structures of the velum that transfer acoustic energy to the nasal cavity, 16 and/or (iii) oral transfer of acoustic energy across the lips and/or tongue. 58
The phenomenon of transpalatal nasalance is further viewed as the detection of nasal acoustic energy when sufficient velopharyngeal closure is achieved. Without direct oral-nasal coupling, nasal resonance is created due to acoustic energy generated from the vibration of the soft palate. 58 Mandulak and Zajac 57 explained that the palatopharyngeus and/or palatoglossus muscles are actively involved in pitch elevation (i.e., high lexical tone production), while acting antagonistic to the levator veli palatini muscle responsible for elevating and partially tensing the soft palate. The increased stiffness of the soft palate from the muscle contraction of the palatopharyngeus and/or palatoglossus, may contribute to change the natural resonant frequency of the soft palate which ranges from 350 to 750 Hz. 58 More recently, Blanton and colleagues alluded to the impact of structural or anatomical changes made to the velum (post palate repair) on F0 and nasalance. 38 The range is similar to the bandpass filter of the NasometerTM, which captures acoustic energy between the 350 and 650 Hz region of the speech spectrum. 15 Bundy and Zajac 58 indicated that F0 has a strong influence on F1-derived nasalance, meaning when a speaker's F0 reaches the lower limit (350 Hz) of the bandpass filter range, more acoustic energy will pass the nasal channel relative to the oral one. They further suggest that 80% of the nasal energy in nasalance scores was primarily determined by this transpalatal transfer.
In Cantonese, the lexical tone 4 has been shown to have the lowest F0 among the 6 lexical tones, while tone 1 has the highest.56,60 An early study on young adolescent males and females (12;10-14;02) reported that the F0 of tone 4 lies between 170 and 230 Hz. 56 In a separate study focusing on tone merging in late adolescence, Mok and colleagues 60 found tone 4 to be characterized by the range 130 to 150 Hz, although their data consisted primarily of female participants. The study by Maggu and colleagues 61 which included only male participants found tone 4 to be characterized by the range 85 to 99 Hz. 61 A more recent study by Wong and Chan, 62 who also focused on late adolescent males and females, reported a range of 173 to 269 Hz for females and a lower range of 75 to 162 Hz for males. These results imply that the F0 in tone 4 has little influence on F1-derived nasalance, and therefore, does not really contribute to transpalatal nasalance. On the other hand, the F0 of tone 1 has been indicated as 240 to 270 Hz, 56 or 240 to 250 Hz, 60 or 135 to 146 Hz. 61 The range of tone 1 is therefore closer to the lower end of the bandpass filter of the NasometerTM, implicating a larger influence on F1-derived nasalance and subsequent higher nasalance scores. This was clearly reflected in our study findings where nasalance for tone 1 words was always the highest, while nasalance for tone 4 words, the lowest. Similar findings on smaller sample sizes were reported by Cheung 39 and She et al 40 where nasalance for tone 4 words were significantly lower than tone 3 words, and nasalance for tone 1, significantly higher than tones 2, 4 and 5. 44
The type of syllable had a significant effect on nasalance for males and females. Nasalance was highest for syllable type CVC with an average of 52.53% averaged across all 6 lexical tones, compared with an average of 21.07% and 18.65% for CV and CVV, respectively. This is perhaps unsurprising as the final syllable consonant is a nasal for all 4 stimuli used (/fɐn/, /jɐn/, /hɔn/, /sœŋ/), which contributes to elevated nasalance scores. Similar findings were reported by She and colleagues. 40 In their study, only 4 different stimuli were used: /fu/, /sɵy/, /fɐn/, and /jim/, where /y/ is high front vowel in the Cantonese sound system. Nasalance values for the CVC stimuli ranged from 56.10 to 69.6% for /fɐn/ and from 54.2% to 68.5% for /jim/, in contrast with a range of 6.6% to 13.1% for CV, /fu/ and a range of 6.1% to 12.4% for CVV, /sɵy/. Similarly, final syllable consonants were both nasals /n/ and /m/. Speech stimuli containing nasal consonants are typically reported as having much higher nasalance values than oral sentences or passages as shown by Pereira and colleagues 27 in their systematic review of the effect of age and gender on nasalance across the lifespan.
Tone pairwise comparisons indicated that the effect of lexical tone varied among syllable types. Almost all CVC tone pairwise comparisons were significant for both males and females (13 for males, 11 for females). In contrast, only 6 pairs for males and 3 pairs for females were found to be significant for CV syllable type and 9 pairs for males and 10 pairs for females for CVV syllable type, highlighting the significant interaction between syllable type and lexical tone on nasalance. In the current study, for both males and females, significant pairwise comparisons were found for tones 1 and 4 and tones 1 and 5 for all 3 syllable types. Unfortunately, no comparisons can be made with earlier published studies. She and colleagues 40 found no significant interaction effect of syllable and tone, although this may be due to the small sample size used in their study, and Cheung did not look at the effect of syllable type and lexical tone on nasalance.
A very consistent pattern was found in our results pertaining to the effect of sex on nasalance. Females were found to consistently have higher nasalance scores than males for all 6 lexical tones and across all syllable types. Higher nasalance in females have been observed by other studies and languages.54,55,63 The recent review by Pereira and colleagues 27 identified that such differences tended to be observed in adolescents and adults and not in children. This is highly attributable to males having larger vertical vocal tract lengths and posterior cavity lengths in the pubertal and postpubertal ages. 64 Furthermore, the length and thickness of the velum increases during adolescence particularly in males, which can cause impedance of nasal airflow resulting in lower nasalance scores. 65 An added finding is that the mean difference in nasalance between males and females, consisted of a wider range of values (as high as 7.69%) for CVC syllable type (containing a final nasal consonant) than the range identified for CV and CVV syllables. This may be explained by the phenomenon of anticipatory nasal assimilation effect. Thompson and Hixon 66 found that females had higher nasalance scores than males when the stimuli consisted of a high concentration of nasal consonants, which could be explained by anticipatory nasal assimilation. Additionally, Mandulak and Zajac 57 found that males were more likely to increase oral acoustic impedance instead of nasal acoustic energy transfer but not in females, resulting in lower nasalance.
Clinical Implications
The significant finding of lexical tones on nasalance has clinical implications. The higher mean nasalance for tone 1, in particular, requires careful consideration of the proportion of tone 1 words in any speech sample. Too many tone 1 words may inflate nasalance scores, invalidating the utility of the measure for clinical purposes as a supplement to diagnosis as well as an outcome of intervention. The principle can also be applied to sample sets for the perceptual ratings of hypernasality. Careful consideration must therefore be made when constructing speech sample sets for both perceptual ratings and the NasometerTM evaluation. The potential effects are reversed for tone 4 words with their low mean nasalance. Consideration of the proportion of tone 1 and tone 4 words may be seen to parallel the consideration given to the proportion of high vowels and nasal consonants in construction of speech sample sets for resonance ratings.66,67
Careful consideration must also be given to syllable type, particularly syllables with a CVC phonotactic structure. This is a common structure in many languages, although the type of final syllable consonant or consonant in the coda position, depends on the language. In Cantonese, only 6 consonants can occur in the coda position, /p, t, k/ and the nasals /m, n, ŋ/. The use of a nasal consonant in the coda position can inflate nasalance values and should perhaps be limited or avoided.
Another implication relates to cross-linguistic comparisons of speech outcomes. Given that the cleft population is small, national, and international collaboration and comparisons of speech and surgical outcomes are increasingly the norm. 41 Within the Scandcleft randomized controlled trial, 41 a single word list, known as the Restricted Word List was developed. There are strict guidelines for the construction of this list, from which the 9-word string is extracted (https://clispi.com/). The first 9 words of the Restricted Word List should contain only high vowels and place at the start of the word list (https://clispi.com/). The 9-word string has been recommended for use in cross-linguistic hypernasality ratings. 40 Including lexical tone languages in cross-linguistic cleft speech studies, considerations must therefore be placed on the tones of the 9 words used. As an illustration, for Cantonese-speakers, the currently constructed 9-word string extracted from the single word list used in the C-CSAT 47 contains 3 tone 1 words and 1 tone 4 word, which could potentially inflate ratings (Figure 1).
A third implication relates to whether separate norms are needed for males and females. Study findings consistently show that females have higher nasalance values for all tones and all syllable types ranging from a small mean difference of 1.90% to a large mean difference of 7.69%. Using the same normative data to assess hypernasality ratings or nasalance may inflate scores for females. Given that the NasometerTM is used globally as a supplement to diagnosis and as an outcome measure of intervention, it is imperative that further research is undertaken to clearly determine whether separate normative databases for males and females are indeed necessary.
Limitations and Future Directions
The study only investigated the effect of tone on nasalance based on single words, albeit with the inclusion of the various possible phonotactic structures in the Cantonese sound system. Repeating each stimulus across the 6 tones is not a normal phenomenon and does not reflect meaningful connected speech. Some participants found it difficult to produce the same tone for a consecutive 6 times. Self-calibration of tone production was observed in some participants after around 500 repetitions. Future studies are recommended to investigate lexical tone effect at a carrier phrase level (e.g., This is) which is more like connected speech.
The age of the participants in this study was limited to adults only. There is strong evidence of changes of vocal tract anatomy and physiology across the lifespan.68,69 Future studies should therefore investigate the effect of tone on nasalance in children and perhaps older adults as defined by the World Health Organization (i.e., ≥60). Pereira and colleagues 27 undertook a systematic review of the nasalance literature and using effect sizes, documented a clinically significant effect of age on nasalance, particularly between adults and children and between adults and adolescents.
Although detailed screening procedures were undertaken to rule out potential confounding factors of nasalance values, the study did not assess velopharyngeal closure during speech using direct visualization methods (e.g., nasendoscopy). It is plausible that for some individuals, instrumental investigations may reveal incomplete velopharyngeal closure in the absence of perceptual symptoms of velopharyngeal dysfunction. Although deemed minimally invasive by the Royal College of SLTs, 70 it may not be easy to obtain ethical approval to administer such an instrumentation which may also require the use of local anesthetic with normal participants.
Conclusion
Nasalance is influenced by lexical tone, syllable type and sex. This study provides important evidence for the effect of the 3 factors on nasalance in a lexical tone language, Cantonese. Study findings have strong clinical implications related to the construction of speech sample sets for use with the NasometerTM and perceptual ratings of resonance. Particularly relevant for cross-linguistic studies that include lexical tone languages, careful consideration should be given to the effect of tone on nasalance scores and hypernasality ratings.
Footnotes
Authors’ Note
Ethical Approval and Informed Consent Statements: This study was approved by the Joint Chinese University of Hong Kong-New Territories East Cluster Clinical Research Ethics Committee (approval no. 2023.485). Data Availability Statement: The datasets generated during and/or analyzed during the current study are not publicly available due to participants’ privacy but are available from the corresponding author on reasonable request.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
