Establishing Reliability and Validity of the FACE-Q Craniofacial Module for Pediatric Head and Neck Cancer

Abstract

Purpose:

We aimed to establish content validity and assess the psychometric properties of the FACE-Q Craniofacial Module, a patient-reported outcome measure, for use in pediatric and adolescent patients with head and neck cancer (HNC).

Methods:

To establish content validity (Part 1), between June 2017 and August 2019, cognitive interviews were conducted with survivors of pediatric HNC (n = 15), and input was obtained from clinical experts (n = 21). To examine item and scale performance (Part 2), Rasch Measurement Theory (RMT) analysis was performed using data from two international studies (n = 121).

Results:

Part 1: Qualitative data from 15 survivors and input from 21 experts provided evidence to support the use of the FACE-Q Craniofacial Module in pediatric HNC. Part 2: The field-test study sample included 121 survivors of pediatric HNC. RMT analysis provided evidence of reliability and validity for 10 FACE-Q scales. Data for each scale fit the RMT model. Scale reliability was high, with Person Separation Index and Cronbach's alpha values ≥0.82 for 9 scales. Mean scores on the Appearance, Psychological, and Social scales were higher for those who liked aspects of their face more. For participants with (vs. without) a facial difference, mean scores were lower for the Face, Jaws, Psychological, and Social scales.

Conclusion:

The FACE-Q Craniofacial Module evidenced reliability and validity for HNC survivors aged 8–29 years and can be used in research and clinical care to measure quality of life of pediatric survivors with HNC.

Introduction

Head and neck cancers (HNC), including Hodgkin lymphoma, rhabdomyosarcoma, thyroid carcinoma, and nasopharyngeal carcinoma, are estimated to account for between 0.25% and 15% of cancers diagnosed within the pediatric and adolescent population.¹ HNC and its treatment may impose a significant detriment to patients health-related quality of life (HRQL) and leave patients with a visible facial difference and problems with facial function (e.g., ability to show facial expression or to eat and drink).^2,3 Survivors of HNC often require rehabilitative treatment and reconstructive surgery to recuperate speech, swallowing, maxillofacial function, and to restore facial appearance.⁴ Tools are needed to measure outcomes from the patient perspective, to capture their experience, and how they feel the condition impacts their HRQL and function.⁵ For a patient-reported outcome measure (PROM) to be useful, its validity and reliability must be ascertained within the target population.^6,7

The FACE-Q Craniofacial Module is a PROM developed for children and young adults aged 8–29 years with facial differences.⁸ This module was developed because it was reported that existing PROMs used in this population lacked content validity, missing content related to appearance and facial function.^9,10 The FACE-Q Craniofacial Module was internationally field-tested in a sample of 2233 children and young adults, including HNC, from 12 countries.^11–13 Although the FACE-Q Craniofacial Module may be valuable for assessing outcomes after diagnosis and treatment of HNC, the qualitative phase of the development, only included a few patients with HNC. COnsensus-based Standards for the selection of health Measurement Instruments (COSMIN) criteria, which assess the quality of PROM development, require that at least seven individuals from the target population review the PROM to achieve the highest quality rating for a content validation study.¹⁴ The development study did not meet this threshold for HNC participants.^12,13

Furthermore, the analysis did not report separately on the validity and reliability of the FACE-Q Craniofacial Module among the HNC subgroup. Presenting psychometric properties for the HNC subgroup will help provide more direct evidence for the use of FACE-Q in this target population.¹⁵ Therefore, the aim of this article was to establish content validity and assess the psychometric properties of the FACE-Q Craniofacial Module, a PROM measuring outcomes in conditions associated with facial differences, for use in pediatric and adolescent patients with HNC. We hypothesized that scales from the FACE-Q Craniofacial Module would evidence sufficient content validity in pediatric and adolescent survivors of HNC.

Materials and Methods

This study had two parts. In Part 1, we examined the content validity of the FACE-Q Craniofacial Module scales in survivors of pediatric HNC and with clinical experts. In Part 2, we performed an exploratory psychometric analysis using data from an international field-test study sample. All aspects of the study were conducted according to COSMIN user's manual criteria.^14,15

This study was approved by the Hamilton Integrated Research Ethics Board No. 14-763 and by the Ethics Board at each participating site.

Part 1: Establishing content validation

Participants

Inclusion criteria

English-speaking, diagnosed with a HNC before 18 years of age, a visible and/or functional facial difference caused by HNC, and at least 8 years of age at time of study.

Exclusion criteria

Cognitive impairment limiting independent participation in an interview.

Recruitment

Recruitment took place between June 2017 and August 2019 at the following sites: McMaster Children's Hospital (Ontario), The Hospital for Sick Children (Ontario), and British Colombia's Children's Hospital (British Columbia). A research team member approached potential patients to explain the study, and obtained informed consent for those interested. An interview was scheduled to take place either in person or by telephone (option available only for those at least 12 years of age), depending on participant preference. A $50 gift card was provided to participants after the interview to thank them for their time.

Interviews

Semistructured interviews were conducted by trained interviewers using a cognitive interview guide. The cognitive interview approach used was adapted from Willis.¹⁶ Participants were asked about the relevance, comprehensiveness, and comprehensibility of the instructions, response options, and items. The “think aloud” approach was used whereby participants verbalized their thoughts as they worked through each scale.^17–19 Probing questions were used to better understand problems with item interpretation.^19–23

Twenty-six FACE-Q Craniofacial Module scales/checklists were included for review, with only the Birthmark scale excluded from the original module (see Table 1). All participants (n = 15) were asked to review six broadly applicable scales measuring appearance (Face scale), adverse effects (Face Adverse Effects scale), and HRQL (Appearance Distress, Psychological, School, and Social scales). Additional scales (Table 1) were reviewed if deemed relevant to participants based on their facial difference(s). Interviews were audio recorded, transcribed verbatim, and coded line-by-line by two researchers (Y.W. and E.T.). The first six interviews were double coded. Codes were transferred into a Microsoft Excel (2016) spreadsheet for analysis and reviewed by the research team to determine content validity.

Table 1.

FACE-Q Craniofacial Module Scales

Scales	N items in each scale (Field-test version)	Participants who reviewed each scale (N)
Facial appearance
Face^a	11	15
Smile	15	6
Eyes	14	5
Forehead	15	5
Cheeks	10	4
Jaws	13	4
Lips	11	4
Teeth	19	3
Head	8	3
Nose	13	3
Chin	12	2
Ears	18	2
Nostrils	6	1
Facial function
Eyes	7	7
Face	12	5
Breathing	7	3
Eating and drinking	13	3
Speech	14	1
HRQL
Appearance distress^a	10	13
Psychological^a	10	13
Social^a	11	12
School^a	10	10
Speech distress	10	0
Adverse effects
Face^a	17	13
Eye	8	7
Ears	12	0

Core scales.

HRQL, health-related quality of life.

Expert input

In September 2017, experts provided feedback on the FACE-Q Craniofacial Module via a secure web-based Research Electronic Data Capture (REDCap).^24,25 Invitations were sent via email, with one reminder sent after 7 days. Experts (n = 21) reviewed each scale and were asked to provide feedback on all aspects of each scale (instructions, response options, items), and to indicate items that should be added or removed. A comment section was provided after each scale and at the end of the survey for additional feedback. Data were exported from REDCap into Excel for analysis to identify suggestions for scale improvements.

Part 2: Psychometric properties

Participants

Participant data (n = 121) came from the following two sources: 1.

The FACE-Q Craniofacial Module field-test study^12,13: Participants were aged 8–29 years with a visible and/or functional facial difference.^12,13 Participants were recruited in plastic surgery clinics in 12 countries between 2016 and 2019. Participants completed relevant FACE-Q scales and questions. A clinical form was used by recruiters to collect information about the type and severity of their facial condition. The form included a matrix that asked about the severity (i.e., no, yes-minor, yes-major) of impact of the condition on facial appearance (e.g., eyes, forehead, lips) and facial functions (e.g., eating, drinking, speech). Data were entered into a secure REDCap database hosted at McMaster University (Canada).^24,25

A study of HNC patients from France, The Netherlands, United Kingdom, and the United States diagnosed before 18 years of age and who were 8–29 years of age at the time of the study. Recruitment took place in oncology outpatient clinics.²⁶ Participants completed the following 10 FACE-Q Craniofacial scales: Face, Nose, Lips, Teeth, Jaws, Speech, Speech Distress, Psychological, Social, and School. Clinical data were obtained from hospital records.

Statistical analysis

An exploratory Rasch Measurement Theory (RMT) analysis was performed in RUMM2030 software (RUMM Laboratory Pty Ltd., Duncraig, Western Australia, 1998–2020) using the unrestricted Rasch model for polytomous data.²⁷ Fit of the data to the Rasch model was examined statistically and graphically. The following tests were performed to assess how well the scales and items performed psychometrically:

Threshold maps were examined to determine if the response categories (e.g., not at all, a little, quite a bit, very much) for each scale worked as intended.

Item fit was examined graphically (item characteristic curves) and statistically [(log residuals (item–person interaction) and chi-square values (item–trait interaction)]. Ideal fit residuals should fall between ±2.5, with chi-square values nonsignificant after Bonferroni adjustment.²⁸ We examined individual item fit and overall fit of the data to the Rasch model.

We inspected local independence of items by examining the residual correlation matrix to identify any pairs of items with residuals that correlated ≥0.30. Locally dependent pairs of items were included in a subtest to identify their impact on Person Separation Index (PSI) values.

Finally, we examined scale reliability with PSI and Cronbach's alpha values. Reliability values ≥0.70 were considered adequate.

SPSS (IBM SPSS Statistics, Version 26; IBM Corp.) was used for further analysis. Each scale was scored from 0 to 100 score using FACE-Q Craniofacial Module transformation tables.¹¹ Construct validity was assessed using predefined hypotheses of expected differences. First, we hypothesized that appearance scale scores would be incrementally higher for those who reported liking (not at all/a little, quite a bit, very much) the appearance of their face, and specific facial areas (i.e., jaws, lips, nose, teeth). Second, we predicted that appearance scale scores would be lower for those with a “major or minor” difference compared with those who had “no” difference. Differences were tested using Independent t-tests or analysis of variance.

Results

Part 1: Establishing content validity

Sample characteristics

Fifteen survivors of HNC aged 8–30 years participated in a cognitive interview (Table 2). Ten participants had completed treatment in the past 5 years. The sample included nine females and six males. Most participants had a history of rhabdomyosarcoma (n = 8). Treatments included chemotherapy (n = 11), surgery (n = 8), and radiation (n = 8). Participants provided feedback on 23 of the 26 FACE-Q Craniofacial Module scales. The number of survivors who provided feedback on the core scales was as follows: Face (n = 15), Appearance Distress (n = 13), Psychological (n = 13), School (n = 10), Social (n = 12), and Face Adverse Treatment Effects (n = 13). Table 2 shows additional information on each scale.

Table 2.

Participant Demographics and Clinical Characteristics of the Content Validation Sample

Patient participants
Variable	Sample N = 15
Age in years
8–12	2
13–17	10
18–30	3
Sex
Male	6
Female	9
Diagnosis
Rhabdomyosarcoma	8
Hodgkin Lymphoma	1
Brain tumor	3
Thyroid tumor	1
Soft tissue sarcoma	1
Unspecified	1
Treatment^a
Chemotherapy	11
Radiation	8
Surgery	8
Blood transfusion	1
Unspecified	3
Time since treatment completion
0–3 years ago	0
4–5 years ago	1
More than 5 years ago	10
Unsure	4

A single participant may have had one or more forms of treatment.

Fifty experts in the field of oncology were invited to provide feedback and 21 (42%) responded. Experts were from Canada (n = 17), United States (n = 3), and the Netherlands (n = 2). Most had a clinical focus in pediatric oncology (n = 17). Participants included oncologists (n = 9), nurse practitioners (n = 3), psychologists (n = 3), otolaryngologists (n = 2), HNC surgical trainee (n = 1), researchers (n = 2), and one was unspecified. The 21 experts provided feedback on all 26 scales.

Comprehensibility

Instructions and response options for the core scales were thought to be clear and “self-explanatory” (Age 30, Male) by most survivors and experts. All instructions were interpreted as intended. Most participants thought the response options were good with “clear distinct categories” (Age 22, Female). Most items in the core scales were interpreted by participants as intended. A total of six items in the core scales and six items from three appearance scales (Chin, Lips, Smile), and two function items from the Facial and Speech scales were identified as unclear or difficult to understand (see Table 3).

Table 3.

Items in Core Scales and Example Quotes Showing Comprehensibility

Face Appearance Instructions: HOW DOES YOUR FACE LOOK? Answer each question by circling one number. Please answer thinking of how your face looks NOW.
How much do you like how your face looks from the side (your profile)?	It depends which side [13 years, Female]. What does that mean? Which side? [11 years, Male] Depends on which side [30 years, Male]. Not sure whether to answer using preferred side or not preferred side, thought of preferred side reading [the] question [22 years, Female]. Do you need to ask about left and right side in profile question? [Canada, pediatric oncologist]
Face Adverse Effects Instructions: HOW DOES YOUR FACE FEEL? Answer each question by circling one number. Please answer thinking of the PAST WEEK. NOTE: Does one side of your face feel better than the other? If yes, please answer thinking of the side that feels worse.
My face feels firm when I touch it.	Does that mean it's stable? [11 years, Male] I don't know what they would consider a perfectly normal response to that out of the responses they give me to choose from. [30 years, Male] I'm not entirely sure what is meant by that, I guess. It's not relevant to me so I don't know what it means as much. [18 years, Female] What does this mean? [14 years, Female]
School Instructions: HOW IS YOUR SCHOOL LIFE? Answer each question by circling one number. Please answer thinking of the PAST WEEK. NOTE: if you were not in school this week, think about when you were last in school.
I am happy at school.	Is anyone really that happy at school? [18 years, Female]
Social Instructions: HOW IS YOUR SOCIAL LIFE? Answer each question by circling one number. Please answer thinking of the PAST WEEK.
People treat me the same as everyone else.	I think racial wise, maybe. Lots of people are racist, so they might act differently toward one person versus another person. So then I think they always treat me the same, the way they would treat someone else the same. [17 years, Male]
I feel confident when I go out (like to a party).	When I go to a party and I'm with friends, I feel confident when I go out, because I'm thinking of when I'm with my friends. [17 years, Male] It's more than just my friend group and there will be people there that don't really accept it. [16 years, Female]
It's okay when people look at my face.	Looking? Like how long do they look? [15 years, Female] Maybe if you change the question to stare then it would be easier to answer that question [15 years, Female].

Comprehensiveness

Eight concepts were suggested for new items, seven by patient participants and one by an expert. Patient participants suggested adding tongue movement and taste items to the Eating and Drinking scale, ear wax to the Ears Adverse Effects scale, and lip color to the Lips scale. Four experts identified swallowing/dysphagia as an important concept that was missing from the Eating and Drinking scale.

Relevance

The recall periods (now, past week) and response options were deemed appropriate by all patients and experts. Most participants found the Face (11 participants of 15), Appearance Distress (10 participants out of 13), Psychological (10 participants out of 13), School (8 participants out of 10), and Social (9 participants out of 12) scales to measure relevant concepts to HNC. Participant and expert impressions of the FACE-Q Craniofacial Module were positive. General comments from participants include recounting that the questionnaire “made [them] think a lot more about all the different struggles that people experience from the same thing” (Age 22, Female) and another participant “[liking] the questionnaire a lot because [she] got to express… how [she feels] about [herself] and someone was actually listening” (Age 14, Female).

Experts noted that the “drawings/illustrations are very helpful [USA, social worker] and that scales are “…detailed enough to really assess for fine defects in eyelid, lacrimal, and visual function” (United States, Pediatric otolaryngologist) (see Table 4).

Table 4.

Items in Core Scales and Example Quotes Showing Relevance

Scale	Relevant	Example quote for relevance
Face appearance	11/15	Measures something important, this is a good indicator of how I feel generally about the shape and look of my facial features and my head [16 years, Female].
Face adverse effects	6/13	Scale is trying to ask about how your treatments have affected your face and how your face feels and how you feel about your face [15 years, Female].
Appearance distress	10/13	Scale is trying to ask if I'm self-conscious about my face. How I feel. How do I feel about how other people feel about my face [14 years, Male].
Psychological	10/13	Scale measures self-worth and your kind of mental quality of life. And how you feel every day. So, I think those are all very important things to measure [16 years, Female].
School	8/10	It shows if I am really enjoying school life, because school from elementary to high school is a very large part of my life. It's 8 hours of the day for most of the year. If I'm not really enjoying school, then the rest of my life and my quality of life would end up suffering greatly [16 years, Female].
Social	9/12	Scale is trying to ask about how people feel around others and their social life [16 years, Female].

Part 2: Quantitative

Psychometric analyses included 121 survivors of pediatric HNC (Table 5). Participants were mostly greater than 14 years of age at time of recruitment (66.9%), with a total of 63 male and 58 female participants. The majority of the sample were from either the Netherlands (33.1%), or the United Kingdom (24.8%).

Table 5.

Participant Demographics and Clinical Characteristics of the 121 Survivors of Head and Neck Cancer from the Field-Test Sample

	N = 121	%
Country
Australia	1	0.8
Canada	20	16.5
The Netherlands	40	33.1
United States	6	5
United Kingdom	30	24.8
France	21	17.4
Brazil	3	2.5
Age in years
8–10	17	14
10–13	23	19
14–17	38	31.4
18–29	43	35.5
Gender
Male	63	52.1
Female	58	47.9
Facial difference
Yes (major/minor)	51	56.7
No	39	43.3

Table 6 shows the results for the scale level RMT results, including the sample size for each scale. Of the 105 items tested, 102 had ordered thresholds, 104 had fit residuals within the ±2.5 criteria, and all items had nonsignificant chi-square p-values after Bonferroni adjustment. The Speech Distress scale had PSI and Cronbach's alpha values of >0.71 and >0.79, respectively. Pairs of items in five scales had residual correlations >0.30. When subtests were performed, the PSI values dropped a maximum of 0.03 for one scale (Lips scale). The proportion of the sample to score on the range of each scale spanned from 73.1% for the Lips scale to 96.7% for the Face scale. Data from the sample fit the Rasch Model for seven scales tested, with marginal misfit in the remaining three scales. Reliability was high, with PSI and Cronbach's alpha values ≥0.82 with and without extremes for nine scales (Table 5).

Table 6.

Rasch Measurement Theory Scale Level Statistics

Scale	Full sample	Sample in Rasch analysis	% scored on scale	Chi-square	DF	p-value	PSI + extremes	PSI – extremes	Cronbach's alpha + extremes	Cronbach's alpha – extremes
Face	121	117	96.7	30.7	18	0.03	0.88	0.87	0.90	0.89
Jaws	88	71	80.7	7.1	14	0.93	0.91	0.88	0.96	0.91
Lips	104	76	73.1	31.1	18	0.03	0.88	0.88	0.95	0.89
Nose	99	77	77.8	22.0	24	0.58	0.93	0.92	0.97	0.94
Teeth	95	90	94.7	31.2	16	0.01	0.88	0.87	0.91	0.89
Psychological	118	106	89.8	17.9	20	0.60	0.90	0.90	0.93	0.91
School	74	64	86.5	17.4	20	0.63	0.82	0.82	0.89	0.86
Social	118	105	88.9	13.4	20	0.86	0.84	0.85	0.90	0.88
Speech distress	94	76	80.9	23.9	20	0.25	0.71	0.71	0.83	0.79
Speech	96	72	75.0	28.6	24	0.24	0.82	0.85	0.92	0.88

DF, degrees of freedom; PSI, Person Separation Index.

Descriptive statistics for scale scores in comparison to a cleft lip and palate population²⁹ are provided in Table 7. Most of the construct validation hypotheses were met. Appearance scale scores were incrementally higher for those who reported that they liked their face overall and specific parts of their face more (p ≤ 0.001; Appendix Table A1). Psychological and Social scales scores were also higher for those who reported that they liked their face overall and facial areas more (p ≤ 0.003). An exception to these findings was the Teeth scale (p ≥ 0.07; Appendix Table A1). As predicted, the mean scores for those with a facial difference were lower for the Face (p = 0.009), Jaws (p = 0.025), Psychological (p = 0.022), and Social (p = 0.023) scales, compared with participants without an observable facial difference. No differences were observed between groups for the remaining scales.

Table 7.

Comparison of Descriptive Statistics for FACE-Q Craniofacial Module Scale Scores

	Participants with HNC (n = 121)					Participants with cleft lip and/or palate (previously published data²⁹)
	N	Minimum	Maximum	Mean	Standard deviation	N	Mean	Standard deviation
Face	120	7	100	55	17.5	2402	63	19.6
Jaws	88	0	100	63	24.0	1476	68	26.8
Lips	104	10	100	72	21.8	2213	63	24.9
Nose	99	0	100	69	22.6	2298	59	23.1
Teeth	95	0	91	53	18.5	2312	55	23.7
Psychological	117	15	100	67	18.8	2255	74	18.9
Social	116	22	100	70	17.4	2263	73	17.3
Speech	95	42	100	76	16.9	1869	69	21.3
Speech distress	96	28	100	76	19.7	1890	69	21.3

Discussion

PROMs validation is an ongoing process, and it is important to ensure that a PROM is both valid and reliable in target populations, especially when that population may not have been the primary focus of the PROM development process. Since the qualitative phase to develop FACE-Q Craniofacial Module was not focused on survivors of HNC, our team conducted a supplementary qualitative study to determine if the scales had content validity for this patient population. Our study provides evidence that the FACE-Q Craniofacial Module measures concepts relevant to survivors of pediatric HNC. Furthermore, analysis of the data from 121 participants supported the validity and reliability of the FACE-Q Craniofacial Module for use in survivors of pediatric HNC.

Content validity is important from a clinical perspective because it ensures that the measure is capturing both the intended construct and outcomes important to the patient. Overall, participants found the scales to be both understandable and relevant, as well as covering all aspects that they considered important under the construct. Experts in HNCs also found the scales relevant and comprehensive. Comprehensibility of the scales is not a property for experts to assess in relationship to content validation. Our study exceeded COSMIN guidelines for sample sizes in qualitative content validation studies, with seven or more responses per item for most of the scales.¹⁴

From a psychometric perspective, this study has shown that the 10 FACE-Q Craniofacial Module scales evidenced reliability and validity for children and young adult survivors of HNC. The response categories for the 10 scales work as intended, and item fit statistics indicated a good fit of the data to the Rasch model. The scales had high reliability indicated by PSI values. Cronbach's alpha values provided evidence of the scales' internal consistency, exceeding COSMIN requirements.¹⁵ These finding are important because they show that the items in each scale worked together to measure the intended constructs. The acceptance of the predefined hypotheses showed that the scales were able to detect differences between groups, providing further support that the scales measured what was intended in this population.

The measurement of patient perception of HRQL is crucial to assessing clinical outcomes of rehabilitative treatments and reconstructive surgery, as the primary goal of these treatments is to restore appearance and facial function. A recent study found that adverse effects graded by physicians were weakly correlated to many patient-reported outcomes in survivors of HNC and advised the use of PROMs to help incorporate patient concerns into care plans.²⁶ At least 40 different instruments have been used to measure HRQL of patients with HNC.³⁰ While there is a large number of HRQL instruments used with pediatric HNC patients, many of these PROMs were not developed or validated in samples that include such patients.^9,30,31 Reviews of existing PROMs for patients with HNC and for pediatric patients with facial differences highlight the heterogeneity of measures that have been used, which has impeded comparisons of outcomes across studies.^9,31

Importantly, existing PROMs that are used in patients with facial conditions, including HNC, lack concepts related to appearance and facial function, which are considered important aspects of HRQL in this patient population.^9,30 The FACE-Q Craniofacial Module addresses these identified gaps in available PROMs, and can be used to evaluate outcomes for pediatric HNC.

A limitation of part one of this study was that only participants with five distinct HNC diagnoses were included for content validation. While rarer HNC diagnoses in the pediatric population were not included (e.g., oral cancer), our sample included the most common pediatric HNCs.¹ Not all participants reviewed all core scale due to time constraints. However, the 10 core scales reviewed met the highest COSMIN rating, which requires ≥7 participants per item. In addition, the current study lacks feedback from other health care professionals involved in the care of pediatric HNC patients, such as speech language pathologists.

Part two of this study was performed on a limited sample size, which meant that the psychometric findings should be considered exploratory.³² Furthermore, only 10 FACE-Q Craniofacial Module scales were examined. In addition, participants' specific HNC type in the field-test study is unknown. Future research with a large sample of patients is warranted to confirm findings.

Conclusion

The FACE-Q Craniofacial Module addresses an important gap in measurement of appearance and facial function from the patient perspective. This module evidenced content validity and acceptable psychometric properties for patients with HNC, providing support for its use within clinical practice or research for this population. Further information about the FACE-Q Craniofacial Module can be found at https://qportfolio.org/face-q/craniofacial/.

Footnotes

Authors' Contributions

Y.W.: Conceptualization, Methodology, Data Curation, Investigation, Formal Analysis, and Writing-Original draft preparation; C.R.: Data Curation, Formal Analysis, Investigation, and Writing-Original draft preparation; E.T.: Methodology, Investigation, Data Curation, and Writing-Reviewing and Editing; P.C.N.: Investigation and Writing-Reviewing and Editing; E.B.: Investigation and Writing-Reviewing and Editing; D.D.: Investigation and Writing-Reviewing and Editing; K.W.R.: Investigation and Writing-Reviewing and Editing; A.K.: Formal Analysis, Supervision, Funding Acquisition, Methodology, and Writing-Reviewing and Editing.

Disclaimer

The analyses, conclusions, opinions, and statements expressed herein are solely those of the authors and do not reflect those of the funding or data sources; no endorsement is intended or should be inferred.

Author Disclosure Statement

A.K. and K.W.R. are codevelopers of the FACE-Q Craniofacial Module described in this publication and share in any license revenues as royalties based on their institutions' inventor sharing policy for their use in for-profit study. The other authors have no conflict of interest to declare in relationship to this work.

Funding Information

This qualitative portion of this study was supported by the Pediatric Oncology Group of Ontario (POGO) Research Unit. The quantitative portion of this study was supported by the Canadian Institutes of Health Research (FRN #148779).

Appendix

Appendix A1.

Comparison of Scores Between Participants With and Without a Facial Difference

Facial difference	N	Mean	Standard deviation	Standard error mean	p
Face
No	39	61	19	3	0.009
Yes	51	51	15	2	0.009
Jaws
No	33	70	26	4	0.025
Yes	44	57	22	3	0.025
Lips
No	38	79	22	4	0.075
Yes	49	70	22	3	0.075
Nose
No	39	73	23	4	0.16
Yes	49	66	22	3	0.16
Teeth
No	39	54	21	3	0.743
Yes	48	52	17	2	0.743
Psychological
No	38	72	16	2	0.022
Yes	50	63	20	3	0.022
Social
No	38	76	17	3	0.023
Yes	49	67	15	2	0.023
Speech
No	37	79	17	3	0.214
Yes	51	75	17	2	0.214
Speech distress
No	37	77	18	3	0.909
Yes	51	77	21	3	0.909

References

Arboleda

, de Mendonça

, Lopez

, et al. Global frequency and distribution of head and neck cancer in pediatrics: A systematic review. Crit Rev Oncol Hematol, 2020; 148:102892; doi: 10.1016/j.critrevonc.2020.102892

Hamilton

, Mahdavi

, Martinez

, et al. A cross-sectional assessment of long-term effects in adolescent and young adult head and neck cancer survivors treated with radiotherapy. J Cancer Surviv, 2022; 16(5):1117–1126; doi: 10.1007/s11764-021-01103-w

Häu ßler

, Stromberger

, Olze

, et al. Head and neck rhabdomyosarcoma in children: A 20-year retrospective study at a tertiary referral center. J Cancer Res Clin Oncol, 2018; 144(2):371–379; doi: 10.1007/s00432-017-2544-x

Eades

, Chasen

, Bhargava

. Rehabilitation: Long-term physical and functional changes following treatment. Semin Oncol Nurs, 2009; 25(3):222–230; doi: 10.1016/j.soncn.2009.05.006

Krogsgaard

, Brodersen

, Christensen

, et al. What is a PROM and why do we need it?. Scand J Med Sci Sports, 2021; 31(5):967–971; doi: 10.1111/sms.13892

Prinsen

, Mokkink

, Bouter

, et al. COSMIN guideline for systematic reviews of patient-reported outcome measures. Qual Life Res, 2018; 27(5):1147–1157; doi: 10.1007/s11136-018-1798-3

Terwee

, Prinsen

, Chiarotto

, et al. COSMIN methodology for evaluating the content validity of patient-reported outcome measures: A Delphi study. Qual Life Res, 2018; 27(5):1159–1170; doi: 10.1007/s11136-018-1829-0

Longmire

, Wong Riff

KWY

, O'Hara

, et al. Development of a new module of the FACE-Q for children and young adults with diverse conditions associated with visible and/or functional facial differences. Facial Plast Surg, 2017; 33(5):499–508; doi: 10.1055/s-0037-1606361

Wickert

, Wong Riff

, Mansour

, et al. Content validity of patient-reported outcome instruments used with pediatric patients with facial differences: A systematic review. Cleft Palate Craniofac J, 2018; 55(7):989–998; doi: 10.1597/16-148

10.

Tapia

, Epstein

, Tolmach

, et al. Health-related quality-of-life instruments for pediatric patients with diverse facial deformities: A systematic literature review. Plast Reconstr Surg, 2016; 138(1):175–187; doi: 10.1097/PRS.0000000000002285

11.

Q Portfolio. FACE-Q Craniofacial: A User's Guide for Researchers and Clinicians, 2022. Available from: https://qportfolio.org/wp-content/uploads/2022/04/FACE-Q-CRANIOFACIAL-USERS-GUIDE-1.pdf [Last accessed: July 25, 2023].

12.

Klassen

, Rae

, Riff

KWW

, et al. FACE-Q Craniofacial Module: Part 1 validation of CLEFT-Q scales for use in children and young adults with facial conditions. J Plast Reconstr Aesthet Surg, 2021; 74(9):2319–2329; doi: 10.1016/j.bjps.2021.05.040

13.

Klassen

, Rae

, Riff

, et al. FACE-Q craniofacial module: Part 2 Psychometric properties of newly developed scales for children and young adults with facial conditions. J Plast Reconstr Aesthet Surg, 2021; 74(9):2330–2340; doi: 10.1016/j.bjps.2021.03.009

14.

Terwee

, Prinsen

, Chiarotto

, et al. COSMIN Methodology for Assessing the Content Validity of PROMs–User Manual. Amsterdam: VU University Medical Center. 2018. Available from: https://www.cosmin.nl/wp-content/uploads/COSMIN-methodology-for-content-validity-user-manual-v1.pdf [Last accessed: July 25, 2023].

15.

Mokkink

, Terwee

, Patrick

, et al. COSMIN Checklist Manual. Amsterdam: University Medical Center. 2012 Jan. Available from: https://faculty.ksu.edu.sa/sites/default/files/cosmin_checklist_manual_v9.pdf [Last accessed: July 25, 2023].

16.

Willis

. Cognitive interviewing in practice: Think-aloud, verbal probing, and other techniques. Cognitive interviewing a tool for improving questionnaire design [Internet] Thousand Oaks, Calif, 2005; 42–65.

17.

Collins

. Pretesting survey instruments: An overview of cognitive methods. Qual Life Res, 2003; 12(3):229–238; doi: 10.1023/a:1023254226592

18.

Van Someren

, Barnard

, Sandberg

JAC

. The Think Aloud Method: A Practical Approach to Modelling Cognitive. London: Academic Press. 1994;

19.

Willis

GB.

Cognitive Interviewing: A Tool for Improving Questionnaire Design. Sage Publications: Thousand Oaks, CA; 2004.

20.

Garcia

. Cognitive interviews to test and refine questionnaires. Public Health Nurs, 2011; 28(5):444–450; doi: 10.1111/j.1525-1446.2010.00938.x

21.

Madans

, Miller

, Maitland

, et al. Cognitive interviewing. In: Question Evaluation Methods: Contributing to the Science of Data Quality. (Madans J, Miller K, Maitland A, Willis G, eds.) John Wiley & Sons: Hoboken, N.J; 2011; pp. 51–75.

22.

Willis

GB.

Analysis of the Cognitive Interview in Questionnaire Design. Oxford University Press: New York, NY; 2015.

23.

Willis

, Artino

Jr . What do our respondents think we're asking? Using cognitive interviewing to improve medical education surveys. J Grad Med Educ, 2013; 5(3):353–356; doi: 10.4300/JGME-D-13-00154.1

24.

Harris

, Taylor

, Minor

, et al. The REDCap consortium: Building an international community of software platform partners. J Biomed Inform, 2019; 95:103208; doi: 10.1016/j.jbi.2019.103208

25.

Harris

, Taylor

, Thielke

, et al. Research electronic data capture (REDCap)—A metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform, 2009; 42(2):377–381; doi: 10.1016/j.jbi.2008.08.010

26.

Morfouace

, Hol

, Schoot

, et al. Patient-reported outcomes in childhood head and neck rhabdomyosarcoma survivors and their relation to physician-graded adverse events—A multicenter study using the FACE-Q Craniofacial module. Cancer Med, 2023; 12(4):4739–4750; doi: 10.1002/cam4.5252

27.

Hobart

, Cano

. Improving the evaluation of therapeutic interventions in multiple sclerosis: The role of new psychometric methods. Health Technol Assess, 2009; 13(12):iii, ix–x, 1–177; doi: 10.3310/hta13120

28.

Gagnier

, Lai

, Mokkink

, et al. COSMIN reporting guideline for studies on measurement properties of patient-reported outcome measures. Qual Life Res, 2021; 30(8):2197–2218; doi: 10.1007/s11136-021-02822-4

29.

Klassen

, Riff

, Longmire

, et al. Psychometric findings and normative values for the CLEFT-Q based on 2434 children and young adult patients with cleft lip and/or palate from 12 countries. CMAJ, 2018; 190(15):E455–E462; doi: 10.1503/cmaj.170289

30.

Anthony

, Selkirk

, Sung

, et al. Considering quality of life for children with cancer: A systematic review of patient-reported outcome measures and the development of a conceptual model. Qual Life Res, 2014; 23(3):771–789; doi: 10.1007/s11136-013-0482-x

31.

Ojo

, Genden

, Teng

, et al. A systematic review of head and neck cancer quality of life assessment instruments. Oral Oncol, 2012; 48(10):923–937; doi: 10.1016/j.oraloncology.2012.03.025

32.

Chen

W-H

, Lenderking

, Jin

, et al. Is Rasch model analysis applicable in small sample size pilot studies for assessing item characteristics? An example using PROMIS pain behavior item bank data. Qual Life Res, 2014; 23(2):485–493; doi: 10.1007/s11136-013-0487-5