Abstract
Introduction
Gender Dysphoria (GD), as defined by the Diagnostic and Statistical Manual of Mental Disorders (DSM 5), describes a marked incongruence between one’s experienced or expressed gender and the gender one was assigned at birth. 1 The World Professional Association for Transgender Health (WPATH) estimates that GD affects approximately 1 in 30 000 male-assigned births and 1 in 100 000 female-assigned births. 2 However, the number of individuals seeking treatment for GD has increased in recent decades, as the Amsterdam Cohort of Gender Dysphoria Study reported that the number of people assessed for GD at their clinic showed a 20-fold increase from 34 patients in 1980 to 686 patients in 2015. 3
As the number of patients seeking treatment for GD has increased in recent years, understanding the efficacy of our current treatment modalities for GD is essential. Untreated GD is associated with significant psychological distress, and many individuals who identify as Transgender and Gender Diverse (TGD) are more likely to experience mental health problems, most commonly anxiety and depression. 4 Despite this, the prognosis for GD is generally positive with treatment such as psychotherapy, hormone replacement therapy (HRT), and surgery.2,3,5 WPATH describes gender-affirming surgery (GAS), such as genital, breast, and facial reconstruction, as medically necessary, especially with respect to alleviating psychological distress for many individuals who identify as TGD. 2 Furthermore, Almazan and Keuroghlian 6 have described several mental health benefits of GAS, citing significantly lower odds of psychological distress, tobacco smoking, and suicidal ideation compared to TGD individuals with no history of GAS.
Facial feminization surgery (FFS), a subset of GAS, encompasses a group of surgical procedures for transgender females that aim to align one’s facial features with their gender identity. 7 These surgeries may be performed on the forehead, orbits, nose, chin, jaw, and neck and include procedures such as supraorbital contouring, hairline lowering, brow lift, rhinoplasty, frontal sinus setback, mandibular angle reduction, genioplasty, malar fat grafting, and more. 7 While the type of surgery used to achieve a more feminine face may differ between patients, general approaches often involve reducing the prominence of the brow and increasing the angle of the nasofrontal angle in the upper face, rhinoplasty for increasing the nasofrontal and nasolabial angles in the midface, and reducing the prominence of the chin and squareness of the jaw through genioplasty and mandibular contouring in the lower face. 8
While overall patient satisfaction with FFS has been reported in the current literature, 9 little has been described about how GD levels are directly impacted by FFS alone. Furthermore, to our knowledge, there appears to be no validated or standardized questionnaire used to assess FFS outcomes, leading to lack of a thorough understanding of the impact of FFS on treatment of GD. This gap underlines the importance of establishing consistent validated means for evaluating GD in TGD patients who undergo FFS. In this systematic review, we evaluate the current literature on how GD levels and related psychosocial outcomes are measured and reported in the context of FFS. Our goal is to identify trends and assess the tools used to evaluate these outcomes. Additionally, we provide a summary of existing evidence on the potential efficacy of FFS in reducing GD and enhancing quality of life (QOL) and psychosocial well-being. This analysis hopes to support the development of a validated questionnaire for assessing the impact of FFS on TGD individuals.
Methods
A systematic review was performed using the Covidence Software. The following MeSH search was utilized to index articles discussing the evaluation of GD as a result of FFS in Scopus and PubMed: (“Facial Feminization Surgery”[MeSH] OR “Facial Feminization Surgery” OR “FFS” OR “Facial feminization” OR “Transgender facial surgery”) AND (“Gender Dysphoria”[MeSH] OR “Gender Dysphoria” OR “Gender incongruence” OR “Transgender identity” OR “Transsexualism.”) Two independent evaluators reviewed studies for inclusion and conflicts were resolved by a third evaluator.
Eligibility Criteria
Articles were identified as eligible for inclusion utilizing the Population, Intervention, Comparator, Outcome, or “PICO” framework, with the overarching question being, “How are levels of gender dysphoria affected by facial feminization surgery in male to female transgender patients?”
Populations included were those that studied adult, male-to-female transgender patients, or patients diagnosed with GD. Included interventions were facial feminization surgeries describing procedures on the skeleton, including but not limited to frontal sinus setback, contouring, single or double genioplasty, and reduction of facial bones. Studies were analyzed if they discussed patient outcomes such as GD, psychological well-being, QOL, or satisfaction with surgery. Included studies were observational studies, randomized controlled clinical trials, case series, and case studies.
Excluded studies were those that analyzed pediatric populations or cadavers, or examined FFS that did not include skeletal augmentation (eg, lip fillers, Botox) and did not evaluate patient satisfaction or other psychosocial factors postoperatively (Table 1).
Inclusion and Exclusion Criteria.
Data Collection
Articles were indexed to see how the results of FFS and its impact on GD were analyzed in each study. Since factors impacting GD are multifactorial, multiple assessment modalities of GD were collected, breaking down the assessment of how FFS impacts GD into surveys, subjective, and objective assessments used by each study. Surveys used to understand patient-reported outcomes were recorded and evaluated to determine whether they were fully or partially validated or not validated at all. Subjective assessments included metrics such as direct and indirect assessments of GD. Direct assessments of GD were those that specifically asked patients about GD levels (eg, “Did your surgery increase or decrease your gender dysphoria?”), whereas indirect assessment of GD included those that did not utilize the term “gender dysphoria” but instead asked about alignment with one’s gender as a result of surgery (eg, “Was your surgery important to your ability to live as a woman?”). Additional subjective metrics included analysis of psychosocial well-being, QOL, and satisfaction with surgery. Objective assessments included physical exams, use of independent raters, patient photographs, cephalometric measurements, and use of medical imaging. Finally, the results of these parameters, when reported, were analyzed to see if improvements were noted in GD, psychosocial well-being, QOL, and patient satisfaction with FFS.
Primary outcomes included (1) whether surveys, subjective, and/or objective metrics assessing FFS results were collected by studies and (2) the outcomes of those metrics when collected.
Results
Identification of Articles
A total of 660 studies were identified in PubMed and Scopus for review. After removing 116 duplicate studies, 544 articles remained for the title and abstract review. Following the review of titles and abstracts, 36 studies were selected for full-text review. Eighteen studies ranging from 2010 to 2024 were determined to be relevant for data extraction (Figure 1).

Preferred reporting items for systematic reviews and meta-analyses (PRISMA) flowchart, generated by Covidence Software.
Data Extraction
A total of 1301 patients undergoing FFS were evaluated across studies, with an average age of 34.8 years (Range: 21.0-47.5 years). In each study, the type of skeletal FFS procedures performed were evaluated, with focus on frontal sinus setback, bone contouring, genioplasty, reduction of facial bones, and rhinoplasty. Within the FFS procedures performed throughout the studies, 10 studies reported frontal sinus setback, 15 studies reported bone contouring, 13 studies reported genioplasty, 7 studies evaluated reduction of facial bones, and 14 studies reported rhinoplasty (Table 2).
Summary of Studies.
Note. NR = not reported.
Articles were evaluated to examine the use of surveys in assessing pre- or postoperative outcomes of FFS. Seventeen studies cited administering patient surveys throughout their studies. Validated surveys were used in 8 studies. Partially validated surveys were used in 4 studies. Seven studies used surveys that were not validated. Common surveys used included FACE-Q, Facial Feminization Surgery Outcomes Evaluation (FFSOE) via Ainsworth and Spiegel, and the World Health Organization Quality of Life Survey, Brief Version (WHOQOL-BREF). FFSOE was used in 4 studies, FACE-Q was used in 3 studies, and the WHOQOL-BREF was used in 2 studies, as described by Table 3. Direct assessment of GD was noted in 2 studies, while indirect examination of GD was assessed in 9 studies. Psychosocial well-being was examined in 11 studies, QOL was measured in 11 studies, and patient satisfaction was analyzed in 13 studies. Analysis of objective measures showed that physical examinations were employed in 6 studies and the use of independent raters (including physicians and lay people) was seen in 4 studies. Patient photographs were assessed in 9 studies, and cephalometric measurements were used in 5 studies. Medical imaging was used in 6 studies, with imaging modalities including computed tomography (CT) scans, teleradiography, Vectra 3D Imaging, and X-rays (Table 3).
Summary of Outcomes Assessed.
Patient surveys: †FACE-Q = a patient-reported outcome instrument that measures how patients feel about their facial appearance, quality of life, and treatment, †WHOQOL-BREF = World Health Organization Quality of Life, Brief Version, ‡FFSOE = Facial Feminization Surgery Outcomes Evaluation [via Ainsworth and Spiegel], †ANA = Esthetic Numerical Analog (ANA) [via Kim et al], †TCS = Transgender Congruence Scale, †BIS = Body Image Scale, †HAD = Hospital Anxiety and Depression Scale, †SDS = Sheehan Disability Scale, †EQ-5D = EuroQol 5 dimension, a self-report survey that measures health-related quality of life, †PROMIS = patient-reported outcomes measurement information system, †SF36v2 = Short-Form 36 Health Survey version 2.
Subjective measures: GD = gender dysphoria, QOL = quality of life.
Objective measures: CT = computed tomography, 3D = three-dimensional, AI = artificial intelligence.
(†) denotes validated survey.
(‡) denotes survey has undergone partial validation.
After analyzing how FFS outcomes were gathered, the results of those outcomes were analyzed as seen in Tables 4 and 5. The most reported result of FFS was patient satisfaction with FFS, as patients were reported to be satisfied with their FFS in a total of 16 studies. Improved psychosocial well-being and QOL were both noted in 10 studies. Overall improvement with GD because of FFS, as denoted by specifically mentioning postoperative GD levels, was specifically noted in 3 studies (Tables 4 and 5).
Summary of FFS Outcomes on Gender Dysphoria.
Note. NA = not assessed, FACE-Q = a patient-reported outcome instrument that measures how patients feel about their facial appearance, quality of life, and treatment, QOL = quality of life, ANA = esthetic judgment numerical scale (ANA) [via Kim et. al], TCS = transgender congruence scale, SDS = Sheehan disability scale, GD = gender dysphoria.
Summary of How FFS Outcomes were Assessed and Results.
Discussion
Gender-affirming care provides treatment for patients diagnosed with gender dysphoria (GD). However, little is understood about how procedures such as facial feminization surgery (FFS) impact a patient’s level of GD. In this systematic review, we examined current reporting on the impact FFS has on GD and the metrics used to assess these outcomes.
Assessing the effectiveness of FFS can be challenging due to its subjective nature, therefore we looked at several parameters used to evaluate the efficacy of FFS. Our study showed considerable variation in the parameters used to evaluate FFS, without a validated means of tracking outcomes of FFS. Measurement of patient satisfaction and QOL appear to be used most often when evaluating FFS through a subjective lens. There also seems to be a focus on more indirect approaches to evaluating GD, with GD seldom directly measured. From an objective standpoint, photographs appear to be used the most frequently to track FFS. It appears that FFS improves GD when measured directly, as displayed by the studies that cited direct measurement of GD levels.14,16 Additionally, even when not assessed directly, Bonapace-Potvin et al 17 show that GD levels improve after FFS. Furthermore, our data suggests that FFS is associated with improved QOL and demonstrates high patient satisfaction, factors that may indirectly indicate improvement in GD levels.
Surveys were employed in 17 out of 18 articles to assess FFS outcomes. While validated questionnaires such as FACE-Q and WHOQOL-BREF appeared in a few studies,9,14,24 these surveys do not specifically measure the effect of FFS on GD, and instead focus more on means that might be indirectly related to GD, such as QOL and appearance. Notably, the Facial Feminization Surgery Outcome Evaluation (FFSOE) survey via Ainsworth and Spiegel was used in 4 studies,12,13,18,26 however, this instrument has not undergone a formal validation process in the United States. Despite this, its use to gain insight into trends of mental health following FFS shows promise in the development of a standardized validated tool to assess FFS outcomes overall.
Interestingly, there was no use of the Utrecht Gender Dysphoria Scale (UGDS), which is a validated 12 item screening measure for GD specifically. 27 Given the UGDS’s use in diagnosing GD, future studies could be conducted to see how pre-and postoperative scores of the UGDS compare in the setting of FFS. Our data showed evidence of GD improvement in cases whether it was measured directly or indirectly,14,16,17 which may indicate that UGDS could be useful in examining FFS efficacy, although its questions are not specific to structures of the face. Furthermore, another questionnaire, developed by Verbruggen et al, 28 previously described using the facial feminization surgery patient’s satisfaction questionnaire (QESFF1), to better capture FFS-specific outcomes, showing reliability to assess patient satisfaction with FFS. Yet, this tool is still in the developmental phase for validation and was not employed by any studies in this review, underscoring the ongoing need for a widely accepted instrument tailored to FFS assessment, especially with regards to GD.
The lack of a standardized measure highlights the necessity for systematic pre- and post-operative GD assessments in FFS. Such evaluations are vital to determine the full efficacy of FFS and other gender-affirming interventions on GD. Although there is no standardized assessment, it appears that assessing factors related to mental health and overall emotional well-being, as seen in the UGDS, 27 with the addition of questions about one’s face specifically may be helpful in the development of a metric for FFS. Additionally, our data provide evidence that indirect factors, such as QOL, psychosocial well-being, and patient satisfaction may serve as valuable metrics, given their use in a majority of the articles studied here. Ultimately, establishing a reliable, validated tool would enable clinicians and researchers to objectively assess the psychological impact of FFS on patients, providing deeper insights into its role as a gender-affirming procedure.
This study has limitations. Given the use of 2 databases, PubMed and Scopus, in article recruitment, it is possible that other eligible studies that may have a significant impact on our findings were not included in this analysis. Furthermore, as there is no validated assessment for evaluating GD in FFS, there may be other parameters outside of the subjective and objective metrics examined here that would impact our findings. Finally, our analysis may have been subjectively biased when extracting data. It is possible other assessments were performed and used to evaluate FFS outcomes that were not explicitly stated (eg, PHQ-9), and therefore we could not include them in our analysis.
In summary, our study indicates that the impact of facial feminization surgery specifically on the level of a patient’s GD is poorly quantified. When evaluated, it does appear that FFS improves dysphoria levels. We recommend a standardized metric to make this assessment that is accurate and easy to perform. Additional study is needed to assess the impact of FFS on gender dysphoria as a part of the care patients receive for GD.
Footnotes
Acknowledgements
Not applicable.
Data Availability Statement
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Jonathan Black MD FACS is a consultant for Stryker Corp. Emily Yanoshak BS, Arthur Wu BS, Abhishek Kumar BS, Jennifer Goldman BA, and Hibo Wehelie BS have no conflicts of interest to disclose.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Ethical Considerations
This systematic review did not require IRB approval.
Informed Consent
Not applicable.
Consent to Participate
Not applicable.
Consent for Publication
Not applicable.
