Abstract
Intimate partner violence (IPV) is a significant public and mental health concern, yet much of its research and measurement have been developed through a cisgender, heterosexual lens. Sexual and gender minority (SGM) populations experience IPV at rates equal to or higher than their cisgender, heterosexual counterparts and experience unique forms of abuse, including identity-based and transphobia-driven violence. However, the lack of IPV measures validated for SGM populations raises concerns about the accuracy and inclusivity of existing tools, contributing to inconsistent prevalence estimates. This study presents a two-tiered review of self-report IPV measures for SGM populations. Tier 1 evaluated psychometric properties of scales using COnsensus-based Standards for selecting health Measurement INstruments (COSMIN), while Tier 2 examined the broader application of IPV measures in empirical research. Eligible studies were original, English-language research measuring IPV in SGM populations, scale development or validation (Tier 1) or reported a reliability statistic (Tier 2). A systematic search identified 9 scale development/validation studies and 72 studies using IPV scales. Notably, SGM-specific measures remain underutilized, with most studies continuing to rely on heterocentric measures. Structural validity and internal consistency were adequate across scales, but evidence for content validity, cross-cultural validity, reliability, and hypothesis-testing was often limited or indeterminate, with no measure achieving full COSMIN recommendations. While several promising SGM-specific IPV tools exist, further refinement, validation, and development of a set of consensus-driven gold standard measures are needed to support accurate assessment, prevalence estimates, and effective interventions for IPV in SGM populations.
Keywords
Introduction
Over the last six decades, research investigating intimate partner violence (IPV) has become a priority in both public and mental health (Kubicek, 2016). As a subset of interpersonal violence, IPV describes any abusive behavior resulting in physical, sexual, or psychological harm, and often includes emotional abuse and controlling behaviors that occur within the setting of an intimate relationship (World Health Organization, 2013). Although the IPV literature is extensive, it has previously focused on cisgender women’s disproportionate experiences of victimization and on cisgender men’s perpetration in heterosexual relationships, a focus consistent with prevalence data indicating that women in these relationships are more often victimized (Etaugh, 2020; Follingstad & Rogers, 2013). Despite this historical focus on and definition of cisgender and heteronormative experiences of IPV, there has been a surge in research concerning IPV in sexual and gender minorities (SGMs), defined as those identifying within the 2SLGBTQI+ community, over the last few decades (Decker et al., 2018; Edwards et al., 2015; Hillman, 2020; Longobardi & Badenes-Ribera, 2017; Rodrigues et al., 2024; Whitfield et al., 2018). Although prevalence rate estimations are largely variable (Etaugh, 2020), there is consistent evidence suggesting that SGM individuals experience IPV at similar, if not higher rates than their cisgender, heterosexual counterparts (Dank et al. 2014; Edwards et al. 2015). Only recent attention has been given to psychometrically screening or measuring IPV occurrence within SGM populations (Dyar et al., 2021; Etaugh, 2020) resulting in more attention being given to the misuse of measures of IPV that have been designed for and validated with cisgender and heterosexual samples (Calton et al., 2016; Dyar et al., 2021). Screening for IPV may therefore prove more challenging and complex for SGM individuals. For instance, existing screening tools and measures of IPV are rarely psychometrically validated with SGM populations, contributing to concerns about their reliability and validity (Dyar et al., 2019; Follingstad & Rogers, 2013).
IPV in SGM Populations
In adulthood, lifetime prevalence of IPV is currently estimated to be between 14% and 61% within lesbian, gay, and bisexual couples compared to 14% to 24% of heterosexual couples (Ard & Makadon, 2011; Parry & O'Neal, 2015). Additionally, transgender and gender diverse people are three times more likely to experience IPV victimization compared to other SGM individuals (Peitzmeier et al., 2020). However, due to heterogeneity in measurement practices, true prevalence rates are nearly impossible to capture across the SGM community, though they are still considered to be high (Kelleher et al., 2025).
The differences between heteronormative and the range of SGM relationships are critical to consider when informing best practices for IPV identification and intervention, for a few reasons. First, SGM individuals are uniquely vulnerable to IPV victimization compared to their heterosexual counterparts due to minority stressors; the chronic stress experienced by marginalized groups as a result of their identity stemming from prejudice, discrimination, and social inequality (Longobardi & Badenes-Ribera, 2017; Meyer, 2003; Whitton et al., 2019). The experience of these stressors can increase risk of both victimization and perpetration of IPV, through mechanisms such as internalized stigma, heightened aggression, and impaired coping (Decker et al., 2018; Edwards & Sylaska, 2013; Rodrigues et al., 2024). This increase in vulnerabilities for both victimization and perpetration may subsequently increase the likelihood of bidirectional IPV within SGM relationships (Kirschbaum et al., 2023; Ronzón-Tirado et al., 2022) as a form of retaliation, self-defence, or difficulties coping or self-regulating. Recognizing the potential for bidirectional IPV underscores the importance of measurement tools that capture both victimization and perpetration within SGM relationships.
Second, several SGM-specific IPV constructs have been identified that do not appear in cisgender and heteronormative relationships. These include transphobia-driven IPV (i.e., a set of abuse tactics directly related to one’s transgender identity; Maclin et al., 2024) and identity abuse (IA; a set of abuse and controlling tactics within IPV that more broadly leverages both heterosexism and cissexism against SGM survivors; Woulfe & Goodman, 2021). Transphobia-driven IPV tactics include leveraging transphobia by intentionally using the wrong pronouns, controlling one’s transition, and belittling someone based on their transgender identity (Maclin et al., 2024; Peitzmeier et al., 2021) Some IA tactics involve “outing” or threatening to disclose a partner’s SGM status without their consent (Woulfe & Goodman, 2021), withholding access to gender-affirming medical treatment, purposefully misgendering, and restricting gender expression, which could result in further harassment and violence, or introduce employment and housing instability (Ard & Makadon, 2011; Dank et al., 2014; Scheer et al., 2019; Woulfe & Goodman, 2021). Recognizing that both minority stressors and SGM-specific forms of IPV can be weaponized by perpetrators against SGM individuals highlights the need for research that accurately captures these experiences and their consequences in SGM populations (Rodrigues et al., 2024).
Past research has identified a robust correlation between IPV victimization and subsequent mental health symptomatology across the SGM community (Laskey et al., 2019; Reuter et al., 2017; Stults et al., 2025). A systematic review by Rodrigues et al. (2024) identified significant relationships between IPV victimization in SGM populations and PTSD, depression, anxiety, suicidality, loneliness, and maladaptive coping styles. Moreover, IPV victimization was found to be associated with negative physical health outcomes through an increase in risk behaviors, such as substance use, sexual behaviors, and sexually transmitted infections (Rodrigues et al., 2024). Although heterosexual survivors of IPV share many of the same adverse outcomes to their health because of IPV, these resulting psychological consequences of IPV compound pre-existing vulnerabilities to increased prevalence of mental health disparities among SGM individuals, due to their minoritized status (Whitton et al., 2019). These pre-existing vulnerabilities include disproportionate exposure to minority stress, discrimination, and stigma, which contribute to higher baseline rates of depression, anxiety, suicidality, and PTSD among SGM populations (Marchi et al., 2023; Mongelli et al., 2019). Therefore, effective screening and identification of IPV for SGM survivors is crucial to appropriately intervene or mitigate these negative outcomes.
Importance of Valid IPV Measurement
It is important to acknowledge that the extent of our knowledge on IPV is only as good as our measurement practices (White et al., 2024). If IPV is measured poorly or inconsistently, our understanding of the phenomenon may be fundamentally flawed. It is therefore critical to evaluate and challenge existing IPV measures, as inaccurate measurement hinders our ability to effectively intervene, mitigate harm, and prevent future violence (Craig et al., 2008; Follingstad & Bush, 2014; Yakubovich et al., 2022). Furthermore, inaccurate measurement will perpetuate ongoing issues with underreporting (Hillman, 2020; Whitton et al., 2024). Most measures of IPV were developed within frameworks assuming cisgender men as perpetrators and cisgender women as victims (Edwards et al., 2015; Follingstad & Rogers, 2013). Similarly, they have only been minimally adapted (i.e., minor language changes) to try to fit research with SGM populations (Dyar et al., 2021). However, without proper construct validation, it is unclear if these measures accurately capture the construct of IPV within an SGM population, especially as they experience IPV in distinct ways (Flake et al., 2017; Whitfield et al., 2018).
Unlike other well-defined constructs that maintain gold standard instruments, such as posttraumatic stress disorder, which is measured through the Clinician-Administered PTSD Scale for DSM-5 (CAPS-5; Hunt et al., 2018), IPV lacks a widely accepted definition and standardized measurement, due to the complex, multidimensional nature of abuse and sociocultural variability (Follingstad & Bush, 2014). Variability exists in definitions, ways to measure violence (i.e., discrete incidents vs. patterns), and constructs assessed (e.g., physical, sexual, psychological, emotional, controlling, financial, etc.), and no widely agreed upon “gold standard” instrument currently exists (Follingstad & Bush, 2014; Yount et al., 2022). For instance, one review article identified 87 unique measures of IPV within cisgender heterosexual research and recommended only 18 as having suitable psychometric properties (Alexander et al., 2022). Alexander et al. (2022) referred to the Revised Conflict Tactics Scale (CTS; Straus et al., 1996) as a possible “gold standard” instrument, but psychometric properties for this scale remain mixed and is critiqued as lacking additional context that would be useful in both mitigation and prevention planning (Jones et al., 2017). Although some studies use validated tools, a large proportion of studies continue to utilize binary or single-item, non-validated IPV measurement, limiting the ability to capture nuance and rich data (Chatterji et al., 2023). The inconsistent use of measures and underutilization of psychometric validation complicates comparisons of IPV prevalence rates and underscores the need for reliable measurement (Kelleher et al., 2025; Yakubovich et al., 2022).
Present Review
A handful of psychometric reviews evaluating screening tools for IPV have been completed (Alexander et al., 2022; Arkins et al., 2016; Li et al., 2024), but none of these have included or focused solely on measurement tools developed for and validated with SGM populations. A recent review by Kelleher et al. (2025) highlights IPV measurement tools used with SGM individuals. However, the Kelleher et al. (2025) study included scales that had not been psychometrically developed for, nor validated with, this target population, which may limit the validity of the measures (Boateng et al., 2018). Given the high prevalence of IPV victimization and perpetration in SGM communities and the relatively recent growth of this research area, identifying psychometrically robust screening and measurement tools has important implications for both future research and clinical practice. Accordingly, this review aimed to evaluate the methodological quality and use of IPV measurement tools developed for or validated with SGM populations. Specifically, we asked: (a) What is the methodological quality and psychometric properties of IPV measures developed for or validated with SGM populations? and (b) How have these measures been applied across empirical research involving SGM participants? To address these concerns, we conducted a two-tiered psychometric review of self-reported IPV measurement tools used with SGM populations. The first tier evaluated the development and validation of psychometric tools to analyze their key validity and reliability indicators. The second tier aimed to synthesize the application and overall use of IPV measurement tools across empirical SGM IPV research. In doing so, we evaluated both the methodological quality and real-world use of IPV scales and screening tools in SGM research, with the goal of supporting more inclusive, accurate, and equitable measurement practices.
Methods
Study Design
The present psychometric review followed COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) for systematic reviews of measurement instruments including evaluating the validity of Patient-Reported Outcome Measures (PROMs) (Mokkink et al., 2018; Prinsen et al., 2018; Terwee et al., 2018). Given that IPV is not technically classified as a PROMs due to victimization being a lived event or exposure rather than a subjective assessment of health status, functioning, or quality of life, only relevant COSMIN psychometric properties were considered. The review was registered on PROSPERO (CRD42024619770). Notably, at the time of registration, this was the first review focused on IPV measurement tools validated with SGM populations.
Search Strategy
A comprehensive systematic literature search was finalized in consultation with a research librarian and performed across four databases: PsycINFO, PubMed/MEDLINE, Embase, and CINAHL. Consistent Boolean key words were utilized for all databases: (“intimate partner violence” OR “IPV” OR “domestic violence” OR “gender-based violence”) AND (“queer” OR “lesbian” OR “gay” OR “transgender” OR “LGBTQ+” OR “non-binary” OR “sexual minority” OR “bisexual” OR “MSM” OR “SGM”) AND (“scale” OR “measurement” OR “checklist” OR “testing” OR “psychometrics” OR “screening” OR “development” OR “validation”). The initial search included studies published up until December 2024. A secondary search was conducted on July 31, 2025, to remain as up to date as possible.
Eligibility Criteria
Eligibility criteria required studies that: (a) were written in English; (b) were original peer-reviewed research (i.e., no reviews, commentaries, conference abstracts, book chapters); (c) measured IPV within the 2SLGBTQI+/SGM populations; (4.1) Tier 1: was a scale development or validation study within the population of interest; (4.2) Tier 2: used a psychometric scale and reported a reliability statistic (i.e., Cronbach’s α, KR-20). We excluded studies that could not be accessed, were not written in English, were not original or empirical research, did not focus on the 2SLGBTQI+/SGM population, did not use a psychometric scale to measure IPV, or did not report a reliability statistic for the scale used.
Screening and Study Selection
The main study author (AP) conducted the first phase of the review and imported the references into HubMeta, a web-based meta-analytic and review platform. We removed study duplicates prior to screening. In line with PRISMA-COSMIN guidelines (Elsman et al., 2024), two authors (SH and JV) independently screened titles and abstracts to determine eligibility for the full-text review and a third reviewer (AP) resolved disagreements. Similarly, two authors (JV and VB), independently conducted the full-text review (kappa = .77), with the consensus of a third reviewer (AP). Prior to the data extraction phase, studies that met the inclusion criteria were further categorized into two sub-groups: Tier 1, scale development and validation studies and Tier 2, empirical studies that used a psychometric scale with a reported reliability statistic (refer to 4.1 and 4.2 of the inclusion criteria).
Data Extraction
Data were extracted and organized into two tiers; the first tier was a scale psychometric extraction per COSMIN guidelines (Mokkink et al., 2018; Prinsen et al., 2018; Terwee et al., 2018), which included scale development and validation studies (sub-group 4.1). The second tier was a descriptive synthesis, which included studies that used a psychometric scale with a reported reliability statistic (sub-group 4.2). For Tier 2, data including (a) author name and year of publication, (b) country, (c) IPV measure used, (d) sample description, (i.e., sample size, SGM identity), and (e) reliability statistic (i.e., Cronbach’s α or KR-20) were extracted by one reviewer (VB) and verified by a second reviewer (JV).
Assessment of Methodological Quality and Measurement Properties
Each scale development and validation study was rated according to the COSMIN updated criteria for strong measurement properties (Mokkink et al., 2018; Prinsen et al., 2018; Terwee et al., 2018). We evaluated the methodological quality of each study using the COSMIN Risk of Bias checklist (Mokkink et al., 2018), which assesses how well each measurement property was tested. As IPV is not considered a PROM in the traditional sense, this rating of methodological quality and measurement properties was based on evaluating the scale development, content validity, structural validity, internal consistency, cross-cultural validity, reliability, and construct validity (i.e., hypothesis testing; Mokkink et al., 2018; Prinsen et al., 2018; Terwee et al., 2018). Measurement properties such as responsiveness, measurement error, and interpretability were excluded, as they may be less applicable to IPV measures. Ratings were completed by two reviewers (AP and JV), with consensus reached on any discrepancy, and assigned a final property rating on a four-point scale: very good, adequate, doubtful, or inadequate (see Mokkink et al., 2018). For each property, the lowest item score for each property category determined the overall methodological quality rating for that property using the “worst score counts” principle.
Next, we evaluated the results of each study’s psychometric testing using COSMIN criteria (Prinsen et al., 2018; Terwee et al., 2018). Each result was rated as sufficient (+), insufficient (−), or indeterminate (?), based on predefined thresholds (e.g., Cronbach’s α ≥.70 for internal consistency; see Supplemental Table S1). The measurement properties assessed similarly included content validity, structural validity, internal consistency, cross-cultural validity, reliability, and hypothesis testing, where applicable.
Finally, recommendations for each measure were conducted using COSMIN’s classification system. Measures were classified as: Category A (recommended for use, evidence for sufficient content validity, and at least low-quality evidence for sufficient internal consistency); Category B (has the potential but requires further validation before recommendation); or Category C (not recommended, there is high-quality evidence for an insufficient measurement property) (Prinsen et al., 2018).
Results
Study Selection
We identified a total of 716 studies from the initial literature search after deduplication. After removing ineligible articles at the title/abstract screening phase, we reviewed 331 studies in full. Of these, 2 were not in English, 5 were not original peer-reviewed research, 9 did not measure IPV in some way, 122 had no use of a psychometric scale (i.e., used a binary measure), 111 reported no reliability statistic (i.e., Cronbach’s α, KR-20), and 6 were inaccessible (Figure 1). The secondary search conducted on July 31, 2025, across all original databases yielded a total of eight new studies that were included. No new scale development or validation studies were identified from the secondary search. A total of six scale development studies and three scale validation studies (Tier 1) of psychometric measures specifically designed or validated with SGM populations were included in the present COSMIN review. In total, 75 studies that used any sort of psychometric scale to measure IPV within an SGM population (Tier 2) met eligibility criteria and were included in the review (see Supplemental Materials S2 and S3).

PRISMA flow diagram.
Study Characteristics (Tier 1)
A total of nine studies met inclusion criteria for the Tier 1 review phase, comprising six scale development studies and three validation studies published between 2002 and 2024 that identified seven unique psychometric measures. All studies used cross-sectional survey designs and quantitative methods, with sample sizes ranging from 78 (McClennen et al., 2002) to 2,486 (Martínez-Bacaicoa et al., 2024) participants. All studies but one were conducted in the United States (Martínez-Bacaicoa et al., 2024). Scales were developed for a range of SGM individuals, including cisgender gay and bisexual men (n = 4 scales), cisgender lesbian and bisexual women (n = 4 scales), transgender people (n = 4 scales), and nonbinary people (n = 3 scales).
Measures Overview
To support future research, Table 1 summarizes the constructs measured and the populations for which each IPV scale has been developed or validated. Identified psychometric scales for measuring IPV specifically within an SGM context included the Sexual and Gender Minority Adapted Conflict Tactics Scale 2 (SGM-CTS2; Dyar et al., 2021), the Intimate Partner Violence for Gay and Bisexual Men Scale (IPV-GBM Scale; Stephenson & Finneran, 2013), the Intimate Partner Violence for Gay and Bisexual Men Screening Tool (IPV-GBM Screening Tool; Stephenson et al., 2013), the Transgender-Specific Intimate Partner Violence Scale (T-IPV Scale; Peitzmeier et al., 2019), the Identity Abuse Scale (IA Scale; Woulfe & Goodman, 2021), the Technology-Facilitated Sexual Violence Scale (TFSV Scale; Martínez-Bacaicoa et al., 2024), and the Lesbian Partner Abuse Scale Revised (LE-PAS-R; McClennen et al., 2002). Validation studies included a study validating and new item development to the T-IPV Scale (Peitzmeier et al., 2021), a validation of the IA Scale (Scheer et al., 2019), and a validation and short-form development study of the LE-PAS-R (McClennen et al., 2002).
Overview of IPV Measurement Tool Constructs Developed or Validated with Sexual and Gender Minority Populations.
Note. IPV = intimate partner violence; SGM = sexual and gender minority; IPV-GBM = Intimate Partner Violence for Gay and Bisexual Men Scale. Blank cells indicate that the construct was not explicitly measured or validated in the referenced scale, an “X” indicates that it has been.
Controlling/Monitoring also includes elements of coercive control, SGM-specific denotes inclusion of items directly addressing identity-based or SGM-specific domains (e.g., HIV-related).
Unique Constructs of IPV Measured
SGM-Specific
Six out of seven scales identified measured an SGM-specific construct of IPV. These included identity abuse (Woulfe & Goodman, 2021), transgender-specific IPV (Peitzmeier et al., 2019), HIV-related IPV and abuse among gay and bisexual men (Stephenson & Finneran, 2013), technology-facilitated sexual violence—specifically gender and sexuality-based violence (Martínez-Bacaicoa et al., 2024), internalized homophobia (McClennen et al., 2002), and adaptations of a general IPV measure to reflect SGM experiences (Dyar et al., 2021).
Other IPV Constructs
A total of three scales captured additional constructs of IPV not identified by these other more general categories. These include injury (Dyar et al., 2021), technology-facilitated sexual violence (Martínez-Bacaicoa et al., 2024), and power imbalances and abusive behaviors (i.e., communication and social skills, substance abuse, intergenerational transmission of violence, faking illness, and status differentials) within lesbian relationships (McClennen et al., 2002).
Quality Assessment and Evaluation of Psychometric Scales
Both scale development and validation studies were included to generate an overall appraisal of each instrument where applicable (Table 2). In terms of PROM development, no scale received a rating above “Doubtful,” most commonly due to lack of SGM persons’ involvement in the item generation process. Where multiple studies were available for a given scale, findings were synthesized to determine an overall rating for each psychometric property (i.e., +, ?, −). Methodological quality ratings (e.g., very good, adequate, doubtful, inadequate) indicate how well each property was tested, whereas the symbols in parentheses (+, ?, –) represent the result of that test. Thus, a study may show very good methodology but an indeterminate (?) result if evidence was inconclusive, or doubtful methodology yet a sufficient result (+) if findings supported validity despite design limitations.
Methodological Quality and Measurement Property Results.
Note. IPV = intimate partner violence; SGM = sexual and gender minority; IPV-GBM = Intimate Partner Violence for Gay and Bisexual Men Scale. Methodological quality ratings (e.g., very good, adequate, doubtful, inadequate) indicate how well each property was tested, whereas the symbols in parentheses (+, ?, –) denote the result of that test (sufficient, indeterminate, insufficient). It is therefore possible for a study to demonstrate very good methodology but an indeterminate result if evidence was inconclusive, or doubtful methodology yet a sufficient result if findings supported validity despite design limitations.
Sexual and Gender Minority Adapted Conflict Tactics Scale 2
The SGM-CTS2 (Dyar et al., 2021) is a 100-item measure assessing both IPV victimization and perpetration in SGM populations. It retains the original five-factor structure of the CTS-2 subscales and introduces two subscales including one on coercive control and the other on SGM-specific tactics. Structural validity was rated as very good based on confirmatory factor analyses (CFAs) of the non-adapted CTS2, with strong model fit for victimization (χ2[485] = 628.95, p < .001; RMSEA = .03; CFI = .98; TLI = .98) and perpetration (χ2[424] = 619.39, p < .001; RMSEA = .04; CFI = .94; TLI = .93) resulting in a COSMIN rating of (+). Internal consistency was acceptable for all subscales (α = .63–.88) except for sexual IPV, which showed a lower Cronbach’s alpha (α = .48) due to one item. However, omega coefficients supported adequate reliability (w = .76–.97), resulting in a very good (+) rating. Content validity was rated as doubtful due to limited participant involvement in item generation (?). Hypothesis testing for construct validity was rated as adequate, supported by correlations in expected directions with relevant constructs (+). Test–retest reliability and cross-cultural validity were not reported.
Identity Abuse Scale
The IA Scale is a 7-item unidimensional measure of exposure to identity abuse (e.g., outing or threatening to disclose one’s SGM identity, denying or attacking a partner’s identity, using slurs or derogatory language, or isolating them from the SGM community) that has been examined in both a development study (Woulfe & Goodman, 2021) and a validation study (Scheer et al., 2019). In the development phase, content validity was rated inadequate (?), as item generation relied primarily on literature review and expert opinion with limited target population involvement. Structural validity was rated adequate (?), supported by EFA (KMO = .85; Bartlett’s χ² = 1,697.73, df = 21, p < .001), with factor loadings from .52 to .77 and acceptable communalities (M = .50). Internal consistency was very good (?), although based only on Cronbach’s alpha and structural validity was not fully established. Cross-cultural validity was rated doubtful (?), reflecting limited testing across diverse subgroups. Hypothesis testing for construct validity was rated adequate (?), with some correlations observed in expected directions.
In the validation study (Scheer et al., 2019), structural validity was rated very good (+), supported by CFA (χ²[14] = 110.24, p < .01; RMSEA = .081; CFI = .977). Internal consistency was very good (+) for past-year identity abuse (α = .90). Cross-cultural validity was very good (+), with EFAs showing a unidimensional factor structure across diverse subgroups, including LGBTQ people of color, White LGBTQ people, transgender and gender non-conforming individuals, and cisgender sexual minority women, all with acceptable to strong factor loadings. Hypothesis testing for construct validity was very good (+), with moderate-to-strong correlations with related constructs, including psychological abuse (r = .68–.72) and physical abuse (r = .67–.70), supporting convergent validity and confirming the IA construct is distinct. Reliability was not reported and rated indeterminate (?). Overall, the Identity Abuse Scale demonstrates strong psychometric evidence, particularly from the validation study, with very good (+) structural validity, internal consistency, cross-cultural validity, and construct validity, although some domains from the development study remain indeterminate (?), resulting in an overall rating of adequate to very good.
Intimate Partner Violence for Gay and Bisexual Men Scale
The IPV-GBM Scale (Stephenson & Finneran, 2013) is a 23-item measure assessing five domains of IPV among gay and bisexual men (physical and sexual, monitoring behaviors, controlling behaviors, HIV-related IPV, and emotional IPV). While item generation included focus groups with gay and bisexual men about defining IPV, content validity was rated inadequate (?), reflecting limited consultation with professionals regarding item relevance. Structural validity was rated adequate (?), supported by EFA (KMO = 0.903; Bartlett’s test p < .001), which identified five factors with eigenvalues >1, though five items did not load on any factor (α < .50). Internal consistency was very good (α = .71–.92), but because structural validity was not fully established, COSMIN guidelines do not allow a formal rating for this domain (?). Cross-cultural validity was rated doubtful (?), based on subgroup EFAs by race without multiple-group CFA or DIF analyses, and hypothesis testing for construct validity was also rated doubtful (?), as no explicit hypotheses were defined and only implicit associations were examined.
The IPV-GBM Screening Tool (Stephenson et al., 2013) is a 6-item measure designed for rapid identification of IPV in gay and bisexual men. Content validity was rated doubtful (?), due to limited reporting of qualitative rigor despite target population input and expert review. Structural validity was very good (?), supported by EFA for the five-domain structure, though the absence of CFA limits certainty. Internal consistency was very good (α > .70), but rated indeterminate (?), given limited evidence for structural validity. Cross-cultural validity was very good (?), based on subgroup factor analyses by race without formal measurement invariance testing. Reliability was not applicable, and hypothesis testing for construct validity was doubtful (?), as no explicit hypotheses were defined.
Transgender-Specific Intimate Partner Violence Scale
The T-IPV Scale (Peitzmeier et al., 2019; 2021) is a brief measure designed to screen for controlling behaviors and psychological abuse directed at transgender and gender non-conforming individuals by their intimate partners. In the development study, content validity was rated doubtful (?), as items were primarily derived from a literature review, and structural validity was inadequate (?), with no EFA or CFA conducted. Internal consistency was very good (?), with a KR-20 of 0.56 for the 4-item scale, while cross-cultural validity was not assessed and reliability was inadequate (?). Hypothesis testing for construct validity was rated adequate (?), as the T-IPV construct was compared to other forms of IPV, though no explicit hypotheses were stated. In the subsequent validation study (Peitzmeier et al., 2021), the scale was expanded to 8 items, and structural validity was rated very good
Lesbian Partner Abuse Scale Revised
The LE-PAS-R (McClennen et al., 2002) is a 25-item measure developed from the original 135-item LE-PAS to assess power imbalances between lesbian couples that may result in partner abuse. This scale measures communication and social skills, substance abuse, intergenerational transmission of violence, faking illness, internalized homophobia, and status differentials. Content validity was rated inadequate (+), as item generation procedures were unclear and limited to quantitative methods, with only advisory council and faculty/clinician review for readability and content. Structural validity was rated indeterminate (?). EFA identified six factors accounting for 77.32% of variance using accepted criteria, but the factor structure was not confirmed with CFA, replicated in an independent sample, or supported by theoretical rationale. Internal consistency was rated doubtful (?), with the authors reporting high internal consistency (r = .94), but without evidence of unidimensionality. Hypothesis testing for construct validity was rated adequate (−), as the scale differentiated abuse from non-abused participants (M = 91 vs. 39.38, t(76) = −11.03, p < .01, R2 = 0.68), but no pre-specified hypotheses were described. Cross-cultural validity and test–retest reliability were not assessed.
Technology Facilitated Sexual Violence Scale
The TFSV Scale (Martínez-Bacaicoa et al., 2024) assesses perpetration and victimization across multiple dimensions of digital sexual violence, including online gender-based violence, online gender- and sexuality-based violence, digital sexual harassment, online sexual coercion, and nonconsensual pornography. Content validity was rated doubtful (?), as limited methodological details were provided regarding the qualitative interviews and analytical methods used to inform item generation. Structural validity was rated very good (+). CFA indicated adequate model fit for the three-factor victimization and perpetration models, with RMSEA values initially high (.097 and .108) but reduced to acceptable levels (.079 and .082) after correlating residuals between similarly worded items; CFI and SRMR indicated good fit, and the modifications were theoretically justifiable, though some minor misfit may remain. Internal consistency was rated very good (+), with Cronbach’s α consistently above .71 and omega values ranging from .68 to .74. Cross-cultural validity and reliability were not assessed. Hypothesis testing for construct validity was rated doubtful (+), as the factor structure largely supported a priori expectations, with some limitations in reporting.
COSMIN Recommendations
Because all but two measures were evaluated in only a single study, a formal GRADE-based synthesis across multiple studies was not feasible. Therefore, overall recommendations were assigned using the COSMIN classification system (Prinsen et al., 2018) based on the combined evidence available for each measure. For instruments with multiple studies such as the IA Scale (Scheer et al., 2019; Woulfe & Goodman, 2021) and the T-IPV Scale (Peitzmeier et al., 2019, 2021), findings from development and validation were synthesized. Although validation studies demonstrated strong evidence for certain measurement properties (Peitzmeier et al., 2021; Scheer et al., 2019), limitations in content or structural validity from development phases moderated their ratings. As a result, both scales, as well as all other measures, were assigned Category B, reflecting promising tools with potential usefulness that require further psychometric evaluation before full endorsement. Notably, no measures met criteria for Category A, which requires sufficient content validity and internal consistency, and none were rated Category C, as no high-quality evidence showed inadequate measurement properties.
Applications of IPV Measures in SGM Research (Tier 2)
The second section of this review identified 72 original studies that used at least one psychometric measure of IPV and reported a reliability statistic (i.e., Cronbach’s α, KR-20). Of these studies, most (n = 57) were based in the United States (see Table 3). Sample sizes ranged from 40 (Pepper & Sand, 2015) to 3,783 (Metheny et al., 2024) across all studies and contained variability of SGM identities. A total of 22 psychometric scales were identified and used in data collection. The Revised Conflict Tactics Scale (CTS2; Straus et al., 1996) was the most frequently used measure (n = 27 studies), despite not being developed or validated with an SGM population. The next two most common non-SGM IPV measures were the original CTS (n = 5; Straus, 1979) and the Psychological Maltreatment of Women Inventory (PWMI, n = 5; Tolman, 1999). Reported internal consistency for these measures ranged from acceptable to excellent; CTS2 (α = .63–.95), CTS (α = .71–.95), and PWMI (α = .71–.96).
Descriptive Synthesis of Studies Using IPV Psychometric Scales.
Note. IPV = intimate partner violence; SGM = sexual and gender minority; IPV-GBM = Intimate Partner Violence for Gay and Bisexual Men Scale. List of scales: ABI (Shepard & Campbell, 1992); CTS (Straus, 1979); CTS-2 (Straus et al., 1996); CTS2-S (Straus & Douglas, 2004); PMWI (Tolman, 1999); IA Scale (Woulfe & Goodman, 2021); T-IPV Scale (Peitzmeier et al., 2019); HITS (Sherin et al., 1998); IPV-GBM (Stephenson & Finneran, 2013); Danger Assessment Scale (Campbell et al., 2009); SGM-CTS2 (Dyar et al., 2019); VDR, SD-PAV, PDR, & SD-PAP (Foshee, 1996); CTS-CF-R (Straus et al., 1996); CADRI Scale (Wolfe et al., 2001); CARS (Watkins et al., 2018); CPI (Burke et al., 2011); Composite Abuse Scale (Revised)—Short Form (CASR-SF; Ford-Gilboe et al., 2016); MMEA Scale (Murphy & Hoover, 1999); NVAWS (Tjaden & Thoennes, 1998). For full reference list of studies and scales identified, please see Supplemental Materials (S1 and S2).
Of the identified IPV scales listed in Tier 1, the IPV-GBM (n = 11), SGM-CTS2 (n = 7), T-IPV (n = 2), and IA (n = 4) scales were used across a total of 25 studies. Based on identified studies, a range for internal consistency ratings was made possible for each SGM-specific IPV scale. This included the T-IPV Scale (α = .82–.86; KR-20 = = .82), SGM-CTS2 (α = .54–.98; lowest for sexual IPV), IPV-GBM Scale (α = .57–.95), and IA Scale (α = .84–.91). These scales were used in diverse SGM samples, but most research (n = 47 studies) continued to rely solely on instruments not specifically developed or validated for SGM populations.
Discussion
The present COSMIN review is one of the first comprehensive evaluations of psychometric tools for measuring IPV specifically developed or validated within SGM populations. Findings were divided into two tiers, the first including psychometric tools specifically designed for SGM individuals, and the second, which evaluated how psychometric tools are applied across empirical IPV research. Seven unique SGM-specific IPV measures were identified in the present study, several of which demonstrated promising psychometric properties (e.g., the IA, T-IPV, IPV-GBM, and SGM-CTS2 scales). All measures were given a Category B recommendation, which calls for continued use, but requires further evaluation (Prinsen et al., 2018). Consistent with concerns about heteronormativity in IPV research (Calton et al., 2016; Dyar et al., 2021; Woulfe & Goodman, 2020) and the need for psychometrically valid tools for SGM populations, findings highlight both encouraging progress and ongoing challenges.
Across nine Tier 1 studies, the majority of scales targeted SGM-specific IPV constructs, including identity abuse (Woulfe & Goodman, 2021), transgender-specific IPV (Pietzmeier et al. 2019; 2021), HIV-related IPV (Stephenson & Finneran, 2013; Stephenson et al., 2013), and technology-facilitated sexual violence (Martínez-Bacaicoa et al., 2024), highlighting recognition of the unique experiences of IPV within these populations. However, methodological rigor and psychometric evidence varied considerably, often with many properties rated as indeterminate due to lack of available data, per COSMIN guidelines (Mokkink et al., 2018; Prinsen et al., 2018; Terwee et al., 2018). Consistent with critiques raised by Kelleher et al. (2025), a notable limitation across scale development studies was inadequate participant involvement in the development process. Content validity also was often insufficiently addressed, with limited evidence that tool development incorporated input from IPV-affected SGM populations. Structural validity and internal consistency were generally the strongest psychometric domains, with seven psychometric studies demonstrating adequate to very good ratings for both factor structure and reliability. Importantly, cross-cultural validity, test–retest reliability, and hypothesis-driven construct validity were often unreported or indeterminate, reflecting a critical gap in scale evaluation (see Table 4 for critical findings).
Summary of Critical Findings.
In Tier 2 studies examining the application of psychometric tools for measuring IPV, the review identified 72 empirical studies using such measures. Most studies relied on instruments originally developed for cisgender heterosexual populations, consistent with earlier critiques of the literature (Calton et al., 2016; Dyar et al., 2021). When SGM-specific scales were used, their application remained limited and inconsistent, with variability in subscale reporting and reliability estimation. Among the reported scales (i.e., IPV-GBM, T-IPV, IA, SGM-CTS2 scales), internal consistency ranged from questionable to very good (α = .54–.98), presenting some preliminary evidence for good psychometric properties. Few studies were published prior to the development of SGM-specific scales. However, most continued to rely on non-SGM-specific instruments, highlighting the ongoing underutilization of SGM-appropriate tools.
Our findings raise concerns and substantiate claims about the heterocentric psychometric approaches taken in the literature and their lack of suitability for research with SGM populations. Practically, reliance on measures not designed for SGM populations may result in under-identification or mischaracterization of IPV experiences, limiting the ability of researchers and clinicians to accurately assess risk, inform interventions, and tailor supports to the unique needs of SGM individuals. Taken together, these findings underscore a paradox in the field; despite the availability of several promising SGM-specific IPV instruments, they remain underutilized. This underuse therefore systematically screening for IPV haphazardly, missing out on SGM-specific IPV data, developing inaccurate prevalence rates, and increasing the risk of false negatives. Moreover, no instrument meets all criteria for full recommendation (i.e., Category A) under COSMIN standards in this review or others (e.g., Li et al., 2024).
Limitations and Future Directions
Although COSMIN-style psychometric reviews are powerful tools offering rigorous evaluation of measurement instruments, the present study is not without limitations. First, the inclusion criteria were limited to studies and scales published in English, potentially excluding relevant studies and scales in other languages and cultural contexts. Second, the review may be subject to selection bias, as included studies might not fully capture all relevant research (Drucker et al., 2016). To mitigate these risks, the review protocol was preregistered on PROSPERO, specifying eligibility criteria, search strategies, and analytic methods in advance to enhance transparency and an unbiased review process (Drucker et al., 2016). Third, although COSMIN guidelines were adapted to the context of IPV tools, some psychometric properties (e.g., responsiveness, interpretability) were excluded due to limited applicability. Fourth, this review limited studies to those that reported an internal consistency statistic, therefore not capturing all empirical research that utilized psychometric IPV scales. It is important to note that internal consistency is sometimes not reported in IPV research due to low variance on certain items, which is a common psychometric challenge in this field (Ryan, 2013). Finally, most included studies were conducted in the United States, which may limit cross-cultural generalizability of findings.
Future research should prioritize the development of IPV scales in close collaboration with diverse SGM communities, ensuring content validity and cultural relevance. Psychometric testing should extend beyond internal consistency to include longitudinal reliability, measurement invariance across subgroups, and cross-cultural adaptation. Given the heterogeneity of the SGM population, intersectional approaches that account for race, ethnicity, socioeconomic status, and other identities are critical. While measures such as the IPV-GBM scale offer strong content validity within specific subgroups, particularly gay and bisexual men, the SGM-CTS2 may represent the most comprehensive tool to date for capturing a wide range of IPV constructs (Dyar et al., 2021). It also retains comparability with the original CTS2, a widely used measure in cisgender heterosexual IPV research (Alexander et al., 2022). Further validation of the SGM-CTS2 and other promising measures, such as the T-IPV and IA scales, is needed across diverse SGM populations before these tools can be recommended as core measures.
Developing consensus is needed to establish a set of “gold standard” IPV measures for SGM research, ensuring both broad coverage and comparability across studies and bolster best practices (Follingstad & Bush, 2014; Yount et al., 2022). This emphasis on a “set” of gold standard measures recognizes that no single measure could adequately capture the range of SGM identities, relationships, experiences of abuse, and sociocultural contexts. Future work should focus on developing a coordinated set of adaptable and well-validated measures can function as an evidence-based gold-standard approach to measuring IPV in SGM communities. Future research should also prioritize large-scale studies with balanced samples of cisgender heterosexual and SGM participants using both the SGM-CTS2 and CTS2. This approach would enable more accurate between-group comparisons, reliable prevalence estimates, testing of measurement invariance, and identification of predictors of IPV across diverse populations. Collectively, these efforts will support the evidence base for psychometrically robust and culturally relevant IPV measurement in SGM research (see Table 5).
Summary of Implications for Practice, Policy, and Research
Note. IPV = intimate partner violence; SGM = sexual and gender minority; IPV-GBM = Intimate Partner Violence for Gay and Bisexual Men Scale; SGM-CTS2; Sexual and Gender Minority Adapted Conflict Tactics Scale 2.
Conclusion
The current COSMIN review provides a comprehensive evaluation of psychometric tools developed or validated to measure IPV within SGM populations, identifying key methodological gaps and priorities for improving psychometric rigor in this field. While several promising instruments exist, overall methodological quality and psychometric evidence remain limited. Across studies, evidence for reliability and validity was generally preliminary, with few instruments meeting COSMIN standards for robust psychometric testing. Internal consistency was the strongest methodological property across measures, with many lacking reported reliability, cross-cultural validity, and varied content and structural validity. Importantly, very few measurement development studies included active engagement from SGM individuals in item development, questioning the cultural relevance of existing tools. Across the broader SGM IPV literature, most studies continue to use measures developed and validated for cisgender or heterosexual populations, limiting accuracy and comparability of findings. The field of SGM-specific IPV measurement appears to be in its early stages. Future work that prioritizes the inclusion of diverse SGM communities, intersectional approaches, and large-scale studies will be essential to establish reliable, culturally relevant, and widely applicable IPV measures. Rather than a single gold-standard tool, a coordinated set of adaptable, context-sensitive measures may offer the most feasible and inclusive approach to capturing the diverse experiences of IPV across SGM populations. Collectively, these efforts will support accurate prevalence estimation, meaningful between-group comparisons, and ultimately, more effective identification and intervention strategies for IPV in SGM populations.
Supplemental Material
sj-docx-1-tva-10.1177_15248380251412520 – Supplemental material for Measuring Intimate Partner Violence in Sexual and Gender Minority Populations: A COSMIN Psychometric Review
Supplemental material, sj-docx-1-tva-10.1177_15248380251412520 for Measuring Intimate Partner Violence in Sexual and Gender Minority Populations: A COSMIN Psychometric Review by Aaron Palachi, Vesna Beljo, Sarah L. Martin, Joey Vong, Shania S. Hossain and David M. Day in Trauma, Violence, & Abuse
Footnotes
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request.
Supplemental Material
Supplemental material for this article is available online.
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
