Abstract
Sadism was initially described as the experience of sexual pleasure produced by acts of cruelty and bodily punishment. Sadism was conceptualized as if sadists were fundamentally different from nonsadists. Recent studies have suggested that sadism is distributed as a dimension rather than as a category. The aim of the current study was to assess the psychometric properties the MTC Sexual Sadism Scale. Our analyses were conducted on a sample of 486 sexual offenders assessed at a correctional institution in Massachusetts. In summary, the results indicate that the MTC Sexual Sadism Scale possesses good psychometric properties for the dimensional assessment of severe sexual sadism with behavioral markers. Moreover, the scale captures a wide range of intensity of sadism among sexual offenders. These results are consistent with prior research and support the current consensus to move toward a dimensional interpretation of sadism. Implications both for clinical assessment and for research on the development of sadism are discussed.
Sexual Sadism
Sexual sadism derives its name from the Marquis de Sade Donatien Alphonse Francois. The diagnosis bears his name because of his literary works, which are imbued with eroticism of violence and cruelty.The first attempt to describe this sexual disorder was by the Austrian psychiatrist Richard Freiherr von Krafft-Ebing (1886/1998) in his book Psychopathia Sexualis. This work was intended to serve as a reference manual for pathologists. Krafft-Ebing (1886/1998) initially classified sadism among the perversions under the name of lust murderer. According to Krafft-Ebing (1886/1998), sexual sadism can be defined as the experience of pleasure as a result of cruelty and punishment directed toward humans or animals, or the desire to humiliate, strike, hurt, and even destroy others in order to experience sexual pleasure.
Sexual sadism has been included in the Diagnostic and Statistical Manual of Mental Disorders (DSM; American Psychiatric Association, 1951) since the mid-20th century. The DSM definition is the most widely used framework in North America for assessing sexual sadism (Krueger, 2010; Yates, Hucker, & Kingston, 2008). In the DSM-5 (American Psychiatric Association, 2013), sadism is defined as recurrent and intense sexual arousal from the physical or psychological suffering of another person, as manifested by fantasies, urges, or behaviors.
In Europe, the most common diagnostic criteria are those of the World Health Organization’s (WHO, 1992) International Classification of Diseases–10th Revision. The WHO defines sadomasochism as a preference for sexual activities that involve restraints, corporal punishment, or humiliation: “If the individual prefers to be the recipient of such stimulation this is called masochism; if the provider, sadism” (WHO, 1992, p. 172).
Dimensions of Sadism
Several authors have proposed their own definition of sadism. There is actually little to no consensus on the defining features of sexual sadism, the requisite number of diagnostic criteria, and the relevance of individual criteria (e.g., animal cruelty; for more details, see Proulx, Blais, & Beauregard, 2007). As mentioned by Marshall and Hucker (2006), “each researcher chose an idiosyncratic list of criteria which typically included some features from both DSM and International Classification of Diseases, but also included other features not mentioned in either of these texts” (p. 1). Therefore, studies report few consistencies in the criteria deemed to be essential to the reliable assessment of sexual sadism (Marshall & Kennedy, 2003; Marshall, Kennedy, & Yates, 2002; Marshall, Kennedy, Yates, & Serran, 2002). These inconsistencies greatly affect the validity and reliability of the measurement of sexual sadism (for more details, see Marshall & Kennedy, 2003).
To clarify what is considered core features of sexual sadism from what is considered an arbitrary choice by one or few authors, a screening of the definitions used in the literature was conducted by the first author and was overseen by the coauthors of this study. To this end, we conducted an exhaustive review of the literature on sadism and extracted how each author defined sadism. The aim was to offer an overview of the literature to analyse it as a whole rather than addressing several sources one after another. Afterward, the recurring features associated with sadism were regrouped into dimensions. This approach allowed us to keep a distance from the particular choice of one author and helps overcome some of the current problems associated with the definition of sexual sadism (i.e., inconsistencies in the criteria).
This review of the literature revealed that sexual sadism features can be regrouped in at least six dimensions (see Table 1). Those dimensions are as follows: (a) sadistic sexual fantasies, pleasures, or urges; (b) cruelty, torture, and/or bodily punishment; (c) humiliation; (d) domination, control, and/or bondage; and (e) ritualism and/or offense planning. Finally, these behaviors or fantasies could implicate (f) humans and/or animals.
Dimensions Related to Sexual Sadism in the Literature.
Note. DSM = Diagnostic and Statistical Manual of Mental Disorders; APA = American Psychiatric Association; ICD = International Classification of Diseases.
The first dimension strongly associated with sexual sadism is the presence of sadistic sexual fantasies, pleasures, or urges. Since the publication of Psychopathia Sexualis, few authors omitted the presence of fantasy or sexual urges from their conception of sadism. Sexual desire can be defined as an interest in a person, or an object which leads to a search for a potential sexual intercourse (Bancroft, 2012). It includes motivational states, needs, drives, or impulses to engage in sexual activities (Regan & Atkins, 2006). On the other hand, sexual fantasies can be defined as a mental image or pattern of thought that can create or enhance sexual arousal (Bancroft, 2012). A strong libido and an excessive sexual desire are central to sexual sadism (Krafft-Ebing, 1886/1998; Proulx et al., 2007). Although some people may not wish to act out their sexual fantasies in real life, it appears that there is a strong relation between fantasies and behaviors among offenders (Longpré, Guay, & Knight, 2011).
The second dimension of sexual sadism is the presence of cruelty, torture, and/or bodily punishment. Since the works of Krafft-Ebing (1886/1998), the majority of researchers have considered these behaviors as central to sexual sadism. These behaviors must be of a physical nature (the psychological aspect of cruelty and torture is associated with humiliation). In the literature, behaviors such as beating, biting, burning, whipping, and mutilating are used to assess the presence of cruelty, torture, and/or bodily punishment in sadistic crimes. Finally, these behaviors are not acted out to control the victim, but rather with the aim of creating fear or to make him or her suffer (Dietz et al., 1990).
Since the works of Eulenberg (1911), humiliation is considered a core feature of sexual sadism. Humiliation can be defined as an action attacking the pride and dignity of others to impose a sense of shame and disgust. The presence of psychological anguish, torment, denigration, and emotional suffering have been used in the research literature to assess the presence of humiliation in sadistic crimes. These behaviors exacerbate the sexual excitement of sadistic offenders and increase the feeling of power over the victim.
The fourth dimension of sexual sadism is the presence of domination, control, and bondage, whether in behaviors or fantasies. Some authors argue that the essence of sadism lies in the absolute power exercised by the offender (Karpman, 1954; Proulx et al., 2007). Dietz et al. (1990) reported the words of J. M. DeBardeleben, a sadistic offender, who mentioned the importance of domination and control in his offense: The wish to inflict pain on others is not the essence of sadism. One essential impulse is to have complete mastery over another person, to make him a helpless object [ . . . ] to become her god. (p. 175)
Behaviors and fantasies, such as bondage, subjugation, and the creation of a feeling of distress and helplessness, are common features used in the literature to describe this dimension.
The fifth dimension of sexual sadism in the literature is the presence of a modus operandi. According to Nitschke, Mokros, Osterheider, and Marshall (2013), planning and ritualism combine sequences or circumstances that are important or even essential for sadistic offenders to achieve sexual satisfaction. Several sadistic offenders plan and elaborate their crimes in detail to feed their fantasies and then try to reproduce these scenarios during the assault. This dimension tends, however, to be more prevalent among sexual murderers, some of whom exhibit an extreme form of sexual sadism (James & Proulx, 2014).
Finally, many authors across time considered that sadistic behaviors could be applied on both humans and animals (Brittain, 1970; Buckels, Jones, & Paulhus, 2013; Krafft-Ebing, 1886/1998; Ressler et al., 1988). Animal cruelty, or zoosadism, is part of MacDonald’s (1963) triad and is considered a precursor of serious psychological problems such as psychopathy and sexual sadism. According to Ressler et al. (1988), people who are cruel with animals and enjoy torturing them are more at risk to do the same with humans. Although this dimension is rarely present at a crime scene, the occurrence of animal cruelty through the history of the offender could be a good indicator of sadistic functioning (Buckels et al., 2013).
An Absence of Pathognomonic Symptoms
As aforementioned, one of the problems in the study of sexual sadism is the lack of consistency in the criteria used to define it (Marshall & Kennedy, 2003; Marshall, Kennedy, & Yates, 2002; Marshall, Kennedy, Yates, & Serran, 2002; Proulx et al., 2007). This lack of consistency is accompanied by the absence of any pathognomonic symptoms, that is, symptoms that are characteristic of a particular disease and help establish a definite diagnosis (Mosby, 2009). Although sadism has been conceptualized as if sadists were fundamentally different from nonsadists, several studies have found that the criteria purportedly related to sexual sadism are also found among nonsadistic samples (Joyal, Cossette, & Lapierre, 2015; Malamuth & Check, 1983; Ogas & Gaddam, 2011; Sagarin, Cutler, Cutler, Lawler-Sagarin, & Matuszewich, 2009). For example, studies have reported that 13% to 60% of men in the general population have domination fantasies (Crépault & Couture, 1980; Joyal et al., 2015), 39% to 50% have bondage fantasies (Arndt, Fochl, & Good, 1985; Ogas & Gaddam, 2011; Joyal et al., 2015), and 23% to 30% have rape fantasies (Joyal et al., 2015; Malamuth & Check, 1983).
An important number of criteria related to sexual sadism are found among the nonsadistic sexual offenders. For example, Marshall and Darke (1982) indicate that about 60% of rapists in their studies reported that humiliation and degradation of the victim was the primary purpose of sexual assault without being considered sadistic crimes. In fact, coercion and humiliation are regular features of nonsadistic rapes (Groth & Birnbaum, 1979; Marshall & Hucker, 2006). Furthermore, aggression—which is another criterion related to sexual sadism—is also present in intrafamilial (Williams & Finkelhor, 1990) and extrafamilial (Lang & Langevin, 1991) child sexual abuse. Recently, Fortin, Dupont, and Guay (2014) studied a sample of 40 child pornography offenders without known contact offense. They reported that close to 70% of their sample had pictures depicting sexual sadism.
The absence of pathognomonic symptoms weakens the assumption of a categorical diagnosis as proposed by the DSM (Marshall & Kennedy, 2003) and leads to the conclusion that sadism may be better represented if measured by dimensional instruments (Marshall & Hucker, 2006; Mokros, Schilling, Eher, & Nitschke, 2012; Nitschke et al., 2009). Recent studies using taxometric analyses have revealed that sadism is consistently distributed as a dimension (Knight, Sims-Knight, & Guay, 2013; Longpré, Guay, Knight, & Benbouriche, 2017; Mokros, Schilling, Weiss, Nitschke, & Eher, 2014). According to Knight (2014), there is no empirical evidence to support the hypothesis of a taxonic structure of sadism, neither in self-report data, archivally derived crime-scene data nor nonoffense behavior. The sum of the empirical evidence (i.e., similar taxometric results with different samples) clearly warrants a dimensional interpretation of sadism.
A dimensional conceptualization of sexual sadism offers several advantages over the current conceptualization (Nitschke et al., 2009). One of the most important advantages is that this type of conceptualization overcomes the problems associated with the absence of pathognomonic symptoms (Marshall & Hucker, 2006) and may increase interrater agreement (Mokros et al., 2014; Nitschke et al., 2009). Moreover, this new conceptualization is in accordance with the different degrees of severity that this diagnosis comprises. Therefore, efforts should focus on developing valid and reliable dimensional instruments of sexual sadism.
Current Sadism Scale
Although amply discussed from a theoretical point of view (Marshall & Hucker, 2006; Marshall & Kennedy, 2003; Marshall, Kennedy, & Yates, 2002; Marshall, Kennedy, Yates, & Serran, 2002), the idea of a dimensional measurement of sexual sadism has received little empirical scrutiny (e.g., Knight et al., 2013). Marshall and Hucker (2006) were the first to develop a dimensional scale of sexual sadism. To this end, they asked professionals to evaluate the relevance of 35 diagnostic criteria used in studies of sadism. The 17 criteria judged the most relevant for the evaluation of sadism were fashioned into a hypothetical scale, which they called the Sexual Sadism Scale (SSS).
More recently, Mokros, Nitschke, and colleagues (Mokros et al., 2012; Nitschke et al., 2009) examined the psychometric properties of the SSS and proposed a new scale composed of 11 items (see Table 2), which they called the Severe Sexual Sadism Scale (SESAS). The SESAS has been widely used in research in recent years. The original structure of the scale was cross-validated on a new sample (Mokros et al., 2012). The SESAS was also used to scrutinize the latent structure of sadism (Mokros et al., 2014) and to assess sadism among female sexual offenders (Pflugradt & Allen, 2013). Recent studies indicated that the SESAS does not correlate with physiological measurements of sadism (Longpré, Brouillette-Alarie, & Proulx, 2016), correlate with the DSM diagnosis of sadism (Eher et al., 2016; Longpré, Brouillette-Alarie, et al., 2016), and is not associated with a higher risk of recidivism (Brouillette-Alarie, Proulx, & Hanson, 2017; Eher et al., 2016).
Items in the Severe Sexual Sadism Scale.
Even though the SESAS exhibits good psychometrics properties (for more details, see Nitschke et al., 2009), this scale is not without empirical problems (Longpré, Brouillette-Alarie, et al., 2016; Mokros et al., 2014). The first important problem is negative interitem correlations (i.e., sexual arousal, humiliation and keeping trophies; Mokros et al., 2014). The second problems is that apart from Nitschke et al.’s (2009) initial study, no study has reported a maximum score of 11 (e.g., Cumbleton, Maillet, & Looman, 2012; Longpré, Brouillette-Alarie, et al., 2016; Pflugradt & Allen, 2013).
Aim of the Study
Given that the significant consequences of sexual sadism, it is important to ascertain that tools that assess it meet rigorous empirical standards and reflect the true nature of its latent structure. Therefore, additional sadism scales and empirical studies with appropriate analyses are required to examine the viability of its dimensional assessment. The first objective of the present study is to assess the viability of a checklist scale—The MTC Sadism Scale (MTCSS)—with classical test theory (CTT) and two-parameter item response theory (2PL IRT) analyses.
Furthermore, recent studies suggest that sadism scales should be based solely on behavioral indicators instead of trying to infer a synergism between deviant sexual arousal and various sadistic behaviors (Kingston, Seto, Firestone, & Bradford, 2010; Marshall & Kennedy, 2003; Nitschke, Mokros, Osterheider, & Marshall, 2013). As aforementioned, humiliation and sadistic sexual fantasies are considered core dimensions of sadism. However, their measurement can be problematic and their reliability can be questioned (Longpré, Brouillette-Alarie, et al., 2016; Marshall, Kennedy, & Yates, 2002; Mokros et al., 2014). Therefore, as a second objective, the viability of using behavioral markers in the assessment of sexual sadism with IRT analysis will be evaluated.
Method
Participants
The initial sample was composed of 518 adult male sexual offenders who had been assessed at the Massachusetts Treatment Center (MTC) for sexually dangerous persons between 1959 and 1984 and who had been determined to be sexually dangerous and civilly committed. An extensive database had been gathered on these offenders, coding numerous variables using their extensive archival records that included clinical interviews, diagnostic and psychometric assessments, information about his offenders’ criminal records and police records, court testimony, parole summaries, probation records, institutionalization records, and school and employment reports. For the vast majority of the MTC sample, postcommitment information—including treatment reports, behavioural reports, work reports, and summaries of program participation—was also available. For the purpose of this study, detailed ratings of up to five sexual crimes, including the index crime, were collected.
The database used in this study was provided by Dr. Raymond A. Knight for second hand analyses. The MTC database was used in several past studies, including the development of two typologies: one for rapists (MTC: R; Knight & Prentky, 1990); and one for child molesters (MTC: CM; Knight & Prentky, 1990). Knight and Prentky’s classifications are among the most empirically validated typologies and are widely used in both research and treatment of sexual offenders (e.g., Looman, Gauthier, & Boer, 2001; Proulx, 2001; S. L. Reid, Wilson, & Boer, 2010). The initial MTC database was coded and rated independently by two trained research assistants. Reliability coefficients in the initial database ranged from .80 to .98.
From the initial 518 participants, 486 participants were retained in the current study. A total of 32 participants were withdrawn on the basis of preliminary Rasch analyses (i.e., inadequate infit or outfit mean square) in a previous study (i.e., Longpré et al., 2017). 1 The sample consisted of 219 rapists (i.e., victims older than the age of 16 years), 178 child molesters (victims younger than the age of 16 years), and 89 mixed offenders (victims who were both older and younger than 16 years old). At the time of the assessments, the offenders’ average age was 29 years (SD = 10.5). Most participants were Caucasian (88.2%), and, at the time of their arrest, were usually or steadily employed (67.2%), had not completed their secondary school (61.4%), and had never been married (52.5%).
Creating the MTC Sadism Scale
Using the components identified in the literature to measure sexual sadism (see Table 1), we created a first version of the MTCSS. From those components, we found behavioral and fantasy indicators present in the MTC database that indicated the potential presence of sexual sadism. This scale comprised eight components theoretically related to sadism that collapse into the six dimensions presented above. These components are as follows: (a) control and domination, (b) aggression, (c) humiliation, (d) sexual arousal, (e) cruelty without sexuality, (f) ritualism and/or offense planning, (g) torture, and (h) insertion of foreign objects into bodily orifices.
We selected 27 indicators in the MTC database to assess these components (see Table 3). They were selected on the basis of their theoretical relevance and their presence in the MTC database. The selection of the indicators was based on consensus ratings between the authors of this study. The MTCSS 27 indicators version was used in a recent study scrutinizing the latent structure of sadism with taxometric analyses (Longpré et al., 2017).
Distribution of Components and Indicators of the MTCSS (27-Indicators Version).
Coding the MTC Sadism Scale
All indicators had been coded as either absent (0) or present (1). The indicator had to be present in one of the five possible sexual crimes collected to be coded as present. Most MTCSS indicators had direct equivalents in the MTC database (e.g., Victim tied). However, in some instances, we had to use proxy variables from the database to code particular domains. For example, the presence of humiliation was assessed with proxies describing the presence of aggressive verbalization during the offense.
Analyses
In an attempt to improve the psychometric properties of the MTCSS (i.e., its reliability and validity), we applied two analysis strategies—CTT and 2PL IRT analyses. Our sample size is considered as sufficient to provide accurate parameter estimates (Jiang, Wang, & Weiss, 2016).
Classical Test Theory
CTT was created to improve the psychometric properties of psychological tests. CTT assumes that there is a true score that would be uncovered if there were no measurement errors (Kline, 2000). However, because there is no way to observe the true score, we measure what is called the observed score (i.e., the sum of true score and measurement errors). The analysis strategy focuses on the interrelation among items in correlation matrices (Kline, 2000). Scale reliability is not only influenced by the interitem correlations but also by several other characteristics like the number of items. Cronbach’s alpha, which varies from 0 to 1, is generally used as an internal consistency index that assesses the average relation among the items as an indicator of whether the scale captures a cohesive construct (i.e., Sexual Sadism). A commonly accepted guideline is to postulate that a Cronbach’s alpha of .70 is considered acceptable, whereas a Cronbach’s alpha of .90 is considered excellent (Kline, 2000). Because Cronbach’s alpha is strongly related to the numbers of items included in the assessment scale, it should be used with caution (Goforth, 2015). A first series of analyses was conducted to remove items that decrease internal consistency. Kuder–Richardson Formula–20 (KR-20), a substitute for Cronbach’s alpha, was used because the indicators were dichotomous. All procedures were conducted with SPSS version 21.
Item Response Theory
The second series of analyses conducted were 2PL IRT analyses. IRT, also known as latent trait theory, was developed to circumvent some problems associated with CTT. IRT models generally assume that the examined latent trait, represented by theta (θ), is unidimensional (de Ayala, 2009). Moreover, according to IRT models, a response to an item is influenced both by participant and item characteristics. The graph of the relation between the ability score of a person and the probability that this person will either endorse the item or will correctly answer it is called the item characteristic curve (ICC), which takes the form of an S-shape curve (Hambleton, Swaminathan, & Rogers, 1991; C. A. Reid, Kolakowsky-Hayner, Lewis, & Armstrong, 2007).
In the present study, a 2PL IRT model was used to assess the MTCSS. The 2PL IRT allows for items to vary not only in their locations or difficulty on the latent trait continuum but also in their capacity to differentiate between persons located at different points on the continuum (C. A. Reid et al., 2007). Because dichotomous indicators were used in the present study, we followed the normal ogive model instead of the Samejima’s graded model (Forero & Maydeu-Olivares, 2009).The first parameter, known as the difficulty parameter or beta (b), is the location of the inflexion point on the ICC. The b parameter usually varies from −3 to 3, where items located below 0 are considered easy and items above 0 are considered difficult or serious (de Ayala, 2009). The second parameter, the discrimination parameter or alpha (α), is the degree to which the item has the power to discriminate between individuals who have or do not have the corresponding b level of a particular latent trait (C. A. Reid et al., 2007). The discrimination parameter is measured using the angle of the slope of the point of inflexion of the ICC. An item with a discrimination parameter considered good usually ranges from 0.8 to 2.5 (de Ayala, 2009).
The 2PL IRT allowed us to determine the psychometric properties of the scale and items and to assess the ability of the scale to discriminate between different levels of sadism. Moreover, items considered either too easy or without good discriminating power were eliminated. 2PL IRT analyses was conducted with Mplus version 6.12 (Muthén & Muthén, 1998-2010).
Results
Classical Test Theory
CTT analyses revealed that the first version of the MTCSS had good internal consistency. The mean KR-20 score for the 27 indicators is .72. The analysis indicated that this index could be increased slightly if three items were deleted—(a) impulsivity in the offenses raised KR-20 to.74, (b) humiliation
Unidimensionality
Essential to examining the structure of a scale using IRT is the establishment of its most important assumption—unidimensionality (Fan, 1998). The violation of model assumptions may lead to erroneous or unstable IRT model parameter estimates. There are several methods to assess the unidimensionality of a model and exploratory factor analysis (EFA) with principal axis factoring and oblimin rotation is part of them (Bertrand & Blais, 2004; Hattie, 1985). The unidimensionality assumption postulates that item covariations arise predominantly from a single underlying dimension, and, therefore, the first eigenvalue should be considerably larger than the remaining eigenvalues (de Ayala, 2009).
To determine whether the MTCSS met the unidimensionality requirement, we first calculated an EFA on the remaining 26 indicators. An EFA with principal axis factoring and oblimin rotation yielded a seven-factor solution. Together, they accounted for 51.4% of the variance. The Cattell’s scree plot revealed that the eigenvalue for the first factor explained 30.4% of the total variance, which exceeds the recommended limit of 20% for considering the model as unidimensional (Reckase, 1979). As mentioned by Hattie (1985), “the larger the amount of variance explained by the first component, the closer the set of items is to being unidimensional” (p. 146). Moreover, the first factor produced an eigenvalue of 4.07 and the second factor, an eigenvalue of 2.05, for a ratio of 1.99. These results indicate that the model can be considered as “sufficiently” unidimensional to perform IRT (Engelhard, 2013).
A confirmatory factor analysis (CFA) was also conducted in Mplus using weighted root mean squares residual (WRMR) estimation. The fit indices suggested that a one-factor model provided an acceptable fit to the data, χ2(299) = 529.55, p < .001; root mean square error of approximation (RMSEA) = .04; comparative fit index (CFI) = .92; Tucker–Lewis index (TLI) = .92. Although there is no single evaluation rule that has achieved consensus in this determination, the generally accepted interpretation of the fit measures is that for the RMSEA, a value of .04 indicates a good model fit. As for the CFI and the TLI, values close to .95 generally indicate a good fit (Hu & Bentler, 1999). Therefore, our incremental fit indices (CFI and TLI) fell slightly under the recommended level of .95. All these results (internal consistency, item-total correlation, EFA, and CFA) converge in support of the assumption that the 26-indicator version of the MTCSS is sufficiently unidimensional.
Model Fit
There are no absolute criteria for model-fit data in IRT (Templin, 2007). However, there are a variety of analyses that can be conducted to guide our judgment. For small item numbers, the traditional chi-square test of model fit can be used. Our small and nonsignificant chi-square indicates that the model probably fits the data. Because chi-square can be influenced by sample size, we also considered the Akaike information criteria and the Bayesian information criteria. Both measures also indicate a good fit between the data and the model. Finally, we also looked at the standardized residual for univariate and bivariate model fit. A general rule of thumb is that the standardized residual should be between ±2 (Agresti, 2013, 2015). Standardized residual over ±3 indicated an important difference between the observed and expected frequency (i.e., an outlier indicator). In our data, most of our indicators (i.e., n = 17) fell within the threshold of ±2 and none reach the problematic threshold of ±3. Therefore, all analyses indicate that our model adequately fits the data.
Item Response Theory
We next conducted a series of 2PL IRT analysis on the 26 remaining indicators of the MTCSS. The results indicated that a 15-indicator version represented the best psychometric solution. Eleven items were deleted because they did not meet the prerequisites for acceptable item parameter estimates. Their discrimination and difficulty parameters were too far from the recommended range presented in the IRT section above (for more details, see de Ayala, 2009). Item parameter estimates are presented in Table 4.
Item Response Theory Item Parameter Estimates (15-Indicator Version).
The final version of the MTCSS showed an acceptable internal consistency (KR-20 = .78). Analyses revealed that the KR-20 could not be increased by the deletion of any item (see Table 5). Moreover, no indicators correlated negatively with the total score. The presence of cuts, bruises and abrasions (r = .61) and the presence of medical problems requiring physician (r = .61) were the indicators that presented the highest item-total correlation. Finally, no indicators correlated negatively with other indicators and interindicator correlations ranged between .11 and .44.
KR-20, Item-Total Correlations, and Frequencies (15-Indicator Version).
Note. KR-20 = Kuder–Richardson Formula–20.
ICCs revealed that most of the items were located in the upper spectrum of sadism (i.e., the points of inflexion were high). The difficulty parameters ranged from 0.05 (cuts, bruises, and abrasions) to 4.72 (vaginal insertion of object). These results indicated that the majority of the indicators included in MTCSS assessed the severe end of the spectrum of sadism. A closer look at the analyses revealed that elements such as insertion of objects (vaginal insertion of object, b = 4.72; anal insertion of object, b = 3.59), cruelty without sexuality (cruelty to people, b = 3.65; cruelty to animal, b = 3.51), torture (sadistic assaults on victim’s genitals/breasts, b = 2.49), and severe aggression (kicking, b = 3.65) were on the upper end of the sexual violence spectrum and therefore were more “difficult” to attain and less frequent. Medical problems (cuts, bruises, and abrasions, b = 0.05; medical problems requiring physician, b = 0.94) occupied the lower end of the continuum and were more “easy” to attain.
The discrimination parameter (i.e., the angle of the slope) revealed that the majority of the indicators fell within the adequacy range of 0.5 to 2.5 proposed by Reeve and Fayers (2005). One indicator is outside of this range as cruelty to people (α = 0.36) fell slightly below. The justification for retaining this item in the final version of the MTCSS is addressed in the discussion that follows. For the indicators within the proposed range, victim tied (α = 0.54) and vaginal insertion of object (α = 0.59) were the least the discriminating indicators, and the second indicator of torture (expressive aggression: uncontrollable rage and anger leading to mutilation after the sexual assault, α = 2.32) was the most discriminating indicator.
Convergent Validity
The convergent validity of the MTCSS’s 15-indicator version was measured. Because we did not have access to a clinical diagnosis of sadism, we used the SESAS as an external and reliable measure. Of the 11 items that compose the SESAS, 10 were available in the MTC database. We did not have access to information to code the items offender keeps records (other than trophies) or trophies (e.g., hair, underwear, ID). Nonetheless, this item is not considered as a core feature of sadism (Marshall & Hucker, 2006) and is more common among sexual murderers than general sexual offenders (Proulx et al., 2007).
The participants’ mean score on the MTCSS’s 15-indicator version was 1.94 (SD = 2.11, range = 0-10; skewness = 0.95; kurtosis = 0.10). Rapists and mixed offenders scored significantly higher than child molesters on the MTCSS, F(485) = 22.09; p < .001. Recently, Longpré, Guay, and Knight (2016) found that MTCSS’s severe behaviors were more common among rapists and mixed offenders than child molesters, which is consistent with the literature. These results indicate that the MTCSS’s 15-indicator is effective to discriminate between rapists and child molesters on both the total score and the severity of the behaviors. Finally, a recent study indicated that the MTCSS total score correlated with external measures of juvenile antisocial behaviors (r = .26, p < .001), adult antisocial lifestyle (r = .21, p < .001), and narcissism (r = .16, p < .001), which is also consistent with the literature (Longpré, Guay, & Knight, 2018).
The participants’ mean score on the SESAS was 2.74 (SD = 1.47, range = 0-8; skewness = 0.46; kurtosis = 0.04); 15.84% of participants (n = 77) obtained a score of at least 4, Nitschke et al.’s (2009) threshold for sexual sadism. The prevalence of sadism found in our sample with the SESAS is similar to what is reported in other studies (e.g., Marshall, Kennedy, & Yates, 2002). The Pearson’s r correlation between the MTCSS and the SESAS was positive and significant (r = .66, p < .001), indicating a moderate to strong convergence between the two dimensional measures of sadism.
Discussion
The creation of a dimensional sadism scale is consistent with the current paradigm shift away from DSM-like categorical models toward the assessment of a sexual sadism continuum. Recent empirical investigations have made it increasingly evident that a dimensional measure represents the future of research on sexual sadism (Knight et al., 2013; Longpré et al., 2017; Mokros et al., 2014). Although recent sadism scales, SSS (Marshall & Hucker, 2006) and SESAS (Mokros et al., 2012; Nitschke et al., 2009; Nitschke et al., 2013), are a definite improvement over the problematic DSM criteria, these scales have limitations that must be addressed prior to their use in either research or clinical settings. The aim of the present study was to develop a SSS and assess its psychometric properties with CTT and IRT analyses. Moreover, IRT analysis was used to evaluate the viability of using crime-scene behavioral markers for the assessment of sexual sadism.
Analyses revealed that a 15-indicator version of the MTCSS have the best psychometric properties. Although the MTCSS maintains many of the advantages of the SESAS (e.g., good internal consistency, good discriminating power, good convergence validity), it also eliminates some of its problems (e.g., negative interitem correlation). IRT analysis revealed that the use of behavioral markers in the assessment of sexual sadism was appropriate. In general, the data on the MTCSS supports the viability of a dimensional assessment of sexual sadism based on behavioral markers.
The Psychometric Performance of the MTCSS
A closer look at the difficulty parameter, or beta (b), revealed that the majority of the indicators included in the scale were considered “difficult,” which indicates that the MTCSS mostly assesses the severe end of the sadism continuum. Although no indicators fell below the threshold of zero, the distribution of our indicators on the spectrum of difficulty was consistent with the literature (e.g., Knight, 2014). Behaviors that were considered most difficult, that is, least frequent and at the upper end of the spectrum, were the presence of cruelty without sexuality, high levels of aggression, and elements of torture. Although sexual aggression is frequently characterized by elements of coercion, sexual violence, and humiliation (Marshall & Kennedy, 2003), even among high SESAS sadistic offenders severe sexual violence has been found to be quite rare (Nitschke et al., 2009). For example, Mokros et al. (2012) reported that only 3.8% of their sample had mutilated the nonsexual parts of their victims, 4.8% had mutilated the sexual parts of their victims, and 9.5% had inserted objects into victims’ orifices. The indicator that was considered the “easiest” or most frequent was the presence of medical problems after the aggression, which is consistent with the literature. For example, Nitschke et al. (2009) reported that the presence of gratuitous violence or wounding was the most frequent behavior in their sample, being present in two thirds of their sample.
Analyses revealed that 14 of the 15 indicators manifested good discriminating power. The discrimination parameters represent an indicator’s ability to differentiate among offenders with varied levels of sexual sadism. Whereas elements such as cruelty without sexuality (cruelty to people, cruelty to animal), tying the victim (victim tied), and insertion of objects (vaginal insertion of object) were the least discriminating indicators, behaviors such as burning his victim (burns), and torturing the victim (expressive aggression: uncontrollable rage and anger leading to mutilation after the sexual assault) yielded the best discrimination. To our knowledge only two studies (i.e., Knight et al., 2013; Mokros & Stefanska, 2017) have conducted 2PL IRT analyses on sadism scales. Although not all the indicators in the present study were present in prior studies, the reported patterns are similar.
One indicator fell below (cruelty to people) Reeve and Fayers’ (2005) recommended range for discrimination parameters. The indicator cruelty to people was retained in the MTCSS because of its strong relation in prior research with sadism. People who are cruel with others have been found to be inclined to enjoy and engage in sadistic behaviors (Buckels et al., 2013). Moreover, this indicator manifested a good difficulty level (b = 3.65).
Assessing Sexual Sadism With Behavioral Markers
The assessment of sadism has been based on three sources of information: self-reported fantasies and behaviors, phallometry, and offence-related behaviors. Since Krafft-Ebing’s (1886/1998) early work, the vast majority of authors have considered fantasies or sexual urges as an integral part of sadism. Unfortunately, few offenders are willing to admit openly the presence of sadistic fantasies, especially in adversarial forensic evaluations. For example, Marshall and colleagues (Marshall, Kennedy, & Yates, 2002; Marshall, Kennedy, Yates, & Serran, 2002) found no significant differences between sadistic and nonsadistic offenders on self-reported sadistic fantasies. Although self-report gives the impression of providing access to more valid information, such disclosures come at a price, because they are difficult to obtain unless they are gathered in a research context with assurances of total confidentiality, as in Knight et al. (2013). Therefore, self-reported sadistic fantasies are rarely available for typical forensic evaluations, and professionals must infer the level of sadism from indications of sexual arousal to sadistic cues or from phallometry (Marshall & Kennedy, 2003). The inference of sadistic fantasies from behaviors has been found, however, to be a daunting task (e.g., Knight & Prentky, 1990), and studies using phallometry to assess sexual sadism have yielded inconsistent results (Harris, Lalumière, Seto, Rice, & Chaplin, 2012; Lalumière & Quinsey, 1994; Longpré, Guay, et al., 2016; Marshall, Hucker, Nitschke, & Mokros, 2015; Proulx, 2001; Seto, Lalumière, Harris, & Chivers, 2012). All these issues have led Nitschke et al. (2013) to conclude that the assessment of sadistic fantasies is the “Achilles heel of sexual sadism diagnosis” (p. 1441).
Because sadistic fantasies are at the very heart of this sexual disorder, Nitschke et al. (2013) proposed that self-revealed fantasies in the assessment of sadism should be considered only if their reliability can be corroborated (e.g., with a phallometric assessment). In the absence of corroboration, we should rely on evidence from the crime scene to infer the presence of sadistic sexual fantasies. Offense-related behaviors remain the most reliable data source, even when such information is incomplete. Such data are less vulnerable to distortion, and when full disclosure has been granted to participants, there is evidence that sadistic fantasies and behaviors are highly correlated (Longpré et al., 2011), suggesting that the presence of sadistic behaviors is a good marker of underlying sadistic fantasies. Consequently, crime-scene behaviors should be considered a proxy indicator of sadistic fantasies (Marshall & Kennedy, 2003; Nitschke et al., 2013). As pointed out by Marshall and Kennedy (2003): We suggest that clinicians state sexual offending issues in terms of behaviors rather than attempting to infer an elusive synergism between sexual arousal and various acts or consequences [ . . . ]. Fantasies or arousal to brutality may very well contribute to an estimate of risk to reoffend, although we doubt it would add anything meaningful to already knowing the behavioral data. (p. 17)
Implications of the Dimensional Assessment of Sadism
In light of the results of CTT and 2PL IRTs, we can conclude that using behavioral markers for the dimensional assessment of sexual sadism is a viable assessment option—at least from a reliability standpoint. Such a paradigm shift from categorical to dimensional assessment has several implications.
Considering sadism to be dimensional has important consequences both for assessment and for the strategies that should be used to study the construct. As opposed to a categorical structure, a dimension does not provide the nonarbitrary, clearly defined cutoff point of a taxonic boundary or natural class (J. Ruscio, Haslam, & Ruscio, 2006). In contrast, the cutoff point must be determined empirically and must be established to optimize the specific objectives of the assessment (J. Ruscio et al., 2006). By “empirically,” we mean that analyses are required to determine the cutoff necessary to predict a predetermined criterion such as the presence of a particular level of sadistic fantasies or the likelihood of reoffending. Such cutoffs must be understood as subjective distinctions imposed to optimize practical decisions rather than as “naturally” occurring boundaries representing differences in kind as is suggested in the DSM-5. In guiding scale, creation dimensionality requires a measure that possess adequate discriminative power along the length of the continuum rather than its focus on the discriminative power at one specific, purportedly nonarbitrary boundary (Widiger & Clark, 2000). Dimensional scales (SSS, SESAS, MTCSS) offer the advantage of allowing such cross-continuum discrimination.
The dimensionality of sadism also indicates that extreme group, categorical strategies for studying the nature, and use of the construct are suboptimal. Rather, dimensional research strategies based on latent traits should be employed. Using categorical assessment or extreme group analyses to measure dimensional constructs has been shown to have a deleterious impact on both measurement error and statistical power (MacCallum, Zhang, Preacher, & Rucker, 2002; Preacher, Rucker, MacCallum, & Nicewander, 2005; A. M. Ruscio & Ruscio, 2002).
The Agonistic Continuum
Recently, Knight and colleagues (Knight, 2010, 2014; Knight et al., 2013; Sims-Knight & Guay, 2011) proposed the idea of an agonistic continuum ranging from no coercive fantasies or behavior, to nonsadistic sexual coercion (what is termed paraphilic coercive disorder), to severe sexual sadism. The term “agonistic,” from the Greek agonia, captured the idea of struggle, anguish, and agony present in paraphilic coercive disorder and sexual sadism (Knight et al., 2013). Whereas the DSM committee on paraphilia considered the possibility of creating two distinct entities, recent research has corroborated the hypothesis that both constructs are distributed as a single dimension instead of two separate categories (Knight, 2014; Knight et al., 2013). Although sadistic assaults are regularly marked by the presence of cruelty, torture, and mutilation, such acts of violence only represent a part of the large spectrum of sexual coercion (Knight, 2010, 2014; Knight et al., 2013; Sims-Knight & Guay, 2011).
If sexual sadism is a dimensional construct and is part of an agonistic continuum, instruments should fashion to assess the entire spectrum to capture this complex construct adequately. Unfortunately, most scales fail to capture the lower end of the spectrum (Longpré, Guay, et al., 2016) and the MTCSS suffers from the same problem (Knight, 2014). A closer look at the item/person map analyses provides a useful perspective on the MTCSS. Although the majority of items are located in the severe spectrum of sadism, most of our offenders are located in the lower end of the sadism spectrum. Moreover, although the current sadism scales might be adequate for forensic and severe offenders, they probably tend to underestimate the presence of sadism in correctional samples (Longpré, Guay, et al., 2016). It is essential that we develop assessment tools that cover the whole spectrum of the agonistic continuum.
Limitations
This study has its limitations. First, it is predominantly a scale fashioned from crime-scene data, which limits the breadth of information about sadism that can be gathered reliably. For instance, adequate information on humiliation, which is a core dimension of sadism (Marshall & Kennedy, 2003), constituted an important limitation. Crime-scene data did not yield data sufficient to create a measure of humiliation with acceptable psychometric properties. This problem reflects the more general problem of using crime-scene behaviors, even from records that have detailed descriptions of crimes. Although the indicators in the MTCSS are highly related to sexual sadism, they reflect observable behaviors (easily accessible in crime-scene descriptions), and consequently, they underrepresent fantasies and do not assess sadism in consensual situations. Nonetheless, such behavioral measures are essential, even though assessing the presence of sadistic fantasies and sadism in consensual sexuality is difficult from such data, because of the problems previously discussed in obtaining reliable information from offenders in adversarial contexts. Thus, although MTCSS is limited in its assessment of sadistic fantasies and sadism in consensual sexual activity, it is less vulnerable to biased reporting, and it focuses on a parsimonious, but fairly reliable assessment of sexually sadistic behavior.
Second, the information available in the archival files that were the major data source of the present study limited the variables that could be coded and considered. The data in the archival files were collected as part of clinical and forensic assessment in a time that was less sophisticated than the present and for a purpose that was not aimed directly at answering research questions. The information was not gathered in an organized fashion to answer specific questions about sadism. Despite these limitations, the MTC database provided sufficiently detailed information to allow us to evaluate the performance of a dimensional scale based on behavioral markers with CTT and IRT analyses. Although the MTCSS scale is not yet sufficient to be used clinically as an alternative to the DSM, it does offer several advantages ranging from good psychometric properties to parsimonious assessment of sadism. Additionally, the MTCSS maintains many of the advantages of the SESAS, the gold standard in the dimensional assessment of sadism, and eliminates some of its problems. With these limits in mind, we believe that more scales and empirical studies should be undertaken to improve the practical assessment of sadism and provide alternative assessments to the unreliable and outdated conceptualization of sadism in the DSM.
Conclusion
In summary, our results indicate that the MTCSS has good psychometric properties. Several authors pointed out that the DSM nosological classification has gone as far as it can go (Insel, 2013; Schmidt, Kotov, & Joiner, 2004). Moreover, as mentioned by Widiger and Samuel (2005), the complexity of psychological disorders are unlikely to be adequately represented and measured by diagnostic categories that attempt to create nonexistent discrete joints along continuous distributions. This view corresponds with the current movement that is drifting from nosological diagnoses toward dimensional assessments of sexual sadism.
Although our analyses indicated that a dimensional assessment of sexual sadism is a viable option, more research is required to establish a comprehensive and widely applicable metric for sadism. Moreover, measures have to be found that can generalize across multiple samples. Future research should now focus on the development of valid and reliable measures that assess the whole spectrum of sexual coercion.
Footnotes
Acknowledgements
The authors wish to thank Professor Andreas Mokros from the department of psychology at the University of Hagen for his methodological advice and for his help. Furthermore, the authors would like to thank professor Eric Beauregard and Jean Proulx for their help.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
