Abstract
The six–dimensional HEXACO model of personality structure and its associated inventory have increasingly been used in personality research. But in spite of the evidence supporting this structure and demonstrating its advantages over five–dimensional models, some researchers continue to use and promote the latter. Although there has been little overt, organized argument against the adoption of the HEXACO model, we do hear sporadic offerings of reasons for retaining the five–dimensional systems, usually in informal conversations, in manuscript reviews, on social media platforms, and occasionally in published works. In this target article, we list all of the objections to the HEXACO model that we have heard of, and we then explain why each objection fails. © 2020 European Association of Personality Psychology
By the 1980s, many researchers had come to agree that personality characteristics could best be summarized by five dimensions—the Big Five personality factors. This consensus represented a major advance for our discipline, because for the first time personality researchers had an objective basis for choosing which characteristics to measure. The research that led to this consensus—the seminal lexically based studies of personality structure by Fiske, by Tupes and Christal, by Norman, by Digman, and especially by Goldberg (e.g. 1990; see review in Goldberg, 1993)—was and still is a huge breakthrough, because virtually nothing had previously been known about personality structure. And once the Big Five was established, it was the organizing framework for a great deal of important research on basic questions in personality psychology, much of it by Costa and McCrae (e.g. McCrae & Costa, 2003) or using the measures they developed (e.g. Costa & McCrae, 1992).
The early lexical researchers were remarkably successful: they generated a reasonably close approximation to the structure of personality characteristics, even though their data were limited to quite small variable sets (owing to computing power limitations) derived from only one language and culture (owing to the near absence of early work on this topic outside the USA). 1 But beginning about 30 years ago, researchers began to report results from lexical studies of personality structure in various other languages and with considerably larger variable sets. When we examined the results of those studies, we noticed that a set of six factors—not just five—was repeatedly recovered (e.g. Ashton, Lee, & Son, 2000; Ashton et al., 2004; Lee & Ashton, 2008; see also Saucier, 2009; De Raad et al., 2014). 2 In the resulting six–dimensional model—first proposed in the European Journal of Personality—three factors are essentially the same as five–dimensional Extraversion, Conscientiousness, and Openness to Experience, and three others—Honesty–Humility, Emotionality, and Agreeableness (versus Anger)—replace the five–dimensional Neuroticism and Agreeableness factors, with Honesty–Humility frequently (but not always) having the most non–Big Five variance. We believe that this structure would have been discovered much earlier, and by the same researchers who discovered the Big Five, if the datasets based on larger variable sets from several languages had been available to them. 3
We're citing the American English–language research that led to the Big Five, but we should note that some important psycholexical research had started in Europe in the 1980s and late 1970s, in particular by Brokken, by Hofstee, and by Angleitner and Ostendorf.
We use the term ‘factor’ in the commonly used sense of ‘dimension’, not necessarily implying a latent–trait, common–factor model. For most of the issues addressed in the present manuscript, it matters little whether one considers personality dimensions as common factors or as composite variables—to use McCrae's (2015) terminology, whether one views these dimensions as ‘intersections’ or as ‘unions’—or even whether one prefers a ‘network perspective’ (e.g. Cramer et al., 2012) on those dimensions.
In this paper, we focus on the dimensional structure of personality characteristics, as assessed using self–reports and observer reports. We do not consider here the dimensional structure of observable behaviours or experiences as sampled and aggregated across many time points. Obtaining such a structure could be quite useful in particular research studies, but we think it would be difficult to identify any single optimal structure from this domain, because of the problem of identifying a variable set representative of the universe of behaviours and experiences.
During the past 15 years, many researchers have adopted the HEXACO model and the main instrument associated with it—the HEXACO Personality Inventory—Revised (HEXACO–PI–R; e.g. Lee & Ashton, 2004, 2018). During this same interval, a lot of additional evidence has accumulated showing the advantages of the HEXACO framework over five–dimensional models (see reviews in Ashton & Lee, 2007; Ashton, Lee, & De Vries, 2014). And also during this same interval, there hasn't been (as far as we know) any systematic attempt to show that the HEXACO model is actually inaccurate or that it doesn't really have any advantages. But in spite of all that, many researchers continue to use and promote five–dimensional personality models and measures in research that is intended to examine the major personality dimensions. Why haven't they switched to the HEXACO framework? We've heard many reasons—sometimes in informal conversations, sometimes in manuscript reviews, sometimes on social media platforms, and sometimes even in published works. In this article, we state each of the objections that we're aware of, and we refute each of them in turn (Table 1).
Summary of objections to adoption of HEXACO model and responses to those objections
But before we get started, you should be aware that we're the same people who identified the HEXACO model and constructed the HEXACO inventory. We might therefore find it difficult to be objective in examining the various issues below, and even more so given that we receive royalty payments for non–academic use of that inventory (academic use is free of charge). So, having been warned of these conflicts of interest of ours, you can use your judgment in evaluating our arguments.
And also before we get started, we want to make it clear that we're arguing for the assessment of the HEXACO factors whenever a researcher wants to summarize the personality domain broadly. But when instead a researcher wants only to measure a trait of special interest to a particular research question, then we don't recommend assessing the HEXACO factors. Instead, the researcher should assess the trait of interest, using whatever scale seems most appropriate, regardless of whether it comes from a measure of the HEXACO factors or of the Big Five or of some other system, or from a stand–alone scale.
And now let's get started.
1. ‘Some five–dimensional measures contain an Agreeableness factor that subsumes some Honesty–Humility traits, so there's no need to measure the HEXACO factors.’
Some proponents of the Big Five have downplayed the distinctiveness of the HEXACO model by suggesting that it simply splits the broad Big Five Agreeableness factor into two sub–factors (e.g. DeYoung, 2015; McCrae & Costa, 2008; Van Kampen, 2012; Watson, Stasik, Ellickson–Larew, & Stanton, 2015), one of which is Honesty–Humility.
Now, some measures of Big Five Agreeableness—such as those of the Big Five Inventory (BFI; John, Donahue, & Kentle, 1991) and the Next Big Five Inventory (BFI–2; Soto & John, 2017)—don't actually include any Honesty–Humility content and don't actually correlate much with Honesty–Humility (see, e.g. Ashton & Lee, 2019a; Ashton, Lee, & Visser, 2019). That's obviously a big problem, but what about five–dimensional measures that do capture a lot of Honesty–Humility variance? The most obvious example is the NEO Personality Inventory–Revised (NEO–PI–R; and its successor the NEO–PI–3; Costa & McCrae, 1992; McCrae & Costa, 2010). The Agreeableness dimension of that instrument subsumes some Honesty–Humility traits (e.g. McCrae & Costa, 2008) and does correlate substantially with HEXACO–PI–R Honesty–Humility (e.g. Watson et al., 2015).
You might think that this should solve the problem well enough. But it doesn't. As we've shown, the NEO–PI–R factor scales collectively omit almost as much HEXACO variance as the BFI(–2) scales do, relative to the amounts explained in those instruments by the HEXACO scales: the variance ‘gained’ in Honesty–Humility is nearly matched by variance ‘lost’ in HEXACO Agreeableness and Emotionality. As a result, the total amount of missing HEXACO variance in the five NEO–PI–R scales is nearly as much as it is in the other five–dimensional instruments. It's just that the missing variance of the NEO–PI–R is distributed across three HEXACO dimensions (Honesty–Humility, Emotionality, and Agreeableness) instead of being concentrated in Honesty–Humility as is the case for the BFI or BFI–2 (see Ashton & Lee, 2019a).
Let's explain this some more. In several recent studies, we've examined the variance accounted for in Big Five measures by HEXACO measures, and vice versa. In each study, we first obtained the ratio of the proportion of Big Five variance accounted for to the proportion of HEXACO variance accounted for, and we then multiplied that ratio by five, to find out how many ‘Big Five variables’ worth’ of variance were accounted for by the HEXACO scales. If the HEXACO scales were adequately accounted for by the Big Five, as might happen (for example) if Honesty–Humility were largely redundant with the latter dimensions, then the resulting value would be not much higher than 5.0 (see Ashton & Lee, 2019a, for details). If instead the HEXACO factors contained a full additional variable worth of variance, then the ratio would be 6.0—and it might even be more, if the Big Five variables were somewhat redundant with each other. Now, we wouldn't expect the ratio to be quite as high as 6.0, because there is in fact some modest correlation between HEXACO Agreeableness and Honesty–Humility.
And yet, what we've found (Ashton & Lee, 2019a) is that in some samples, Big Five measures such as the BFI, the NEO Five–Factor Inventory (NEO–FFI; Costa & McCrae, 1992), and the International Personality Item Pool Big Five (Goldberg, 1999a) actually omit the equivalent of a full one–sixth of the total HEXACO variance—or to put it another way, the HEXACO scales actually capture the equivalent of a full six variables’ worth of Big Five variance. 4 In a different sample, the NEO–PI–R did a bit better, as the HEXACO instrument accounted for 5.69 variables’ worth, but in that same sample, the BFI also did a bit better, with the HEXACO instrument accounting for 5.74 variables’ worth. So even a Big Five measure that incorporates quite a bit of Honesty–Humility within its (Big Five) Agreeableness scale still does only modestly better overall than a Big Five measure that includes no Honesty–Humility content at all. 5
Of course, you could have similar results by adding to a set of Big Five scales some very narrow personality variable that is peripheral to the Big Five. Or you could even add a random variable. But our point in doing these analyses is just to show that the six broad HEXACO factors—which, you'll remember, are already established through lexical studies of personality structure as the major dimensions of personality—are not adequately captured by five–dimensional personality measures.
And having said this, we think that the NEO–PI–R facet scales cover an impressively wide range of content and could actually do a reasonably good job of approximating the six broad HEXACO scales. But the way to do this would be to abandon the five NEO–PI–R ‘domain’ scales outright in favour of six new scales computed as combinations of the facets that best approximate the HEXACO scales (see Ashton & Lee, 2019a). For the BFI(–2), by contrast, the lack of Honesty–Humility content means that one could simply add an Honesty–Humility scale to correct this deficiency, with other HEXACO factors being approximated by certain combinations of BFI(–2) facets. But this would still leave some deficiency in accounting for HEXACO Emotionality, and the set of six dimensions wouldn't have the same theoretical interpretability as the HEXACO dimensions would (see section 15).
And as we've already noted (Ashton & Lee, 2019a), the HEXACO variance not captured by the Big Five is about as much as you'd find in comparing a Big Five measure with a ‘Big Four’ measure that omits Neuroticism. So, keeping the Big Five instead of the HEXACO means losing about as much variance as you'd throw away by using this Neuroticism–less Big Four instead of the Big Five.
This missing HEXACO variance wouldn't be of concern if it were simply due to some artefact of measurement. But the HEXACO–PI–R scales contain just as much valid variance as Big Five measures do. For example, in samples of closely acquainted pairs, self/observer agreement is at least as high as for Big Five scales (e.g. De Vries, Realo, & Allik, 2016; Lee & Ashton, 2013). And keep in mind the reason why the missing variance matters: the HEXACO factors represent the major, broad dimensions of personality, as originally identified using the only variable sets that can be claimed to be representative of the personality domain.
This fact has several consequences. Because of the missing variance, the HEXACO scales substantially outpredict the Big Five (whichever Big Five) in relation to many personality characteristics and personality–relevant criterion variables—including materialism, delinquency, unethical decision making, status–driven risk taking, phobic tendency, short–term mating (or sociosexuality), and ‘realistic’ vocational interests (see review in Ashton, Lee, & De Vries, 2014). Keep in mind that these advantages in criterion validity occur even in cross–source data, typically with observer reports of personality used as predictors of self–report outcome variables.
And also because of the missing HEXACO variance, we now know that the Big Five had given an incomplete understanding of some important personality phenomena. For example, sex differences are generally larger in the HEXACO factor space (and sometimes in the Emotionality factor alone) than in the Big Five factor space (e.g. Ashton, Lee, & De Vries, 2014). Likewise, age differences (Ashton & Lee, 2016) as well as similarity and ‘assumed similarity’ between close acquaintances (Lee et al., 2009; Thielmann, Hilbig, & Zettler, 2020) are more readily understood in terms of the HEXACO factors, as both of these phenomena involve a lot of Honesty–Humility and hardly any (HEXACO) Agreeableness. And in addition, the HEXACO scales can accommodate the contrast between the tendency to avoid exploiting others (high Honesty–Humility) and the tendency to tolerate exploitation by others (high Agreeableness), a theoretical interpretation (see Ashton & Lee, 2001, 2007) that has been borne out in studies of people's directly observed behaviour in economic games and behavioural ethics tasks (Hilbig, Thielmann, Klein, & Henninger, 2016; Hilbig & Zettler, 2009; Hilbig & Zettler, 2015; Hilbig, Zettler, Leist, & Heydasch, 2013; Zettler, Hilbig, & Heydasch, 2013; Zhao, Ferguson, & Smillie, 2015; Zhao & Smillie, 2015). So, the large amount of missing HEXACO variance—missing from any five–dimensional personality measure—is a big deal.
(2)‘Personality is hierarchically structured, so it doesn't matter so much whether we measure at one level of the hierarchy (five factors) or another (six factors).’
Before we start on this one, we have to distinguish between two entirely different kinds of hierarchy in personality structure. One kind of hierarchy consists of several levels of constructs. For example, at the highest level, you can have several broad factors, each of which is defined (at the next level down) by several narrower, facet–level traits, each of which is defined (at the lowest level) by many very narrow behavioural tendencies (often corresponding to single items). 6 We'll consider that kind of hierarchy in section 4 below.
We should note that the item–level constructs have recently been studied in their own right as ‘nuances’ (e.g. Mõttus et al., 2017); we can appreciate the potential usefulness of the nuance level but won't discuss it further here. In addition to the levels above, an even higher level of one or two very broad factors has also been proposed, but that level doesn't exist: see section 7 below.
A second kind of hierarchy consists of several factor solutions. For a given variable set, one can examine a solution with one factor, a solution with two factors, and so on, and map out the relations between the factors at different levels (e.g. Goldberg, 2006). The objection stated above concerns this second kind of hierarchy. According to this view, one can meaningfully examine solutions having different numbers of factors, and sometimes it can be useful to examine a solution with fewer, broader factors, or sometimes it can be useful to examine a solution with more, narrower factors, so there's no need to favour any one solution over the others. 7
Now, some researchers (e.g. Markon et al., 2005) have attached particular importance to the sequence by which various factors emerge as additional factors are extracted from their preferred collections of personality questionnaire scales. However, it's doubtful that those patterns have any real significance, because they depend on the particular set of questionnaire scales being examined, and they don't match the sequence obtained from the more representative personality variable sets of lexical studies of personality structure (cf. Ashton et al., 2015; De Raad et al., 2014; Saucier & Srivastava, 2015).
The reason why this argument fails is that the six–factor solution is simply far more useful than the others. On the one hand, if you decide to measure five or fewer factors, then you throw away a lot of variance, and that matters a lot—recall section 1 above. On the other hand, if you decide to measure seven or more personality factors, then there are two problems.
The first problem is that no candidate set of seven or more factors has yet been identified. We took a good look at the seven–factor solutions obtained in lexical studies of personality structure in various languages (e.g. Ashton & Lee, 2007; Ashton, Lee, Perugini, et al., 2004), and we found several quite different ones, with none of them recurring widely. (Remember, it's these studies whose variable sets can defensibly be claimed to be representative of the personality domain.) Basically, if we confine ourselves to the personality domain—not bringing in other important psychological individual differences such as mental ability, religiousness, psychotic tendencies, and sexual orientation—then it isn't even clear what a candidate seven–factor (or eight–factor, or nine–factor) solution would look like. This means that if you want seven (or eight, or nine) personality factors, you'll have to make some arbitrary decisions, because given what we know about the cross–language replicated results of lexical studies of personality structure, there's no principled way to select a set of seven (or eight, or nine) factors. 8
This isn't to say that you can't identify some meaningful or useful set of many relatively narrow personality dimensions from lexically based variable sets (see, e.g. Saucier & Ostendorf, 1999; Saucier & Iurino, in press). We suspect that lexical variable sets could be used to identify some very useful sets of somewhat (but not trivially) narrow factors, many of them defined by traits that have rather low communalities within six–factor solutions. Most factors of these large sets might even be widely replicable across languages, but we doubt that a given entire large set of k factors would be widely replicable within k–factor solutions of those languages.
The second problem is that even if you do arbitrarily make a set of seven or more personality factors, that set will only give you a little information beyond the original set of six: if you break up or rearrange the six to obtain seven or more, the resulting factors will show some sizeable intercorrelations, so the increase from six dimensions won't gain you much information. 9 We'll show you an example in the next section.
Well, again, we suppose you could keep those intercorrelations low if you decided to measure one or more narrow traits having relatively low loadings across the six dimensions. Those might be some interesting variables to study, but they'd still be narrow traits, not broad factors.
(3)‘There's no need to replace the Big Five with the HEXACO factors, because you can just use the 10 “aspects” of the Big Five instead.’
DeYoung, Quilty, and Peterson (2007) have proposed a model and measure of personality structure in which each of the Big Five factors is divided into exactly two ‘aspects’. These ‘aspects’ were identified in those authors’ joint factor analyses of the 30 NEO–PI–R facets and the 45 scales of the International Personality Item Pool measure of the ‘Abridged Big Five Circumplex’ structure (Goldberg, 1999a). DeYoung et al. did five such analyses, each involving variables mainly associated with a given Big Five factor, and in each case extracted exactly two factors. The resulting set of 10 factors was the basis for the 10 ‘aspects’ of Big Five Aspect Scales (BFAS; DeYoung et al., 2007).
We think that the BFAS measures themselves are psychometrically sound. But the fundamental problem with the proposed 10 ‘aspect’ constructs is that this set of 10 factors is not obtained from any variable sets that have any claim to be representative of the personality domain. (A big set of facets developed as markers of the Big Five space doesn't qualify.) The set of 10 aspects is never recovered in any lexical study of personality structure.
But let's put that aside and focus only on the question of how much information is provided by the 10 aspects. You'll recall from section 1 above that a lot of the variance in the six broad HEXACO scales is not captured by Big Five scales: typically, the HEXACO scales account for almost six Big Five scales’ worth of variance.
When we start by applying this kind of analysis with the five broad BFAS, the same thing happens: using the meta–analytic correlation matrix of Ludeke et al. (2019), we find that the six HEXACO variables account for 5.85 Big Five variables’ worth of variance (Lee & Ashton, 2019). This means that if you use the HEXACO scales instead of the Big Five, the trade–off between information and parsimony is pretty favourable: by adding one more variable, you get close to one more variable's worth of information. (In fact, as we mentioned above, the trade–off is about as good as it would be if you added Neuroticism to a ‘Big Four’ that excluded that variable—and no one would suggest getting rid of Neuroticism as being too redundant.)
But what happens next if you decide to use the 10 aspects of the BFAS instead of the six HEXACO factor scales? If we apply the same kind of analysis in the same dataset, we find that the 10 aspects account for about 6.44 HEXACO scales’ worth of variance (see Lee & Ashton, 2019). That's right: even though the number of variables went up by four (i.e. from six up to 10), the gain in information was about half of a variable worth (i.e. from six up to six and a half). In other words, to obtain far less new information than the HEXACO scales provided over the Big Five, you had to add not one new variable but four new variables. As you can tell, this is a bad trade–off between information and parsimony.
But if you really want information, you'll do much better with facet–level scales than with aspects. When we used rationally selected sets of up to 10 HEXACO facet scales to predict the aspect scales, the squared multiple correlations ranged from .46 (Intellect) to .63 (Assertiveness), with a mean R2 of .56 (Lee & Ashton, 2019). In predicting the 25 facet scales, however, the set of all 10 aspect scales showed many rather poor predictions, with a mean R2 of .37 and with about half of the 25 facet scales having R2s in the .10s or .20s. 10
By the way, a large part of the problem—but not all of it—is that the BFAS variables don't account for much Honesty–Humility variance, even though the ‘Politeness’ aspect of Agreeableness does have a few items highly relevant to Honesty–Humility. DeYoung (2015, p. 36) has said that ‘the content of the Honesty/Humility factor (the sixth factor) can be encompassed by the Politeness aspect of Agreeableness’, but in spite of that, not so much Honesty–Humility variance is captured by the Politeness scale. Another part of the problem is that the various ‘aspects’ are fairly highly correlated with each other, not just within the same Big Five factor but across those factors: for example, in the meta–analytic data of Ludeke et al., six of the cross–factor aspect intercorrelations had absolute values exceeding .40, with another four exceeding .30.
(4) ‘Narrower, facet–level traits are more important than broader, factor–level traits, so it doesn't matter much whether we measure five broad dimensions or six.’
This argument is one that we can relate to, because we've often argued that for many outcome variables, a rationally selected set of facet–level scales can outpredict factor–level scales (e.g. Ashton, Paunonen, & Lee, 2014). (In fact, one of us actually got started in this line of work by examining that research question.) Depending on the outcome variable and the facet–level traits that seem most relevant conceptually, you might be better off not measuring any broad personality dimensions at all, and just using those particular facets—even if they're not part of the HEXACO–PI–R. And for some research questions, the facets provide a lot of interesting detail not captured by the broad factors, as we've recently seen in our own studies of age differences in personality (Ashton & Lee, 2016), of political orientation and personality (Lee, Ashton, Griep, & Edmonds, 2018), and of religiousness and personality (Ashton & Lee, 2019b). Going even further, items themselves—as representatives of what Mõttus, Kandler, Bleidorn, Riemann, and McCrae (2017) call ‘nuances’—will often provide even more detail and predictive accuracy.
But for many research questions, you want to understand an outcome variable in terms of the major dimensions of personality—how it ‘fits’ within the personality factor space. Not only do the facets lack the unique status of the factors—there are many plausible alternative sets of facets, and there is no single ‘correct’ set—but the factors also have the advantage of parsimony. In any case, personality research has generally been conducted with a focus at the factor level: we found 36 articles on PsycINFO with ‘meta–analysis’ and either ‘Big Five’ or ‘Five–Factor Model’ in the title, but only three with ‘meta–analysis’ and either ‘facets’ or ‘facet–level’ (and one of those three wasn't a facet–level meta–analysis). 11
You can see this in the names of structural models themselves: for example, there's a Five–Factor Model but no 30–Facet Model, and there's a HEXACO model but no 25–Facet Model. By the way, we don't claim the set of HEXACO–PI–R facets to be necessarily the best possible set, nor was it based on any systematic analyses of the lexical datasets; it's merely intended to be reasonably simple and fairly comprehensive. Costa and McCrae have likewise noted that the 30 NEO–PI–R facets are not necessarily an optimal set, but some proponents of the five–factor model have nevertheless dedicated much attention to respondents’ profiles on those 30 variables.
Likewise, when you try to understand personality phenomena, whether you measure five dimensions or six does matter a lot. For example, when research on assumed similarity in personality reports was undertaken using Big Five measures, there was a notable contrast between certain facets of NEO–PI–R Agreeableness, such that Straightforwardness and Modesty showed much more assumed similarity between spouses than Compliance did (McCrae & Costa, 2008). This observation reflects the fact that assumed similarity between social partners is quite strong for Honesty–Humility but close to zero for HEXACO Agreeableness and Emotionality. And when research on age–related differences in personality was undertaken using Big Five measures, again no one had any idea that those age–related differences would be so large for Honesty–Humility traits—or so close to zero for (HEXACO) Agreeableness traits. And once again, sex differences in personality typically are better summarized by the HEXACO factors than by the Big Five, largely because of the Emotionality factor.
So whatever personality phenomenon you're studying (see Baumert et al., 2017, for some suggestions), make sure you're measuring all six factors. To take one example, if you'd like to study genetic and environmental influences on personality variation, use the HEXACO factors, as some researchers have recently been doing (Kandler, Lewis, Butković, Vukasović Hlupić, & Bratko, 2019; Kandler, Richter, & Zapko–Willmes, 2019; Lewis & Bates, 2014). Or if you're interested in ‘network approaches’ to personality, you should start with the HEXACO factors, as has recently been performed by Costantini et al. (2015). Or if you like latent profile analysis, the same point applies (Daljeet, Bremner, Giammarco, Meyer, & Paunonen, 2017).
(5) ‘You don't need to measure the HEXACO factors, because Honesty–Humility is covered by the “Dark Triad” variables, so you can just add the Dark Triad to the Big Five.’
The ‘Dark Triad’ variables (Paulhus & Williams, 2002)—psychopathy, Machiavellianism, and narcissism—have been studied extremely widely in personality research, and a common practice has been to assess the Dark Triad characteristics along with some measure of the Big Five. On some level, this practice makes sense, because the Dark Triad variables capture a lot of variance in Honesty–Humility, most of which is missed by many Big Five instruments.
But there are some disadvantages to this approach (see Lee & Ashton, 2014, for details). One big disadvantage is that the Dark Triad correlates more strongly with any version of (low) Big Five Agreeableness than HEXACO Honesty–Humility correlates with HEXACO Agreeableness, so you don't obtain as much unique information—and don't forget that Big Five scales typically already correlate more strongly with each other than the HEXACO scales correlate with each other. Related to this disadvantage, there are many criterion variables that are better predicted by the HEXACO scales than by the Big Five/Dark Triad combination, even in cross–source data (e.g. Lee et al., 2013). And in addition, if you go for this sort of thing, there are some integrated theoretical interpretations for the HEXACO factors (e.g. Ashton & Lee, 2007; see also section 15) that don't generalize neatly to the Big Five–plus–Dark Triad.
With regard to the Dark Triad taken as a set, we tend to think that there are more general problems. There is no empirical or theoretical basis for choosing these three variables as opposed to some other set of three (or four, or five, etc.) characteristics that involve selfish versus unselfish tendencies. (A recent investigation by Moshagen, Hilbig, and Zettler, 2018, actually identified nine such characteristics.) Moreover, each of the Dark Triad is multifaceted, each having one or more facets that overlap with those of the other(s) and one or more that does not; as a result, the Dark Triad variables themselves are not optimally differentiated from each other.
Some recent studies have shown that a set of three scales measuring the Dark Triad variables can add some incremental validity beyond Honesty–Humility in predicting certain criterion outcomes (e.g. Pilch & Górnik–Durose, 2016; Wertag & Bratko, 2019). This is to be expected, given the common finding that sometimes a given criterion outcome will be related not only to the variance of broad personality factors but also to the unique variance of narrower personality traits. We should note, however, that in some cases these results will be attributable simply to the increase in reliability afforded by additional scales assessing similar constructs. As discussed by Schmidt, Hunter, and Caplan (1981) and by Westfall and Yarkoni (2016), the incremental validity provided by one scale over another can sometimes occur simply because the former helps to make up for the imperfect reliability of the latter. (Think about it this way: if you divide a single scale into its even–item and odd–item halves, each of those halves will likely add some predictive validity beyond the other). 12
This problem would also apply to the studies demonstrating incremental validity of Honesty–Humility over the Big Five (see Westfall & Yarkoni, 2016). But we think the best way of comparing predictions by the HEXACO factors with those by the Big Five factors is to compare multiple correlations achieved by the two models (as done, e.g. in Ashton & Lee, 2008; Lee et al., 2013), not to evaluate incremental effects of Honesty–Humility over the Big Five factors.
(6) ‘Obviously you can't measure Honesty–Humility through a self–report instrument, because dishonest people will claim to be honest.’
We rarely hear this argument from personality researchers, but we should address it anyhow: basically, when people provide self–reports of personality under ‘low–stakes’ conditions—in which they have no incentive to make a misleadingly positive impression—they don't simply all describe themselves as high in Honesty–Humility. Self–report scores on the HEXACO–PI–R Honesty–Humility scale in general population samples show roughly normal distributions, with many persons having scores below the theoretical scale midpoint. There's no ‘piling up’ of persons at the high end of the scale, as would be expected if dishonest people were responding in such a way as to appear honest.
As for the validity of self–reports of Honesty–Humility, they correlate fairly highly with observer reports of Honesty–Humility as provided independently by persons who know the respective target persons very well: when the subjective level of acquaintanceship reaches 10 on a 0–to–10 scale, the self/observer correlations reach .50 (Lee & Ashton, 2017; see also Zettler, Lang, Hülsheger, & Hilbig, 2016). More generally, the self/observer agreement for Honesty–Humility is generally at least similar to that for Big Five Agreeableness (see, e.g. De Vries, Realo, & Allik, 2016; Lee & Ashton, 2013). The self–reports also correlate substantially with directly observed behaviours in various scenarios that assess cheating, selfishness, and exploitation of others (e.g. Hershfield, Cohen, & Thompson, 2012; Hilbig & Zettler, 2015).
Now, self–reports of Honesty–Humility also happen to correlate fairly highly (and positively) with scores on the ‘Impression Management’ scale (from the Balanced Inventory of Desirable Responding; e.g. Paulhus, 1991)—a scale that is often interpreted as indicating how much a respondent is ‘faking good’. But Impression Management scale scores—especially when administered in ‘low–stakes’ settings—actually don't primarily measure individual differences in faking. Instead, in considerable part, they measure how well behaved a person really is, as indicated by the fact that self–reports on Impression Management scales correlate positively (i) with observer reports on those same scales (and also on Honesty–Humility and Conscientiousness) and (ii) with a directly observed measure of ethical behaviour (i.e. refraining from cheating even when cheating would not be detected)—see De Vries, Zettler, and Hilbig (2014), Zettler, Hilbig, Moshagen, and De Vries (2015), De Vries et al. (2018), and Müller and Moshagen (2019a, 2019b). 13
The ‘overclaiming bias’ (Paulhus, Harms, Bruce, & Lysy, 2003)—a tendency to overstate one's knowledge of various stimuli (e.g. events, concepts, and persons)—has also been suggested to be an indicator of a positive self–presentation bias. Empirical findings show that overclaiming bias is correlated modestly with (high) Openness to Experience but minimally with the other HEXACO factors, including Honesty–Humility (Dunlop et al., 2017; see also Grosz, Lösch, & Back, 2017, for overclaiming in relation to narcissism).
Mind you, we still don't recommend Impression Management scale items as measures of Honesty–Humility, because we expect that they are still influenced to at least some extent by differences between people in the tendency to describe themselves in more desirable or less desirable terms. The results of one recent study (Müller & Moshagen, 2019b) support this view, as an external index of self–presentation was more strongly related to the Impression Management scale than to Honesty–Humility self–reports, whereas cheating behaviour was more strongly related to the latter than to the former. But the main point is that the correlations of Honesty–Humility (and also of Conscientiousness) with the Impression Management scale (or any other variable purported to assess ‘faking good’) don't undermine the construct validity of the former.
And this reminds us: compared with the HEXACO–PI–R, Big Five measures actually have more self–report ‘method’ or ‘source’ variance, mainly owing to individual differences in the social (un)desirability of responding—not necessarily the same thing as deliberate faking, but still an irrelevant source of variance. When we examined self–reports and observer reports from the NEO–FFI and (60–item) HEXACO–PI–R (Lee & Ashton, 2013), we found that four of the NEO scales (all except Openness) loaded about .40 on a self–report response bias factor (Neuroticism loading negatively), whereas only three of the HEXACO scales loaded substantially on such a factor (.26 for Agreeableness, .31 for Conscientiousness, and .36 for Extraversion). So, in the case of a self–report outcome variable that loads (say) .50 on a self–report response bias factor, the obtained correlations for NEO–FFI self–report scales (other than Openness) would each be inflated by about .20 (i.e. .50 × .40), whereas the obtained correlations for HEXACO self–report scales would be inflated by about .13 to .18 (for the three scales noted above) or less (for the other three scales). What this means is that when you're measuring both personality and outcomes with self–report, and when your outcome variables are heavily influenced by self–report social desirability variance, your validity estimates will actually be inflated more for the Big Five measures than for the HEXACO–PI–R. 14
See Biderman, McAbee, Hendy, and Chen (2019) and Biderman, McAbee, Chen, and Hendy, (2018) for results—based on analyses of self–reports only—that support this conclusion for the BFI(–2) as well as the NEO inventories. And, for some broadly Big–Five–like self–report measures used in assessing pathological personality, this inflation due to self–report method variance is much more severe (see Ashton et al., 2017).
(7) ‘The HEXACO Honesty–Humility factor is really just a higher–order “stability” factor representing the common element of Big Five Agreeableness, Conscientiousness, and Emotional Stability.’
The claim here is that Honesty–Humility correlates with all three of these Big Five factors, which correlate positively with each other, and therefore Honesty–Humility is basically equivalent to a higher–order factor (often called ‘stability’ or ‘alpha’) that is defined by those three Big Five factors. For example, as one personality psychologist (Wiernik, 2017a) has put it, Honesty–Humility is ‘more like the higher order factor above Agree./Consc./Emot. Stab.—“Factor Alpha” or “Stability”—than a 6th factor on same level’. In a similar way, Catano, O'Keefe, Francis, and Owens (2018, p. 90) speculated as to whether Honesty–Humility ‘is a sixth personality factor … or simply a second–order factor based on three of the Big Five dimensions’.
Now, we've already found that there's no need to invoke higher–order personality factors to account for correlations between Big Five scales. Those correlations are better explained instead by ‘blended variable’ models incorporating secondary loadings for some facet–level variables (Ashton, Lee, Goldberg, & De Vries, 2009). 15
Irwing (2013, p. 239) claimed that the results of Ashton et al. (2009) were based on comparisons of ‘a theoretically pre–specified model with one directly fitted to the data’. On the contrary, and as stated explicitly in our 2009 article, the blended–variable models were derived in one sample and then cross–validated in two other samples.
But even as far as correlations between Honesty–Humility and the Big Five are concerned, the above argument fails: in many samples (see, e.g. Lee & Ashton, 2013; Ashton & Lee, 2019a; Ashton et al., 2019; Lee & Ashton, 2019; based on Ludeke et al., 2019), Honesty–Humility correlates close to zero with Neuroticism, and its quite low correlations with Conscientiousness are mainly due to the Fairness facet of Honesty–Humility (by contrast, Greed Avoidance and Modesty are usually uncorrelated with Conscientiousness). As for Honesty–Humility and Big Five Agreeableness, the correlations can be rather high, depending on which measure of the latter is considered. But as we mentioned in section 1, Honesty–Humility is only modestly associated with HEXACO Agreeableness.
The fact that Honesty–Humility is only weakly related to Conscientiousness has apparently been noticed by some proponents of the higher–order personality factors, who have also argued that this result indicates a shortcoming of the HEXACO Honesty–Humility scale. For example, Viswesvaran and Ones (2016, p. 65), in their review of research on the ‘integrity tests’ often used in predicting workplace counterproductivity, argued that ‘The fact that the honesty–humility scale lacks a conscientiousness element makes it a severely deficient measure of integrity’. Fortunately, the HEXACO–PI–R also happens to include a Conscientiousness scale (it's the ‘C’), which in combination with Honesty–Humility produces considerably better prediction of workplace counterproductivity than that yielded by the Big Five: in the meta–analysis by Pletzer, Bentvelzen, Oostrom, and De Vries (2019), the squared multiple correlations in self–report data were .32 for the six HEXACO scales versus .23 for the Big Five.
(8) ‘HEXACO Honesty–Humility is actually a dimension of values, not of personality, so there's no need for a personality inventory to measure it.’
Here is the argument, from Parks–Leduc, Feldman, and Bardi (2015, p. 24):
As the different views on the nature of the relationships between traits and values can cause confusion in the literature, researchers who study traits and values should state their underlying assumptions and provide a consistent theoretical conceptualization connecting their work with other trait–value studies that conceptualize the link in a similar way. As an example, the HEXACO personality inventory (K. Lee & Ashton, 2004) is a six–factor model that includes Honesty–Humility (not included in our study because it is not a Big Five inventory). The Honesty–Humility factor seems to largely tap values; the developers of the scale state that the common adjectives used to define the factor are ‘Sincere, honest, faithful/loyal, modest/unassuming, fair–minded’ (Ashton & Lee, 2007, p. 154). These descriptors overlap considerably with values items; ‘honest’ and ‘loyal’ are both benevolence items on the SVS, whereas the tradition scale includes the item ‘humble’ (Schwartz, 1992, p. 7), which seems quite similar to modest and unassuming. Fair–minded would seem to fit with universalism values, which are concerned with equality and fairness in society (Schwartz, 1992). If one is of the opinion that traits and values are both aspects of personality, then the HEXACO scale may be a legitimate method for measuring personality. However, if a researcher views personality as an aggregate of traits (and not values), then this scale may not be appropriate for measuring personality. We therefore encourage researchers to make explicit their assumptions and definitions.
We'll start by accepting, for the sake of argument, that any personality characteristic that is represented as an item in the Schwartz Values Survey must no longer be considered as a personality characteristic. This rule would eliminate some defining traits of Honesty–Humility, but it would likewise eliminate some defining traits of Big Five Agreeableness (politeness and helpfulness) and of Openness to Experience (creativity, curiosity, and broad–mindedness). So, if you'd like to stick with the Big Five on the grounds that Honesty–Humility is really a dimension of values, well, you'll have to drop those other two factors and call it the Big Three instead.
But of course, we don't really think that a characteristic included in the Schwartz Values Survey is automatically to be excluded from the personality domain. It merely happens that Schwartz considered some common characteristics that are uncontroversially viewed as personality traits to be among the kinds of things that people can consider as guiding principles in one's life. As we've discussed elsewhere, it's of considerable interest that the domain of ‘values’ overlaps heavily with two of the six HEXACO dimensions—Openness to Experience and Honesty–Humility (Lee et al., 2009). These are the same two HEXACO dimensions that show the strongest associations with political attitudes (e.g. Lee, Ashton, Ogunfowora, Bourdage, & Shin, K–.H., 2010; Leone, Desimoni, & Chirumbolo, 2012) and the same two HEXACO dimensions that show the highest levels of similarity and assumed similarity in the self–reports and observer reports of closely acquainted persons (Lee et al., 2009)—and also, the highest preferred similarity in a hypothetical ideal romantic partner (Liu, Ludeke, Haubrich, Gondan–Rochon, & Zettler, 2018). 16
See also Visser and Pozzebon (2013) for Honesty–Humility and intrinsic versus extrinsic life aspirations, and Ogunfowora (2014) for Honesty–Humility and job seekers’ preferences for ethical leaders.
And remember: The HEXACO model was absolutely not the result of any expansion of the personality domain. It was based on the same kinds of variable sets, taken from the same universe of characteristics, that the Big Five was derived from. Researchers (including us) are often interested in other broad psychological individual difference dimensions from outside the personality domain—consider religiousness, or psychotic tendencies, or sexual orientation, or mental ability—but the HEXACO factors came from the same personality domain that the Big Five came from.
(9) ‘But in some languages, lexical studies of personality structure don't recover the six–factor HEXACO space.’
It might well be that in many languages the six HEXACO factors won't be found: some research suggests that only two very broad factors will be recovered universally (Saucier et al., 2014). We've speculated that the more differentiated six–factor structure might be found only in languages that have had relatively large populations of speakers and relatively old written traditions (Ashton, Lee, & De Vries, 2014). 17 We don't claim cross–language universality for the HEXACO dimensions.
Then again, it isn't clear whether the less differentiated structures obtained from other languages are due to the personality lexicons of those languages or to influences such as their speakers’ unfamiliarity with personality rating forms. Even for questionnaires designed to measure the Big Five, their factor structures are often not well recovered in less economically developed countries, except when participants are self–selected for interest in completing a personality questionnaire (e.g. Laajaj et al., 2019), a result that would also apply to the HEXACO factors. Some researchers have interpreted such findings as indicating that personality structure itself is less differentiated within less socioecologically complex societies (e.g. Lukaszewski, Gurven, von Rueden, & Schmitt, 2017), but we think that the critical variable is the origin of the rater, not of the target.
What we claim instead is that the six–factor HEXACO space is the largest to show any widespread cross–language replicability (see detailed discussion in Ashton & Lee, 2007). As Saucier and Srivastava (2015, pp. 292–293) have put it (in relation to the emergence of one–factor to six–factor solutions), ‘It is not proposed that studies in every language will reveal this pattern of emergence. We suggest only that the central tendency will be to do so.’ Note also that the exact factor axis locations obtained in a given study may depart from the usual HEXACO locations—a result that reflects the fact that the personality space is not simple structured (e.g. Hofstee, De Raad, & Goldberg, 1992; Saucier, 1992)—even when the HEXACO space is itself recovered (see discussions in, e.g. Ashton & Lee, 2007; Lee & Ashton, 2008). 18
In these analyses of single adjectives, the proportion of total variance accounted for by the six dimensions is typically around 25%. This result reflects in part the fact that ratings on single adjectives contain considerable error variance, as even the most synonymous adjectives typically intercorrelate only around .75; in order to account for even 50% of the total variance, one would need to extract dozens of factors (e.g. about 40 in the English–language dataset of Lee & Ashton, 2008).
But how do we know how well the six–factor structure is replicated? To make sure that it wasn't all in our own minds, we also asked some undergraduate students (see Lee & Ashton, 2008): we gave each student a list of all six HEXACO factor descriptions along with the set of six lists of adjectives representing the six lexical factors of a given language, but without any hints as to which factor was which. We then asked each student to rate on a scale from 0 to 8 the similarity of each factor description to each list of adjectives—that is, each student made a set of 6 × 6 = 36 ratings. (Different students rated different languages, with at least 17 students per language.) What we found was that the students’ mean convergent and discriminant ratings matched our expectations pretty closely (Lee & Ashton, 2008, table 3). This convergent/discriminant pattern is also found in correlations between lexical personality factors and HEXACO–PI(–R) scales (e.g. Ashton et al., 2006; Ashton & Lee, 2010; Ashton, Lee, Marcus, & De Vries, 2007; Szarota, Ashton, & Lee, 2007; Wasti, Lee, Ashton, & Somer, 2008) 19 and between lexical personality factors and cross–language adjective marker scales (Ashton & Lee, 2010). We summarize these various convergent/discriminant results in Table S1.
Convergent correlations involving the HEXACO–PI(−R) Honesty–Humility scale and lexical Honesty–Humility factors tend to be somewhat lower than those for the other factors, probably because HEXACO–PI(−R) Honesty–Humility, unlike the other HEXACO scales, doesn't contain any adjectives embedded within its items. We think that adjective–based Honesty–Humility scales, without the behavior–in–situation contextualization of HEXACO items, likely involve larger elements of variance due to acquiescence or elevation and to socially (un)desirable responding.
And in case you'd like to appreciate the content of the six lexical personality factors, Table S2 lists the highest–loading terms from three sources: our own English–language lexical study of personality structure (Ashton, Lee, & Boies, 2015; Lee & Ashton, 2008), Saucier's (2009) summary of our earlier (Ashton, Lee, Perugini, et al., 2004) cross–language comparison, and the De Raad et al. (2014) cross–language simultaneous components analysis. 20 , 21
The lexical factor that corresponds to Openness often includes many terms describing intellectual ability versus lack thereof, a result that reflects the inclusion of such terms in many though not all lexical studies of personality structure. We've excluded intellectual ability—though not intellectual orientation—from our conceptualization of HEXACO Openness, on the grounds that intellectual ability per se is conceptually distinct from personality, a view shared by some other personality researchers (e.g. Costa & McCrae, 1992, p. 15).
Saucier (2009) also reported lists of adjectives that defined an alternative but broadly similar set of six factors as obtained in lexical studies of personality structure based on somewhat broader variable sets. Those variable sets included some categories of terms whose status as personality characteristics is a matter of debate.
But what about solutions with more than six factors? Basically, there aren't any that have replicated across more than a couple of languages (see Ashton & Lee, 2007; Ashton, Lee, Perugini, et al., 2004). We've speculated before that perhaps the ‘sensitive’ and ‘fearful’ aspects of Emotionality could divide into two coherent factors, but that doesn't seem to happen with any regularity. We've also wondered whether the ‘intellectual’ and ‘creative’ aspects of Openness could divide, but this also doesn't happen much, and in any case it's confounded with the inclusion of terms describing intellectual ability—not just intellectual orientation—in some lexical studies. As far as cross–language replicability of personality factors is concerned, so far there appears to be a sharp divide between six and seven factors.
(10) ‘It doesn't matter that lexical studies of personality structure show a six–factor structure, because the personality lexicon over–represents terms involving socially oriented tendencies, which results in the HEXACO structure instead of the Big Five.’
Here is the argument as stated by DeYoung (2010, p. 1176):
Unfortunately, some biases will be present even in the pool of personality descriptors drawn from natural languages. Socially salient traits are likely to be over–represented, for example (because language is used primarily for social purposes), leading to the possibility of over–representation of factors related to socially oriented behavior. Thus, the mere fact that a six factor model is more replicable than the Big Five in various languages is not an inherently sufficient reason to prefer it to the Big Five.
One problem with this claim is that there is no independent source of information about what would be an appropriate representation of ‘socially oriented’ adjectives in the personality lexicon, so there is no direct way to evaluate this claim. (Fortunately, though, there is a way of finding out whether a given factor emerges because of some very heavy representation of highly similar terms, and we'll discuss that issue in the next section.)
Another problem with the argument that ‘socially oriented’ terms are over–represented in the personality lexicon is that it cannot explain why the HEXACO structure should emerge instead of various other six–factor structures. For example, if socially oriented terms are over–represented, then why doesn't Extraversion divide into two factors? Why doesn't Big Five Agreeableness divide into (let's say) a trustingness–versus–suspiciousness factor and a politeness–versus–rudeness factor? Why instead do we repeatedly see a six–factor solution with factors for Honesty–Humility and (HEXACO) Agreeableness—dimensions that just happen to correspond to the two forms of cooperative tendency? There is no a priori reason to expect such a division on the grounds that socially oriented terms are over–represented in personality lexicons.
Although we've tried to take seriously the above argument about socially oriented terms being over–represented, we find it hard to do so. For decades, lexical studies of personality structure—with their crucial advantage of providing variable sets representative of the domain of subjectively important personality characteristics—have been cited by researchers as the primary support for the Big Five structure. But now that the findings from those same studies indicate better support for another structural model, some researchers now argue that, well, those lexical studies aren't so unbiased after all, because if they really were unbiased, then they would support the Big Five structure.
(11)‘The Honesty–Humility factor is just a “bloated specific” factor that emerges as a result of many highly redundant variables.’
Viswesvaran and Ones (2016, p. 65) claimed that ‘the H–H factor as assessed in the HEXACO is a bloated specific factor arising from individual differences in Agreeableness’. They didn't explain this claim thoroughly or test it empirically, but we've seen the bloated specificity point raised several times by anonymous journal reviewers and on social media (Wiernik, 2017b, 2019), so let's discuss it. First of all, recall that a ‘bloated specific’ refers to a narrow factor that is formed only when many nearly redundant variables are included in a factor analysis (Cattell & Tsujioka, 1964).
To find out whether Honesty–Humility is a ‘bloated specific’, let's start by checking out the terms that define the six lexical factors (Table S2), as drawn from our own English–language lexical study (Lee & Ashton, 2008), from Saucier's (2009) summary of our earlier (Ashton, Lee, Perugini, et al., 2004) cross–language comparison, or from the De Raad et al. (2014) cross–language components analysis. As you'll see in these cases, the content of that Honesty–Humility factor is quite diverse, with terms corresponding to the HEXACO–PI–R Honesty–Humility facets of Sincerity, Fairness, Greed Avoidance, and Modesty. There's no apparent tendency for the defining adjectives to be less heterogeneous than those of the other dimensions.
And this apparent tendency can be verified objectively. If this factor were a ‘bloated specific’, then it would be defined by terms that not only seem redundant but also are very highly intercorrelated, thereby producing very high loadings for the defining terms of this factor. But as seen in Lee and Ashton (2008)—or basically any other six–factor solution—this is not the case: the highest–loading terms of the Honesty–Humility factor have loadings no higher than those of the highest–loading terms on other factors. 22
The lexical Honesty–Humility factor of Ashton, Lee, and Goldberg (2004) was less diverse in content than that of the Lee and Ashton (2008) English lexical study, being defined mainly by terms suggesting cunning and pretentiousness. Even in this case, though, the primary loadings of the defining terms of Honesty–Humility were no higher than those of the defining terms of other factors of the six–factor solution, some of which also showed fairly restricted variety of content.
Now, let's consider the HEXACO–PI–R scales. The average facet scale intercorrelation in the Honesty–Humility factor domain is not very different from those in the other personality factor domains. For example, in the self–report samples of our recent paper on the 100–item HEXACO–PI–R, the values for the six HEXACO factor scales (in the HEXACO order) were .45, .37, .45, .44, .34, and .38 (in an online sample) and .32, .37, .41, .40, .39, and .38 (in a student sample). 23 And correspondingly, the sizes of the primary loadings on the Honesty–Humility factor are no different from the sizes of the primary loadings on the other factors (see Lee & Ashton, 2018). Anyhow, what all this means is that Honesty–Humility is not a ‘bloated specific’.
It's not that the Honesty–Humility facets have lower reliabilities than the facets of the other factors. The average alpha reliabilities of the facets (in the online and student samples, respectively) were .80 and .73 for Honesty–Humility, .73 and .71 for Emotionality, .74 and .73 for Extraversion, .74 and .70 for Agreeableness, .71 and .71 for Conscientiousness, and .67 and .65 for Openness to Experience.
(12)‘Honesty–Humility is actually a unipolar dimension defined by its low end—it just assesses selfish, antisocial tendencies versus the lack thereof.’
As Soto (2019a) has put it, ‘H is a misnomer, because it's defined entirely by antisocial rather than prosocial content …. Not sure if the factor is bipolar in the same way as the Big Five, or if it's just one half of a somewhat messy agreeableness–antagonism dimension’. Let's consider a few different ways in which a factor can be ‘unipolar’ or ‘bipolar’.
First, Soto (2019b, tweet #8 in the thread) suggested that the lexical Honesty–Humility factor is defined primarily by ‘antisocial’ content and is therefore less ‘bipolar’ than the Big Five are. Now, the lexical Honesty–Humility factor does tend to have more high–loading adjectives at the low pole than at the high pole, but it still does show some high–loading terms at the high pole, with those terms (e.g. honest and sincere) not typically being negations of the low–pole terms (e.g. Table S2).
However, Soto's point mainly relates to the fact that the items used in the HEXACO–PI–R generally involve some contrast between what he calls antisocial tendencies (at the low pole) and lack thereof (at the high pole; see Soto, 2019b, tweet #8 in the thread). 24 Now, we find Soto's conception of ‘antisocial’ behaviour to be very broad, as the low pole of the Honesty–Humility scale includes not only outright fraud (in the Fairness facet) but also flattery, materialism, and sense of entitlement (in the Sincerity, Greed Avoidance, and Modesty facets). But in any case, this content is consistent with the generally understood meaning of the traits being measured, and also with our interpretation of Honesty–Humility as a tendency not to take advantage of others—to cooperate even when one could successfully exploit them. Anyhow, if item content relevant to the willingness versus reluctance to exploit others somehow makes Honesty–Humility ‘unipolar’, it isn't obvious which is the ‘real’ pole: Soto apparently considers non–exploitation to be the normal mode for human interactions (and maybe you do too, depending on the circles you travel in), but we suspect that exploitation is actually the default for life forms on Earth, with cooperation being the evolutionary novelty. 25
By the way, according to this criterion, the Open–Mindedness scale of the BFI–2 would also be unipolar, because its items all involve the presence or absence of some intellectual, artistic, or creative tendency (see table 6 in Soto & John, 2017). But there's no problem: by the argument explained in this section, BFI–2 Open–Mindedness is just as bipolar a scale as HEXACO–PI–R Honesty–Humility is.
Soto and colleagues (see Soto, 2019b) also tried unsuccessfully to construct their own Honesty–Humility scale, one that would satisfy their concept of bipolarity insofar as its positive–pole items (e.g. ‘Is always honest’ and ‘Always tells the truth’) do not mention any tendency that they consider antisocial. In commenting on this attempt, Soto suggested that the HEXACO Honesty–Humility scale is ‘unipolar’ and would be more appropriately named for its opposite pole. We should note that several other scales assessing broadly similar constructs, and with a similar emphasis on item content that Soto (2019b) classifies as ‘antisocial’, are also labelled in terms of the high Honesty–Humility pole: consider, for example, the Straightforwardness and Modesty scales by Costa and McCrae (1992) or the scales measuring the same constructs by Maples, Guan, Carter, and Miller (2014) and Johnson (2014) as well as the Honesty/Propriety scale by Thalmayer and Saucier (2014). For the same reasons that we discuss in this section, these scales are not unipolar, and their names are fine.
Another basis for calling a factor ‘unipolar’ is the distribution of people's self–report or observer report scores on that dimension. The suggestion that Honesty–Humility is unipolar—with the low pole being the real one—gives the impression that this dimension simply contrasts a small minority of highly deviant, antisocial persons with everyone else. Some psychological characteristics really do have unipolar distributions: on some measures of certain dissociative tendencies, such as depersonalization or amnesia, the modal score is the lowest possible score, and most people are very close to that minimum (e.g. in the dataset of Goldberg, 1999b). But that isn't how Honesty–Humility scores are distributed—this dimension doesn't simply distinguish a few evil people from the rest of us. Instead, there's almost as much room above the average as there is below. In a typical sample of students or online respondents, the mean score on the HEXACO–PI–R Honesty–Humility scale is often below 3.5 on a 1–to–5 scale, and that mean is typically closer to the scale midpoint than the Big Five Agreeableness mean is (see, e.g. Ashton et al., 2019; Lee & Ashton, 2013); and for HEXACO–PI–R Agreeableness, the scale mean is often right around the scale midpoint. That's right: in this distributional sense, Big Five Agreeableness is closer to ‘unipolar’ than Honesty–Humility is (and much closer to unipolar than HEXACO Agreeableness is).
Soto has also suggested that Honesty–Humility is ‘unipolar’ in the sense that it primarily predicts antisocial behaviours, rather than prosocial behaviours. Now, Honesty–Humility certainly does predict antisocial behaviours (e.g. Ashton & Lee, 2008; Book, Volk, & Hosker, 2012; Heck, Thielmann, Moshagen, & Hilbig, 2018; Bourdage, Wiltshire, & Lee, 2015; Hilbig & Zettler, 2015; Lee, Ashton, & De Vries, 2005; Lee, Gizzarone, & Ashton, 2003; Marcus, Lee, & Ashton, 2007; Pletzer et al., 2019; van Gelder & De Vries, 2012), and antisocial behaviours are pretty important. But Honesty–Humility also predicts the kinds of prosocial behaviours that most people don't do. For example, Honesty–Humility predicts giving in the dictator game (better than the Big Five do; see meta–analysis by Thielmann, Spadaro, & Balliet, 2020)—and in that game, most people give less than half of the money. 26 Honesty–Humility also predicts ‘social mindfulness’—that is, the tendency to act in a way that allows others to choose what they prefer—even though most people are not entirely socially mindful (Mischkowski, Thielmann, & Glöckner, 2018; van Doesum et al., 2019; van Doesum, van Lange, & van Lange, 2013). And as a more spectacular example, very few people have donated a kidney to a stranger, but once again, those who have done so tend to be high in Honesty–Humility, which is the best HEXACO factor predictor of this kind of prosociality (Maynard, 2019).
Thielmann et al. (2020) also found that giving in the dictator game correlates with the prosocial trait of guilt proneness, which itself correlates strongly with Honesty–Humility (e.g. Cohen, Panter, & Turan, 2012; Cohen, Wolf, Panter, & Insko, 2011): when people high in Honesty–Humility do something bad, they tend to feel bad about it.
Putting all of the above aside, you might still consider Honesty–Humility to be a ‘misnomer’ if you just simply find the low pole of the dimension to be more interesting than the high pole. (Come to think of it, we've also given more attention to the low pole than to the high pole—see our ‘field guide to low–H people’ in Lee & Ashton, 2012). And consistent with this view, Diebels, Leary, and Chon (2018) recently suggested—on the basis of their detailed review of research on the correlates of Honesty–Humility—that this dimension should be labelled, in reference to its low pole, as ‘Selfishness’. We don't have any strong disagreement with that suggestion, although for the sake of consistency, it should also come with a focus on the low pole of the other cooperation–related dimension—HEXACO Agreeableness—which might be called ‘Anger’. Maybe if we had gone with ‘Selfishness’ and ‘Anger’ right from the start, the SEXACO model of personality structure would have had a bit more appeal.
(13)‘The validity of the HEXACO Honesty–Humility scale is simply due to its money–related item content, which inflates its correlations with money–related outcome variables.’
Conveniently for us, this concern has recently been neatly addressed by Thielmann and Hilbig (2018). They directly compared the validity of money–related and money–unrelated items of the HEXACO–PI–R Honesty–Humility scale in predicting money allocations in the Dictator Game. They found almost no difference between the two sets of items, which means that the link between Honesty–Humility and Dictator Game giving isn't simply due to shared variance in specifically monetary motivations.
A similar situation can be inferred for the prediction of cheating behaviour by Honesty–Humility. In the meta–analysis by Heck et al. (2018), associations with cheating behaviour—cheating for monetary gain—were about the same for all four facets of Honesty–Humility, even though two of those facets contain money–related items (Greed Avoidance and Fairness) and two of them don't (Sincerity and Modesty).
Of course, some outcome variables will be best predicted by Honesty–Humility facets or items that do emphasize money and material goods. But the point is that the predictive validity of Honesty–Humility is not in general attributable to item content involving money and material goods. It's also worth noting that Honesty–Humility is sometimes the personality dimension most predictive of criterion outcomes from domains that are not at all represented in HEXACO Honesty–Humility items. Perhaps the best example involves sex: the Honesty–Humility scale contains no sex–related items, but it is negatively related to outcomes such as short–term mating orientation or willingness to engage in sexual harassment ‘quid pro quos’ (e.g. Ashton & Lee, 2008; Bourdage, Lee, Ashton, & Perry, 2007) or number of sex partners (Gaughan, 2009; Provenzano, Dane, Farrell, Marini, & Volk, 2015).
We should add that the inclusion within HEXACO–PI–R Honesty–Humility of money–related items—and of a Greed Avoidance facet more generally—is consistent with the content of the lexical Honesty–Humility factor as obtained in various languages. For example, in the cross–language simultaneous components analysis of De Raad et al. (2014; see table S2 of the present article), the low pole of Honesty–Humility included terms translated as greedy, avaricious, avid, rapacious, and covetous (see also appendix A–1 of Lee & Ashton, 2008).
(14)‘The six–factor structure found in lexical studies of personality structure is not found in factor analyses of questionnaire variables.’
Let's stipulate for a moment that you simply cannot recover the HEXACO structure from any personality questionnaire variable sets other than the HEXACO–PI–R. Well, this just doesn't matter. Any given questionnaire variable set cannot be taken as representative of the personality domain, because it is likely to over–represent or under–represent certain regions of the personality space according to the preferences of the researchers who selected the questionnaire scales or items.
The whole point of lexical studies of personality structure is to analyse a variable set that is not biased by researchers’ opinions about the importance of various personality traits. When researchers identify the set of familiar personality–descriptive adjectives from a language, they obtain a list of the personality traits that a community of language speakers has used for generations to describe each other's personalities. It essentially represents the set of personality traits that people have found to be important.
So, if your questionnaire variable set can't recover the six–factor structure obtained in lexical studies, that's a problem with the representativeness of your variable set, not with that six–factor structure. This point applies to any questionnaire variable set, but it should be especially obvious when the variables were actually developed as markers of a given structural model. So, for example, when researchers factor analyse various facet scales (or items) intended to measure the Big Five or Five–Factor Model dimensions (e.g. DeYoung et al., 2007), they will of course recover a five–dimensional structure, and they will very likely not recover a six–dimensional HEXACO–like structure. That won't tell us anything about the structure of personality characteristics, because the variable set is not representative of the personality domain.
And even if you mix in some HEXACO markers with your Big Five markers, there's still no guarantee that you'll recover the HEXACO factor space. We sometimes find that joint analyses of facets from the HEXACO–PI–R and five–factor measures do recover the HEXACO factors (e.g. Ashton et al., 2019), but this doesn't have to happen. In fact, whenever you include some other variables along with your HEXACO markers, you might not obtain all six HEXACO dimensions in your six–factor solution: if your variable set includes a bunch of highly intercorrelated variables assessing some other construct, well, you might obtain a factor representing that construct. An analogous point also applies when you examine only some segments of the personality domain, such as those related to Big Five Agreeableness (e.g. Crowe, Lynam, & Miller, 2018).
But putting aside all of the above, we can examine our original stipulation. Can any non–HEXACO questionnaire variable sets recover the HEXACO factors? Yes indeed they can.
For example, when the scales of Jackson's PRF and JPI inventories are jointly factor–analysed (see Ashton, Jackson, Helmes, & Paunonen, 1998), the five–factor solution resembles the Big Five, albeit with two factors that are oriented in the positions of HEXACO Agreeableness and Emotionality, not Big Five Agreeableness and Neuroticism. And when a sixth factor is extracted, the additional factor is most strongly defined by the JPI Social Adroitness scale (characterized by a subtly manipulative interpersonal style)—which correlates negatively with Honesty–Humility.
As another example, when we factor–analysed the 30 NEO–PI–R facets along with the 25 PID–5 facets (see tables 4 and 5 of Ashton, Lee, De Vries, Hendrickse, & Born, 2012), we obtained seven factors: six were close variants of the HEXACO dimensions, with only a difference in the rotational positions of Extraversion and Emotionality, and the seventh was defined only by PID–5 scales, most strongly by those assessing psychotic tendencies, which are not generally considered to represent normal personality variation.
The same NEO–PI–R/PID–5 variable set has since been administered in several other studies: De Fruyt et al. (2013), Griffin and Samuel (2014), and Wright and Simms (2014). In each case, when seven factors are extracted and rotated to the solution of the Ashton et al. article above (see Ashton & Lee, in press), there is an obviously close correspondence between the two sets of factors: congruence coefficients are very high for the six factors of the HEXACO space (.88 or higher for Openness and .90 or higher for all other factors) and fairly high for the ‘psychotic’ factor (.84 or higher), which in some cases has considerable loadings for all PID–5 scales and seems to represent in part a dimension of PID–5 response styles. 27
The PID–5 scales show reasonably high self/observer agreement, but they are also massively saturated with self–report response style variance, chiefly representing some blend of acquiescence or elevation and social (un)desirability (Ashton et al., 2017). The typical facet–level scale of the PID–5 has about 20% of its variance attributable to those response styles, which greatly inflates the alpha reliabilities and intercorrelations of those scales, as well as their correlations with other self–report variables having similar susceptibility to response styles.
Another example involves a joint factor analysis of the 30 NEO–PI–R facets, the 15 SNAP scales, and several markers of dissociation, as examined in the participant sample of Markon, Krueger, and Watson (2005, study 2) and of Watson, Clark, and Chmielewski (2008). Ashton and Lee (in press) showed that this dataset also produces a seven–factor solution whose dimensions closely resemble those of the HEXACO structure, but with the alternative rotational position of Extraversion and Emotionality, along with a dissociation factor that is conceptually and empirically similar to psychoticism (e.g. Ashton et al., 2012; Watson, 2001). 28
In this alternative rotational position, one dimension represents an introverted form of Emotionality (defined by depressiveness and anxiety), and another represents an unemotional form of low Extraversion (defined by detachment and interpersonal coldness). We prefer the HEXACO factor axis locations for their theoretical interpretability, but clinically oriented researchers might prefer these alternative variants insofar as each dimension has one pole that is clearly clinically relevant.
This latter result is of particular interest in relation to DeYoung's (2015, p. 36) claim that ‘Questionnaire rather than lexical studies do not support the six–factor structure’, given that DeYoung cited the results of Markon et al. (2005). But the factor analyses reported by Markon et al. included not only the NEO–PI–R and SNAP facet–level scales but also the broad factor–level scales of the BFI and EPQ, which obviously will influence the obtained factor solutions and in such a way as to favour Big Five–oriented (and Eysenck–oriented) structures. (You can imagine what would have happened if this analysis had included the six HEXACO factor scales, instead of the BFI factor scales.) However, when the analysis is confined to the facet–level scales, the resulting six–factor solution closely resembles the HEXACO factor space; and when several dissociative scales are included, they define their own, seventh factor (just as in the NEO–PI–R plus PID–5 variable set described above).
We emphasize that we're not reporting these results as evidence in favour of the HEXACO model—and we've made this point many times before when we've reported joint factor analyses (e.g. Ashton et al., 2012; Ashton et al., 2019; Ashton, De Vries, & Lee, 2017; Ashton & Lee, 2005). We're reporting them just to show you that there can exist some non–HEXACO questionnaire variable sets that do recover the HEXACO factor space.
(15)‘The Big Five factor axis locations have a theoretical basis that's lacking for the HEXACO factors.’
You can make a theory of the Big Five, but no matter which version of the Big Five you consider, you'll miss a lot of the variance in the HEXACO factors (recall section 1). 29 If you want to understand all of the major personality dimensions from whatever theoretical perspective, you have to work with the HEXACO framework. Or then again, maybe you could still work with the Big Five factor axis locations, but then you'd need to find a theoretical basis for the residual HEXACO variance not accommodated by the Big Five, and that would get way too complicated. So you might as well get started with the HEXACO factors themselves.
But we should note: the Five–Factor Theory of Costa and McCrae (e.g. 2008) concerns personality variation more generally, and not the Big Five factor space per se. If you'd like our opinion, we think that the Five–Factor Theory addresses some important issues and is probably more or less right, despite our lack of enthusiasm for its name.
Before we discuss some attempts at theoretical interpretation of the Big Five, we'll first discuss the theoretical basis of the HEXACO factors (e.g. Ashton & Lee, 2007; see also De Vries, Tybur, Pollet, & van Vugt, 2016), and where it came from. Back in the 1990s, before there was any hint of the HEXACO model, one of us began trying to understand the functional or adaptive trade–offs behind the Big Five factors. The resulting suggestion (Ashton, Paunonen, Helmes, & Jackson, 1998) was that kin–altruistic tendencies corresponded to traits that combine Big Five Agreeableness and high Neuroticism (e.g. sensitivity), whereas some cooperative or reciprocal–altruistic tendencies—those governing the tendency to tolerate exploitation by others—corresponded to traits that combine Big Five Agreeableness and low Neuroticism (e.g. patience).
But then we realized that there was something confusing about this: why was there no dimension for another kind of cooperative reciprocal–altruistic tendencies, the ones involved in (not) exploiting others? Well, when we started looking at lexical studies of personality structure across various languages (e.g. Ashton et al., 2000; Ashton & Lee, 2001; Ashton, Lee, Perugini, et al., 2004; Hahn, Lee, & Ashton, 1999), everything began to make sense. Not only were there two factors that basically matched our suggested rotational variants of Big Five Agreeableness and Neuroticism, but there was also (that's right) an Honesty–Humility factor, which is readily interpretable as a dimension of treating others fairly versus exploiting them. So now we had factors for both kinds of cooperative or reciprocal altruistic tendency and also for a kin–altruistic tendency (see Ashton & Lee, 2007, for a fuller explanation of the latter).
And once we understood those three factors, we were better able to notice a conceptual parallel shared by the other three HEXACO factors: Extraversion, Conscientiousness, and Openness each involve a trade–off between more and less ‘engagement’ within a given area of endeavour: social, task–related, and idea–related, respectively (Ashton & Lee, 2007).
DeYoung (2010) has suggested that the Big Five structure is preferable because it has one dimension for negative affect traits (Neuroticism) and one dimension for altruism–related traits (i.e. Big Five Agreeableness). Now, as for ‘negative emotion’ being grouped all in the same Big Five factor, well, some kinds of negative emotion traits are actually just about uncorrelated with one another, when you measure them properly. For example, ‘trait anger’ (as measured by low HEXACO Patience, from the Agreeableness factor) shows only very weak correlations with ‘trait fear’ (as measured by HEXACO Fearfulness, from the Emotionality factor), so if you want to explain both of these traits with a ‘negative emotion’ dimension, then you'll still need at least one other dimension to explain why those traits are nearly uncorrelated with each other. (Actually, a similar point arises even in the case of HEXACO Agreeableness and Honesty–Humility: if you include the facets of both of these dimensions in a single broad factor domain, then that factor domain will contain some pairs of facets that are nearly uncorrelated.) 30
For the four–item facets of the 100–item HEXACO instrument, the correlation between Fearfulness and Patience (i.e. low Anger) was −.02 in Goldberg's Oregon sample, −.09 in our college student sample, and −.10 in our online sample. The correlations of Patience with Sincerity (from Honesty–Humility) were .05, .10, and .11 in the same samples.
And as for the Big Five having only one dimension for altruism–related traits, a crucial limitation of the Big Five framework is that it can't fully accommodate, let alone explain, the contrasts between kin–altruistic tendencies and the two forms of cooperative or reciprocal altruistic tendencies. Consider that the defining traits of Honesty–Humility and of Agreeableness correspond conceptually to the tendencies to exploit others and to react against exploitation by others (see review in Ashton & Lee, 2007), respectively. This theoretical distinction is borne out empirically in economic games research: Honesty–Humility predicts giving in the dictator game, whereas (HEXACO) Agreeableness predicts acceptance of unfair offers in the ultimatum game; Big Five Agreeableness is a weaker predictor of each (e.g. Hilbig et al., 2013; Thielmann, Hilbig, & Niedtfeld, 2014; Thielmann, Spadaro, & Balliet, 2020; see also Hilbig et al., 2016). And the kin altruism interpretation of HEXACO Emotionality fits neatly both with the content of its defining facets and with the sex differences observed for those facets—features that can't be accounted for within the Big Five framework (see Ashton & Lee, 2007).
And the division of altruism–related content in the HEXACO framework doesn't mean that you can't have an overall altruism axis: it just runs through the middle of the H+/A+/E+ (versus H−/A−/E−) region. So, for example, if you're a fan of the ‘interpersonal circle’ that corresponds to the Big Five Agreeableness/Extraversion plane (e.g. Trapnell & Wiggins, 1990), you can find it in the overall altruism/Extraversion plane of the HEXACO factor space. The HEXACO–PI–R actually has an Altruism facet, but it isn't assigned to any one factor scale of the HEXACO–PI–R, because it's supposed to divide its loadings across the H, A, and E factors—and that's what it does (Lee & Ashton, 2018). 31 And likewise, the Neuroticism axis also fits well within the HEXACO factor space: it's approximated closely as a combination of Emotionality, low Extraversion, and low Agreeableness (and maybe low Conscientiousness, depending on the measure of Neuroticism).
And this doesn't mean that you have to lose the Altruism facet from the six broad scores: you can just compute six component scores from the full set of 25 facets (see p. 2 of http://hexaco.org/downloads/ScoringKeys_100.pdf).
(16)‘But lots of reviewers and editors and the Old Boys and Girls Club are still using the Big Five ….’
Hey, you've got us with this one! We can help you with the science, but not with the politics.
Supporting Information
Supporting Information, per2242-sup-0001 - Objections to the HEXACO Model of Personality Structure—And Why Those Objections Fail
Table S1 Evidence for the cross-language replicability of the six-factor solutions
Table S2 Highest-loading and/or most commonly loading adjectives on six factors in lexical studies of personality structure.
Supporting Information, per2242-sup-0001 for Objections to the HEXACO Model of Personality Structure—And Why Those Objections Fail by MICHAEL C. ASHTON and KIBEOM LEE, in European Journal of Personality
Table S1 Evidence for the cross-language replicability of the six-factor solutions
Table S2 Highest-loading and/or most commonly loading adjectives on six factors in lexical studies of personality structure.
Footnotes
Supporting Information
Additional supporting information may be found online in the Supporting Information section at the end of the article.
