Abstract
Based on a large individual differences study, Böckler, Tusche, and Singer aimed to develop and psychometrically evaluate measurement procedures that capture individual differences in multiple facets of human prosociality. Böckler et al. claimed that they identified four reliable and method-independent subcomponents of human prosociality: altruistically motivated prosocial behavior, norm-motivated prosocial behavior, strategically motivated prosocial behavior, and self-reported prosocial behavior. We show that this claim is not supported by the data. The abnormalities of the factor solution are visible in reported standardized loadings much larger than unity and negative residual variances at the indicator level. Additionally, the strong dispersion in factor loadings reported in the article hinders factor interpretation. We reanalyze the correlation matrices and propose a model with one overarching prosociality factor and a specific factor for game-theoretical conflicts. This simpler model is a more sustainable representation of prosocial behavior.
Böckler et al. (2016) are to be commended for their effort to develop and psychometrically evaluate measurement procedures that capture individual differences in multiple facets of human prosociality. The approach they take to measure prosociality is both broad in its scope and multivariate in nature. Both attributes are essential for making a lasting contribution to personality psychology. However, the data analyses are flawed so that the subsequent interpretation of the results is misguided. Unfortunately, the results reported by Böckler et al. (2016) do not provide a sound basis for interpretation and discussion because they essentially reflect errors in the application of confirmatory factor analysis.
In this commentary, we first point out overarching desiderata for latent variables in behavioral sciences and then address conceptual and methodological issues for each of the four factors proposed by Böckler et al. (2016). We consider the conceptual and psychometric problems regarding the factors strategically motivated prosocial behavior and norm-motivated prosocial behavior as so severe that these two factors cannot be upheld with the present data. The solution reported is improper as manifested in standardized loadings much larger than unity and negative residual variances. There are several reasons for improper solutions, ranging from sampling fluctuations especially in small samples, local empirical underidentification of latent variables within larger models, poor indicator reliability, severe outliers, inadequate treatment of missings, and more generally their distribution properties (e.g., Boomsma, 1985; Gerbing & Anderson, 1987). It is likely that empirical underidentification is the cause for the improper solution reported by Böckler et al. (2016). Thus, we eliminate underidentified factors and reanalyze the indicators of altruistically motivated and self-reported (SR) prosocial behavior and propose a multitrait–multimethod (MTMM) model (Eid, Lischetzke, Nußbeck, & Trierweiler, 2003) of prosociality as a viable alternative to the model proposed by Böckler et al. (2016).
Desiderata for Establishing Latent Variables
Cooperation, sharing, and similar prosocial and altruistic behaviors are increasingly studied in the life sciences. Game-theoretical approaches are frequently used to derive measures that arguably reflect fundamental dimensions of human prosociality. Measures such as the dictator game (DG), ultimatum game (UG), prisoners dilemma, chicken game, and many more game-like paradigms were popularized in behavioral economics (Bornstein, Budescu, & Zamir, 1997; Güth, Schmittberger, & Schwarze, 1982; Kahneman, Knetsch, & Thaler, 1986; Sally, 1995). Indicators like social value orientation, donations, and social discounting (SDisc) all attempt to capture a person’s preference about how to divide resources between the self and another person. Therefore, game-theoretical conflict (GTC) is a suitable label for all these measures. Constructs such as altruism, agreeableness, trust, or cooperation have also been measured via SR ratings in personality psychology for many decades (Zhao & Smillie, 2015). Therefore, when measuring prosociality, at least two measurement methods should be distinguished: GTC paradigms and SR ratings. It is still unclear which prosocial traits need to be distinguished. The study by Böckler et al. (2016) suggests four such facets of prosociality, but other authors postulate one, two, three, or seven such traits (Hilbig, Zettler, Leist, & Heydasch, 2013; Hubbard, Harbaugh, Srivastava, Degras, & Mayr, 2016; O’Reilly & Chatman, 1986; Penner, Fritzsche, Craiger, & Freifeld, 1995).
Given the two operational approaches to prosociality, a straightforward evaluation of procedures to measure prosociality could be an MTMM approach. Figure 1 illustrates a series of plausible models that we considered in our reanalysis of the published correlation matrix by Böckler et al. (2016). This series of models is restricted to just two facets of prosociality. Panel A shows a model that postulates a single overarching trait. The models in Panels B and C are MTMM models. They additionally specify and tentatively label specific factors accounting for variance that is due to method specificities of the measures: GTC in Panel B versus SR prosociality in Panel C. Panel D shows a correlated-factor model that corresponds to the model from Böckler et al. (2016)—except that two of the four factors are removed. Please note that the indicators in Panels B and C that have no loading on a specific factor share a measurement method too. In Panel B, those variables share SR methodology, and in Panel C, these indicators all capture GTCs. The measurement procedure of prosociality that is factorially not represented in the models shown in Panels B and C serves as a reference method (Eid et al., 2003). The models in Figure 1 could be extended gradually so that more than one trait and more than one method factor accounting for the shared variance due to similar measurement approaches can be specified. However, given the concerns regarding the factors of strategically and norm-motivated prosocial behavior we outline below, only two of the initial factors deserve further evaluation in the present data. We therefore restrict our consideration of the factorial space to the models shown in Figure 1.

Completely standardized factor loadings for the measurement models based on the correlation matrices and summary statistics for Samples 1 and 2 provided by Böckler et al. (2016); SR = self-report; GTC = game-theoretical conflicts. Indicators in the figure and their corresponding names by Böckler et al. (2016): PS = Prosociality Scale; MI = Machiavelli index; IRI = interpersonal reactivity index; ZPG help = Zurich Prosocial Game (overall helping); Don = donations; SVO = social value orientation (prosocial); SDisc = Social discounting (log k); TG-RG = trust game; DG = dictator game; First values on the paths represent parameter estimates based on the first sample, second values on the path indicate the parameter values for Sample 2; nonsignificant parameter estimates are printed italics.
In Figure 1, we summarize the most suitable factor models that are plausible candidates in an MTMM modeling context—so to say the usual suspects in psychometric studies. The correlated factor model (Panel D in Figure 1) realized by Böckler et al. (2016) does not allow for an overarching prosociality factor. An overarching factor can be considered to be implicitly present in the correlation between both factors. The higher the factor correlation, the stronger such an implicit general factor. The lack of an explicit overarching factor is problematic if all factors are interpreted as facets of an overarching human prosociality trait. If an overarching prosociality factor is specified, some positive manifold of prosociality indicators, or first-order prosociality factors, is required. The models specified by Böckler et al. (2016) for Samples 1 and 2 do allow for positive manifold among the factors, but only altruistically motivated prosocial behavior and SR prosocial behavior show a meaningful relation with each other; the other five factor correlations are all zero. For the sake of parsimony, simplicity, and transparence, it is important to avoid jingle and jangle issues when labeling latent variables (Kelley, 1927): Factors labeled similarly or identically should be highly or perfectly related with each other and factors labeled dissimilar or differently should be related weakly or not at all. Latent variables sharing the label “prosociality” should therefore show some positive manifold.
Panels A, B, and C, in Figure 1, all specify an overarching general factor of prosociality. These models can be compared with each other inferentially (A with B, C, and D) or descriptively (B with C and D, C with D). The MTMM models in Panels B and C specify one general factor and an orthogonal-specific factor. The specific factors can be interpreted substantively (i.e., they can be considered trait factors), but they can also be understood as method factors (i.e., they are interpreted as nuisance variance that is due to unwanted variance caused by the measurement approach).
The Factors of Prosociality Proposed by Böckler et al.
Establishing novel personality factors is no easy business. The facets of prosociality proposed by Böckler et al. (2016) are suggested to represent crucial facets of prosocial behavior and should therefore meet established standards for novel trait factors. There are four requirements such factors should meet. First, they should be theoretically and conceptually sound given prior research results in the field of human prosociality. Second, they should be clearly identified by strong loadings of their proposed indicators and the indicators in turn should share essential attributes and should not share construct irrelevant attributes. Third, novel factors should not simply be a function of one or more established personality factors. Fourth, they should predict relevant real-world variables. Let us now go through the four factors proposed by Böckler et al. (2016) with a focus on the first, second, and third standard.
Altruistically Motivated Prosocial Behavior
The first factor—altruistically motivated prosocial behavior—causes the least problems within the model proposed by Böckler et al. (2016). According to the loadings, SDisc is a marker variable for this factor. As far as factor interpretation is concerned, SR measures of altruism should count toward the scope of this factor too, such as the honesty–humility factor of the HEXACO model of personality, namely, the tendency toward active cooperation in terms of nonexploitation (Hilbig et al., 2013), but this factor does not include any SR measures. It is therefore not transcending measurement approaches.
Strategically Motivated Prosocial Behavior
Figure 2 in Böckler et al. (2016) depicts standardized factor loadings. These factor loadings indicate that changing the value on the latent variable strategically motivated prosocial behavior by one standard deviation will change the manifest variable by about nine standard deviation units. Such abnormal parameter estimates are indicative of so-called Heywood cases that appear when residuals of observed variables in the model (here strategic giving [UG-DG]) have negative variances. Such cases are prevalent if the sample size is insufficient, if variables are overfactored, or if their loadings are weak and inconsistent due to the fact that indicators used to measure a given factor do not homogeneously covary (McDonald, 1985; MacCallum et al, 1999). In other words, when one specifies more factors than are existent. In the model by Böckler at al. (2016), not only is the standardized factor loading of one variable very far out of bounds but also the other two loadings are overly small and in all likelihood not distinguishable from zero. Therefore, the factor is exclusively determined by a single variable, one with an improper factor loading. Thus, given the present evidence, there is no empirical reason why one should retain this factor and the corresponding variables in the model.
An additional problem with this factor is the use of computationally dependent variables. Böckler et al. (2016) use a difference score (Cronbach & Furby, 1970) to parameterize strategic-giving (UG-DG), which includes a variable DG arguably being indicative for the factor termed altruistically motivated prosocial behavior. The authors acknowledge the statistical dependency between the two indicators in their model. Nevertheless, the computational dependency affects factor interpretations and could also be partly responsible for the anomalies of the factor solutions.
Norm-Motivated Prosocial Behavior
With respect to the norm-motivated prosocial behavior, statistical and psychometric problems are not as severe as in the case of the strategic factor. Still there is a standardized loading larger than unity and therefore a negative variance for the variable second–party punishment. Additionally, the loading of the variable ZPG (reciprocity effect) is zero. The latent variable accounts for less than 5% of the variance in ZPG. Thus, this loading should be removed. Given the magnitude of loadings, the reciprocity effect should not be taken into account when interpreting the norm-motivated prosocial behavior factor, which in fact captures nothing but second- and third-party punishment–related variance.
Beside the measurement problems related with the norm-motivated prosocial behavior factor, it is by no means established that the decision to punish unfair offers (such as measured in the indicators second- and third-party punishment) unequivocally reflects prosocial behavior. The theoretical model of strong reciprocity indeed claims that negative reciprocity (e.g., the tendency to punish unfair behavior) reflects prosocial behavior since the punishing individual is sacrificing resources in order to punish unfair behavior (Fehr, Fischbacher & Gächter, 2002). However, recent evidence informs us that the underlying motivation to punish unfair behavior is driven by the personality trait of assertiveness and reflects a form of egoistic behavior (Yamagishi et al., 2012, see also Kaltwasser, Hildebrandt, Wilhelm, & Sommer, 2016). According to Yamagishi et al., assertive participants (acting as responder) are driven by the egoistic motivation to avoid the imposition of an inferior status. Therefore, punishment of unfair behavior does not necessarily reflect prosocial motivation but could instead be seen as a mechanism of status defense. Taken together, the motivation to punish unfair behavior or to act strategically in a game is not necessarily prosocial.
The factors norm-motivated and strategically motivated prosocial behavior are also unrelated with criteria such as socioeconomic status, affective dispositions, and cognitive skills reported in Table 5 by Böckler et al. (2016). Therefore, we conclude that there is no evidence supporting convergent validity for the norm-motivated and strategically motivated factors.
Self-Reported Prosocial Behavior
Concerning the SR prosocial behavior factor, the loading of the Machiavelli index (MI) is somewhat small. A more severe issue is that this factor cannot reflect established factorial distinctions among SR measures. For example, Penner, Fritzsche, Craiger, and Freifeld (1995) distinguish seven SR factors of prosociality, most of which are not covered by the nomothetic span of the SR prosocial behavior as specified by Böckler et al. (2016).
Reanalyzing the Data From Böckler et al.
Böckler et al. (2016) provide two correlation matrices, along with the means and standard deviations of the 14 indicators they considered toward establishing a model of individual differences in human prosociality. The two matrices originate from two different samples that have been separately analyzed via exploratory (Sample 1) and confirmatory (Sample 2) factor analyses. We reanalyzed these two matrices by separately modeling them using confirmatory factor models specified in Mplus 7 (Muthén & Muthén, 2014). As argued above, we dropped the factors norm-motivated prosocial behavior and strategically motivated prosocial behavior along with their five indicators. This decision was further corroborated by the fact that the four factorial model reported by Böckler et al. (2016) did not lead to convergence when modeling the two correlation matrices. Thus, we considered the relationships of nine variables (three SRs and six variables derived from game-theoretical paradigms) to fit the model series shown in Figure 1. Please note that these analyses do not follow the aim of providing a final solution for the structure of human prosociality. Our aim with this reanalysis is to study the structure of the nine retained indicators from Böckler et al. (2016) and to exemplify an MTMM approach that should be extended in future endeavors.
In all models, we used the following nine indicators (names by Böckler, Tusche, & Singer, 2016, provided in their Figure 2): Prosociality Scale (PS), MI, IRI, ZPG (overall helping), donations, SVO (prosocial), SDisc (log k), trust game (TG-RG), and DG. In the first model, we specified a single overarching factor of prosociality. This model did not reach acceptable fit in either samples: Sample 1: χ2(27) = 82.96, p < .001, comparative fit index (CFI) = .654, root mean square error of approximation (RMSEA) = .105, standardized root mean square residual (SRMR) = .079, Akaike information criterion (AIC) = 9,736; Bayesian information criterion (BIC) = 9,823; Sample 2: χ2(27) = 56.18, p < .001, CFI = .721, RMSEA = .067, SRMR = .076, AIC = 7,310; BIC = 7,390. Model modification indices related to the one factor model fitted to both correlation matrices suggested that two residual covariances were required: The PS with IRI and the MI with SDisc (.41 and .37 vs. .23 and .28—the first value representing the correlated error in Sample 1, the second value representing the correlated error in Sample 2, respectively; see Panel A of Figure 1). Including these two residual covariances in the general factor model yielded substantially improved model fit: Sample 1: χ2(25) = 43.78, p = .011, CFI = .884, RMSEA = .063, SRMR = .059, AIC = 9,701, BIC = 9,795; Sample 2: χ2(25) = 28.16, p = .30, CFI = .97, RMSEA = .030, SRMR = .058, AIC = 7,286, BIC = 7,371. All factor loadings were statistically different from zero at α < .05, with the exception of the indicators trust game and IRI. Furthermore, all other factor loadings depicted in Panel A of Figure 1 were pretty heterogeneous in their magnitude, ranging between .20 and .68.
Second, we estimated the model specified in Panel B of Figure 1. This model assumed GTC paradigms to be specific due to their measurement approach as compared with the SRs, which were only allowed to load on the general prosociality factor. In order to inferentially compare this model with the one shown in Panel A, we retained the two residual covariances introduced in the one factorial model. As suggested by the data originating from sample 1, the specific factor accounting for all game-theoretical indicators lead to statistically significant improvement of the model fit: Δχ2(6) = 19.89; χ2(19) = 23.89, p = .20, CFI = .97, RMSEA = .037, SRMR = .040, AIC = 9,693, BIC = 9,806. There was also an improvement in the model quality when fitting the MTMM model to the second covariance matrix; however, a residual covariance between the MI and SDisc had to be skipped in this case. In Sample 2, the MTMM model resulted in the following fit: χ2(20) = 12.37, p = .90, CFI = 1, RMSEA = .000, SRMR = .033, AIC = 7,280, BIC = 7,380. Inspection of the loadings of this model in Samples 1 and 2 shows substantial heterogeneity (as provided in Panel B of Figure 1).
Third, we considered the game-theoretical paradigms as a reference method and estimated a specific factor based on the *SR measures—as depicted in Panel C of Figure 1. In Sample 1, this model fitted the data descriptively somewhat worse, as compared with the model in which game-theoretical paradigms were modeled as specific factor. Sample 1: χ2(23) = 38.82, p = .02, CFI = .902, RMSEA = .061, SRMR = .053, AIC = 9,700, BIC = 9,800; Sample 2: χ2(23) = 23.74, p = .41, CFI = .991, RMSEA = .015, SRMR = .050, AIC = 7,285, BIC = 7,377.
Finally, we specified two correlated first-order factors—one for the game-theoretical indicators and one for the SR measures (Panel D of Figure 1). The fit of this model in both samples was comparable with the fit achieved in the previous MTMM models. Sample 1: χ2(24) = 40.88, p = .02, CFI = .895, RMSEA = .061, SRMR = .055, AIC = 9,700, BIC = 9,797; Sample 2: χ2(24) = 23.74, p = .47, CFI = 1, RMSEA = .000, SRMR = .050, AIC = 7,283, BIC = 7,372. The correlation between the two factors was different from zero but not very large (r = .50 in Sample 1 and r = .51 in Sample 2). Please note that although the specific factors in Panels B and C and the two factors in Panel D have the same label, they are distinguished by additionally signs (*, +, and #). These signs indicate that those factors do not have the exact same meaning—their interpretation is partly contingent upon the overall model.
Discussion
About 40 years ago, H.-J. Eysenck and J. Guilford had a debate over the nature of the extraversion factor (Eysenck, 1977; Eysenck & Eysenck, 1969; Guilford, 1975, 1977)—a debate that inspired the title of this commentary. Their discussion focused on how representative personality factors can be derived. In his response to Guilford’s comment, Eysenck (1977, p. 405) states: It is suggested that psychometric considerations play an important part and that factor analysis in particular can be of great value in this connection. (…). Factors emerging from such analyses must be replicable and reliable, and they must fulfill certain basic psychometric requirements.
Nevertheless, closer inspection of the four-factor solution of prosocial behavior unequivocally shows that the factors norm-motivated prosocial behavior and strategically motivated prosocial behavior cannot be part of the recommended solution for an overarching prosociality model. Given the opaque solution provided by Böckler et al. (2016), the best conclusion is to remove these factors from further consideration at least until better measures are available. The solution published by Böckler et al. (2016) is improper and not admissible.
We removed the indicators leading to improper modeling solutions and then reanalyzed the remaining variables by estimating four competing models. The general factor model specified in Panel A had insufficient fit. A single general factor of prosociality cannot sufficiently account for individual differences observed in the nine indicators analyzed here. In the next model (Panel B), we specified a specific factor that captures variance specific to game-theoretical approaches to measure prosociality. Taken together, this model had decent fit and adequately accounts for the observed individual differences. Please note that the specific factor is best seen as a method factor because the most salient attribute these indicators have in common as compared to the other indicators in the model is the shared GTC format. To the degree to which this factor can be shown to have convergent and discriminant validity, the interpretation as a nuisance (method) factor is falsified and a more substantive interpretation is warranted. In Panel C, we reversed what is deemed to serve as a reference method for measuring prosociality. This model allows studying the communalities of indicators that share the attribute of emanating from SR approaches to measures prosociality. The fit of this model was somewhat poorer than what we found for the previous model. The correlated factor model in Panel D showed a fit being comparable to the previous model. Unlike the MTMM models, in this correlated factor model, the two latent variables are not deemed to partly reflect method-specific communalities—although what used to be the altruistically motivated prosocial behavior factor in Böckler et al. (2016) is now labeled prosociality measured by GTC paradigms.
Taken together the model shown in Panel B is outperforming the other models, and it is sound as an explanatory model of prosocial behavior for the nine indicators analyzed here. Whether or not these factors show convergent and divergent relations with criteria is an open question. Obviously, the loading structures indicate that further work to improve psychometric properties of several indicators is strongly required. The necessity to introduce data driven modifications into the models also deserves further serious attention. In replications and extensions of the Böckler et al.’s (2016) study, additionally, future studies should include measures of benevolence (Hubbard et al., 2016) and the willingness to do charity work (Meier & Stutzer, 2008), both of which are core aspects of prosociality.
We want to close this commentary by emphasizing that a strict psychological and psychometric evaluation is not only indispensable but also crucial and essential when it comes to discussing new constructs and their evaluation. If a psychological trait is not conceptually sound, if no adequate measurement model can be established, if the trait is a function of established traits, and if the trait does not predict somewhat relevant variables, then the trait does not represent progress over earlier work, and for the sake of parsimony and clarity, should be dismissed. The psychological and psychometric research necessary to meet these crucial requirements within the overarching framework on prosocial behavior is difficult. Although Böckler et al. (2016) did not reach their goal, their study is an important step in the right direction.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
