Abstract
Results from two experiments suggest that stereotype-threat effects are special cases of a more general process involving the need to maintain or enhance status. We hypothesized that situations capable of confirming a performance stereotype might represent either a threat to status or an opportunity for enhancement of status, depending on the nature of the stereotype. The positive relationship between baseline testosterone and status sensitivity led us to hypothesize that high testosterone levels in males and females would amplify existing performance expectations when gender-based math-performance stereotypes were activated. In Study 1, high-testosterone females performed poorly on a math test when a negative performance stereotype was primed. In Study 2, high-testosterone males excelled on a math test when a positive performance stereotype was primed. The moderating effect of testosterone on performance suggests that a stereotype-relevant situation is capable of conferring either a loss or a gain of status on targets of the stereotype.
Most performance-based stereotypes are explicit orimplicit statements of comparison between two or more groups (e.g., womenpossess poor math ability compared with men, pit bulls are good fighterscompared with other breeds). Viewing a stereotype in this manner suggests aconnection between situations capable of confirming a performance stereotypeand situations involving the gain or loss of status. For example, in studiesof stereotype threat, when a woman finds herself about to take a math test, her lower status in the math domain has the potential for confirmation (e.g., Brown & Josephs, 1999; Shih, Pittinsky, & Ambady, 1999; Spencer, Steele, &Quinn, 1999).
In these situations, the confirmation of a negative (orpositive) stereotype may be closely tied to a loss (or gain) of status amongtargets of the stereotype. If the target of a stereotype experiences thestereotype as a statement about status, factors associated with statusseeking and status protection might moderate the effects that the stereotypehas on performance. It follows from this argument that individual differencesin baseline testosterone might moderate the relationship between stereotypethreat and performance. The two studies presented here tested this logic byexamining the consequences of both positive and negative stereotypes aboutmath ability.
TESTOSTERONE AND STATUS IN MEN AND WOMEN
Across a wide variety of animal species, behaviorsintended to achieve, maintain, and enhance status (i.e., dominant andpower-seeking behaviors) are observed primarily among high-testosterone(high-T) individuals (e.g., Kraus, Heistermann, &Kappeler, 1999; Ruiz-de-la-Torre & Manteca, 1999). Not surprisingly, the relationship between testosterone and statusseeking is not as strong in humans as it is in nonhuman species (Book, Starzyk, & Quinsey, 2000), probably because of themuch greater complexity associated with human societies. Nonetheless, measuring testosterone in humans at a single point in time predicts dominanceas well as status-related behaviors and occupations across a variety ofsituations (see, e.g., Dabbs, 1998; Dabbs, La Rue, & Williams, 1990; Mazur& Booth, 1998; Scaramella & Brown, 1978; van Honk et al., 1999).
Most of the research in this area has been conducted onmale samples, owing in part to the assumption that females, because of theirsignificantly lower testosterone levels, should not show a relationshipbetween testosterone and status. However, the existing evidence supports apositive relationship rather strongly. Baseline testosterone levels in womenhave been positively related to occupational status (Purifoy& Koopmans, 1979), personality measures of dominance (Grant & France, 2001; Urdy &Talbert, 1988), and an absence of smiling-considered a nonverbalindicator of dominance (Cashdan, 1995). Furthermore, Grant and France (2001) argued that a review of theliterature allows one to conclude that the various “personalitycharacteristics found to have a positive association with testosterone inwomen are themselves indicative of the underlying core personality trait, dominance” (p. 42). Kemper (1990) drew asimilar conclusion, arguing that women have as high a need for power anddominance as do men.1
TESTOSTERONE AND INTELLECTUAL PERFORMANCE
A large literature documenting the relationship betweentestosterone and human intellectual performance has indicated thattestosterone exerts its effects neuroanatomically, by influencing theorganization of the developing brain (e.g., Gouchie & Kimura, 1991; Williams, Barnett, & Meck, 1990). Curiously, despite more than a century of study on the relationship betweentestosterone and status, no attempts have been made to linktestosterone's behavioral effects to intellectual performance.
Although the prototypical status struggle depicts twomales squaring off in an aggressive encounter, concerns over status are oftenreflected in more genteel behaviors and decisions. For example, a high-Tfaculty member might be particularly concerned with his or her standingwithin the department, and may behave in ways intended to enhance or maintainthis standing (e.g., increasing publication rate) without resorting to thetypes of dominant behaviors observed in other animal species (e.g., biting, scratching). Any time individuals are highly invested in a domain, a subsetof these individuals (e.g., those high in testosterone) should care abouttheir relative social standing in that domain.
Carrying this reasoning over to human intellectualperformance, we suggest that, among individuals who are highly investedwithin an academic domain (e.g., mathematics), a subset of these individuals(e.g., high-T individuals) will care about their self-perceived standing inthat domain. Performance should be influenced by status demands only when twoconditions are met: Investment in the domain is high, and concern with statusor dominance exists. This led us to hypothesize that, among individuals whoare invested in an academic domain, performance capable of jeopardizing ormaintaining status may be moderated by individual differences in baselinetestosterone. When baseline testosterone is low, performance should not beinfluenced by the status consequences of the situation. Only when baselinetestosterone is high should a status manipulation influence performance.
OVERVIEW OF THE PRESENT STUDIES
In the two studies reported here, we examinedparticipants who had previously indicated that mathematics and math abilitywere very important to them. We predicted that behaviors that result fromencountering stereotypes about math performance would become amplified inhigh-T individuals. As we reported previously (Brown & Josephs, 1999), men are faced with a positive stereotype about their mathability, and expect to do well on math tests. Thus, the possibility ofconfirming a positive stereotype should make high-T men more likely to view amath test as a way to maintain or enhance their high status in math, inspiring them to strive for excellence. This logic was tested in Study 2. Conversely, women face a negative stereotype about their math abilities, andexpect to do poorly on math tests (Brown & Josephs, 1999). The possibility of confirming a negative stereotype should causehigh-T women to view a math test as a potential threat to their status, resulting in poor performance among high-T women. This logic was tested in Study 1.
STUDY 1
In Study 1, we determined baseline testosterone levelsin male and female participants, and administered a math test. The statusmanipulation used in this experiment was based on previous studies ofstereotype threat, and was designed to activate the stereotype that femalespossess weak math abilities, relative to males. The manipulation was notexpected to affect males' performance because males tend to judgethe stereotype of weak math ability as pertaining only to females, and thusdo not view the stereotype to be self-relevant (Brown & Josephs, 1999).
Thus, we expected an interaction among sex, thestereotype-threat prime, and testosterone levels such that high-T females forwhom stereotype threat was primed would perform more poorly than high-Tfemales who were not primed. The performance of low-T females was predictedto be unaffected by the stereotype-threat manipulation. Because thestereotype was not relevant to males, and because their status as possessinggood quantitative skills should not have been threatened, the performance ofhigh-T males was not expected to decline in the stereotype-threat condition(see Aronson et al., 1999, and Brown& Josephs, 1999, for discussions of male-relevant mathstereotypes).
Method
Participants
Participants were 76 females and 75 males enrolled inintroductory psychology at the University of Texas. They participated inpartial fulfillment of the course's research requirement. Onlyparticipants who scored above the midpoint of the Math Identification Questionnaire (Brown & Josephs, 2000) duringpretesting were eligible for inclusion in the study. On this questionnaire, respondents rate statements (e.g., “My math abilities are veryimportant to me.”) using a 9-point, Likert-type scale. In pilottesting (Brown & Josephs, 1999), we found thatindividuals scoring below the midpoint of this scale are relatively immune tothe performance-impairing effect of stereotype threat, presumably becausethey are minimally invested in math (see also Aronson et al., 1999; Spencer et al., 1999).
Materials and procedure
Saliva collection
Testosterone was measured through enzyme immunoassays ofsalivary samples conducted by Salimetrics (State College, Pennsylvania). Fora full description of the saliva collection procedures and the enzymeimmunoassay procedure, see Granger, Schwartz, Booth, and Arentz (1999). Experimental sessions were conducted between 12 p.m. and 4p.m. to control for a diurnal decline in testosterone (e.g., Granger et al., 1999).
Stereotype prime
Participants were run individually. Following salivacollection, participants were told that they would be taking a test ofmathematical reasoning abilities. They then completed either a questionnairedesigned to prime stereotype threat or a control questionnaire featuringquestions about coming to college. The seven-item stereotype-primequestionnaire included items such as “I think that some people feel I have less math ability because of my gender” and “Inmath classes, I often feel that others look down on me because of mygender.” The seven-item control questionnaire included items suchas “School can be very rewarding” and “I have avery clear idea of what my major will be.” Copies of bothquestionnaires are available by request.
Testing session
After completing the questionnaire, participants weregiven 20 min to solve 20 questions drawn from the quantitative section of the Graduate Record Exam (GRE-Q). Participants were informed that an incorrectanswer would result in a point deduction, and advised not to guess blindly. Following the test, participants were given a questionnaire that included anitem asking how nervous they felt at the moment (possible answers ranged from1, not at all, to 5, very nervous). Participants were alsoasked to recall their math score on the Scholastic Assessment Test (SAT) andthe number of math courses they had taken in high school.
Results and Discussion
For all analyses presented here, a median split wasperformed on testosterone levels within sex. High-T females (M= 46.10 pg/ml, SD= 40.23) were high relative tolow-T females (M= 24.66 pg/ml, SD=22.78). High-T males (M= 127.67 pg/ml, SD= 70.57) were high relative to low-T males (M=79.60 pg/ml, SD= 39.11).
Manipulation checks
Supporting the notion that our female participantsacknowledged the self-relevance of a sex-linked stereotype, a significant sexdifference emerged on the stereotype-prime questionnaire, with femalesindicating significantly greater agreement with the questionnaire items thanmales did, t(77) = 3.4, p < .05. Anothercheck of the manipulation was provided by our nervousness measure. If themanipulation was effective in priming stereotype threat, then females shouldhave been more nervous than males. We found a statistically significanteffect of the manipulation on nervousness, with females reporting greaternervousness than males, F(1, 148) = 2.52, p <.05. We also looked at the effect of the manipulation on the nervousness ofthe group that we predicted would be most upset by it, the high-T females. Wefound that, as expected, the average reported nervousness was higher in thisgroup (M= 3.44) than in any other group (means in theother groups ranged from 2.16 to 2.92), but none of these differences werestatistically significant (though many were marginally significant).
GRE-Q performance
GRE-Q performance was significantly related to math SATscores and number of high school math classes (rs = .58 and.26, respectively, ps < .05). Thus, we were able to use theserelationships to reduce error variance by examining our primary dependentmeasure after controlling for the combined effects of these math proficiencyvariables. Reported math SAT scores were not related to testosterone,r(149) = .09, n.s., suggesting that if some participantsfabricated or erred in reporting their SAT scores, these problems were notsystematically related to the primary individual difference variable.
We predicted that stereotype threat would impairperformance among female participants high in testosterone, relative tohigh-T females in the control condition. We also predicted a lack ofstereotype-prime effects among low-T females, and predicted no performancedifferences among male participants. As shown in Table 1, these hypotheses were supported. A 2 (testosterone level) × 2(stereotype threat: prime, control) × 2 (sex) analysis ofcovariance (ANCOVA) revealed a statistically significant three-wayinteraction, F(1, 142) = 4.94, p < .05.
Study 1: Mean math performance scores by condition and testosterone (T) level
Note. Scores were adjusted using participants' reported SAT (Scholastic Assessment Test) math scores and number of math classes taken in high school.
To interpret this interaction, we conducted a 2(testosterone level) ×2 (stereotype threat: prime, control) ANCOVAwithin sex. Among female participants, we found no main effects but astatistically significant interaction, F(1, 71) = 4.85,p < .05. A planned comparison indicated that, as predicted, high-T females in the prime condition underperformed relative to high-Tfemales in the control condition, t(37) = 2.47, p <.05, d= .81. In addition, as predicted, thestereotype-threat prime had no performance-altering effect among low-Tfemales (t < 1). We next contrasted high-T females in the primecondition to all other females, and found that this group underperformedrelative to all other female participants, t(73) = 1.70,p < .05 (one-tailed), d= .45. Finally, weconducted the critical planned comparison using means that were not adjustedfor the covariates, finding a statistically significant difference betweenhigh-T females in the prime and control conditions, t(37) =2.11, p < .05, d= .69. This suggests thatthe primary finding was not due to statistical problems associated with the ANCOVA.
Overall, men (M= 15.1, SD= 3.0) outperformed women (M= 13.3, SD= 3.6) in the current study, F(1, 148) = 11.67,p < .05. But, as hypothesized, the stereotype-threat prime didnot have a significant effect on male participants' performancescores, nor did testosterone levels interact with the stereotype prime forthe male participants (all ps > .12; see Table1 for these means).
Summary
We argued that confirmation of a negative stereotypemight be closely tied to a loss of status. If a stereotype-threat situationis capable of conferring a loss of status or a confirmation of low status, then those individuals who are hypothesized to be concerned with such mattersshould be susceptible to the performance-impairing effects of the stereotypethreat. Our results support this idea. We found that when stereotype threatwas primed in females, math performance suffered, but only among females whowere high in baseline testosterone. Only when both preexisting levels oftestosterone were high and stereotype threat was primed did performancesuffer, suggesting a moderating relationship between the two variables.
STUDY 2
Our previous study (Brown & Josephs, 1999) demonstrated that, as a result of negative stereotypes about theirmath abilities, females report that they are concerned about performingpoorly in math, and do not expect to do well. Males, however, do not sharethese negative, math-related performance concerns. Instead, because ofpositive stereotypes about their math abilities, males expect to excel inmathematics (Brown & Josephs, 1999). In general, whenanticipating having to take a math exam, females reported being threatened, whereas males viewed the upcoming exam not as a threat but as a challenge.
It is this distinction between threat and challenge, between negative and positive expectations, that we believe determines thedirection of moderation associated with baseline testosterone. Sapolsky (1998) referred to testosterone as a“permissive” hormone-magnifying existing behaviorsand tendencies, but not creating new ones. When one expects to fail, and thisresult has consequences for status, testosterone may magnify the behaviorsand tendencies that make failure more likely. Conversely, expecting tosucceed in a situation that has status consequences should lead to anincrease, at least among high-T individuals, in behaviors that contribute tosuccess.
Study 2 was designed to test the positive side of thisstatus-performance relationship by placing half of our male participants intoa situation that offered them the opportunity to confirm a positivestereotype, and enhance their status in mathematics. As in previous research(Brown & Josephs, 1999), our participants were toldthat they would be taking a math test that either distinguished people of lowability from everyone else or distinguished people of high ability fromeveryone else. In essence, we led participants to believe that their scorewould be compared with a predetermined cutoff. In the former case, thiscutoff was quite low, and in the latter case, substantially higher.
We predicted that when high-T male participants werepresented with an opportunity to excel, thus enhancing their math status, they would rise to the challenge and perform at a high level. In contrast, weexpected that when told that the test was scored in a way that would onlydistinguish those individuals possessing low ability from everyone else, high-T males would be unconcerned with their performance because of theabsence of a status-enhancing opportunity. On the basis of previousliterature and the findings from Study 1, we hypothesized that low-T maleswould be relatively unaffected by the opportunity to enhance status, andwould perform similarly in the high-ability and low-ability conditions.
Method
Participants
Participants were 51 males enrolled in introductorypsychology at the University of Texas. They participated in partialfulfillment of the course's research requirement. As in Study 1, allparticipants were preselected as being highly identified with mathematics.
Materials and procedure
Saliva collection.
Saliva collection and analysis procedures were identicalto those in Study 1.
Test instructions
Participants were run individually. After salivacollection, they were told that they would be taking a test of mathematicalreasoning abilities. As in our previous study (Brown & Josephs, 1999), participants were given one of two sets of instructionsfor the math test. In the exceptional-ability (status-enhancementpotential) condition, participants were told that the test would identifyonly individuals who were “exceptional” in math abilitycompared with everyone else. In the weak-ability (nostatus-enhancement potential) condition, participants were told that the testwould identify only individuals “weak” in math abilitycompared with everyone else.
Manipulation check
Following the test instructions, participants were givena manipulation check asking them to indicate the nature of the test that theywere about to take, by choosing one of four options: (a) “This testwill determine whether I am exceptionally weak in my mathabilities”; (b) “This test will determine whether I amexceptionally strong in my math abilities”; (c) “Thistest measures my non-verbal spatial abilities”; and (d)“This test measures my logical abilities, combined with my abilityto mentally rotate 3-D pictures.”
Testing session
The procedure for the testing session was identical tothe procedure in Study 1. The performance measure was again the GRE-Q.
SAT scores
Before the start of the experimental session, participants were given a separate consent form that allowed us access totheir math SAT scores. Participants were assured that they did not have togive this consent to participate in the experiment. SAT scores for those whoconsented (n= 49, 96% of the sample) were obtained at theend of the semester.
Results and Discussion
For all analyses presented here, a median split wasperformed on testosterone levels. High-T males (M= 171.95pg/ml, SD= 29.41) were high relative to low-T males(M= 101.73 pg/ml, SD= 19.45). Scores onthe Math Identification Questionnaire did not differ between conditions(overall M= 6.75, SD= 0.94).
Manipulation check
The manipulation check confirmed the success of themanipulation. All participants correctly indicated the testing condition towhich they had been assigned. There were no main effects or interactions forthe nervousness measure in Study 2 (all Fs < 1).
GRE-Q performance
As in Study 1, GRE-Q scores were adjusted for thecombined effects of math SAT scores and the number of high school mathclasses completed.
We predicted that the exceptional-ability conditionwould provide an opportunity for our male participants to confirm a positivestereotype about their math abilities and enhance their status, and that thiswould lead to high-T males outperforming low-T males in this condition. Asshown in Table 2, this hypothesis was supported. A 2(testosterone level) × 2 (test instruction) ANCOVA on adjusted GRE-Q scores revealed no main effects (Fs < 1), but asignificant interaction, F(1, 44) = 5.47, p < .05. Planned comparisons revealed that, as predicted, high-T males performedbetter than low-T males in the exceptional-ability condition, t(24)= 2.07, p= .05, d= .81, butnot in the weak-ability condition, t(21) = 1.36, p= .19, d= .57. A planned comparison alsorevealed that, as predicted, high-T males performed better in theexceptional-ability condition than in the weak-ability condition,t(22) = 2.25, p < .05, d=.92. We also conducted a 2 (testosterone level) × 2 (testinstruction) analysis of variance (ANOVA) on nonadjusted scores, and foundthe same pattern of means, F(1, 45) = 5.57, p< .05.2 Consistent with ourpredictions, high-T males outperformed low-T males, but only when thesituation provided the opportunity to enhance status in the mathdomain.3
Study 2: Mean math performance scores by condition and testosterone level
Note. Scores were adjusted using participants' SAT (Scholastic Assessment Test) math scores and number of math classes taken in high school.
Number of questions skipped
We hypothesized that high-T males who were given theopportunity to demonstrate their high status in the math domain-andconfirm a positive stereotype-would “rise to thechallenge,” and perform well on the test. If this group was highlymotivated to excel, they should have been more likely than other participantsto attempt every question in order to score as high as possible on thetest.
In support of this prediction, a 2 (testosterone level)X 2 (test instruction) ANOVA on the number of questions skipped revealed nomain effects (Fs < 1) but a significant interaction,F(1, 47) = 4.83, p < .05. Planned comparisonsrevealed that high-T males skipped significantly fewer problems than low-Tmales in the exceptional-ability condition, t(25) = 2.07,p < .05, d= .80, but not in the weak-abilitycondition, t(22) = 1.06, p= .30,d= .43.
Summary
We argued that confirmation of a positive stereotypemight be tied to an enhancement in status. If a positive stereotype offersthe opportunity to enhance status, then those individuals especiallyconcerned with such matters should perform particularly well in the presenceof the stereotype. Our results were consistent with this idea. We found thatwhen males had the chance to confirm a positive stereotype, performance wasincreased, but only among males high in baseline testosterone, whose behaviortends to imply concern with status. High-T males outperformed low-T males, but only when the situation provided the opportunity to enhance status in themath domain, suggesting a moderating relationship between the twovariables.
GENERAL DISCUSSION
Together, these studies suggest that stereotype-basedperformance effects occur as a result of a fundamental desire to maintain orenhance one's social status. Only those participants hypothesized toexperience concern in the face of a possible change in status demonstratedstereotype-based performance effects. Specifically, both males and femaleshigh in testosterone appear more responsive to reminders of theirstereotypical status in the math domain, compared with males and females lowin testosterone. However, the nature of this response is very different formales and females. On the one hand, females have long faced negativestereotypes about their math abilities, and a reminder of these stereotypesposes a potential threat to status. When primed with a negative stereotype, only high-T females showed a decrease in math performance. Males, on theother hand, face positive stereotypes about their math abilities, and areminder of these stereotypes presents an opportunity to enhance status. Thus, high-T males outperformed low-T males, but only when primed with apositive stereotype.
One alternative interpretation of these findings is thatthe moderating effects of baseline testosterone can be explained by baselinedifferences in physiological arousal. That is, the stereotype-threat primemay have pushed highly aroused individuals over the top of theperformance-arousal curve. However, several points argue against thisconclusion. First, no testosterone-based performance differences wereobserved in either of the control conditions. Second, in neither study didhigh-T participants report being more nervous than low-T participants acrossexperimental conditions. Finally, a recent study (Josephs, Guinn, Harper, & Askari, 2001) found that testosterone showed nocorrelation with cortisol. Furthermore, raising cortisol levels via ingestionof licorice actually resulted in a slight decrease in testosterone levels. Ifbaseline testosterone is related to chronic arousal, a positive relationshipwith cortisol levels should be observed.
We have argued throughout this article that, through itshierarchical ordering of two or more groups, a stereotype is essentially astatement about dominance or status. Thus, we have suggested that priming astereotype leads persons high in testosterone to perceive the situation ashaving status implications, and subsequently to act on those perceptions. Although the results from these two studies are consistent with thisexplanation, what remains missing is a direct and face-valid manipulation ofstatus. Recently, Josephs and his colleagues (Guinn, Newman, & Josephs, 2002; Josephs & Guinn, 2002) have sought to remedythis by examining the consequences of winning or losing on performance. Thisresearch has shown that when high-T participants perceive themselves to bedominating against a competitor, their subsequent performance relative tocontrol participants increases. When high-T participants perceive themselvesto be dominated, their subsequent performance declines. Low-T participantsshow minimal response to these dominance manipulations. We find these resultsencouraging in that a direct manipulation of status has yielded findings thatare consistent with our stereotype-as-status explanation.
Footnotes
Acknowledgements
We thank Yael Avivi, Brett Bays, Jennifer Gonzales, Shirley Kong, Ashley Owen, and Reese Pepper for their assistance. For comments on an earlier draft, we wish to thank Joshua Aronson, James Dabbs, Allan Mazur, Margaret Shih, Steve Spencer, and Claude Steele. Thisresearch was funded by a National Science Foundation grant (NSF199601580-002)to the first author.
1. It should be pointed outthat changes in testosterone resulting from the consequences of a dominancebattle (i.e., winning or losing in a competitive situation) may be not asstrong in women as in men, but this issue is not of primary concern in thisarticle.
2. To provide an accuratecomparison with the ANCOVA, we conducted this ANOVA using only theparticipants for whom we had obtained math SAT scores (n=49).
3. For a discussion of thediscrepancy between these results and Study 1 ofBrown and Josephs (1999), see Newman, Josephs, and Brown(2002).
