Abstract
Three preregistered studies (N = 533) investigated the relationship between intellectual humility (IH) and cognitive and emotional empathy. Study 1 (n = 212) revealed a positive association between IH and empathic accuracy (EA), especially toward the outgroup. Study 2 (n = 112) replicated the significant association between IH and EA. Study 3 (n = 209) employed a manipulation to enhance IH to demonstrate causality. We found evidence for an indirect effect, wherein the manipulation increased state IH, which was associated with greater EA. A mini meta-analysis revealed that, on average, individuals with higher levels of IH exhibit increased EA, showing a greater understanding of others’ emotional states. Moreover, IH predicts empathic resilience—buffering against personal distress while maintaining or increasing empathic concern for others. These findings highlight the positive influence of IH on empathy, emphasizing its potential for fostering deeper connections and better understanding in social interactions.
Empathy, our ability to understand and share the feelings of others, is hard work. Ample evidence highlights that although we may possess the natural ability to empathize, we do not always do so (Cameron et al., 2019; Keysers & Gazzola, 2014). Because the act of empathizing may be costly, we do not empathize with everyone but rather choose whom to empathize with (Bloom, 2017b; Depow et al., 2021; Weisz & Zaki, 2018). Moreover, we tend to empathize with others who resemble us (Ferguson et al., 2020), favoring people from our ingroup (Bloom, 2017b) rather than an outgroup, especially in contexts of tension and conflict (Bloom, 2017a; Scarry, 1996). In light of the recent escalation in polarization and conflict around the world, the concept of intellectual humility (IH)—recognizing that one’s views and beliefs may be fallible or incorrect (Leary et al., 2017)—emerges as a potential tool for fostering empathy among individuals with differing viewpoints and backgrounds. Specifically, the social-oil hypothesis posits that “humility helps reduce relational wear and tear in situations in which conflict is highly likely,” and that humility could “buffer a relationship from deterioration” and “promote relational-repair behaviors” (Van Tongeren et al., 2019, p. 2). We draw on the social-oil hypothesis and test IH as a predictor of cognitive and emotional facets of empathy in three preregistered studies in interpersonal situations of complex intergroup dynamics.
On Intellectual Humility and Empathy
IH regulates one’s presentation of ideas and beliefs and responses to those of others (Van Tongeren et al., 2019). A growing body of research is dedicated to understanding the significance of IH within the realm of intergroup relations, attitudes, and conflicts. It has been linked to attitude change (Rodriguez et al., 2019), forgiveness in conflicts (Zhang et al., 2015), tolerance of different religions and political views (Hook et al., 2017; Krumrei-Mancuso & Newman, 2021; Porter & Schumann, 2018), affective polarization (Knöchelmann & Cohrs, 2025), fostering positive attitudes toward outgroups (Bowes et al., 2022; Brienza et al., 2021), and investigative behaviors (Koetke et al., 2023).
Despite these findings, the current understanding of the effects of IH on interpersonal factors is incomplete, especially in regard to interpersonal factors within intergroup relations (Knöchelmann & Cohrs, 2025; McElroy-Heltzel & Battaly, 2023). Research demonstrates the connection between IH and closeness and positivity toward a partner following an interpersonal conflict (Peetz & Grossmann, 2021), gratitude, altruism, benevolence, universalism, and self-reported dispositional-empathy dimensions such as perspective taking (PT) and empathic concern (EC; Krumrei-Mancuso, 2017, 2018). We extend these findings by investigating how IH influences cognitive and emotional components of empathy in interpersonal situations of complex intergroup dynamics. Specifically, we examine whether IH predicts empathic accuracy (EA)—the extent to which one objectively recognizes another’s emotional states (Ickes, 1997; Zaki et al., 2009)—and whether IH promotes emotional empathy by maintaining or increasing EC (e.g., sympathy, compassion) while acting as a buffer against relationship deterioration by managing personal distress (PD; e.g., concern, uneasiness). We refer to the latter as “empathic resilience.” We test these questions in the context of how people react to individuals from their ingroup and their outgroup.
Empathy has various definitions in the literature (Batson, 2009). One widely accepted conceptualization distinguishes between cognitive empathy, or mentalizing, which involves recognizing and understanding others’ emotions and feelings, and emotional empathy, or experience sharing, which entails sharing others’ emotions while maintaining a distinction between self and other (Zaki & Ochsner, 2012). Below, we explore these components to develop our hypotheses.
Cognitive Empathy
Cognitive empathy is “the ability to consciously put oneself into the mind of another and understand what that person is thinking or feeling” (Decety, 2015, p. 1). EA refers to a group of paradigms that capture cognitive empathic abilities by measuring the degree of correspondence between an observer’s inferences about a social target’s emotions and the target’s actual reported emotions (Levenson & Ruef, 1992; Schmid Mast & Ickes, 2007; Zaki & Ochsner, 2011). EA is linked to various positive outcomes (Riediger & Blanke, 2020), including a child’s self-concept (Crosby, 2001), fewer adjustment problems (Gleason et al., 2009), higher marital satisfaction (Maneta et al., 2015), greater support within couples (Verhofstadt et al., 2008), and communication satisfaction (Blanke et al., 2016). Low levels of EA are associated with more negative views of the outgroup (Hasson et al., 2019). Therefore, finding ways to promote EA is crucial.
Research suggests that EA inferences can be derived from various sources of information, including the perceiver (e.g., mental representations, stereotypes), the target (e.g., verbal communication), and the relationship between the target and perceiver (Zaki et al., 2008). Also, using mental representations such as perceiver stereotypes can improve EA (Lewis et al., 2012), as can paying attention to what the target is saying (Hodges & Kezer, 2021) or asking them for feedback (Eyal et al., 2018; Israelashvili & Perry, 2021). Ma-Kellams and Lerner (2016) found that individuals who engage in systematic thinking tend to exhibit higher levels of EA than those who rely on intuitive thinking. Systematic thinkers carefully analyze the information provided by the target, rather than relying on intuitive information from preexisting schemas and self-representations. Notably, systematic thinking has been linked to IH (Samuelson & Church, 2015). Also, intellectually humble individuals are theorized to look beyond schemas and heuristics like confirmation bias (i.e., searching for information that confirms one’s preconceptions; Porter, Elnakouri, et al., 2022) and are thus expected to be more accurate in understanding others’ emotions. Therefore, we hypothesize that IH predicts EA.
Because individuals high in IH are more likely to seek out and learn new information beyond their natural biases, even when it challenges their preexisting beliefs, we further hypothesize that they would be more empathically accurate, particularly with individuals from an outgroup. That is, the relationship between IH and EA will be moderated by group belonging, such that the effect will be strongest toward the outgroup.
Emotional Empathy
Emotional empathy describes the overall emotional reactions of the empathizer (Davis, 1983). These include empathic concern (EC)—“other-oriented” feelings of sympathy and care—often leading to approach behaviors as well as personal distress (PD). The latter is a self-focused emotional response to the discomfort experienced by others, which, when felt at high levels, may lead to avoidance and withdrawal (Jordan et al., 2016). The social-oil hypothesis of IH proposes a potential interplay between EC and PD. According to this hypothesis, “a key function of IH is to prevent relational wear-and-tear, like oil prevents an engine from overheating” (McElroy et al., 2014, p. 21).
Humility involves the internal regulation of the self in relation to both internal and external forces. It requires a balanced recognition of one’s strengths and weaknesses, maintaining self-worth that is neither inflated nor diminished. Humility is not self-deprecation but rather an authentic acknowledgment of personal capabilities and limitations, alongside a genuine appreciation for the contributions and perspectives of others (Owens et al., 2013; Tangney, 2000; Van Tongeren et al., 2019). This perspective also includes recognizing one’s smallness within the broader context of existence, which enables humble individuals to regulate and alleviate existential anxiety (Kesebir, 2014). Such humility-driven self-regulation further supports managing behaviors and responses in relationships, promoting the development and maintenance of enduring interpersonal connections (Van Tongeren et al., 2019). Similarly, IH involves regulating beliefs and thoughts in relation to oneself and others (Porter, Elnakouri, et al., 2022). Individuals high in IH can distance themselves from upsetting situations, maintaining an objective perspective and not becoming overwhelmed by distress (Kross & Grossmann, 2012).
Since PD is a self-focused emotional response to others’ discomfort (Jordan et al., 2016), IH allows individuals to regulate these responses. As a result, those high in IH are less likely to become self-focused in such situations and are more emotionally available to provide support. This suggests that they may be more likely to respond with EC rather than PD compared to individuals low in IH. Thus, we predict an interaction between IH and emotional responses. Specifically, individuals higher in IH are expected to exhibit similar or greater EC following others’ emotional stories compared to those lower in IH. While PD may also increase for individuals high in IH, their increase in EC is anticipated to outweigh any rise in PD. In other words, even if PD does not remain similar or decrease, the key distinction lies in the greater relative increase in EC, which ultimately enhances their capacity to respond supportively. That is, IH predicts what we refer to as empathic resilience. In essence, empathic resilience can emerge from various combinations of IH, EC, and PD. Specifically, IH might predict higher EC and lower PD, no change in EC but a decrease in PD, or even a slight increase in PD accompanied by a larger increase in EC. See Figure 1 for a depiction of this prediction.

Depiction of Empathic Resilience.
Overview of Studies
To test our hypotheses, we used an ecological measure—the EA task—in which Jewish Israeli participants were shown videos of autobiographical emotional stories of both Jewish and Palestinian Israelis (Jospe et al., 2020; Kassem et al., 2024). Importantly, Jewish Israelis represent the majority of Israel’s population, comprising approximately 80% of the total, whereas Palestinians constitute a disadvantaged minority (Daoud et al., 2018; Lewin-Epstein & Semyonov, 2019). By comparing the participants’ evaluation of the storytellers’ emotional states to those previously rated by the storytellers, we extracted measures of EA. We asked participants about their own emotional responses following each video and extracted measures of EC and PD. To measure IH, we used standard self-report scales. Specifically, we conducted three preregistered laboratory studies. Studies 1 and 2 employed a within-subject design, and Study 3 used a between-subjects design with an IH manipulation. In all studies, participants were offered course credit or monetary compensation. We report all manipulations, measures, and exclusions in these studies and in the Supplemental Material. The data, codes, and Supplemental Material for all studies are available at https://osf.io/fzcgx/?view_only=df5880e021414933a1d5c8e54ef5e79e.
Study 1
Sample and Procedure
This study was preregistered at https://aspredicted.org/927c-tr5k.pdf. It was designed as a semi-exploratory study aiming to recruit at least 100 females and 100 males who identified as Jews and were native Hebrew speakers, to have sufficient statistical power (90%) to detect a medium effect size (Faul et al., 2007). Ultimately, we were able to recruit 232 participants, with 50.6% self-reporting as females.
Initially, we recruited students from the university using the university experiment system or flyers distributed by research assistants to take part in the study conducted in a laboratory setting on campus during November 2021 to December 2021. However, the campus was closed approximately 2 months later due to a surge in COVID-19 cases. Consequently, we had to transition the study to an online setting, and we recruited participants using the university experiment system and Facebook groups dedicated to people, mostly students, looking to participate in studies. This time, the study was run via Zoom. Despite this shift, we aimed to maintain the advantages of the lab setting by having a research assistant deliver identical instructions via Zoom. To standardize our exclusion criteria across the three studies, 1 we excluded 12 participants who failed two or more attention checks, 2 who showed low motivation to participate (e.g., opened another window to read the news while filling out the survey), and 6 who reported a language other than Hebrew as their native language, resulting in a final sample of 212 participants, of which 114 were students who participated in person in the lab (61.4% women, the average age was 24.05 years, SD = 3.47, and the age range was 18–45 years), and 98 participated on Zoom (87.7% were students, 37.8% women, the average age was 25.7 years, SD = 3.94, and the age range was 18–38 years). As we used a repeated-measure design (long format = 8 videos × 212 participants = 1,696), we also removed 17 observations with no data on the continuous measure of EA, 2 25 observations of outliers exceeding 3 SDs on the dependent variables as preregistered, 15 trials of videos for which participants reported the video got stuck, and 3 trials in which participants reported knowing the target, yielding a final sample of 1,636 observations. Participants watched a total of eight randomized videos from the Israeli Empathic Accuracy Stimuli Set (Jospe et al., 2020, 2022), followed by additional measures described below. We obtained approval for all studies from the Ethics Committee at our institution.
Measures
Unless specified otherwise, in all studies, following recommendations by Aguinis et al. (2009) for more granularity in measurement scales, we presented the measures with a Likert-type scale ranging from 0 = do not agree at all to 10 = strongly agree. Except for the EA task that was originally developed in Hebrew, all scales were translated to Hebrew using the back-translation procedure (Brislin, 1970). Specifically, one of the coauthors translated these items into Hebrew, and a bilingual translator (an English major and a graduate of an Ivy League school) back-translated them to English. Minor discrepancies were deemed immaterial by a second coauthor.
Empathic Accuracy
We used the Israeli Empathic Accuracy Stimuli Set and paradigm (Jospe et al., 2020; Kassem et al., 2024). Participants were presented with a total of eight videos—four featuring female Palestinian Israelis and four featuring female Jewish Israelis. These videos showcased individuals recounting autobiographical emotional stories in Hebrew, and their order of presentation was randomized.
Participants were first asked to infer the emotional state of the targets at each moment of the videos, using a continuous rating slider from very negative to very positive, which resulted in a continuous rating. EA was determined by correlating these participant ratings with the self-reported ratings provided by the targets at a previous stage (for more details, see Jospe et al., 2020). We refer to this as the continuous EA measure.
After each video, participants were asked to rate eight specific emotions of the target on a scale of 1 to 9: embarrassment, anger, sadness, happiness, disgust, pride, fear, and excitement. We measured the discrepancy between the target’s reported emotion and the participant’s rating to calculate EA. These ratings were then reversed, with a maximum accuracy score of 8 points and a minimum accuracy score of 0 points for each emotion. The total accuracy score for all eight emotions ranged from 0 to 64. We transformed this scale to a range of 0 to 100 for better comprehension. We refer to this as the specific-emotions EA measure.
Situational PD and EC
After each video, we implemented a measure developed by Israelashvili et al. (2020) to assess participants’ emotional responses to the stories they just watched. Participants were presented with the question “How do you feel in response to the story you just heard?” followed by four items measuring PD (e.g., “upset” and “distressed”) and four items measuring EC (e.g., “sympathetic” and “compassionate”) in a randomized order. Reliabilities were α = .88 and α = .86, respectively.
Intellectual Humility
The IH Scale
We used the six-item scale developed by Leary et al. (2017), α = .82. Sample items include “I question my own opinions, positions, and viewpoints because they could be wrong” and “I reconsider my opinions when presented with new evidence.” We refer to this scale as the short IH measure.
The Comprehensive IH Scale
We used the 22-item scale developed by Krumrei-Mancuso and Rouse (2016). It has four subscales, including independence of intellect and ego, α = .89 (reverse-coded; e.g., “When someone disagrees with ideas that are important to me, it feels as though I’m being attacked”), openness to revising one’s viewpoints, α = .83 (e.g., “I am open to revising my important beliefs in the face of new information”), respect for others’ viewpoints, α = .85 (e.g., “I can respect others, even if I disagree with them in important ways”), and lack of intellectual overconfidence, α = .70 (reverse-coded; e.g., “My ideas are usually better than other people’s ideas”). The overall scale reliability was α = .83. We refer to this scale as the long IH measure.
Control Measures
Interpersonal Reactivity Index
To measure trait empathy, we used a measure developed by Davis (1983). The Interpersonal Reactivity Index includes four subscales: PT, α = .74 (e.g., “I try to look at everybody’s side of a disagreement before I make a decision”), fantasy, α = .77 (e.g., “I daydream and fantasize, with some regularity, about things that might happen to me”), EC, α = .71 (e.g., “Sometimes I don’t feel very sorry for other people when they are having problems”), and PD, α = .73 (e.g., “I sometimes feel helpless when I am in the middle of a very emotional situation”). Reliability of the overall scale was α = .80. We presented the measures with a Likert-type scale ranging from 1 = does not describe me at all to 5 = describes me very well.
Ten-Item Personality Inventory
To measure the Big Five, we used a measure developed by Gosling et al. (2003). Each item includes a pair of expressions representing one specific personality domain. Two items assess each personality domain. For example, sample items included “open to new experiences, complex” to measure openness to experience, and “dependable, self-disciplined” to measure conscientiousness. Participants rated their answers on a Likert-type scale ranging from 1 = do not agree at all to 7 = strongly agree. Reliability was poor: extraversion: α = .67; agreeableness: α = .31; conscientiousness: α = .72; emotional stability: α = .65; and openness to experience: α = .42.
We also collected gender, age, political orientation, religiosity, openness toward Palestinians, equality support, and learning (the last three are detailed in the Supplemental Material).
Results
In the current study and subsequent studies, we performed all statistical analyses using R software (R Core Team, 2020, Version 4.3.1). Table 1 shows correlations and standard deviation (SD) of the main study’s variables between subjects.
Means, Standard Deviations, and Correlations of Study 1 Variables.
Note. N = 212.
p < .01. *p < .05. †Marginally significant.
We conducted an independent-sample t-test to assess whether the change in location had any impact on participants’ IH ratings. The results indicated no significant difference in ratings between the two locations (lab/online Zoom) for both the short IH measure, p = .895, and the long IH measure, p = .213. We also analyzed each sample separately and observed similar patterns consistent with our hypotheses, as detailed in the Supplemental Material (see Tables S6–S9 and Figures S2–S5). Both IH measures were highly correlated but did not converge, r = .52, 95% confidence interval, CI [.42, .62], p < .001. The two EA measures were not correlated, r = −.04, 95% CI [−.18, .09], p = .540, suggesting they assess different aspects of accuracy. The correlations between the EA and IH measures were in the expected direction but mostly not significant. The continuous measure of EA was not correlated with the short IH measure, r = .01, 95% CI [−.12, .14], p = .891, or with the long IH measure, r = .01, 95% CI [−.13, .14], p = .903. The specific-emotions measure of EA displayed a correlation of r = .08, 95% CI [−.05, .22], p = .224 with the short IH measure and a correlation of r = .17, 95% CI [.04, .30], p = .012 with the long IH measure, hinting at a potential partial support for our hypothesis regarding the association between IH and EA.
To test the convergent and discriminant validity of the two IH measures, we ran a factor analysis of the short IH with the four subdomains of the long IH using Promax rotation with the package nFactors (Raiche et al., 2020). Using the n_factors function in package psycho (Makowski et al., 2019), we found the optimal number of factors was two, suggesting that the short IH and the two long IH subdomains “openness to revising one’s viewpoints” and “respect for others’ viewpoints” loaded on one factor, while the other two long IH subdomains (i.e., “independence of intellect and ego” and “lack of intellectual overconfidence”) loaded on a separate factor (see Table S1 in the Supplemental Material for factor loadings). 3
Given the strong correlations among the short IH and the two mentioned long IH subdomains (i.e., rs = .79, p < .001, and .57, p < .001, respectively), 4 and their loading on the same factor, we aggregated them for a more parsimonious analysis. In addition, the short IH and the two long IH subdomains capture the more relational aspect of IH (Porter, Baldwin, et al., 2022), which is relevant to our theoretical questions about IH and empathy. Notably, in the original study that validated the long IH measure, the first two subdomains showed higher factor loadings on the IH factor (loadings = .794, .988), while the other two subdomains, now loaded separately, exhibited weaker loadings (loadings = .487, .372; Krumrei-Mancuso & Rouse, 2016, p. 215). This suggests that the first two subdomains align more closely with the essence of IH. This aggregation was not preregistered; however, in the Supplemental Material (Tables S2 and S3), we report the same analyses with the short IH and long IH scales in their original configuration. In addition, we report a mini meta-analysis comparing all IH measures in their original configuration across all studies to provide a generalizable summary of the results.
To assess the contribution of IH and group belonging to EA as the dependent variable (DV), we ran a mixed-effects model where IH, group belonging of the target, and the interaction between them are the fixed effects, and the video ID and participant ID are the random effects. To test each predictor’s contribution to the model, we conducted a four-step hierarchical model approach. The first step was the null model, for which the only factors entered were the video ID and participant ID. IH was added to the second model, group belonging of the target to the third, and the interaction between them to the fourth (see Table 2 for the model comparisons). Across the various studies conducted, we interpreted the variables’ parameters from the most complex model, which significantly improved the model’s goodness of fit. In some cases, we also interpreted the simpler models if deemed right by our hypotheses.
Study 1, Model Comparison Assessing the Contribution of IH and Group Belonging to EA Measures.
Note. N = 212.
p < .001.**p < .01. *p < .05. †Marginally significant.
We first tested the models with IH predicting the specific-emotions measure of EA. In line with our prediction, after running the four-step hierarchical model, the model showing best of fit included IH, group belonging, and the interaction between them. We found a significant main effect of IH on EA, β = .08, p = .001, and a significant interaction between them (β = −.12, p < .001; see Figure 2), suggesting that higher levels of IH predict EA toward the outgroup but not the ingroup. A simple slope analysis using the reghelper package (Hughes et al., 2022) indicated a significant positive association between IH and state EA for the outgroup, t = 3.28, p = .001, and a marginal negative association for the ingroup, t = 0.48, p = .058. We did not replicate this finding with the continuous measure of EA as the DV, and even found a marginally significant interaction suggesting that IH predicts EA toward the ingroup and not the outgroup (see Tables S4 and S5 in the Supplemental Material).

Study 1, Interaction Between IH and Target Group Belonging, Predicting Empathic Accuracy.
Consistent with our hypothesis and preregistration, a positive correlation emerged between IH and state EC, along with a negative correlation between IH and state PD. We found a positive marginally significant correlation between the short IH and state EC, r = .13, 95% CI [−.00, .26], p = .059, whereas the correlation with PD was not significant, r = .10, 95% CI [−.03, .23], p = .145. There was a negative correlation between the long IH and state EC, r = −.15, CI [−.28, −.01], p = .029, and an even stronger negative correlation with PD, r = −.25, CI [−.37, −.12], p < .001. This pattern indicates that indeed, IH exhibits differential effects on EC and PD.
To explore these effects, we conducted an additional non-preregistered analysis. We again employed a mixed model and incorporated IH and the type of affective measure (EC or PD) as fixed effects, while participant ID and video factors were treated as random effects (see Table 3 for model comparison). The best-fitting model was once again the most complex one that included both IH and the interaction term.
Study 1, Model Comparison Assessing the Contribution of IH and Type of Measure to Affective Measure Value.
Note. N = 212.
p < .001. **p < .01. †Marginally significant.
In this model, consistent with our predictions, we observed a significant interaction between IH and type of affective measure, β = −.05, 95% CI [−.21, −.03], p = .009. In addition, a simple slope analysis indicated a significant positive association between IH and state EC, t = 2.42, p = .016, and a nonsignificant association with PD, t = 1.11, p = .268 (see Figure 3). That is, individuals with higher levels of IH tend to experience greater EC, while the impact on PD is comparatively weaker. The increased gap between EC and PD represents what we refer to as empathic resilience, supporting our hypothesis.

Study 1, Interaction Between IH and Type of Affective Measure (Concern/Distress), Predicting Affective Measure Value.
To further probe our question, we repeated the same analysis and model comparison with the IH measure residualized from constructs that may serve as alternative explanations to the observed effect. The interaction between IH, PD, and EC held when controlling for the measures we preregistered for exploratory purposes. Specifically, we separately residualized openness to experience, emotional stability, trait EC, and trait PD from IH and ran the four-step model comparison with each. In all cases, the model with the best fit was the most complex one that included the interaction between IH and type of measure. In this model, both the main effect of IH and the interaction between IH and type of measure were significant (see Table 4). Importantly, the significant interactions consistently revealed the same pattern: IH enhances empathic resilience by widening the gap between PD and EC. 5
Study 1, Multilevel Models Regressing Affective Measure Value (State Empathic Concern/Personal Distress), on Type of Measure (Concern/Distress) and IH, Where Type of Measure and IH Are the Fixed Effects, and Participant ID and Video ID Are the Random Effects.
Note. N = 212. OTE = openness to experience; ES = emotional stability; trait EC and PD = trait empathic concern and personal distress.
Discussion
In summary, in support of our hypotheses, Study 1 provided preliminary evidence of an association between IH and EA, particularly toward an outgroup. In addition, as predicted, IH was found to moderate the affective responses to others. Specifically, higher levels of IH were related to increased EC, whereas the impact on PD was weaker. These effects remained significant even after accounting for trait PD and EC, openness to experience, and emotional stability. In Study 2, we aimed to replicate these findings using a different research design and to explore a potential mechanism underlying the relationship between IH and EA.
Study 2
Sample and Procedure
This study was preregistered at https://aspredicted.org/7fj5-s5r6.pdf. Given that in Study 1 the interaction between IH, group belonging, and EA was significant after approximately 100 participants before moving from the lab to the Zoom setting, we predicted that we would need at least 100 participants to reach significance. We recruited 134 students to participate in our study in person in the lab. We again recruited the students through the university’s online experiment system and by distributing flyers around campus. All participants identified as Jews and were all native Hebrew speakers, 66.4% female; their average age was 24.54 years, SD = 2.84, and their age range was 19 to 42 years. Participants arrived at the lab to take part in the study for course credit or monetary compensation. We excluded eight participants who later did not identify as Jewish, three who reported a language other than Hebrew as their native language, six who had previously participated in Study 1, and four who failed two or more attention checks. Consequently, the sample consisted of 113 participants. As we used a repeated-measure design (long format = 6 videos × 113 participants = 678 observations), we excluded 64 observations with missing data on the continuous EA measure and 9 outlier observations that exceeded 3 SDs on the dependent variables in accordance with the preregistration. Also, we excluded 9 observations from participants who reported issues with a specific video malfunctioning, yielding a final sample of 596 observations nested within 112 participants. (Following the exclusion of the above-mentioned trials, one additional participant was removed due to being a test, yielding a final sample of 112 participants.) A sample size of 112 participants is sufficient to detect a medium to strong effect size (d = 0.62, 90% power, α = .05, two-tailed). The procedure for this study was similar to Study 1, with several modifications made to shorten the duration. We removed two videos (one Jewish and one Palestinian) and selected the six most effective videos that demonstrated both variance between participants and predictive validity between IH and empathy, assessed IH using only the short measure, and introduced a task to measure motivation for empathy as a potential mediator. The order of the six videos was randomized. To prevent potential priming effects, half of the participants completed the motivation-for-empathy task at the beginning of the questionnaire, and the other half completed it at the end.
Measures
Empathic Accuracy
We used the same two measures as in Study 1.
Situational PD and EC
We used the same measures as in Study 1, α = .87 and α = .86, respectively.
IH Scale
We used the same short IH measure as in Study 1, α = .82.
Motivation for Empathy
Motivation for empathy was assessed using a task developed by Cameron et al. (2019) in which our participants were presented with a choice between two decks of cards: one deck representing “feel self” and the other representing “feel other.” After making a choice, participants saw an image and evaluated how it made them or another person feel. This process was repeated 40 times. The total number of times participants opted to “feel other” was used as an indicator of their level of motivation for empathy.
Results
Table 5 presents correlations and SDs of the study’s variables.
Means, Standard Deviations, and Correlations of Study 2 Variables.
Note. N = 112.
p < .01.
The two measures of EA were not correlated, as in Study 1, indicating they are measuring different aspects of EA, r = .09, 95% CI [−.10, .27], p = .355. IH was correlated with the continuous measure of EA, r = .28, 95% CI [.10, .44], p = .003, but not with the specific-emotions measure, r = −.08, 95% CI [−.26, .11], p = .394, partially supporting our prediction about the association between IH and EA. Similar to Study 1, we ran a four-step model comparison to test our hypotheses. When running this analysis with IH and group belonging to predict the continuous measure of EA, we found that only the model predicting EA with IH alone showed significantly better fit than the more complex models (see Table 6 for detailed model comparison).
Study 2, Model Comparison Assessing the Contribution of IH and Group Belonging to EA Measures.
Note. N = 112.
p < .01.
Indeed, in this model, IH predicted EA significantly, β = .15, 95% CI [.05, .24], p = .002. In other words, IH predicts EA, but the association to the outgroup is not significantly stronger than the ingroup. When we ran the same model comparison predicting the specific-emotions measure of EA, none of the models differed significantly, suggesting that none of the other predictor variables significantly predicted the specific-emotions measure of EA.
In the present study, we did not preregister the hypothesis regarding IH predicting state EC or distress. Nevertheless, an exploratory analysis replicated the results of Study 1 whereby IH predicts empathic resilience. Specifically, we followed the same analysis and approach of Study 1 that showed that the model with the best fit is the most complex one (see Table 7), and we found a significant interaction between IH and type of measure (EC/distress), β = −.08, 95% CI [−0.14, −0.01], p = .020, where the gap between the values of the two measures increased (see Figure 4).
Study 2, Model Comparison Assessing the Contribution of IH and Type of Measure to Affective Measure Value.
Note. N = 112.
p < .001. *p < .05.

Study 2, Interaction Between IH and Type of Affective Measure (Concern/Distress), Predicting Affective Measure Value.
A simple slope analysis showed that the significant interaction between IH and type of measure predicting empathic resilience stems from a positive but not significant association between IH and EC, t = 1.14, p = .256, and no association between IH and PD, t = −0.26, p = .793. This outcome replicates the finding from Study 1, wherein higher levels of IH are associated with greater EC, with the impact on PD less pronounced.
One of the preregistered predictions of Study 2 was that motivation for empathy mediates the association between IH and EA. This analysis did not yield any significant results regarding this prediction (see the Supplemental Material).
Discussion
In Study 2, we found a significant correlation between IH and the continuous measure of EA, supporting our prediction regarding the relationship between IH and cognitive empathy. No correlation was found between IH and the specific-emotions measure of EA. Importantly, we replicated our previous findings regarding the interaction between IH and empathic resilience. The hypothesized mediation between IH and EA through motivation for empathy was not supported. In the next study, we sought to take the hypotheses from Studies 1 and 2 further, testing the causality between IH and EA, along with the differential effects of IH on PD and EC. We therefore employed a between-subjects design manipulating IH to test causality.
Study 3
Pilot Study
To ensure a robust effect in Study 3, we conducted a pilot study pretesting the manipulation on a convenient sample of native English speakers. We recruited 90 participants through Prolific (Palan & Schitter, 2018) and removed five who failed the attention check, obtaining a final sample of 85 individuals. Participants were randomly assigned to either an IH condition (43 participants) or the condition of intellectual certainty (42 participants). We combined two published manipulations of IH to strengthen our manipulation (see the full description in the Supplemental Material). First, we employed a procedure developed by Porter et al. (2020; Study 5). We asked participants to read a text about IH (or intellectual certainty, as a control condition) as a desired skill at work. Following this was a procedure by Koetke et al. (2023; Study 3) in which participants were asked questions that made them feel either intellectually humble or neutral. Finally, they completed a state humility measure as a manipulation check. 6
For the state IH measure, we adapted 13 items from three trait IH scales to assess state-level IH (i.e, Krumrei-Mancuso & Rouse, 2016; Leary et al., 2017; Reis et al., 2018), α = .90. 7 See the full scale in the Appendix.
Results
As anticipated, the experimental condition significantly and strongly increased state humility compared to the intellectual-certainty condition, t(83) = 3.55, p < .001, d = .770. These findings boosted confidence in the effectiveness of the manipulation employed in Study 3.
Main Sample and Procedure
This study was preregistered at https://aspredicted.org/wqmd-srf6.pdf. An a priori power analysis in G*Power (Faul et al., 2007) suggested that we would need at least 172 participants to detect a medium effect of IH on EA (d = 0.50, 90% power, α = .05, two-tailed). Therefore, we aimed to recruit at least 200 participants who identified as Jewish, native Hebrew speakers. We initially recruited a total of 231 students, 63% female, and their average age was 23.74 years, SD = 2.67, with an age range of 18 to 36 years, who came to the lab for course credit or monetary compensation. We excluded 6 who identified as non-Jewish, 11 who did not speak Hebrew as their native language, and 5 who displayed lack of seriousness by answering incorrectly two or more attention checks, as preregistered, yielding a final sample of 209.
Because we used a repeated-measure design (long format = 4 videos × 209 participants = 836), we also removed 1 observation with missing data on the continuous EA measure and 20 observations of outliers exceeding 3 SDs on the dependent variables (as preregistered), along with 3 observations of participants who reported a technical problem with a video, yielding a final sample of 812 observations. Participants were randomly assigned to either the IH (104) or intellectual-certainty (105) condition, as described above for the pretest. Following the manipulation, participants completed study measures and watched four randomized videos (described above) featuring two Jewish Israeli women and two Palestinian Israeli women. Similar to Study 2, these videos were selected based on the variance they demonstrated and their predictive validity between IH and empathy.
Measures
Empathic Accuracy
We employed the same two measures for assessing EA as in Studies 1 and 2: for specific emotions and for continuous EA.
Situational PD and EC
We used the same measures as in Studies 1 and 2, both alphas were .82.
State IH Scale
We used the same 13 items of the state IH measure described in the pretest and added an additional item from Krumrei-Mancuso and Rouse’s (2016) study: “I am willing to listen to others even if I disagree with them,” α = .79.
Results
The manipulation effectively increased state IH, which was rated significantly higher in the experimental condition compared to the control condition, d = 0.47, t(207) = 3.40, p < .001. Table 8 shows correlations and SDs of the study’s variables.
Means, Standard Deviations, and Correlations With Confidence Intervals of Study 3 Variables.
Note. N = 209.
p < .01. *p < .05.
The manipulation did not have a significant effect on either measure of EA at the individual level. However, we found a significant positive correlation between state IH and the continuous measure of EA, r = .15, 95% CI [.01, .28], p = .034, providing indirect support for our hypothesis. Furthermore, because a predictor can indirectly affect an outcome even in the absence of a direct effect (Hayes, 2017), we sought to test whether state IH mediated the relationship between the experimental condition and the continuous EA measure. We ran an exploratory mediation analysis using the “mediation” package in R for multilevel data (Tingley et al., 2014) with 5,000 bootstrapped samples. The experimental condition was the predictor, state IH was the mediator, and the continuous EA measure was the outcome. Indeed, there was a significant indirect effect of the experimental condition through state IH, β = .02, 95% CI [.00, .04], p = .034, whereas the direct effect was not significant, β = .00, 95% CI [−.07, .08], p = .976, suggesting that the effect of the experimental condition on EA was mediated by state IH (see Figure 5 and Table 9).

Study 3, Mediation Model of the Effect of the Experimental Manipulation on Empathic Accuracy Through State IH.
Study 3, Regression Coefficients for Test of Indirect Effect From the IH Condition, Through State Intellectual Humility, on Empathic Accuracy.
Next, as preregistered, we examined the association between the experimental condition and group belonging (Arab/Jew) in predicting both measures of EA. Therefore, we predicted EA with the experimental condition, group belonging of the target, and the interaction between them as fixed effects, and video ID and participant ID as the random effects. None of these models was significantly different from each other (see Table 10). Thus, we concluded that the manipulation and group belonging did not directly affect EA. That is, only when accounting for state IH did the association between IH and EA emerge. These findings replicated a main effect between IH and EA, but not an interaction between IH and group belonging.
Study 3, Model Comparison Assessing the Contribution of the Experimental Condition and Group Belonging to EA Measures.
Note. N = 209.
We sought to test whether IH predicted EC and PD. Hence, we again conducted a four-step hierarchical model to examine the effect of IH and type of measure (EC/PD) on affective measure value. 8 We carried out separate analyses for the experimental condition and state IH as predictors. When running the model comparison with the experimental condition, the best model was the third, predicting affective measure value with type of measure (distress/concern) and condition. However, only the type of measure’s coefficient was significant, β = −.97, 95% CI [−1.03, −.91], p <.001, whereas the experimental condition wasn’t, β = −.04, 95% CI [−.18, .10], p = .596, suggesting that the type of measure but not the experimental condition predicts affective measure values in this model.
When conducting the same analysis with state IH instead of the experimental condition, the third model appeared as the most optimal; and the fourth, which was the most complex, was marginally significantly different from the third one (see Table 11). This model tested our hypothesis, predicting affective measure value while the type of measure, state IH, and their interaction were the predictors. The interaction was marginally significant, partially supporting our hypothesis, β = −.06, 95% CI [−.12, .00], p = .059, and replicating the findings of Studies 1 and 2. A simple slope analysis indicated that the marginal interaction stemmed from a negative nonsignificant slope for PD, t = −0.89, p = .375, and a positive nonsignificant association between state IH and EC, t = 0.57, p = .600. Moreover, Figure 6 shows again that state IH marginally predicts empathic resilience: The higher participants’ reported levels of state IH, the larger the gap between EC and PD.
Study 3, Model Comparison Assessing the Contribution of IH and Type of Measure to Affective Measure Value.
Note. N = 209.
p < .001. †Marginally significant.

Study 3, Interaction Between State IH and Type of Affective Measure (Concern/Distress), Predicting Affective Measure Value.
This interaction differs from previous studies in the independent effects between IH and PD on one hand, and IH and EC on the other. However, considering the overall trend of the interaction between IH and both measures, there is a consistent pattern: Higher levels of IH are associated with maintained or increased EC, with a weaker impact on PD. In the current study, EC remains stable while PD decreases, showing a similar effect on empathic resilience.
To understand the complementary role of the experimental condition on empathic resilience, we conducted an exploratory moderated-mediation analysis between the experimental condition, state IH, and type of measure when predicting affective measure value. Therefore, we residualized distress from concern to capture the gap between them (i.e., empathic resilience) and ran a mediation analysis with 5,000 bootstrapped samples (see Figure 7 and Table 12). We found a marginal indirect effect, β = .08, 95% CI [−.00, .19], p = .062, hinting that the IH manipulation indeed affected empathic resilience through state IH. However, if these effects are real, the sample was not sufficiently powered to reach significance. Furthermore, the mediation analysis reveals a suppression effect, highlighting the need for further exploration and clarification.

Study 3, Mediation Model of the Effect of the Experimental Manipulation on Empathic Resilience through State IH.
Study 3, Regression Coefficients for Test of Indirect Effect From the IH Condition, Through State Intellectual Humility, on Empathic Resilience.
Discussion
Study 3 aimed to examine the causal relationship between IH and EA and to demonstrate that IH acts as a buffer against PD, while maintaining EC. We found a significant effect of the manipulation on state IH. The direct effect on EA was not significant, but it was in the expected direction. Notably, an exploratory analysis found an indirect effect between the experimental condition and the continuous EA measure through state IH, suggesting that the manipulation indirectly affects EA through state IH. The absence of a significant direct effect between the manipulation and EA may be due to insufficient strength, or perhaps those stimuli were only effective at enhancing state IH. Future studies should try to corroborate the effect by, for example, asking participants to reflect on intellectually humble behavior (vs. intellectually certain behavior). Alternatively, perhaps the manipulation of presenting questions while expecting people to give a wrong answer made participants with lower levels of IH angry, thus interfering with their ability to focus on someone else and accurately recognize their emotions.
Like Study 2, we did not replicate the interaction between IH and group belonging when predicting EA. But consistent with the findings of Studies 1 and 2, we found a marginal interaction between state IH and the type of measure in predicting affective measure value, reflecting increased empathic resilience. We also found a marginal indirect effect when testing the effect of the manipulation on empathic resilience through state IH, suggesting that the IH experimental condition increased empathic resilience through state IH. This mediation analysis revealed a suppression effect, suggesting that our predicted effects apply only to those who reported higher levels of state IH, and that some participants were negatively affected by it.
Mini Meta-Analysis
To assess the average relationship between IH and EA across studies and measures, we ran a three-level meta-analysis that accounts for dependency in effect sizes (Van den Noortgate et al., 2013), using the rma.mv function of the package metafor (Viechtbauer, 2010). We found a significant association,

Forest Plot of the Average Meta-Analytic Relationship Between IH and Empathic Accuracy Across All Studies and Measures.
To assess the overall significance of the interaction between IH, type of measure, and affective measure value (i.e., the association between IH and empathic resilience), we conducted a mini meta-analysis of all three studies with each study’s respective beta coefficient.
9
The average estimate was

Forest Plot of the Average Meta-Analytic Beta Between IH and Type of Measure and Affective Measure Value (i.e., Empathic Resilience).
General Discussion
Our central hypothesis predicted a positive association between IH and EA. We found mixed evidence for this association, with different measures of IH correlating with different measures of EA in each study. However, when employing an internal meta-analysis, we found a weak positive and significant relationship supporting our prediction across all measures and studies. When it comes to objectively understanding other people (especially across deeply divided intergroup identities), even a weak effect is an important one. Furthermore, in Study 3 we found an indirect effect of the manipulation of IH on EA through state IH, lending some support to the causal link between IH and EA: To the degree that state IH is elicited, it is associated with greater EA. Our hypothesis predicting that the association between IH and EA would be moderated by group belonging of the target was supported only in Study 1; therefore, it appears that IH predicts EA generally for both ingroup and outgroup targets, and a differentiation between these ingroup and outgroup targets should be more thoroughly examined.
Building on the social-oil hypothesis, our findings across studies combined with a mini meta-analysis demonstrate that IH increases what we term “empathic resilience”—it serves as a buffer for affective responses toward others. Specifically, individuals higher in IH tend to maintain or even increase their EC while experiencing a reduced impact on PD. Importantly, we found that IH uniquely contributes to empathic resilience even when we controlled for close constructs to IH and empathy outcomes such as openness to experience and trait empathy. Although we did not find an interaction between the experimental manipulation and EC or PD, we found a marginal indirect effect such that state IH mediated the moderation between the experimental condition and empathic resilience, suggesting that the experimental condition maintains EC and reduces PD only for those reporting an increase in state IH.
Implications
The present article contributes to the literature in several ways. We comprehensively tested the association between IH and empathy—including its cognitive, affective, and motivational aspects—beyond classical self-reports, utilizing rich and naturalistic stimuli of both ingroup (majority) and outgroup (disadvantaged minority) members. Although the observed effect sizes were small, our approach represents a promising step toward a nuanced comprehension of how IH is differentially associated with various components of empathy.
We extend the existing research on EA by proposing IH as an antecedent. Previous studies identified several factors influencing EA, such as feedback (Israelashvili & Perry, 2021), systematic mode of thought versus intuitive (Ma-Kellams & Lerner, 2016), and age (Riediger & Blanke, 2020). Because EA has been shown to yield positive relational outcomes, introducing IH as a novel antecedent provides an avenue for creating interventions that enhance EA by fostering IH.
In addition, we demonstrate that IH, identified as a cognitive attribute, not only influences cognition as previously shown (Bowes et al., 2022), it also exerts an impact on an individual’s affective responses toward others. Specifically, we establish that IH contributes to the development of empathic resilience—displaying care for others without compromising one’s own well-being. This phenomenon aligns with the social-oil hypothesis.
Limitations and Future Research
We acknowledge several limitations in our study. First, our hypotheses were tested using recorded stimuli featuring individuals who were unfamiliar to the participants. The dynamics might vary when there is a preexisting acquaintance between individuals. For instance, empathic resilience may display different patterns in a romantic relationship or at the workplace or between friends. Hence, future research should examine our hypotheses across diverse acquaintance levels and in live interactions.
Second, our prediction regarding motivation for empathy as the primary mechanism linking IH and EA was not substantiated. This could be due to an inadequate sample size to detect the mediation effect or limitations in our motivation measurement approach; or perhaps it is not motivation but something else that IH facilitates. Future studies should retest this mechanism or consider alternative pathways.
Third, our hypothesis concerning group belonging as a moderator in the relationship of IH and EA was affirmed only in Study 1 and not in subsequent replications. This might imply that IH doesn’t strengthen in interactions with outgroup members; rather, it seems to mitigate differences between individuals from diverse backgrounds. In essence, higher levels of IH appear to correlate with increased EA across all groups. This inference aligns with past research indicating that IH acts as a buffer against negative behaviors toward outgroups (Bowes et al., 2020, 2022; Sgambati & Ayduk, 2023). Yet, it is possible that our sample sizes are not sufficiently powered to determine one or the other. Future studies could test the interaction hypothesis with larger samples and more target stimuli, as well as by manipulating ingroup and outgroup membership with simpler intergroup contexts such as different sports teams and polarized political groups.
Fourth, the mediation analyses in Study 3 suggest that the direct effect of our experimental manipulation on EA is negligible, and even negative on empathic resilience. Support for our hypothesis only emerges when accounting for the indirect effect mediated by state IH. These findings underscore the need to develop additional methods to manipulate IH in a lab setting (and later in the field) to enable a clearer understanding of its causal effect on various interpersonal outcomes.
Fifth, despite the presence of mostly significant interactions between IH and empathic resilience, our samples may have been underpowered to robustly detect interaction effects. We have found that conducting studies with our stimuli in a controlled, in-person lab setting is essential, as collecting quick online samples for such tasks often yields unreliable and invalid data. Consequently, we relied on the available university student pool and designed our studies with careful consideration of the tradeoff between sample availability and statistical power. Our study planning was guided by a power analysis focused on detecting main effects. Nevertheless, we provide a meta-analysis of the interaction effects across all studies and samples. Future research should aim to replicate and extend these findings using larger samples.
Finally, it is possible that the relationship between IH and EA is bidirectional. That is, individuals with a greater capacity for understanding others might also exhibit higher levels of IH. Prior research shows that being relational by listening increases humility (Lehmann et al., 2023). Similarly, greater empathy could lead to higher levels of IH. However, Study 3, where we manipulated IH, obtained evidence suggesting that individuals with elevated state IH also demonstrated greater EA. Nevertheless, further investigation into the alternative direction of this relationship is warranted.
Conclusion
Our results indicate that overall, IH is associated with EA and predicts empathic resilience: sustained or increased EC and reduced PD. We further show that IH increases EA and marginally increases empathic resilience. Our research marks an initial step in unveiling the role IH may play as a route to cultivating empathy and enriching interactions among individuals from diverse backgrounds. Further research is essential to extend these findings to varied relationships and domains.
Supplemental Material
sj-docx-1-psp-10.1177_01461672241313427 – Supplemental material for Intellectual Humility Predicts Empathic Accuracy and Empathic Resilience
Supplemental material, sj-docx-1-psp-10.1177_01461672241313427 for Intellectual Humility Predicts Empathic Accuracy and Empathic Resilience by Michal Lehmann, Shir Genzer, Nur Kassem, Daryl R. Van Tongeren and Anat Perry in Personality and Social Psychology Bulletin
Footnotes
Appendix
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The preparation of this article was supported by a grant from The John Templeton Foundation (#62265) supporting AP and ML, a grant from the Institute for Israeli Thought (IIT) awarded to ML, and a grant from the Mind and Life Institute awarded to AP.
Supplemental Material
Supplemental material is available online with this article.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
