Abstract
Considering the mixed nature of reports of flexibility difficulties in autism, we hypothesized that a task that more closely resembles the challenges faced in real life would help to assess these difficulties. Autistic and typically developing adults performed an online Emotional Shifting Task, involving non-explicit unpredictable shifts of complex socio-emotional stimuli, and the Task Switching Task, involving explicit predictable shifts of simple character stimuli. Switch cost (i.e. the difference in performance between Shift and Non Shift conditions) was larger in the autistic group than in the comparison group for the Emotional Shifting Task but not for the Task Switching Task. Females responded faster than males in the Emotional Shifting Task. On the Task Switching Task, typically developing males responded faster than typically developing females, whereas there was a female advantage in the autistic group. Our findings suggest that factors such as predictability, explicitness of the shift rule, stimulus type as well as sex could play a critical role in flexibility difficulties in autism.
Lay abstract
Flexibility difficulties in autism might be particularly common in complex situations, when shifts (i.e. the switch of attentional resources or strategy according to the situation) are unpredictable, implicit (i.e. not guided by explicit rules) and the stimuli are complex. We analyzed the data of 101 autistic and 145 non-autistic adults, without intellectual deficiency, on two flexibility tasks performed online. The first task involved unpredictable and non-explicit shifts of complex socio-emotional stimuli, whereas the second task involved predictable and explicit shifts of character stimuli. Considering the discrepancies between laboratory results and the real-life flexibility-related challenges faced by autistic individuals, we need to determine which factor could be of particular importance in flexibility difficulties. We point out that the switch cost (i.e. the difference between shift and non-shift condition) was larger for autistic than for non-autistic participants on the complex flexibility task with unpredictable and non-explicit shifts of socio-emotional stimuli, whereas this was not the case when shifts were predictable, explicit and involved less complex stimuli. We also highlight sex differences, suggesting that autistic females have better social skills than autistic males and that they also have a specific cognitive profile, which could contribute to social camouflaging. The findings of this work help us understand which factors could influence flexibility difficulties in autism and are important for designing future studies. They also add to the literature on sex differences in autism which underpin better social skills, executive function, and camouflaging in autistic females.
Background
Autism Spectrum Disorders 1 are characterized by socio-communicative difficulties associated with specific interests, repetitive behaviors, as well as sensory specificities (American Psychiatric Association [APA], 2013). Several symptoms, such as repetitive behaviors and the difficulties autistic people have changing their plans and routines, underlie cognitive flexibility difficulties (Faja & Nelson Darling, 2019; Mostert-Kerckhoffs et al., 2015). Cognitive flexibility is the ability to shift (i.e. switch attentional resources) from one task to another, or to change strategy or perspective according to the situation (Miyake et al., 2000) and is an essential adaptive skill. The difference in performance between Shift and Non-Shift conditions is referred to as the switch cost. Real-life cognitive flexibility difficulties can be evaluated using the shift subscale of the Behavioral Rating Inventory of Executive Function questionnaires (Gioia et al., 2000; Roth et al., 2005 for the adult version). These questionnaires make it possible to discriminate between autistic children and adults and typically developing (TD) individuals, particularly on the shift subscale (Davids et al., 2016; Geurts et al., 2020; Granader et al., 2014; Wallace et al., 2016; White et al., 2017). This latter subscale investigates task switching, how people face changes and difficulties, as well as how people manage to change their perspective, for instance, in problem solving. However, laboratory results on flexibility in autism are inconsistent (for overviews, see Geurts et al., 2009; Leung & Zakzanis, 2014).
Reduced ecological validity (i.e. the extent to which the outcome of an experiment can generalize to day-to-day life—see Kihlstrom, 2021; Orne, 1962), and high variability of the employed tests might partly explain the mixed results (De Vries & Geurts, 2012; Eylen et al., 2015; Geurts et al., 2009). First, tasks vary according to the stimuli used, which can be more or less complex (e.g. character stimuli, such as numbers and letters, are less complex than socio-emotional stimuli). Second, shifts can be guided either explicitly or implicitly (i.e. explicit rule/cue indicating the shift vs rule inferred by the participant). Third, shifts can be predictable (i.e. the participant knows when the shift will appear) or not. Finally, tasks might vary depending on the solicitation of other cognitive functions (e.g. memory load).
Usually, everyday life situations requiring flexibility are related to the processing of unpredictable changes involving complex stimuli (see Han et al., 2012). To achieve ecological validity, De Vries and Geurts (2012) designed a flexibility task which included unpredictable shifts and used human faces as stimuli. Children (8–12 years old) were presented with faces and had to report either their gender or emotion depending on a cue displayed above the face. Unexpectedly, autistic children did not exhibit greater difficulties than TD children. According to the authors, this result might be explained by the fact that the emotional content was too simple and that the shift was explicitly guided by means of the cue. We can hypothesize that flexibility difficulties in autism would mainly appear during complex tasks, particularly when shifts are unpredictable but also when they are not explicitly guided (Van Eylen et al., 2011). Thus, difficulties would be observed, in particular, when the relevant cues have to be autonomously explored in order to infer the shift (see Van de Cruys et al., 2014). This hypothesis is at least partly supported by empirical data from the Wisconsin Card Sorting Test (WCST), which is based on the use of unpredictable shifts and implicit rules and in which the performance of autistic participants is usually impaired (for a meta-analysis, see Landry & Al-Taie, 2016).
Interestingly, interest has recently been shown in a predictive coding framework thought capable of explaining autism symptoms (Gomot & Wicker, 2012; Pellicano & Burr, 2012; Van Boxtel & Lu, 2013; Van de Cruys et al., 2014) and this could help explain discrepancies in the performances of autistic individuals on flexibility tasks. From the predictive coding view, the brain is constantly generating predictions regarding incoming sensory input based on past sensory input and, more generally, past experiences. Prediction error is the difference between the actual bottom-up sensory input and top-down driven predictions (i.e. priors) (Friston, 2005). An imbalance in the weight attributed to sensory input and predictions in autism (Brock, 2012; Pellicano & Burr, 2012) could lead to high precision of prediction errors (Van de Cruys et al., 2014), meaning that predictions would not be flexibly adjusted according to the context (Sapey-Triomphe et al., 2021; Van de Cruys et al., 2014). In line with this hypothesis, Sapey-Triomphe et al. (2021) found that autistic adults, unlike TD participants, constructed priors but did not flexibly modulate them according to the context during a low-level discrimination task. These findings might generalize to different tasks or situations requiring flexibility and could partly explain flexibility difficulties in autism and why these difficulties appear in unpredictable situations. They could also help us understand the social difficulties experienced by autistic individuals, as socio-emotional situations are complex and context-bound, and thus unpredictable and driven by non-explicit rules.
Our aim was to conduct an online study to investigate whether a flexibility task including unpredictable and non-explicit shifts of socio-emotional stimuli would highlight flexibility difficulties in autism compared to TD. To this end, we examined the difference between autistic and TD participants on a complex flexibility task involving unpredictable shifts of socio-emotional stimuli and guided by non-explicit rules. Participants were instructed to evaluate the valence of socio-emotional scenes without and with their context (shown immediately afterwards). The latter context could sometimes change the valence of the scene compared to the first evaluation (Shift condition). We hypothesized that the autistic participants would exhibit lower correct response rates (CRs), greater switch costs (i.e. the difference, in accuracy and response time, between Shift and Non-Shift condition) and longer response times (RTs) than the TD participants. In the task, participants were asked to categorize stimuli (including emotional faces) as positive or negative. As happy faces are recognized more accurately and faster than negative ones (for a review, see, Calvo & Nummenmaa, 2016), even by autistic participants (for review and meta-analysis, see Harms et al., 2010; Uljarevic & Hamilton, 2013), we expected to observe higher CR and shorter RT for positive than for negative stimuli as well as a higher switch cost in the case of positive compared to negative shifts.
In addition, participants performed a second task involving predictable shifts of simple character stimuli. They were guided by explicit rules which they had to remember. As in the task conducted by Rogers and Monsell (1995), participants were asked to classify either the digit number of a pair of characters (one letter and one number, appearing clockwise in one of the boxes on a four-box grid) as even/odd (in the lower boxes) or the letter as consonant/vowel (in the upper boxes). Rogers and Monsell (1995) observed a larger RT when participants had to change from the letter to the number task or from the number to the letter task compared to when they did not have to change. In other words, there was a larger switch cost in the Shift than in the Non-Shift condition and we expected to observe the same in our task. We also hypothesized that the switch cost would not be larger for autistic than TD participants. Nevertheless, the performance of the autistic participants should be lower due to the working memory load required by the task (because the participants had to remember the rule for the letter task and for the number task), and to the fact that autistic individuals usually have working memory difficulties (for a meta-analysis, see Habib et al., 2019). We also expected a correlation between RT on the two tasks in TD (particularly, for the Shift conditions) but not in the autistic group, as we hypothesized that autistic participants would have specific difficulties in the task involving unpredictable shifts of socio-emotional stimuli.
We conducted exploratory analyses to study the effect of sex and its interaction with group and shift in the two tasks. Indeed, recent findings have highlighted sex and gender differences in autism. It is probable that these have been underestimated due to a gender bias in autism diagnosis and research (Hull et al., 2016; Hull & Mandy, 2017). Autistic females might hide their symptoms better (Dean et al., 2017; Lai et al., 2017), resulting in them receiving less attention. More specifically, for both neurobiological and socio-cultural reasons (Cage & Troxell-Whitman, 2019; Lawrence et al., 2020; Schaer et al., 2015), they might show better social skills and pay more attention to faces than autistic males (Harrop et al., 2018, 2019), and might also exhibit higher social motivation and greater sensitivity to social reward (Cook et al., 2017; Lawrence et al., 2020; Sedgewick et al., 2016; Song et al., 2021). However, some studies have found similar difficulties in emotion recognition in autistic males and females (Baron-Cohen et al., 2015) and sometimes even higher levels of difficulty in females in complex socio-emotional situations (Vanmarcke et al., 2016). Importantly, research has also highlighted the presence of greater flexibility abilities in autistic females (Bölte et al., 2011; Lehnhardt et al., 2016), which could also contribute to their better social adaptation. Thus, it was important to test the effect of sex in this study in order to investigate if autistic females would perform better than autistic males (1) in the socio-emotional flexibility task, and (2) in the flexibility task with character stimuli. This approach allowed us to investigate sex differences in complex socio-emotional settings, to explore if superior socio-emotional adaptation in autistic females might be related to better flexibility skills and if higher flexibility skills might be related to specificities in predictions.
Finally, we conducted an exploratory analysis of correlation between task performances (CR and RT) and age, education, autistic traits, negative and positive affects in order to check whether these variables affect our results. Rationale, all detailed hypotheses, exclusion criteria and planned analyses were pre-registered on Open Science Framework: https://osf.io/avfcs.
Method
Participants
A total of 109 autistic adults (46 females, 56 males and 7 transgender/non-binary individuals) and 200 TD adults (129 females, 69 males and 2 transgender/non-binary individuals), who reported normal or corrected-to-normal vision and no treatment impairing their cognitive functions, completed the entire online study. Transgender/non-binary individuals (6.5% of our autistic group) were included in the analysis (with their natal sex; discussed in the limitations) as gender diversity is common in autism (Cooper et al., 2018). Analyses without transgender individuals did not change our main conclusions (see Supplementary Materials ). All participants were between 18 and 45 years old (mean = 32 ± 7).
Autistic participants were recruited with the help of regional expertise centers dedicated to autism diagnosis, psychiatrists and psychologists who forwarded the online study to their diagnosed patients only. Some were recruited via associations/social networks. In this latter case, participants needed to contact us in order to participate. After being contacted, we asked them for details on the professionals who had made the diagnosis, the type of diagnosis received, and the type of tests used for the diagnosis. We did this in order to determine whether they had received a formal diagnosis by professional experts based on the criteria of the Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM-IV-R or 5th ed.; DSM-5; APA, 2000, 2013) or those of the International Classification of Disease–10th Revision (ICD-10; WHO, 1992). If they provided evidence of a formal diagnosis, we sent them the link to the study. In the study, participants were required to report their diagnosis (79 Asperger’s Syndrome, 4 High Functioning Autism and 26 Autism Spectrum Disorder), to state who provided the diagnosis (multiple choices—69 were diagnosed in expertise centers, the others were diagnosed in hospitals and/or by private psychiatrists) and their age at diagnosis (mean = 27 ± 9). Some had received one or more other diagnoses, either neurodevelopmental (e.g. attention deficit hyperactivity disorder (ADHD) and dyslexia; N = 25) and/or psychiatric (e.g. anxiety disorder and depression; N = 28); 68 participants reported having no known comorbidities.
TD adults were recruited via advertisements, mailing lists, and personal networks. To be included in the study, they could not be diagnosed or self-identified as autistic or have any autistic relative. Among participants who completed the entire study, 5 reported a neurodevelopmental disorder and 13 reported a psychiatric diagnosis. Analyses excluding TD participants with no such diagnosis did not change our main conclusions and even strengthened them (see Supplementary Materials ).
Participants could not be included in the study if they had a disorder (e.g. Parkinson’s disease, intellectual disability, and major depression) and/or were taking medication that could have affected task performance (e.g. treatment that could have modified their attention levels). These exclusion criteria were presented to the participants in writing at the beginning of the study (before informed consent was given) and additional questions were asked during the study (e.g. the study ended if the participant reported feeling that their treatment was impairing their cognitive functions). For reasons related to the European General Data Protection Regulation (GDPR) regarding online data collection pointed out by the ethical committee, we did not collect any details regarding treatment and diagnosis (other than the autism diagnosis). The list of exclusion criteria also included the following: being a protected adult (e.g. under guardianship), being in a situation of social fragility (e.g. prison or hospitalization), having uncorrected visual impairment, having consumed drugs in the hours preceding the task (e.g. alcohol and cannabis), not having access to a computer to perform the study.
All procedures performed in this study involving human participants were conducted in accordance with the Code of Ethics of the World Medical Association (Declaration of Helsinki) and the study was approved by the local ethics committee (CER-Grenoble Alpes, COMUE University Grenoble Alpes, IRB00010290).
Data exclusion
Figure 1 represents the flow chart for participant exclusion after data collection. The TD group had substantially more female participants, and these had a higher education level, something which was not anticipated. In order to match our groups better, we randomly removed 30 TD female participants with high education level before any analysis (see Uzefovsky et al., 2016).

Flow chart of participant exclusion. The number of excluded and remaining females (F) and males (M) in each group (autistic and typically developing (TD) group) is specified at each step.
Exclusion criteria were stated before the experiment (https://osf.io/avfcs). We excluded TD participants with AQ score >32 (N = 17). Note that we did not plan to exclude autistic participants with AQ <32 as AQ is not a diagnostic tool but a screening tool only. In a large sample of participants (476 patients referred for an autism diagnosis), Ashwood et al. (2016) showed that the AQ questionnaire, despite its good sensitivity, has low negative predictive value (around two-thirds of those who scored below the cut-off in their study were finally diagnosed with autism). This could explain why some of our participants had AQ scores below the cut-off despite their autism diagnosis. We planned to exclude participants who had <75% CR on valence evaluation without context in our main task. This would have led to the exclusion of 51 participants (around 1/6). We therefore reduced the threshold to <60% CR, resulting in the exclusion of six autistic and seven TD participants. We then checked for outliers (i.e. we checked for participants whose accuracy on valence evaluation with context in our main task was below Quartile 1 − 1.5 × InterQuartile range or above Quartile 3 + 1.5 × InterQuartile Range in both Shift and Non-Shift conditions): there were two autistic participants who both had 0% correct in Shift and Non-Shift conditions, and one TD participant who had 18% correct in the Non-Shift and 33% correct in the Shift condition. As these data likely result from misunderstanding/misreporting (Osborne & Overbay, 2004), we excluded these three participants, even though this was not pre-registered. Finally, we removed trials with RT below 300 ms (less than 0.1% of the trials). In addition, one trial was above the maximum duration and was also removed even though this was not pre-registered. We did not remove participants with positive affects <18 and negative affects >29 on the Positive and Negative Affect Schedule (PANAS - Watson et al., 1988) (two autistic participants). Indeed, the participants were recruited and tested from the 13 April 2020 to the 27 May 2020, that is, during the COVID-19 pandemic, and this might have led to increased negative affects measured on the PANAS. Instead, we ran supplementary correlation analyses between scores on tasks and PANAS scores. Importantly, it is unlikely that the COVID context affected our main comparisons and the validity of our interpretations, as any potential COVID-associated effects would have affected both the autistic group and comparison group.
Demographics of the final sample are summarized in Table 1. Groups did not differ on age or education but, as predicted, they differed significantly on AQ scores. In addition, autistic participants had significantly less positive and greater negative affects than TD participants and they reported more other diagnoses. Characteristics by sex in each group are reported in the Supplementary Materials . There was no noteworthy difference between males and females in each group.
Mean value, standard deviation and range for age, education and scores on questionnaires as well as percentage of participants with a diagnosis other than autism for each group, and group comparison.
PsyNeuroDiag: psychiatric or neurological diagnosis; NeuroDevDiag: neurodevelopmental diagnosis; AQ: Autism Quotient; NegA: negative affects; PosA: positive affects, ASD: autism spectrum disorder; TD: typically developing.
Education refers to the numbers of years of study. Twelve years correspond to achievement of secondary education.
Material and procedure
The experiment was performed online on the PsyToolkit platform (Stoet, 2010, 2017) which makes it possible to collect reliable accuracy and RT data (Kim et al., 2019). It was anonymous (i.e. no names, IPs, zipcodes, email addresses, etc. or any sensitive data not required for the study, such as ethnicity, were collected), in compliance with the GDPR. Using an online experiment may enhance ecological validity, particularly for autistic participants. Indeed, it allows them to perform the experimental procedures in their trusted environment, reducing stress related to social interactions and unknown environments, while ensuring protection of privacy (Benford & Standen, 2009; Gillespie-Lynch et al., 2014; Haas et al., 2016). Informed consent was obtained online from all participants (see Varnhagen et al., 2005). The experiment included two tasks (order counterbalanced) and two questionnaires.
The Emotional Shifting Task
During this 5-min task, participants were asked to evaluate the valence of 22 pairs of successive pictures: a cropped picture of a socio-emotional scene (without context) with either a positive or negative valence followed by the complete non-cropped image (with context) which had a congruent valence (Non-Shift condition) or opposite valence (Shift condition) due to the added context (Biro et al., 2021. Shift and Non-Shift conditions were randomly mixed (i.e. unpredictable). Each picture was presented for 3000 ms, during which participants had to answer by pressing the “B” or “V” button on the keyboard (one for positive and the other for negative; button—stimulus assignment counterbalanced) with the index finger of the dominant hand. Each picture was preceded by a fixation dot presented for 1000 ms (Figure 2). The 22 experimental trials were preceded by six training trials. Pictures selected for the experimental trials came from the International Affective Picture System (IAPS—Lang et al., 1997) for the Non-Shift condition and from the Internet for the Shift condition. In two pilot studies, stimuli were rated by TD participants (N = 46 for the first study; N = 91 for the second study) for both valence and arousal (separately for the picture with context and without context) on a 7-point Likert-type scale, in a way similar to the IAPS (for details, see Biro et al., 2021). Images which differed significantly on the associated valence values with and without context were selected by the authors for the final task.

Top: Emotional Shifting Task—example of one trial in the Shift condition, with a negative valence without context and positive valence with context. Bottom: Task Switching Task—example of a sequence with four correct answers followed by one incorrect answer.
The Task Switching Task
During this 5-min task, a square divided into four cells appeared on the screen (Rogers & Monsell, 1995). A letter and a number appeared side by side in one cell (Figure 2). When the stimulus appeared in the top cells, participants were asked to respond based on the letter displayed and to press V if the letter was a consonant (G, K, M or R) or B if it was a vowel (A, E, I or U). When the stimulus appeared in one of the bottom cells, participants were asked to respond based on the number displayed and to press V if the number was odd (3, 5, 7 or 9) and B if it was even (2, 4, 6 or 8). Stimuli were presented sequentially in one cell after another, always in clockwise order (i.e. predictable). The stimulus was displayed until the participant responded, and for a maximum duration of 5000 ms. If the response was incorrect or missing, a message appeared beside the stimulus for 3000 ms and repeated the instructions. The participant first practiced on the letter task only (40 trials), then on the number task only (40 trials), before performing the mixed tasks (60 trials).
The Autism-Spectrum Quotient
The Autism-Spectrum Quotient (AQ) is a 50-item self-reported questionnaire intended to evaluate autistic traits in participants (Baron-Cohen et al., 2001). Participants rated their agreement or disagreement with statements on a 4-point Likert-type scale. Each item scored 1 or 0, a score of 32 and above being associated with high autistic traits. Cronbach’s alphas for the autistic group and the TD group in the current sample were 0.86 and 0.88 respectively, showing good internal consistency.
The PANAS
The PANAS is a 5-min questionnaire intended to detect anxious and depressive states (Watson et al., 1988). Participants rated on a 5-point scale the extent to which they were experiencing 20 particular emotions (10 for Positive Affects and 10 for Negative Affects) at the time of the questionnaire. Scores above the 5th percentile (Positive Affects <18 and Negative Affects >29) are correlated with depressive and/or anxious states (Crawford & Henry, 2004). Cronbach’s alphas for the autistic group for PA and NA were 0.82 and 0.88, respectively, and for the TD group 0.83 and 0.89, showing good internal consistency.
Community involvement
One autistic adult was involved in designing the protocol, analyzing and interpreting the data and writing the manuscript.
Results
We used Generalized Linear Mixed Models for our analyses and all post hoc tests were performed with Dunnet’s T3 adjustment. The detailed analytic strategy and all tables of results are provided in Supplementary Materials .
Emotional Shifting Task
An accuracy analysis (i.e. the analysis of correct responses) was performed on trials with correct answers on valence evaluation without context. We then analyzed RT of correct trials on pictures with context. There were less CR (β = −1.43, p < 0.001) and longer RT (β = 184, p < 0.001) in the Shift than in the Non-Shift condition, thus revealing the expected switch cost. Autistic participants were less accurate (β = −0.62, p = 0.001) and slower (β = 126, p < 0.001) than TD. The Group × Shift interaction was significant on CR (β = −0.74, p = 0.018) and on RT (β = 202, p < 0.001). Post hoc comparisons (Figure 3(a) and (d)) revealed no significant difference on either CR or RT between autistic individuals and TD in the Non-Shift condition. However, in the Shift condition, TD participants were more accurate (β = 0.99, p < 0.001) and faster (β = −228, p < 0.001) than autistic participants. The Non-Shift condition led to more CR and shorter RT in autism (CR: β = 1.80, p < 0.001; RT: β = −285, p < 0.001) and in TD (CR: β = 1.07, p = 0.023; RT: β = −83, p < 0.001), indicating a switch cost in both groups.

Mean accuracy with standard error (SE)—left—and box plots with RT distributions—right—for valence evaluation with context on the EST according to Group × Shift Condition (a and d), Emotion × Shift Condition (b and e) and Group × Sex (c and f). Significance levels for simple effects of interest are also displayed.
There were fewer CR for negative scenes than for positive scenes (β = −0.84, p = 0.015) and RT were slower (β = 248, p < 0.001). Post hoc comparisons (Figure 3(b) and (e)) revealed a switch cost on both CR and RT only when participants had to disengage from positive stimuli (CR: β = 9.77, p < 0.001; RT: β = −432, p = <0.001).
Females were marginally more accurate (β = 0.28, p = 0.055) and significantly faster (β = −72, p < 0.001) than males. Post hoc tests (Figure 3(c) and (f)) showed no significant sex difference on CR in either autistic or TD individuals. However, females were faster than males in autism (β = 84, p < 0.001) and in TD (β = 60, p = 0.002). TD females were marginally more accurate (β = 0.57, p = 0.057) and faster (β = −55, p < 0.001) than autistic females. TD males were more accurate (β = 0.68, p = 0.018) and faster (β = −139, p < 0.001) than autistic males. There was no significant difference between autistic females and TD males on CR or RT.
Task Switching Task
Accuracy was lower (β = −0.67, p = < 0.001) and RT were longer (β = 387, p < 0.001) in the Shift than in the Non-Shift condition, revealing the expected switch cost. Groups did not significantly differ on accuracy, but autistic participants were slower than TD (β = 158, p = <0.001). The Group × Shift interaction (Figure 4(a) and (c)) was significant on CR (β = 0.30, p = 0.045) but not on RT. Post hoc tests revealed higher CR and shorter RT in the Non-Shift than in the Shift condition in autism (CR: β = 1.69, p < 0.001; RT: β = −391, p < 0.001) and in TD (CR: β = 2.28, p < 0.001; RT: β = −383, p < 0.001). There was no significant difference in CR between autistic and TD participants in either the Non-Shift or the Shift condition. However, TD were faster than autistic participants in the Shift (β = −162, p < 0.001) and in the Non-Shift conditions (β = −153, p < 0.001). Note that the apparent contradiction between the interaction effect on CR and the post hoc tests can be explained by the log odds scale of our output, which is non-linear and can lead to “removable interaction” (Loftus, 1978; Wagenmakers et al., 2012).

Mean accuracy with standard error (SE)—left—and box plots with RT distributions—right—for the TST according to Group × Shifting condition (a and c) and Group × Sex (b and d). Significance levels for simple effects of interest are also displayed.
Females were more accurate than males (β = 0.48, p = 0.008) but post hoc tests (Figure 4(b) and (d)) revealed no significant difference, except between autistic males and TD females (β = 0.48, p = 0.02). On RT, there was no main effect of Sex, but there was a significant Group × Sex cross-over interaction (β = −146, p < 0.001). TD males answered faster than autistic males (β = −231, p = < 0.001) and TD females (β = −69, p < 0.001). Autistic females answered slower than TD females (β = −85, p < 0.001) but faster than autistic males (β = 77, p < 0.001).
Participants who reported a diagnosis of neurodevelopmental disorder other than autism were less accurate than those who did not (β = −0.84, p = 0.003)
Correlation analyses
Correlation between task performances
In the TD group, CR in the Emotional Shifting Task (EST) was positively correlated with CR in the Task Switching Task (TST) (r = 0.30, p = 0.021), but the correlation was not significant in the ASD group. The correlations between RT in the EST and RT in the TST were not significant.
Correlation with PANAS, AQ, age, and education
Neither positive affects, negative affects, AQ nor education were significantly correlated with performance on the tasks. However, higher AQ scores were associated with more negative affects in the TD group (r = 0.32, p = 0.007). Analyses also showed that age lowered TST performance in TD (correlation with CR: r = −0.31, p = 0.011; correlation with RT: r = 0.33, p = 0.005) but not in autism. We did not find any correlation with age on the EST. The correlation matrices can be found in the Supplementary Materials .
Discussion
Considering the variability in reports of flexibility difficulties in autism, our goal was to investigate whether a complex flexibility task including unpredictable and non-explicit shifts of socio-emotional stimuli would be more effective in revealing differences between autistic individuals and TD individuals than a simpler flexibility task. To this end, we compared these two groups on both the EST (Biro et al., 2021) and the TST (Rogers & Monsell, 1995). We also investigated sex differences.
Flexibility difficulties in autism and predictive coding
In the EST, the switch cost was greater in autistic than TD participants in terms of not only CR but also RT. As suggested by Van de Cruys et al. (2014), autism seems to be characterized by high precision of prediction error resulting from an imbalance in the weight attributed to sensory input and predictions (i.e. priors). This characteristic would reduce the ability to flexibly adjust the relative precision of the prior and the sensory input in a context-dependent way, as has been suggested by the recent findings of Sapey-Triomphe et al. (2021). Thus, difficulties would arise, in particular, in contexts in which multiple cues are in competition (Van de Cruys et al., 2014). Emotion recognition with context requires the autonomous exploration of multiple cues, which are processed automatically during the early stages of face processing in TD individuals (Barrett et al., 2011; Righart & De Gelder, 2008). In the EST, the truncated picture seen before the scene provides a cue with a predictive value (i.e. generating expectations). In the Non-Shift condition, the cues in the scene are congruent with each other and with expectations. There is no mismatch between prediction and sensory input, and the prior does not need to be adjusted, leading to similar performances in autistic and TD participants. In the Shift condition, there was an unpredictable and non-explicit change in the predictive value of the cue provided by the first evaluation due to the context and thus, a competition between multiple cues. Autistic participants would struggle to update their priors, leading to longer stimulus processing durations and a deficit in disengaging, thus impairing their performances. This difficulty in autonomously updating the predictive value of cues as a function of context could lead to flexibility difficulties such as those observed in the EST and explain why such difficulties are not observed in the TST. Indeed, there is no cue competition in the TST as the predictive value of the cue indicating the shift is explicit and predictable. Our results are in line with the idea that the act of switching, per se, is not a problem for autistic individuals but that difficulties might arise when shifts are unpredictable and implicit (De Vries & Geurts, 2012; Van de Cruys et al., 2014; Van Eylen et al., 2011). Hence, flexibility difficulties in autism might result from predictive coding specificities and would be context-dependent (Van de Cruys et al., 2014). This could partly explain the discrepancy between laboratory tasks and real life (Van de Cruys et al., 2014; Van Eylen et al., 2011). Importantly, other studies have demonstrated that autistic individuals are able to adapt to unpredictable changes when there is no competition between cues (i.e. explicitly indicated; e.g. Barnard et al., 2008; De Vries & Geurts, 2012; Hill & Bird, 2006). In addition, autistic individuals are able to extract statistical regularity in the environment without any explicit rule (Brown et al., 2010; Manning et al., 2017; Nemeth et al., 2010), indicating that implicit changes can be handled as long as they are statistically predictable. Thus, the combination of unpredictability and non-explicitness could be particularly important for revealing flexibility difficulties in autism (Van de Cruys et al., 2014; Van Eylen et al., 2011). Accuracy on the TST predicts accuracy on EST in TD but not in autism. This finding is consistent with the hypothesis that flexibility difficulties in autism might be mediated by factors such as predictability and explicitness of the shifts. At a neural level, our findings are corroborated by Thillay et al. (2016), who found a larger contingent negative variation in autism compared to TD in uncertain contexts, suggesting that autistic adults cannot flexibly modulate cortical activity as their level of certainty changes.
Importantly, our two tasks also differ in the type of stimuli used. Socio-emotional stimuli are challenging for autistic participants and might also contribute to differences in task performances (see Latinus et al., 2019). In addition, pictures with context involved global processing, whereas autistic individuals have a more locally oriented perception (Mottron et al., 2006). As the autistic group experienced no difficulties in the Non-Shift condition of the EST, the results tend to suggest that it is difficult to update priors when in a state of uncertainty. Indeed, if autistic participants had simply used the information without context to give their responses, they would probably have done so for both conditions (Shift and Non-Shift) and we would not have observed the effect of the interaction between group and shift on RT. In addition, accuracy in the Shift condition would have been much more greatly impaired, given that context is essential for valence evaluation. Nevertheless, it should be noted that incongruent images might be more ambiguous and their valence is probably more difficult to identify for autistic participants. While we cannot rule out the possibility that the lower performances of autistic participants are influenced by ambiguity, it is not inconsistent with flexibility difficulties related to predictive coding specificities, as the purpose of predictive coding in the brain would be precisely to resolve perceptual ambiguity (Summerfield et al., 2006; Weilnhammer et al., 2017). In addition, the study of Latinus et al. (2019), which also investigated flexibility in response to socio-emotional stimuli, also revealed specificities in autism. First, the authors identified a reduced willingness to switch to the emotional sorting rule in an emotional Wisconsin Card Sorting Test. Second, they observed increased activity in the bilateral Inferior Parietal Sulci in autistic compared to TD participants when the participants had to switch between events, thus highlighting the need for a higher level of certainty before settling into a stable processing stage in the Wisconsin Card Sorting Test. These results showing flexibility specificities, in particular with regard to the processing of emotional stimuli, might also support our findings. Hence, flexibility difficulties related to predictive processes seem to be a good candidate for explaining our results, even if other factors might be involved. Future studies should include tasks with varying levels of predictability but excluding socio-emotional stimuli in order to better investigate flexibility difficulties in autism and the role of unpredictability in them, irrespective of the social content.
The global performances of the autistic participants were lower than those of the TD participants. On the EST, this effect was driven by the Shift condition, as there was no group difference in the Non-Shift condition. On the TST, although autistic participants performed as well as TD, they were always slower regardless of the shift condition, probably because of the working memory load imposed by the task. Working memory is frequently impaired in autism (Habib et al., 2019), and this might affect reaction time.
Easier shifting from negative to positive valence
In addition to showing the expected advantage of positive emotion on performance, the results also revealed that the switch cost was greater when disengaging from positive than negative emotions. This is consistent with the idea that RTs are faster when recognizing positive facial expression compared to negative ones (for a review, see Calvo & Nummenmaa, 2016 in TD and Harms et al., 2010; Uljarevic & Hamilton, 2013 in autism). The happiness advantage can be explained by the salience of perceptual features such as open mouth (Calvo & Nummenmaa, 2016) but also in terms of the tendency to have a positive bias toward individuals because humans usually express a normatively positive mood (Diener & Diener, 1996; Leppänen et al., 2003; Leppänen & Hietanen, 2003).
General and specific sex differences in autism
Although exploratory, our results also provide an interesting account of sex differences in autism and TD. In the EST, shorter RT for females than males in both groups shows that some sex differences in autism mirror sex differences in the general population (see Parish-Morris et al., 2017). A female advantage in emotional face recognition is frequently reported, but varies depending on the sample size (Kret & De Gelder, 2012) and task characteristics, with more realistic tasks leading to more robust sex differences (Wingenbach et al., 2018). This advantage is larger in childhood and has clear socio-cultural influences (for a meta-analysis, see McClure, 2000). Indeed, in stereotypical gender roles, which are appropriated early (Martin & Ruble, 2010), females are expected to be empathetic and to take care of others more than males are, thus requiring them to pay more attention to other people’s emotions. Other factors such as sex hormones, sex chromosomes or brain structure might also influence the female advantage in social cognition (Honk et al., 2013; Kret & De Gelder, 2012; McClure, 2000; Whittle et al., 2011).
Interestingly, autistic females exhibited an intermediate profile between autistic males and TD females on the EST. This profile was similar to that of TD males. Despite being different from their TD peers, leading to integration difficulties, they might sometimes have less noticeable specificities than autistic males, due possibly to their better social camouflaging abilities (Dean et al., 2017; Hull et al., 2020; Lai et al., 2017, 2019; Schuck et al., 2019). Results suggest that autistic females might better understand and adapt to socio-emotional contexts and changes than autistic males. This adds to the growing literature reporting sex differences regarding the social sphere in autism (Chawarska et al., 2016; Harrop et al., 2019, 2018; Lawrence et al., 2020; Sedgewick et al., 2016; Song et al., 2021).
On the TST, females were more accurate than males but sex differences failed to reach significance in any of the groups, underscoring the critical role of the sample size. TD males had faster RT than TD females. This result might be explained either by general faster RT in males (Der, 2006; see also Roivainen, 2011) or by difference in predictive processes. However, this remains to be tested. The fact that autistic females exhibited faster RT than autistic males in both tasks suggests that they achieve faster processing (Lehnhardt et al., 2016) or have better flexibility abilities than males, and this might also contribute to camouflaging (Bölte et al., 2011; Lehnhardt et al., 2016).
Factors influencing task performance
Autistic participants had less Positive Affects and greater Negative Affects than TD, reflecting the high incidence of anxiety and depression in autism (e.g. Roy et al., 2015). They also had higher AQ scores (in line with Baron-Cohen et al., 2001). However, neither affects nor AQ were correlated with performances in autism. Interestingly, higher AQ scores in TD were related to more negative affect (associated with anxiety and depression traits), in accordance with Ashwood et al.’s (2016) hypothesis that AQ scores might be sensitive to anxiety. Age lowers performance on the TST in TD, possibly as a result of the deterioration of executive functions with aging (Wecker et al., 2005), but not in ASD. Aging trajectories might be atypical in ASD (e.g. Geurts & Vissers, 2012; Lever & Geurts, 2016; Wecker et al., 2005). However, this still has to be investigated.
Limitations
Online studies have limitations such as the lack of control over external factors (e.g. noisy environment) as well as selection or reporting bias (Chang & Vowles, 2013; Janssens & Kraft, 2012). To minimize these limitations, participants were asked to perform the tasks in a well-rested state in a quiet environment and to report disturbance during the task. We also adopted a cautious approach to participant recruitment, although the lack of in-person diagnosis assessment is a weakness. It should be noted that the participants had no intellectual deficiency, were mainly late-diagnosed, were eager to go onto the Internet, and that the results cannot be generalized to the whole spectrum. The question of inter-individual variability in executive function difficulties in autism (Lehnhardt et al., 2016; Van Eylen et al., 2011) should also be addressed in future studies. Another limitation is the absence of intelligence quotient (IQ) measurement. However, significant differences in IQ should have affected accuracy in the TST (see Memari et al., 2013; Russo et al., 2007) and the Non-Shift condition of the EST (see Harms et al., 2010), which was not the case. Hence, it is unlikely that the results could be due to IQ differences.
The decision to include transgender/non-binary individuals in their natal sex group is a matter for debate (Nguyen et al., 2019). Future studies should consider groups on the basis of their gender identity, even though it will be challenging to recruit a sufficiently large sample to perform analyses. Moreover, analyses on sex differences were exploratory and future studies should investigate more specific hypotheses.
Importantly, our two tasks varied on several parameters (i.e. predictability, explicitness, and socio-emotional components) and future studies should investigate more precisely which of these parameters plays the most significant role in flexibility difficulties in autism or, indeed, attempt to determine whether a combination of these parameters is needed in order to observe flexibility difficulties in autism. Similar tasks varying on just one of these parameters would be required.
Finally, despite the use of socio-emotional scenes, which improve ecological validity compared to character stimuli, the protocol, per se, is not similar to day-to-day life. First, naturalistic environments always contain a context (i.e. they are not cropped). However, presenting a cropped image before presenting the global scene might feed into autistic perception (more locally oriented on initial viewing), with the limitation that, in daily life, autistic persons’ first fixation on social scenes might not always be oriented toward the face (for a review, see Guillon et al., 2014). Second, valence evaluation in real life is more implicit and does not involve response selection. Third, the fundamental process that we are reporting in this study is based on extreme emotional situations. Nevertheless, it could apply to a wide variety of contexts in day-to-day life. Fourth, day-to-day life involves more complex situations which are not fixed but are constantly evolving and many stimuli have to be processed at the same time. In sum, a paradigm like the EST permits a more ecological assessment of flexibility than tasks with character stimuli only. Despite this, it is still far removed from the “real word.” However, it can be argued that ecological validity is relatively reduced in any simulated environment. Thus, the EST made it possible to test specific hypotheses even if increasing complexity inflated the number of factors that could influence the task (e.g. predictability, explicitness, ambiguity in the images, and socio-emotional situation). Future studies are needed to determine which parameters could be the most closely involved in flexibility difficulties in autism. Protocols with higher levels of ecological validity (e.g. films and real situations with actors) will also be required in order to assess whether the findings can be generalized to other day-to-day situations, as already suggested by Geurts et al. (2009).
Conclusion
In this study, autistic and TD participants performed a new EST, with non-explicit and unpredictable shifts related to the processing of complex socio-emotional stimuli. Whereas laboratory tasks often fail to highlight flexibility difficulties in autism, even though these are often observed in everyday life, our results indicated a larger switch cost in autism compared to TD. Interestingly, this effect was not observed in a cognitive flexibility task with explicit, predictable shifts of character stimuli. These findings could indicate that predictive coding specificities play a critical role in flexibility difficulties in autism, even though further experiments are needed to overcome the limitations of the study. Furthermore, we showed typical sex differences in emotion recognition with context in autism, with a female advantage. We also show specific sex differences on the cognitive flexibility task in autism. These findings are consistent with the literature indicating better social skills and a specific cognitive profile in autistic females without intellectual deficiency, and this could contribute to social camouflaging.
Supplemental Material
sj-pdf-1-aut-10.1177_13623613211062776 – Supplemental material for Flexibility in autism during unpredictable shifts of socio-emotional stimuli: Investigation of group and sex differences
Supplemental material, sj-pdf-1-aut-10.1177_13623613211062776 for Flexibility in autism during unpredictable shifts of socio-emotional stimuli: Investigation of group and sex differences by Adeline Lacroix, Frédéric Dutheil, Alexander Logemann, Renata Cserjesi, Carole Peyrin, Brigi Biro, Marie Gomot and Martial Mermillod in Autism
Supplemental Material
sj-pdf-2-aut-10.1177_13623613211062776 – Supplemental material for Flexibility in autism during unpredictable shifts of socio-emotional stimuli: Investigation of group and sex differences
Supplemental material, sj-pdf-2-aut-10.1177_13623613211062776 for Flexibility in autism during unpredictable shifts of socio-emotional stimuli: Investigation of group and sex differences by Adeline Lacroix, Frédéric Dutheil, Alexander Logemann, Renata Cserjesi, Carole Peyrin, Brigi Biro, Marie Gomot and Martial Mermillod in Autism
Footnotes
Acknowledgements
The authors thank all participants for their participation in this study. The authors also thank the GNCRA and all CRA, expertise centers and clinicians who helped for the recruitment. The authors are very grateful to Morgane Bordet who helped with the implementation of the experiment on PsyToolkit in French and in the recruitment of control participants. The authors are also grateful to the Reviewers for their insightful and highly constructive comments which were very helpful in improving the manuscript.
Author contributions
A.L. designed the study, conceived, and designed the analysis, collected the data, performed the analysis, wrote and revised the paper. M.G. and M.M. were major contributors to data interpretation, writing, and revision of the manuscript. C.P. contributed significantly to ethical requirements and revised the manuscript. B.B., A.L. and R.C. conceived the EST and adapted it to PsyToolkit. A.L. and R.C. were also major contributors to the writing and revision of the manuscript. F.D. substantively revised the manuscript. All authors approved the final version of the manuscript for submission.
Availability of data and materials
This study was pre-registered on https://osf.io/avfcs. All data and materials have been made publicly available on the following osf repository:
.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by the French Ministry of Higher Education, Research and Innovation (France) to Adeline Lacroix. This work was also supported by MIAI@Grenoble Alpes (ANR-19-P3IA-0003).
Supplemental material
Supplemental material for this article is available online.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
