Abstract
Objective
We aimed to characterize the impact of startle and surprise, both independently and in combination, on subjective feelings, behavior (task performance and gaze behavior), and several physiological parameters.
Background
The effects of startle and surprise are known to affect pilots’ cognitive performance, with potential impact on safety. Startle and surprise can occur either together or independently, yet no studies have experimentally distinguished their specific effects.
Method
Participants (n = 45) were each assigned to one of the three conditions while performing the MATB-II task. In the startle condition, participants were subjected to an expected loud sound. In the surprise condition, an unexpected reverse video effect was applied to the experimental interface. In the combination condition, participants were exposed to both stimuli simultaneously.
Results
Surprise was associated with an increase in skin conductance without affecting performance. In contrast, startle was marked by a decline in performance on the communication sub-task, increased skin conductance and heart rate, and a narrowing of attention. When startle and surprise were combined, the results mirrored those of startle alone but included a stronger feeling of startle and surprise, and a more prolonged heart rate increase.
Conclusion
Startle and surprise combined yielded more numerous significant effects on subjective, behavioral, and physiological measures than startle and surprise independently.
Application
Identifying the specific impacts of startle and surprise could pave the way for their automatic detection using artificial intelligence. Safety could be enhanced through the design of specific countermeasures to help the crew cope with such states.
Introduction
On February 4th, 2020, a French Bee A350 was approaching Paris-Orly airport. The copilot was flying the plane, and the captain was monitoring. At an altitude of 1350 feet, a windshear warning surprised the crew, despite good weather. The copilot initiated a go-around procedure but eventually stopped responding during several minutes. Seeing no reaction from the copilot, the captain took control of the aircraft. However, an unannounced action by the copilot triggered a low energy alert, risking a stall. The captain responded appropriately and regained control of the aircraft. The startle and surprise caused by the unexpected windshear warning was identified as a contributing factor to the copilot’s cognitive incapacitation that led to this incident (BEA, 2021).
Beyond this event, startle and surprise have contributed to many accidents and incidents in aviation (Rivera et al., 2014), with the consequences well documented (Diarra et al., 2023; Martin et al., 2012; Woods & Sarter, 2000). These effects range from delayed actions and inappropriate inputs to total cognitive incapacitation, potentially resulting in catastrophic outcomes (KNKT, 2014). Despite training, the unpredictability and emotional intensity of certain events can challenge even the most experienced crews. Research indicates that only 30% of airline pilots perform consistently well in unexpected stall conditions compared to anticipated ones (Landman et al., 2017a). This difficulty in adhering to procedures during unexpected events is echoed by Casner et al. (2013), Grant et al. (2018), and Schroeder et al. (2014). Visual attention is also affected, with pilots focusing narrowly on surprising information, as shown by Dehais et al. (2015). Similarly, Kinney and O’Hare (2020) found that unexpected events can reduce visual scanning and affect physiological parameters, such as increasing heart rate and pupil dilation. Landman et al. (2017b) proposed a conceptual model to explain reactions in startling and surprising situations, highlighting their respective effects without a clear relation between them. Hancock et al. (2022) noted that startle and surprise might occur together or separately but did not distinguish their distinct consequences.
Numerous studies outside the aviation domain have been conducted on startle and surprise. The following paragraphs describe the associated literature.
Startle
Startle is a defensive reflex triggered by an intense, sudden stimulus (Koch, 1999) that can be either tactile (Hoffman et al., 1964), vestibular (Bisdorff et al., 1995), visual (Bradley et al., 1990), or acoustic (Yeomans & Frankland, 1995). A startle event triggers a rapid reaction, as quick as 53 ms for behaviors like eye closure (Ekman et al., 1985). Ekman also noted that activities like neck muscle contractions and trunk movements occur within 200 ms. Physiological changes include increased heart rate peaking 4 s after the stimulus (Gautier & Cook, 1997), higher blood pressure (Holand et al., 1999), and increased skin conductance (Vrana, 1995). Motor tasks are significantly impacted poststartle. Tracking tasks are impaired for the first 2 s, with performance normalizing after up to 10 s (May & Rice, 1971; Thackray & Touchstone, 1970; Vlasak, 1969). Cognitive tasks suffer similarly, with performance on visual matching tasks deteriorating for up to 31 s after a startling stimulus (Woodhead, 1958). Startle also decreases the performance of tasks requiring sustained mental effort, such as continual mental subtraction (Vlasak, 1969) and simulated air traffic control tasks (Thackray, 1983). Interestingly, the startle reaction can be enhanced or attenuated by the emotional state of the individual (Bradley & Lang, 2000; Lang et al., 1990; Vrana, 1995).
Surprise
Surprise is an emotion arising from a discrepancy between expectations and reality (Horstmann, 2006). It can result from an unexpected change or absence of expected change and is not necessarily accompanied by startle. A surprising event is sometimes marked by a universal facial expression (Ekman et al., 1969; Ekman & Friesen, 1971; Hiatt et al., 1979). Physiological responses to surprise include increased skin conductance (Bradley, 2009), heart rate deceleration (Reisenzein et al., 2012), and pupil dilation (Maher & Furedy, 1979). Surprise provokes a shift of attention to the surprising stimulus, facilitating the evaluation of the discrepancy, but interrupting ongoing activities (Tomkins, 1962). Horstmann (2006) found that this interruption lasts almost 1 s for 78% of participants. Making sense of a surprising situation requires updating or elaborating a new mental model (Klein et al., 2007), a process that can delay reactions. Meyer et al. (1991) reported increased reaction times following visual surprises, with differences of up to 700 ms between experimental and control groups.
Objectives and Hypotheses
We believe that the aviation field could greatly benefit from further research on the impacts of startle and surprise. To date, no study has clearly examined the separate and combined effects of these phenomena. Beyond gaining a better understanding of their consequences, such research could be a crucial step towards the automatic detection of startle and surprise in pilots. This, in turn, would pave the way for the deployment of specific countermeasures in the cockpit. We therefore aimed to better understand the independent and combined effects of startle and surprise on the participants’ subjective feeling, behavioral response (reaction time, accuracy, facial expression, and gaze behavior), and physiological response (heart rate and skin conductance). We formulated several hypotheses: H1) Surprise will affect the reaction time without impact on accuracy and should increase at least the skin conductance. The reaction time will increase because of the reframing cost and, according to us, the surprise effect will be too weak to affect accuracy. H2) Startle will affect more intensely task performances than surprise (reaction times and accuracy), the gaze behavior, and all physiological measures (heart rate and skin conductance). H3) The combination of both startle and surprise will have a stronger impact than startle or surprise alone, with effects on participants’ self-assessment of startle and surprise, task performances (reaction time and accuracy), gaze behavior, skin conductance, and heart rate.
Method
Participants
Forty-five participants (11 women) aged from 21 to 45 years old (M = 28.1, SD = 6.5) took part in this study. They all had a minimum educational level equivalent to a High School Diploma. No previous flight experience was required to perform the experimental task. This research complied with the American Psychological Association Code of Ethics and the research ethics committee of the University of Toulouse approved the experimental protocol (n°2023-684). Informed written consent was obtained from each participant.
Procedure
The experiment lasted about 1 h and followed a between-subject design with three conditions (startle, surprise, and combination of startle and surprise). It began with a brief introduction, misleading participants to believe the study was about mental load to ensure they would later experience surprise. Participants completed two stress and anxiety questionnaires: The Perceived Stress Scale (PSS-14) and the State-Trait Anxiety Inventory (STAI-Y). They were then equipped with physiological sensors, and the eye-tracking device was calibrated. After a tutorial of the experimental task, participants completed a 5-min training scenario without a startling or surprising stimulus. Following an optional short break, they completed a single 7-min experimental scenario from one of the three conditions: startle, surprise, or combination of startle and surprise. After the experimental task, participants rated their level of startle and surprise regarding the startling or surprising stimulus on a 10-point Likert scale (Annex 1) and indicated their subjective workload using the NASA TLX (Hart & Staveland, 1988) (Figure 1). Procedure of the experiment. Note that participants were each assigned to a single condition (startle, surprise, or startle and surprise).
Experimental Task
We used the OpenMATB (Open Multi-attribute Task Battery) software (Cegarra et al., 2020) to create our scenarios and run our experiment. OpenMATB is a clone of the classic MATB-II task that simulates multiple task management on the flightdeck (Comstock & Arnegard, 1992) and allows easy synchronization of task events with physiological measures.
MATB-II is composed of four sub-tasks: • System Monitoring (Figure 2(a)): Participants monitor six visual indicators (two push buttons and four gauges) and correct anomalies (e.g., absence of the green light, presence of the red light, or gauge deviation) via keyboard inputs. Each anomaly is a system monitoring event (60 in total). • Tracking (Figure 2(b)): Participants use a joystick to maintain a cursor within a target area. • Communications (Figure 2(c)): Participants listen to instructions directed at them or not (identified by their code, e.g., FG659) and execute frequency changes on four radio channels (COM 1, COM 2, NAV 1, and NAV 2) using keyboard directional keys. Each communication, directed or not to the participant, is a communication event (20 in total). • Resource Management (Figure 2(d)): Participants manage which pumps are on to optimize fuel levels of two tanks within a system of six tanks connected by eight pumps, with the fuel levels depleting naturally. OpenMATB interface, left: normal mode; right: reverse video mode (surprising stimulus).

The experimental task lasted 7 min. We used the 3:30–3:45 period as a baseline (the “Reference period”) and the 5:00–5:15 period to assess poststimulus (startle, surprise, or combination of startle and surprise) performance (the “Study period”). To ensure a valid comparison, both periods and their preceding 30 s included identical system monitoring and communication events. The task difficulty was adjusted through preliminary studies to ensure it was challenging but manageable, with difficulty increasing over the first 2 min to reach the desired cognitive load.
Startle and Surprise Induction
Participants were randomly assigned to one the three conditions (Figure 3): (1) (2) (3) Timeline of the three different experimental conditions.

Apparatus and Signal Processing
Participants were equipped with over-ear headphones, a standard keyboard, a mouse, and a joystick to perform the MATB-II task on a Benq 19-inch screen (Figure 4). Before the task, various physiological sensors were placed on participants to analyze their reactions to stimuli of interest (startle, surprise, or combination of startle and surprise). A three-lead electrocardiogram (ECG), placed according to the Einthoven triangle, measured cardiac activity. A two-electrode galvanic skin response (GSR) sensor on the index and middle fingers measured perspiration. A photoplethysmogram (PPG) earpiece measured blood flow variations from the left ear. These signals were recorded using the PsychoBIT Bitalino bundle. Additionally, a Tobii Pro Spectrum eye tracker (sample rate = 600 Hz) tracked participants’ gaze. A Logitech C920 camera filmed participants’ face, neck, and shoulders close-up to capture facial reactions. A Lab Streaming Layer program with its companion software, LabRecorder, recorded and synchronized all signals. Experimental set-up with the reverse video mode (surprising stimulus) of the OpenMATB.
Dependent Variables
The measured variables for each participant were self-assessment of startle and surprise, perceived workload, MATB-II performances, occurrence of facial expression related to startle or surprise, gaze behavior (explore/exploit ratio, K coefficient, Lempel-Ziv complexity, stationary entropy), heart rate, and skin conductance.
MATB-II sub-tasks performances: • System monitoring: We analyzed reaction times and success in detecting and correcting abnormal behavior on the six visual indicators. For comparison, the mean reaction time from the 30 s preceding the reference period was subtracted from the reaction times recorded during that period. The same procedure was applied to the study period. • Tracking: We analyzed the average center deviation of the cursor during the first 10 s of the analyzed period. We then compared this deviation to the average center deviation from the 30 s preceding the analyzed period, as was done in system monitoring. • Communication: We evaluated the success rate of entering the correct frequency in the correct channel. • Resource management: Performance was not considered because the tank levels were not consistent across participants at the stimuli of interest occurrence, making comparisons complex.
Facial Expression
Two researchers assessed independently the facial expressions of participants. Facial expressions associated with startle were analyzed by observing eye closure, horizontal lip stretch, and head and trunk movements (Ekman & Friesen, 1978; Ekman et al., 1985). Facial expressions related to surprise were analyzed by observing raised eyebrows, widened eyes, and jaw drop or mouth opening (Ekman & Friesen, 1978; Schützwohl & Reisenzein, 2012). Reaction of startle or surprise was noted as positive if at least one of these observable markers was present.
Gaze Behavior
We computed the explore/exploit ratio (Dehais et al., 2015), K coefficient (Krejtz et al., 2016), Lempel-Ziv complexity (Lounis et al., 2020), and stationary entropy (Krejtz et al., 2014) using the ArGaze library (Hogue et al., 2024). Each measure was corrected by subtracting the 30 s preceding the stimulus of interest. Each MATB-II sub-task was defined as an Area of Interest (AOI). The gaze distribution between the different AOIs was also analyzed.
Cardiac Response
We analyzed the cardiac response to the stimulus of interest over 3 s epochs. Cardiac response was obtained by subtracting the mean heart rate calculated during the 30 s preceding the reference or study period to the mean heart rate calculated during the 0–3 s, 3–6 s, 6–9 s, 9–12 s, or 12–15 s of the analyzed period.
Skin Conductance Response
We analyzed the skin conductance response by subtracting the mean skin conductance during the 30 s preceding the reference or study period to the skin conductance measured during the analyzed period. To determine if the stimulus of interest had a significant effect, we compared the mean skin conductance change on the 0.9–4 s period poststimulus of interest to the mean value during the reference period. For each participant, we then evaluated the latency and magnitude of the skin conductance response. Magnitude was defined as the largest response occurring between 0.9–4 s poststimulus of interest. Latency was defined as the time at which a change of 0.015 μS was first observed (Vrana, 1995).
Data Analysis
For MATB-II performances, gaze behavior, and physiological data, we first compared the study period to the reference period to assess the stimuli of interest effects. We then conducted between-condition comparisons using one-way ANOVAs, with the experimental condition as a categorical variable (Shapiro-Wilk tests were used to ensure data normality). Pairwise comparisons were performed using Student’s t-tests with a Bonferroni-Holm adjustment for multiple comparisons (Abdi, 2010). Chi-square tests were used to analyze the performance of the Communication and System Monitoring sub-tasks.
Results
Subjective Data
Anxiety and Stress
Preexperiment questionnaires indicated that participants experienced low anxiety (STAI-Y State: M = 33.35 ± 7.85) and low to moderate stress (PSS-14: M = 21.1 ± 8.20) with no outlier. A one-way ANOVA between conditions revealed no statistical difference in anxiety (state: F (2, 42) = 1.26, p = .29, η2 = 0.06, trait: F (2, 42) = 0.74, p = .48, η2 = 0.03) nor stress (F (2, 42) = 0.38, p = .68, η2 = 0.02).
Self-Assessment of Startle and Surprise by Participants
A one-way ANOVA between conditions showed a significant difference in the self-assessment of startle (F (2, 42) = 19.06, p < .001, η2 = 0.48). Post-hoc analysis revealed that the self-perceived startle was significantly stronger in the combination condition than in the startle (t (28) = −2.05, p = .049) or surprise (t (28) = −6.46, p < .001) conditions. The self-perceived startle was also significantly stronger in the startle condition compared to the surprise condition (t (28) = 3.63, p = .001) (Figure 5). Tukey boxplots of the self-assessment of startle and surprise in all conditions.
A one-way ANOVA between conditions showed a significant difference across the conditions in the self-assessment of surprise (F (2,42) = 3.50, p = .039, η2 = 0.15). Post-hoc analysis revealed that the self-perceived surprise was significantly stronger in the combination condition than surprise condition (t (28) = −2.96, p = .0063) (Figure 5).
Perceived Workload
A one-way ANOVA on the NASA TLX score revealed no statistical difference in the perceived workload of participants (F (2,42) = 0.32, p = .72, η2 = 0.015).
Behavioral data
MATB-II Performances
System Monitoring
Means and Standard Deviations of the Reaction Time for Detecting Abnormal Behavior in the System Monitoring Sub-task.
Tracking
The center deviation was significantly higher in the combination condition (t (28) = 2.40, p = .02) in comparison to the reference period. No significant effect of the stimulus was found in the startle and surprise conditions. A one-way ANOVA on center deviation revealed no significant difference among the three conditions.
Communication
The Chi-square test revealed that response accuracy in the communication sub-task was significantly worse in the combination (χ2 (2) = 16.34, p < .001) and startle (χ2 (2) = 9.69, p = .007) conditions compared to the reference period. A large number of misses occurred in the combination condition. Only the comparison between the combination and surprise conditions showed a statistical difference (χ2 (2) = 6.53, p = .038). Combination-startle and startle-surprise showed no statistical difference (p > .05) (Figure 6). Communication accuracy in all conditions.
Gaze Behavior
Stationary Entropy
A t test revealed a significantly lower stationary entropy in the startle (t (28) = −2.35, p = .02) and combination (t (28) = −2.00, p = .05) conditions compared to the reference period (Figure 7). However, a one-way ANOVA on the entropy revealed no significant difference among the three conditions. Tukey boxplots of the stationary entropy during the study and the reference periods.
Gaze Distribution
The gaze distribution was significantly different from the reference period in the combination condition. Participants gazed significantly more on the “System Monitoring” AOI (t (28) = 3.76, p < .001) and significantly less on the communication AOI (t (28) = −2.59, p = .01). The difference was nonsignificant in the startle and surprise conditions (p > .05). A Chi-squared test was performed to compare the gaze distribution among the three conditions and no statistical difference was found.
K-Coefficient, Lempel-Ziv Complexity, and Explore/Exploit Ratio
The analysis of the K-coefficient, the Lempel-Ziv complexity, and the explore/exploit ratio showed no significant effect of the stimuli of interest compared to the reference period (p > .05 in all conditions). One-way ANOVAs on these different gaze metrics revealed no significant difference among the three conditions.
Facial Expression
In the “Surprise” condition, 2 out of 15 participants exhibited facial markers of surprise. In the startle condition, 12 out of 15 participants displayed at least one facial marker of startle. In the combination condition, all participants showed at least one facial marker of startle, and none displayed facial markers of surprise.
Physiological Data
Heart Rate
The heart rate on the whole study period was significantly higher in the startle (t (28) = 2.33, p = .02) and combination conditions (t (28) = 4.18, p < .001) in comparison to the reference period (Figure 8). Tukey boxplots of the mean heart rate change during the study and reference periods.
Means, Standard Deviations, and Student’s t-tests for Heart Rate Comparing Study Period to Reference Period for Each Condition on 3s-epochs.
A one-way ANOVA comparing the conditions, revealed a significant effect on the 0–3 s (F (2, 42) = 7.26, p = .002, η2 = 0.20) and 3–6 s periods (F(2, 42) = 3.70, p = .04, η2 = 0.09) and no significant difference on the other periods. During the 0–3 s period, post-hoc tests showed a significantly higher heart rate in the startle (t (28) = 3.03, p = .005) and combination (t (28) = 3.65, p = .001) conditions compared to the surprise condition. No significant difference was observed between startle and combination conditions.
During the 3–6 s period, post-hoc tests showed a significantly higher heart rate in the combination condition compared to the surprise condition (t (28) = 2.42, p = .02). No significant difference was observed in startle/combination and startle/surprise conditions (Figure 9). Mean heart rate variation to the three stimuli of interest in comparison to the reference over the 15 s study period.
Skin Conductance Response
Considering the mean skin conductance value between 0.9 s and 4 s after the stimulus of interest, a t test comparing each condition to the reference period showed a statistical difference for each condition (startle: (t (28) = 5.83, p < .001, surprise: t (28) = 2.53, p = .01, combination: t (28) = 7.05, p < .001) (Figure 10). Skin conductance response to the three stimuli of interest in comparison to the reference.
Means and Standard Deviations for the Skin Conductance Response in all Conditions.
Discussion
Our study investigated the separate and combined effects of startle and surprise with a task designed to represent piloting activities. In the startle condition, participants were exposed to a loud noise with prior knowledge of its occurrence to trigger a startle response while minimizing surprise. In the surprise condition, a reverse video effect was applied without prior warning to induce surprise. The combination condition involved both startling and surprising stimuli simultaneously to elicit both startle and surprise effects. We will first review the results in relation to our hypotheses, then discuss the implications, and finally address some limitations.
As predicted (H1), the surprising stimulus elicited a skin conductance response similar to that observed in studies by Landman et al. (2017b) and Reisenzein et al. (2012), indicating that the reverse video effectively provoked an emotional reaction (Christopoulos et al., 2019). We were also expecting an increase of reaction time due to the reframing cost following the surprising stimulus, but no such effect was observed in the monitoring task (only task for which the reaction time was measured). Several reasons could explain this null result. First, as revealed by the self-assessment of surprise, the surprising stimulus created a moderate level of surprise. Second, anomalies in the monitoring task were scripted 2 s after the surprising stimulus, whereas Meyer et al.'s (1991) found that the impact of a surprising event is most significant when actions follow the event by 0.5 s as it competes with ongoing task processing. When the event occurs 1 or 2 s later, the surprise effect on task processing is less important, and reaction times are less impacted (Meyer et al., 1991). Thus, in our experiment, even if the subject was focused on the monitoring task at the time of the surprising event, the effect of the surprise may have already diminished drastically when anomalies were triggered. Finally, participants may have been engaged in another sub-task when the anomaly occurred, reducing the attentional impact of the surprise on the monitoring task. It’s important to note that in real-life scenarios, surprising events can have a much long-term impact, especially if they involve significant reframing or are life-threatening, as seen in the AF447 accident (BEA, 2021).
As hypothesized (H2), the startling stimulus had a more substantial impact than surprise on task performance, gaze behavior, and physiological responses. A decrease in accuracy on the communication sub-task highlighted the significant impact of startle on cognitive abilities, consistent with previous research (Thackray, 1983; Vlasak, 1969; Woodhead, 1958). However, like with the surprising stimulus, there was no increase in reaction time in the monitoring task, possibly due to similar reasons as outlined above. After the startling stimulus, participants exhibited more narrowed attention, as evidenced by lower gaze entropy, reflecting attention tunneling. This finding aligns with recent studies showing that high stress can affect attentional distribution, leading to missed hazards and information (Pooladvand & Hasanzadeh, 2023). Concerning the physiological response, the skin conductance increase was also consistent with previous studies (Bradley, 2009; Landman et al., 2017a; Vrana, 1995) as well as the heart rate increase, occurring in the first seconds following the startling stimulus, as reported in several studies (Adkins et al., 2019; Chou et al., 2014; Deuter, 2012; Holand et al., 1999). This physiological response indicates that startle indeed triggers a reaction of the body, which has been interpreted as a preparation of the body to defend against a threat by activating the autonomic nervous system (Öhman & Wiens, 2003).
As hypothesized (H3), the combination of startle and surprise had the most pronounced impact compared to the independent effects of startle and surprise. Compared to the surprise, the combination led to a greater decline in communication sub-task accuracy, a higher self-assessment of startle and surprise, a higher heart rate and skin conductance response, and additional significant effects on gaze behavior. Compared to the startle, the combination had an analog impact on skin conductance response with a significant increase few seconds after the combination stimulus, as well as on entropy with a decrease during the first 15 s. However, for the self-perceived startle, the tracking sub-task performance, the gaze distribution, and the heart rate, participants were significantly more impacted by the combination of startle and surprise than by startle alone. The combination of startle and surprise resulted in a gaze distribution significantly different from the reference period, shifting significantly towards system monitoring and away from the communication sub-task. Concerning cardiac activity, the expected heart rate increase was observed during the first 6 s post combination stimulus, and the peak was higher (7.8 ± 12.4 bpm) than in the startle condition (4.2 ± 8.5 bpm). Interestingly, the heart rate increase lasted longer than in the startle condition, supposing a longer emotional effect of the combination condition. Confirming the H3 hypothesis could imply that surprise enhances the startle response. However, existing research on startle potentiation related to emotions (Bradley & Lang, 2000; Lang et al., 1990; Vrana, 1995) suggests that it is emotional valence that influences the startle reaction—heightening it with negative emotions like fear and diminishing it with positive emotions like joy. In our experiment, the surprising stimulus was relatively neutral (a reverse video), yet it still produced a heightened startle response. This indicates that surprise might intensify the startle reaction even when the emotional valence is neutral. Since startle responses can impair cognitive functions, they might hinder the process of reframing after a surprise, thereby amplifying the overall impact. In high-stakes environments like a flight deck, where quick decisions are crucial, this could potentially lead to catastrophic consequences.
Aviation but also other safety critical domains could benefit from a better characterization of startle and surprise. A deeper understanding and more precise modeling of these effects could enable intelligent assistants to detect them accurately. The intelligent assistant developed by Duchevet et al. (2024) serves as a notable example of this approach. Thanks to machine learning, it detects the startle effect based on pilots’ physiological parameters and automatically activates its support. It assists pilots in maintaining situational awareness by drawing their attention to overlooked and deviating flight parameters through an eye-tracking system. Additionally, it helps pilots recover from startle and surprise more quickly by encouraging controlled breathing at a specific rhythm and providing haptic feedback on the wrist, simulating a 60 bpm heartbeat. Such an assistant could serve as an alternative to the proactive approach through resilience training promoted by Hancock et al. (2022). Enabling machines to assess the pilot’s state offers the potential to develop reactive resilience to unexpected events through human–machine teaming, similar to how crew resource management functions in today’s cockpits.
Limitation and Future Research
The startling and surprising stimuli used could be improved. Indeed, in the startle condition, participants reported feeling a certain amount of surprise even though they had been warned that a sound would be played. One potential solution to reduce residual surprise is the use of a warning before the startling stimulus, as demonstrated in the Woodhead (1958) study. However, this approach carries the risk of distracting participants. Another option could involve informing participants that a loud sound will occur following a specific action, as done in the Foss et al. (1989) study. In the surprise condition, participants reported feeling surprised, albeit with a moderately high intensity. However, subjective measures did not confirm that the intended differences in surprise between conditions were successfully achieved. This may be because nonexperts often struggle to differentiate between the two concepts. It is also possible that the surprise stimulus did not elicit a sufficiently pure effect of surprise. Directly inspired by the work of Meyer et al. (1991), the key advantage of this stimulus was that it did not interfere with the task. Developing a reliable method to elicit significant surprise without disrupting the task would be a compelling avenue for future research.
One other limitation of this study is the inclusion of multiple dependent measures. While this approach allowed for a comprehensive analysis, we cannot exclude the occurrence of false positives despite the adjustment for multiple comparisons. Future research may focus on replicating these findings with a more targeted set of variables to ensure the robustness of the observed results.
Future research could also involve conducting the experiment with a simpler task to more accurately analyze the effects on performance (particularly reaction time) and over a longer period than with MATB-II. The multi-task nature of MATB-II may lead participants to develop different long-term strategies, making extended analysis challenging, especially regarding reaction time increases. Adding one surprising and one startling event to a continuous mental subtraction task or a schema-discrepant task, as done by Meyer et al. (1991), could be an interesting avenue.
Alternatively, setting up a similar experiment in a simulator closer to the operational world with real pilots could be a valuable follow-up. This would better reflect real-world reactions, allowing finer modeling of pilot responses and possibly better automatic state detection. However, it would also be essential to gather pilots’ perspectives on the feasibility and potential implications of such automated detection in startle and surprise conditions. Their input could help identify possible drawbacks or unintended consequences of automated countermeasures, which designers should consider. Additionally, a more real-world simulation would also allow for a more intense surprising stimulus to create a fundamental surprise, as described by Lanir (1984).
Conclusion
We conducted a study to characterize the independent and combined effects of startle and surprise, hypothesizing that their combination would have a greater impact on performance during a simulated piloting task and on the physiological response. Our findings confirmed this greater impact, with a greater variety of significant results such as higher self-perceived startle, lower performances in the communication sub-task, a narrowing of visual attention, and more marked physiological response, in particular considering heart rate. Such results allow a better characterization of the individual contributions of the startle and surprise effects on a simulated piloting task. It is paving the way towards the automatic detection of such states by aircraft pilots to further improve safety in aviation.
Key Points
• To date, no study has clearly examined the separate and combined effects of startle and surprise. • Forty-five participants performed the MATB-II task and were exposed either to a startling stimulus, a surprising stimulus, or the combination of both stimuli. • Startle and surprise combined yielded more numerous significant effects on subjective, behavioral, and physiological measures than startle and surprise independently.
Supplemental Material
Supplemental Material - Investigating the Independent and Combined Effects of Startle and Surprise in a Simulated Flight Task
Supplemental Material for Investigating the Independent and Combined Effects of Startle and Surprise in a Simulated Flight Task by Alexandre Duchevet, Jean-Paul Imbert, Jérémie Garcia, Benoît Lamirault, and Mickaël Causse in Human Factors.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was done as part of the HAIKU project. This project has received funding from the European Union’s Horizon Europe research and innovation programme HORIZON-CL5-2021-D6-01-13 under Grant Agreement no 101075332.
Supplemental Material
Supplemental material for this article is available online.
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
