Abstract
As autonomous systems become more prevalent, understanding how humans interact with varying levels and reliability of automation is critical. This study investigates the effects of automation level and automation reliability on performance, mental workload, and situation awareness in hybrid automation systems. Using OpenMATB, 45 participants completed a multitasking simulation under different automation levels (manual to fully automated) and reliability conditions (50%, 70%, 99%). Results indicated significant effects of both automation level and reliability on task performance, situational awareness, and heart rate variability, with higher automation and reliability generally improving performance but lowering physiological markers of workload. Participants also showed a tendency to prefer lower automation levels when system reliability was low. Findings highlight the importance of designing adaptive systems that account for user preferences and reliability expectations. Future work may focus on modeling user transition behavior to inform adaptive automation strategies that support performance while maintaining engagement and situational awareness.
Introduction
With the increase in autonomy and artificially intelligent (AI)-enabled autonomous system, human factors researchers must investigate how to best design the interplay between humans and automation in a hybrid format. Hybrid autonomous systems provide benefits to several industries, especially in aerospace as the industry becomes more complex. Autonomous agents, enabled by AI, are susceptible to failure as they may be faced with situations that they were not trained on. Thus, automation reliability may vary over time, depending on the operator and contextual environment, underscoring the need to understand human behavior in hybrid autonomous systems when the autonomy may fail. Therefore, the objectives of the study are to (i) understand the variability in performance, mental workload and situation awareness under varying automation levels and automation reliability; and (ii) investigate the transition behavior of human-preferred automation levels during human-automation interaction because of the exposure to the existing adaptive system.
Background
The development of autonomous systems in the aerospace domain has provided an avenue for the performance of complex and engaging tasks in a more efficient and timely manner, thereby promoting a better flight experience. However, the operation of these complex systems comes with some caveats because of differences in individual and contextual characteristics. As automation increases, situational awareness decreases, and when automation fails and the operator must take over, the operator may not be able to make decisions and act on the system effectively (Fu et al., 2020). This has the potential to affect the mental workload of the individual, a condition which is effectively using heart rate variability (HRV) features (Delliaux et al., 2019). Therefore, designers have offset automation back to the human operator to maintain situational awareness by adjusting the automation level.
Automation level is referred to as the extent of task performance shared between a user and automation in a multitasking environment that involves the management of complex systems (Kaber & Endsley, 2004). These automation levels are developed in form of adaptive (automation controlling automation level transition), adaptable (user controlling automation level transition) and hybrid automation (both user and automation collaborating to transition between automation levels; Endsley, 2018). As a result of the longstanding Automation-induced performance and cognitive issues on the user when using adaptive and adaptable automation, Calhoun (2022) suggests the use of hybrid automation, given its potential to apply a user-centered approach in operation while simultaneously leveraging automation capabilities. Therefore, to effectively design hybrid autonomous systems, it is crucial to understand changes in performance and cognitive attributes within existing adaptive-autonomous systems and further understand the user by investigating the dynamics in human transition behavior between automation levels under varying automation reliability.
Approach
This study was approved by IRB with protocol IRB-24-342 by the Oklahoma State University IRB. A mixed study experimental design was utilized to investigate the effect of automation level and automation reliability on situational awareness, performance, and automation transitions. Automation level was a within subjects variable, and automation reliability was a between subjects variable. The experiment was performed using the OpenMATB system which is based on the NASA Multi-Attribute Task Battery.
Forty-five participants (M = 22.71, SD = 5.13), completed a set of tasks within the OpenMATB environment under varying automation reliability, for a duration of 30 min. The study involved five scenarios, depicting five levels of automation in which each scenario spanned 6 min. Tasks in each scenario included System Monitoring, Communications, Resource Management, and Tracking tasks. Automation levels ranged from fully manual (level 0) to fully automated (level 4), depicting adaptive automation. Automation level within each scenario was randomized to minimize a learning effect. The participants were also randomly placed in one of three automation reliability groups low- (50%), high- (70%), and near-perfect (99%) reliability groups. The random placement of participants within reliability groups also entailed a relatively equal number of genders across reliability groups. Although participants were informed beforehand that the automation was not 100% reliable, they were not told which of the reliability groups they were placed.
The same set of events were presented in each scenario and developed following studies of Novak et al. (2024) and Huang et al. (2024). Within 6 min, 36 scales and 36 lights failed in the System monitoring task, 9 own calls and 9 distractions were heard in the Communications task, 17 pumps failed, and 9 pumps shut off in the Resource management task. In the Tracking task, a cut-off frequency was set as 0.06 Hz. The Tracking task was selected as the main task for performance assessment as it requires continuous dynamic control which has been shown to be one of the most engaging forms of activity (Spence & Feng, 2010). Only the automation of the tracking task failed and when this occurs, the participant was informed by a visual indicator on the OpenMATB interface.
After each scenario, Situation Awareness Rating Technique (SART) and preference questionnaire were presented to the participants within the OpenMATB interface. The Preference questionnaire involved participants selecting which automation level they would have transitioned to, based on task demand, if they were in control of choosing the automation level. At the end of the study, participants completed a post-study questionnaire on what combination of tasks the participants would prefer to be automated if given the chance to design the automated system. Polar H10 was worn around the chest to record HRV, respectively.
Responses to Preference questionnaire, SART questionnaires, and center deviation in the Tracking task were extracted from OpenMATB outputs. SART scores were computed using Understand + Attentional Supply – Attentional Demand. Root Mean Squared Error (RMSE) was computed from the center deviation to obtain the Performance. HRV features Standard Deviation of NN intervals, Low Frequency and High Frequency were extracted from RR-interval using HRVanalysis package in R to assess mental workload (Pichot et al., 2016). MANOVA was conducted with automation level and reliability groups as the independent variable and performance, SART score and HRV features as dependent variables, with follow-up tests performed where needed. The responses to the Preferred questionnaire as new scenarios were introduced were utilized to develop a transition matrix showing preferred transition between automation levels, and information obtained from post-study test was summarized.
Outcome
Results from MANOVA showed a significant main effect of automation level on performance, situation awareness, HRV features (F(20, 736) = 4.926, p < .001). A significant main effect of automation reliability on the predictor variables was found with F(10, 364) = 6.634 and p < .001. No significant interaction effects between automation level and reliability were observed (F(40, 925) = 0.718, p = .905).
ANOVA test showed a higher performance with increased reliability and automation level (F(4,195) = 20,
Also, there was a significant increase in the situation awareness with automation levels (F(4,195) = 62.69,
Results showed a consistent behavior of the user transitioning into a lesser automation level when the system is observed to consistently fail. Additionally, post-study survey showed there was also a preference in automating Tracking and Resource management tasks, suggesting the automation of frequently manipulated tasks.
Conclusion
The study sought to investigate the dynamics in task efficiency and user transition behavior in an adaptive environment under varying automation reliability. Key findings are that automation reliability as well as automation level impact the human-automation interaction. The study also highlights the stochastic nature of human-automation interaction, showing a shift in the higher automation levels when reliability is low, with users favoring the use of manual control in cases of minimal manipulation and optimal engagement. Future work could involve the development of a stochastic model that could model human-automation dynamics in complex systems. Overall, this research offers a user-centered approach in the effort to develop complex systems that optimize task engagement and degrade mental workload, with the goal of creating a user-centered hybrid automated system.
Footnotes
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
