Abstract
In high-stakes environments like Unmanned Aerial Vehicle (UAV) operations, effective verbal team communication is critical yet often hindered by environmental noise. This study explores how verbal communication and real-time gaze sharing—visualizing a teammate’s gaze through fixation trails—affects team performance. Twenty-four two-person teams completed simulated UAV tasks under three conditions: no gaze sharing with communication, gaze sharing only, and both gaze sharing and communication. Results revealed that combining gaze sharing with communication significantly improved team performance and reduced mental workload. This combination also improved visual attention strategies, as evidenced by reduced saccadic activity and more stable fixations. Participants also reported improved shared awareness and coordination. These findings underscore the complementary benefits of integrating verbal and non-verbal communication tools. By highlighting the value of gaze sharing in complex team settings, this work informs the design of adaptive interfaces that support collaborative decision-making and shared awareness in operationally demanding environments.
Introduction
In complex, time-sensitive domains such as unmanned aerial vehicle (UAV) command-and-control (C2) operations, team success depends on shared mental models—common understandings of tasks, roles, and system states (Cannon-Bowers et al., 1993). These models are primarily constructed and maintained through verbal communication, which conveys critical information and instructions (Nawaz et al., 2021). However, verbal communication alone may be insufficient, especially in high-stress and high-workload environments. Challenges such as noisy operational centers, communication delays, and misinterpretation can disrupt information flow, compromising performance (Baker et al., 2021).
One promising approach to overcoming these limitations is gaze sharing—the real-time visualization of where teammates are looking (Atweh, Hazimeh, & Riggs, 2023, Atweh, Hayek, & Riggs, 2023, Atweh & Riggs, 2024a). By integrating eye tracking data into shared displays, gaze sharing introduces a powerful non-verbal communication channel that can support coordination and enhance shared awareness. Unlike static cues or alerts, gaze visualizations provide continuous insight into a teammate’s attention and intent, potentially complementing or even replacing verbal exchanges when they are delayed or infeasible (D’Angelo & Schneider, 2021).
Despite its promise, the role of gaze sharing in UAV C2 operations remains underexplored. Previous research has largely focused on the design of gaze visualizations (e.g., fixation trails vs. point markers) and their effects on cognitive load, performance, and situation awareness (SA; Atweh & Riggs, 2024a). These studies suggest that fixation trails—temporal visualizations of gaze paths—may better support team coordination than instantaneous markers. However, a critical gap remains in understanding how gaze sharing interacts with verbal communication. Specifically, does gaze sharing improve or hinder verbal communication strategies? Can it reduce mental workload by offloading the need for verbal updates?
This study addresses these questions by investigating how real-time gaze sharing, when paired with open verbal communication, affects team performance, visual attention, and workload in a simulated UAV C2 environment. Through this work, we aim to better understand how multimodal communication strategies can be leveraged to support collaboration in complex systems.
Related Work
Gaze Sharing
Eye tracking is a method used to record and analyze eye movements to determine where individuals direct their visual attention (Poole & Ball, 2006). Eye trackers, whether desktop-mounted or head-mounted, collect data in the form of points of regard (POR), which indicate the location of gaze on a screen. From these data, researchers derive fixations—periods of stable gaze associated with visual processing—and saccades—rapid movements between fixations during which visual intake is minimal (Mao et al., 2021). These patterns of movement form scanpaths, which reveal visual strategies employed during a task.
In team contexts, real-time gaze sharing enables operators to view one another’s gaze locations, offering a non-verbal channel for coordination (Sung et al., 2021). This can be accomplished through various visualization methods such as dots, trails, or heatmaps projected onto individual or shared displays. Of particular interest are fixation trails that provide gaze paths over time to help teammates anticipate each other’s actions (D’Angelo & Schneider, 2021; Newn et al., 2017).
In UAV operations, where tasks are complex, dynamic, and tightly coupled, there is value in knowing where a teammate is looking to improve teamwork. Gaze sharing may improve coordination by reducing the need for verbal updates, aiding in task division, and decreasing redundant scanning (Zhang et al., 2017). However, gaze visualizations must be carefully designed because excessive or poorly integrated visual cues can increase mental workload, particularly if they disrupt a user’s existing visual processing strategies ( Atweh & Riggs, 2024a, 2025a; McCarley et al., 2021).
Communication in UAV Operations
In UAV C2 settings, verbal communication is central to real-time coordination. Operators rely on radios, intercoms, and/or VoIP systems to exchange critical information. Structured communication protocols, such as the use of standard phraseology and confirmation techniques, help reduce ambiguity and ensure accuracy of information transmission (Cummings & Guerlain, 2007). These strategies are especially vital in high-stakes operations where timing, precision, and SA are critical.
However, as task complexity and operational tempo increase, the cognitive demands placed on team members can strain verbal exchanges. Mental workload—defined as the amount of cognitive effort required to perform a task—can fluctuate based on task load, time pressure, and communication burden. Under high workload conditions, communication breakdowns are more likely, particularly when teams rely solely on verbal channels. These breakdowns can manifest as delayed responses, overlapping speech, or failures to clarify or confirm task-relevant information.
In large or distributed UAV teams, this challenge is amplified by hierarchical roles and task interdependencies. As teams increase in size and responsibility is distributed across more sensor operators, mission commanders, and pilots, the volume and complexity of communication increases (Nawaz et al., 2021). In such environments, overreliance on verbal strategies may reduce efficiency and increase error potential, especially when operators lack shared visual context.
To address these limitations, non-verbal tools such as flashing alerts or haptic feedback have been introduced to convey critical updates. Yet, these cues often lack the contextual depth that gaze sharing can provide (Duan et al., 2019). By revealing visual attention in real time, gaze sharing offers a more nuanced form of non-verbal communication that can complement verbal strategies. This may be especially useful in high-load conditions where minimizing verbal exchanges could reduce cognitive load.
This study explores how gaze sharing influences team visual strategies and cognitive workload during high-demand scenarios. Specifically, we explore the use of gaze sharing though the use of fixation trails. Given the potential of gaze sharing to reduce ambiguity and improve shared attention, understanding its effects on performance and workload is essential for designing collaborative tools in C2 environments (Atweh & Riggs, 2025b, 2025c). We hypothesize that the integration of gaze sharing with open verbal communication reduces workload and supports efficient visual strategies, offering insights into multimodal communication design for UAV teams
Methodology
Participants
Twenty-four teams (48 participants total) of undergraduate and graduate students from the University of Virginia were recruited for this study (M = 24.5 years, SD = 4.36 years). Each pair included one male and one female who did not previously know each other to control the variables and reduce potential confounding factors that could influence the results. The experiment was approved by the University of Virginia Institutional Review Board (protocol number 3480).
Experimental Setup
The design of the experimental testbed was based on the “Vigilant Spirit Control Station” the U.S. Air Force uses to develop interfaces to control multiple UAVs (Feitshans et al., 2008). The testbed was developed using Unity and ran on two desktop computers (27”, 2560 × 1440 monitor; Figure 1). Teams were collocated but each participant viewed separate monitors and used separate mice to input responses. The testbed was networked so participants could see in real-time inputs made by their teammates (e.g., when Participant 1 clicked on the target button, Participant 2 could see the response in real-time). However, participants could not see the real-time cursor movements of their teammates. Two desktop-mounted FOVIO eye trackers with a sampling rate of 60 Hz were used to collect point of gaze data of each participant. The average degree of error for the FOVIO eye tracker (determined by the manufacturer) is 0.78° (SD = 0.59°).

Experimental setup with the two networked desktop computers side-by-side with an external microphone in between.
UAV Tasks
Each pair was responsible for completing a primary task and three secondary tasks—that is, four tasks total—for up to 16 UAVs (Figure 2). Although all tasks were the pair’s responsibility, only one participant from each pair had to complete each instance of a task. The primary task was the target detection task where pairs monitored UAV video feeds and indicated whether a target was present (i.e., a semi-transparent cube). The other three secondary tasks included a rerouting task (avoiding UAVs flying through no-fly zones), fuel leak task (maintaining each UAV’s health), and chat message task (responding to chat messages). These tasks and their structure emulate the multitasking, dynamic nature of a UAV C2 environment. Only one participant from each pair would have to complete each instance of a task.

Screenshot of the UAV simulation with the four panels labeled (clockwise): Map, Video Feed, Health, Chat Message, and Reroute Panels.
Experimental Design
A within-subjects fractional factorial study was conducted, where all teams completed three 10-min scenarios in counterbalanced order with the following three conditions: (a) no gaze sharing with verbal communication), (b) gaze sharing with no verbal communication, and (c) both gaze sharing and verbal communication. For conditions where communication was allowed, participants were permitted to communicate freely without any restrictions. Gaze sharing was visualized using a real-time fixation trail offering a visual representation of the preceding two seconds (Atweh & Riggs, 2024a; Figure 3). For each condition the number of targets, reroutings, fuel leaks, and chat messages tasks were held constant. Each instance of a task was randomized within each condition. Performance was measured using a point system to gauge overall performance. Table 1 shows the point value associated with each task. The points values were assigned to encourage participants to prioritize certain tasks (i.e., target detection). Each participant completed a subjective workload rating at the end of each trial using the NASA Task Load Index (NASA-TLX; Hart & Staveland, 1988).

The fixation trail gaze sharing visualization technique.
Point System for Scoring Performance.
Eye Tracking Data Analysis
The eye tracking data was first filtered, and invalid entries removed. The gaze data was screened to meet data quality requirements as outlined in ISO/TS 15007-2:2014-09, which states that 15% data loss is acceptable. The data loss across all participants and trials was on average 8.6% (SD = 2.2%). Fixations and saccades were detected using the code developed by the authors (Atweh et al., 2024). This code is used to analyze eye tracking data collected from experimental studies with participants and it serves two main purposes: (1) filtering the eye tracking dataset and (2) detecting fixations and saccades based on Nyström and Holmqvist’s (2010) velocity-based and data-driven adaptive algorithm. Then, it passes the data through the Savitzky-Golay smoothening filter and calculates the angular velocities in preparation for the data-driven iterative algorithm that keeps iterating until the absolute difference between the newly calculated velocity threshold and the previous one converges to less than 1°. The event detection code contains five main steps: peak velocities detection, saccade onset detection, saccade offset detection, fixation detection, and saccades detection based on velocity constraints for saccade detection and spatial and duration constraints for fixation detection. See Atweh et al. (2024) for more details on the preprocessing and event detection process.
After detecting fixations and saccades, we calculated six eye tracking metrics for each participant in each trial (i.e., number of fixations, fixation duration, number of saccades, saccade duration, velocity, and amplitude). We calculated the mean for each metric by team for each condition.
Experimental Procedure
Participants read and signed the consent form and were briefed about the study goals, tasks that needed to be completed as a team, and how the testbed was networked. Afterwards, the participants completed a 10-min training session together where they had to achieve at least 70% accuracy across all tasks. They were also given the opportunity to strategize and divide the tasks at their discretion. We adopted this approach to encourage natural use of gaze sharing, allowing us to observe how participants would naturally use gaze sharing in their communication strategies. Before the start of each condition, participants had the opportunity to review their strategy and determine task division based on the constraints of each condition. At the conclusion of the study, participants filled out a debriefing questionnaire. The experiment session lasted 60-80 mins and participants were compensated $15 for their time.
Results
Performance
Figure 4 shows the mean and standard error of the performance scores across the 24 teams for each condition based on the designated scoring convention (Table 1). Pairs who completed the tasks using both gaze sharing and communication yielded the highest total scores (mean = 40,566 points), followed by gaze sharing and no communication (mean = 38,076 points) and the no gaze sharing and communication conditions (mean = 37,558 points). A repeated measures univariate ANOVA showed that the score was statistically significantly different between the three conditions (F(2, 46) = 6.22, p = .004, partial η2 = 0.21). Post hoc tests using Bonferroni correction revealed that the total score performance was statistically significantly higher in the “gaze sharing and communication” condition compared to the “no gaze sharing and communication” condition (p = .003). There were no statistical differences in terms of score for all other pairwise comparisons (all p > .05).

Overall performance scores for each condition. An asterisk (*) indicates significance.
NASA-TLX Scores
Figure 5 shows the mean and standard error of the NASA-TLX scores for each of the six dimensions. We decided to analyze the six dimensions separately based on recent recommendations in the literature (i.e., Bolton et al., 2023). A one-way repeated measures MANOVA was conducted to check for any statistical difference between the NASA-TLX scores of the three conditions across the different dimensions. A significant multivariate effect was observed for the gaze conditions F(12, 82) = 2.39, p = .04; Wilks’ Λ = 0.34; partial η2 = 0.25. Six follow-up repeated measures univariate ANOVAs showed that the mental (F(2, 46) = 5.61, p = .007, partial η2 = 0.2) dimension was statistically significantly different between the three conditions, using a Bonferroni adjusted α of .0083. Pairs expressed significantly lower mental demand when using both gaze sharing and communication compared to the “no gaze sharing and communication” condition (p = .038) and the “gaze sharing and no communication” condition (p = .023). There were no statistical differences in mental demand between the “no gaze sharing and communication” condition and the “gaze sharing and no communication condition” (p = .9).

NASA-TLX scores for each dimension by condition. An asterisk (*) indicates significant main effects for a dimension.
Eye Tracking Metrics
Figure 6 shows the mean and standard error of the six eye tracking metrics (number of fixations, fixation duration, number of saccades, saccade duration, velocity, and amplitude) across the 24 teams for each condition. Six repeated measures univariate ANOVAs revealed that the mean number of fixations (F(1.28, 29.52) = 9.59, p = .002, partial η2 = 0.29), the mean number of saccades (F(2, 46) = 7.6, p = .001, partial η2 = 0.25), saccade duration (F(1.028, 23.65) = 14.82, p < .001, partial η2 = 0.39), and saccade velocity (F(2, 46) = 8.67, p < .001, partial η2 = 0.27) differed statistically significantly between the three conditions. For the number of fixations and saccade duration, we used the Greenhouse-Geisser correction to the ANOVAs due to the violation of the sphericity assumption. This correction adjusts the degrees of freedom to reduce the risk of Type I error, resulting in the decimal degrees of freedom reported.

Eye tracking metrics for each condition. An asterisk (*) indicates significant main effects for a metric (ms = milliseconds, ° = degrees visual angle).
Post hoc analysis with a Bonferroni adjustment showed significant differences between conditions. Participants exhibited the lowest number of fixations in the “no gaze sharing and communication” condition compared to both the “trail and no communication” condition (p = .004) and the “trail and communication” condition (p = .007). Participants had significantly higher saccades in the “no gaze sharing and communication” condition compared to the “trail and communication” condition (p < .001). Saccade duration was significantly longer in the “no gaze sharing and communication” condition compared to both the “trail and no communication” condition (p = .002) and the “trail and communication” condition (p = .003). Saccade velocity was significantly higher in the “no gaze sharing and communication” condition compared to the “trail and communication” condition (p < .001). Other pairwise comparisons were not statistically significant (p > .05).
Debriefing Questionnaire
Most participants (75%) reported that the combination of real-time gaze sharing and verbal communication improved shared awareness. Additionally, 67% felt that this combination helped them better understand their teammate’s focus of attention. When asked about coordination, 50% indicated that task coordination improved under the combined condition. However, a small subset found the added information to be either ineffective (8%) or distracting (6%), suggesting potential individual differences in how such tools are perceived. When asked specifically about their ability to predict their teammate’s actions, a key component of SA, 44% of participants stated that the fixation trail helped even in the absence of verbal communication. Conversely, 17% felt that gaze sharing did not affect predictability, indicating some variability in gaze sharing use.
Participants were also asked to rank the three conditions in terms of overall preference. Over half (56%) ranked the combination of trail gaze sharing with verbal communication as the most effective configuration, emphasizing the complementary value of visual and verbal cues. In contrast, 23% preferred trail gaze sharing without verbal communication, often citing the streamlined simplicity of relying solely on visual cues. Notably, 60% ranked the “no gaze sharing with verbal communication” condition as the least effective, highlighting difficulties in maintaining coordinated attention without access to visual cues.
Discussion and Conclusion
This study examined how real-time gaze sharing and verbal communication interacted to influence team performance, workload, and visual attention strategies in a simulated UAV C2 environment. The results offer clear support for the hypothesis that the combination of gaze sharing and open verbal communication yields the most effective collaborative outcomes. Teams in this condition not only had the highest performance scores but also had significantly lower mental workload and had more efficient visual behaviors, as evidenced by fewer saccades and more stable fixations.
These findings suggest that gaze sharing paired with verbal and communication can reduce cognitive demands. Here, gaze sharing likely reduced the need for constant verbal updates, allowing team members to coordinate implicitly through the visual cues provided by gaze sharing. At the same time, the availability of verbal communication may have clarified or disambiguated the meaning of gaze cues, reducing the cognitive effort required to interpret any ambiguous information. This complementary interaction likely explains why only the combined condition, and not gaze sharing or verbal communication alone, significantly lowered mental workload (Baker et al., 2021; Nawaz et al., 2021).
The eye tracking data further reinforces this interpretation. The reduction in saccadic activity and increase in fixation stability under the combined condition indicates that participants adopted more focused and deliberate visual strategies. In contrast, teams lacking either gaze sharing or verbal communication showed signs of visual inefficiency. This was evidenced by more frequent and faster saccades, which may reflect compensatory scanning due to uncertainty about a teammate’s focus or actions.
Subjective workload data also supported the objective findings. Most participants indicated that the combination of gaze sharing with communication improved shared awareness and task coordination. They also noted that gaze sharing provided a better understanding of their partner’s attention. Notably, the least preferred configuration was verbal communication without gaze sharing, further emphasizing the limits of verbal communication in managing coordination especially under high workload. Together, these findings highlight the importance of designing team systems that support team communication using different mediums (Atweh & Riggs, 2024a, 2024b). Gaze sharing, when thoughtfully integrated with verbal communication, has the potential to improve team collaboration by reducing workload, improving visual coordination (Zhang et al., 2017).
This study has several limitations that should be acknowledged. Although the tasks were carefully designed to reflect operational UAV demands, the findings may not generalize to other domains involving different team structures or communication constraints. One notable limitation is the use of college students as participants, rather than trained UAV operators, which may affect the ecological validity of the results. Additionally, the experimental conditions were constrained to three configurations, and did not capture the full range of possible interaction effects between communication strategies and gaze sharing modalities. Future work could expand this scope by systematically manipulating additional variables, such as the timing, frequency, or different gaze sharing techniques, to further understand how to support collaboration (Atweh et al., 2022; Cannon-Bowers et al., 1993).
Another key limitation involves the static implementation of the fixation trail. While many participants found the trail helpful, a small subset reported that it occasionally obstructed other visual elements or was not consistently useful. Future research could explore dynamic, user-controllable features, such as an on/off toggle or adaptive display logic, to better integrate gaze sharing without introducing unnecessary clutter or distraction. Moreover, long-term studies are needed to assess how teams adapt to gaze sharing over time and whether its benefits persist or evolve with experience.
Expanding this work to larger teams and more complex operational settings would also be valuable, as coordination challenges scale with team size and task interdependence (Atweh et al., 2022). One particularly promising avenue is to explore how gaze sharing supports recovery from interruptions, a common and disruptive feature of real-world C2 operations. The ability to rapidly regain SA by observing a teammate’s visual focus may prove especially beneficial in high-interruption environments, supporting team resilience and minimizing performance breakdowns.
Ultimately, this work contributes to a growing understanding of how multimodal communication tools can enhance team cognition and coordination in high-demand settings. As collaborative systems grow more distributed, dynamic, and data-rich, designing interfaces that reduce cognitive workload and promote shared attention will be critical. The insights gained from this study underscore the potential of gaze sharing technologies to advance more adaptive, resilient, and efficient human-machine teaming, paving the way for better performance not just in UAV operations, but across the many domains where effective teamwork is vital.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by the National Science Foundation (NSF grant: #2008680; Program Manager: Dr. Dan Cosley).
