Objects on a Collision Path With the Observer Demand Attention

Abstract

How observers distribute limited processing resources to regions of a scene is based on a dynamic balance between current goals and reflexive tendencies. Past research showed that these reflexive tendencies include orienting toward objects that expand as if they were looming toward the observer, presumably because this signal indicates an impending collision. Here we report that during visual search, items that loom abruptly capture attention more strongly when they approach from the periphery rather than from near the center of gaze (Experiment 1), and target objects are more likely to be attended when they are on a collision path with the observer rather than on a near-miss path (Experiment 2). Both effects are exaggerated when search is performed in a large projection dome (Experiment 3). These findings suggest that the human visual system prioritizes events that are likely to require a behaviorally urgent response.

One of the problems people face every waking moment is how to prioritize processing. Should one step into the crosswalk when the traffic light turns green, or first monitor the street for cars that may run a red light? The solution seems to be a dynamic tension between the need to maintain goals while still being prepared for change. Here we show that people have a default priority setting for dynamic events signaling possible threat.

Processing priority is often studied using search tasks in which a prespecified target is arrayed among distractor objects and one randomly chosen object is unique in color, luminance, time of appearance, or motion (Egeth & Yantis, 1997). Critically, participants know that the unique property provides no information about the location of the target. The index of whether the unique property captures attention is whether participants search the unique object with priority even though there is no goal-directed reason to do so.

Franconeri and Simons (2003) demonstrated that one of the strongest attention-capturing events is an object that expands in size, as though it were looming and about to collide with the observer. This finding led Franconeri and Simons to propose that dynamic changes are most likely to capture attention when they signal potential behavioral urgency. From this perspective, the abrupt appearance of an object, a sudden change in color, and a looming object may have a reflexive influence on processing priority because they are likely to require evaluation for possible action. However, a weakness of this proposal is that it can too easily be applied in a circular fashion. In the experiments reported here, we tested the urgency hypothesis directly by comparing the priority given to events that were equated in their physical features, but differed in their threat potential.

Experiment 1 examined the priority given to an object that suddenly increased in size, comparing the priority of the object when it was in the periphery versus near the center of gaze. We predicted that a looming item in the periphery would be of greater behavioral urgency because there is a need to evaluate events that have been detected but not yet identified. Experiment 2 compared priority for looming items that had identical motion trajectories, but differed in their direction with respect to the viewer. Some items loomed on a collision path with the participant, whereas others loomed on a near-miss path. The urgency hypothesis predicts that greater priority will be given to looming items on a collision path because these can be linked most directly to the need for imminent evasive action. In Experiment 3, the displays were projected over a much larger visual field than typically used in experiments on attentional capture. We expected that the pattern of results would be exaggerated with these more realistic displays.

EXPERIMENT 1: LOOMING IN THE PERIPHERY HAS PRIORITY OVER LOOMING IN THE CENTER

In Experiment 1, the participants' task was to indicate the orientation of the oval-shaped target among two or five spherical distractors (see Fig. 1). One sphere (the looming item) expanded rapidly in size through the first three frames of motion, and another sphere (the target) became slightly oval shaped in the fourth and final frame. The looming item and the target item were determined randomly and independently. Spheres appeared at two different distances from the center so that the influence of looming could be compared for these locations.

Fig. 1.

Illustration of the visual search displays in Experiment 1. The preview display consisted of three frames of motion in which one item expanded rapidly in size (looming item); other items were stationary. In the following search display, a randomly determined object (the target) was deformed so that it took on an oval shape. Items were presented equally often in the near and far locations.

If looming influences search priority, then search should be fastest when the target is also the looming item. We predicted that looming distractors in the periphery would slow search more than those near the center because of the increased threat potential of objects that are detected but not clearly visible. To ensure that search difficulty was not confounded with target location, we varied the difficulty of target discrimination in each location. In one condition (A), the target was deformed the same amount in both locations, so that center targets were generally easier to find than peripheral targets. In a second condition (B), peripheral target ovals were deformed more than center targets, so that peripheral targets were easier to find.

Method

Participants

Thirty-six undergraduates (26 females, 10 males) received course credit for participating in a 1-hr session. All reported normal or corrected-to-normal visual acuity and maintained an overall search accuracy better than 90%.

Displays

Displays were presented on a 43-cm (diagonal) screen (1024 × 768 resolution, refreshed at 89 Hz) in a room with typical office lighting. Participants sat with their eyes about 50 cm from the screen. Figure 1 illustrates a typical display sequence. The background of the displays was gray (15 cd/m²). Display items consisted of discs (5.0° visual angle) filled with a linear shading gradient that ran from white (29.5 cd/m²) in the upper left to black (0.1 cd/m²) in the lower right, giving the impression that they were spheres lit from above and to the left.

Each display began with a small fixation marker for 500 ms, followed by a preview display of either three or six spheres. In the preview display, one of the spheres expanded rapidly and uniformly in all directions from a small size (1.2°) to the standard size in three frames of motion (each frame = 56 ms). All spheres were arrayed on two imaginary circles, with radii of 9.0° (near) and 22.0° (far). Items on the near circle could appear at the clock positions 12:00, 3:00, 6:00, and 9:00; items on the far circle could appear at 1:30, 4:30, 7:30, and 10:30. The fourth frame was the search display, which remained on view until participants responded or 4,500 ms elapsed. A 22-ms blank screen was inserted between the three preview frames and the final search frame to help mask the local deformation that occurred when one of the spheres (i.e., the target) became oval shaped in the final frame. This change was the only difference between the third preview frame and the search display. In condition A, all ovals were deformed by 17%; in condition B, the near ovals were deformed by 17%, and the far ovals were deformed by 24%.

Procedure

Participants were told to look for the “oval” in each display and to indicate its orientation as rapidly as possible by pressing one of two keys. A small plus sign (correct), minus sign (incorrect), or circle (no response) provided feedback and served as the fixation point for the next display. Participants were instructed to maintain an accuracy of at least 90%.

Prior to testing, participants received 36 practice trials. Participants were not told that one of the items would loom in each display. Each participant was tested on a total of 576 trials, in eight blocks of 72 trials. Blocks were separated by brief breaks.

Results

Figure 2 shows mean response time (RT) for trials on which participants responded correctly, as well as mean error rates. Search speed, as indexed by RT slopes, was fastest when the target loomed (mean = 13 ms/item for the easier of the two target locations in each condition), slower when the near distractor loomed (mean = 61 ms/item), and slowest when the far distractor loomed (mean = 72 ms/item). This pattern held both when the near targets were easier to find and when the far targets were easier to find.

Fig. 2.

Results from Experiment 1: error rate and mean response time for trials on which participants responded correctly as a function of display size, target location (near vs. far), and looming item (far distractor vs. near distractor vs. target). Results are shown separately for condition A (left panel; near targets easier to find than far targets) and condition B (right panel; far targets easier to find than near targets). Error bars indicate standard errors of the means.

The factors of looming item (target, near distractor, far distractor), display size (three items, six items), target location (near, far), and condition (A: easy near targets, B: easy far targets) were examined with analysis of variance (ANOVA). Significant main effects were found for looming item (RT was smaller for looming targets than for looming distractors), F(2, 68) = 166.09, p < .001, MSE = 11,745, and display size (RT was smaller for three than for six items), F(1, 34) = 309.34, p < .001, MSE = 9,068. An interaction of target location and condition, F(1, 34) = 197.99, p < .001, MSE = 14,574, confirmed that RT was generally smaller for near than for far targets in condition A (mean difference = 159 ms) and that this pattern was reversed in condition B (mean difference = 168 ms). A significant interaction of looming item and display size, F(2, 68) = 70.80, p < .001, MSE = 2,269, indicated that the RT slope was smaller when the target loomed than when the near distractor loomed, F(1, 68) = 75.95, p < .001, and also smaller when the near distractor loomed than when the far distractor loomed, F(1, 68) = 7.07, p < .01. The correlation between mean RT and error rate across the 24 combinations of display size, looming item, target location, and condition was significant, r(22) = .49, p < .02, indicating that larger RTs were associated with a greater number of errors.

EXPERIMENT 2: LOOMING ON A COLLISION VERSUS A NEAR-MISS PATH

This experiment compared the influence of looming items that had identical motion trajectories, and appeared in the same spatial positions in the final display frame, but differed in their implied motion with respect to the participant. Half of the looming items were on a collision path with the viewer's head in the preview displays; the other half were on a near-miss trajectory.

Method

Twenty-two undergraduates (14 females, 8 males) participated in a 1-hr session. The displays were similar to those in Experiment 1 with the exception that the spheres were slightly smaller (4.0°) and the looming items expanded asymmetrically. Looming items expanded in a manner that was consistent with either a collision with the viewer's head or a near miss about 30° from the viewer's head. We ensured that the motions were otherwise identical by using a collision loom at any given clock position (e.g., 9:00) as the near-miss loom in the opposite position (e.g., 3:00).

Figure 3 illustrates the difference between looms on a collision path and looms on a near-miss path. For most clock positions, a near-miss loom missed not only the viewer's head, but also his or her body (as illustrated in the top panel). The one exception was the 6:00 position, where the near-miss loom was on a collision path with the viewer's torso (as illustrated in the bottom panel). Later, we discuss the unplanned effect of these 6:00 targets.

Fig. 3.

Illustration of a participant viewing the two types of looming motion (collision vs. near-miss path) in Experiments 2 and 3. For most clock positions, a near-miss loom missed not only the viewer's head, but also his or her body (upper right). The one exception was the 6:00 position, where the near-miss loom was on a collision path with the viewer's torso (bottom right). The relative sizes of the displays and the observer in this illustration are similar to the conditions in Experiment 3; displays were smaller in Experiment 2.

The procedure was otherwise the same as in Experiment 1. Participants were not told about the two different types of looming, and 20 of the 22 participants were surprised to learn after testing that there had been two paths of looming motion (collision vs. near-miss).

Results

The left half of Figure 4 shows mean RTs for trials on which participants responded correctly, as well as mean error rates. Search was much more efficient when the oval target was moving on a collision path with the observer than when it was moving on a near-miss path or was stationary. The increased efficiency for targets on a collision path was evident both in absolute search time (more than 200 ms faster for collision targets than for near-miss or stationary targets) and in the estimated rate of search, as indexed by RT slope (19 ms/item for collision targets vs. 66 ms/item for near-miss targets and 67 ms/item for stationary targets). Thus, search efficiency increased 3.5-fold for collision targets relative to stationary targets. Participants were very accurate in their search in all these conditions, with an error rate below 3% overall.

Fig. 4.

Results from Experiments 2 (left) and 3 (right): error rate and mean response time for trials on which participants responded correctly as a function of display size, direction of loom (collision vs. near-miss path), and looming item (far distractor vs. near distractor vs. target other than 6:00 target vs. 6:00 target). Error bars indicate standard errors of the means.

An ANOVA on the RT data from correct trials indicated main effects of looming item (RT was smallest when the target was the looming item), F(2, 34) = 11.78, p < .001, MSE = 27,822; direction of loom (RT was smaller when the looming item was on a collision path than when it was on a near-miss path), F(1, 17) = 82.82, p < .001, MSE = 7,463; target location (RT was smaller for near than for far targets), F(1, 17) = 39.38, p < .001, MSE = 14,998; and display size (RT was smaller for three than for six items), F(1, 17) = 95.12, p < .001, MSE = 29,028. A three-way interaction of looming item, direction of loom, and display size, F(2, 34) = 9.02, p < .001, MSE = 7,642, indicated smaller RT slopes for looming targets on a collision path than for looming distractors on a collision path, F(1, 34) = 21.08, p < .001, but not for looming targets on a near-miss path relative to looming distractors on a near-miss path, F(1, 34) < 1.0. The correlation between RT and error rate across the 24 combinations of display size, looming item, direction of loom, and target location was not significant, r(22) = .12, p = .57, indicating that there was no trade-off between response speed and accuracy.

Note that although all target positions were included in these analyses, Figure 4 separates out the 6:00 position because of an unexpected finding. Post hoc inspection of each clock position separately revealed that the 6:00 position was unique in showing no difference in search rate between targets looming on a collision path (14 ms/item) and targets looming on a near-miss path (18 ms/item, p > .20). This is presumably because a looming object on a near-miss path at this position signaled a potential collision with the observer's body. The search rate for near-miss 6:00 targets was significantly increased (as indicated by the shallower RT slopes) over the average search rate in all other clock positions when the target was stationary (Fisher's least significant difference, p < .01).

EXPERIMENT 3: SEARCHING IN A WIDE-ANGLE SCENE

Attentional capture during search is of interest in large part because it simulates many aspects of searching for objects during everyday skilled activities, such as driving and team sports. Because previous studies used small displays, they leave many unanswered questions about how their results apply to more realistic situations. In Experiment 3, we used a projection dome to test attentional capture when looming items are much larger and are projected further into the visual periphery.

Method

Eighteen undergraduates (11 females, 7 males) participated in a 1-hr session. The displays of Experiment 2 were used again, but this time they were projected onto a wide-angle Elumens VS3 hemispheric dome measuring 3.47 m (width) × 2.37 m (height) × 1.47 m (depth). Elumens software transformed the displays to prevent image distortion. Participants were seated with their eyes 2.70 m from the screen, such that the projection area subtended about 88° (width) × 66° (height). The spheres subtended 11°, and the item locations were 16° (near) and 40° (far) from the center of gaze. The procedure was otherwise the same as in Experiment 2.

Results

The right half of Figure 4 shows mean RTs for trials on which participants responded correctly, as well as mean error rates. The main finding was that the priority given to collision targets over near-miss or stationary targets was even greater for the large displays in this experiment than for the displays in Experiment 2. Compared with search in the smaller displays (Experiment 2), search was slowed by 100 ms for collision targets, by 200 ms for near-miss targets, and by more than 300 ms for stationary targets. The estimated rate of search as indexed by the RT slope was 28 ms/item for collision targets, 65 ms/item for near-miss targets, and 100 ms/item for stationary targets. Thus, search efficiency increased 3.7-fold for collision targets relative to stationary targets. Participants also made more than 2% fewer errors on trials with collision targets than on trials with near-miss or stationary targets.

An ANOVA on the RT data from correct trials indicated main effects of looming item (RT was smallest when the target was the looming item), F(2, 30) = 60.65, p < .001, MSE = 39,104; direction of loom (RT was smaller when the looming item was on a collision path than when it was on a near-miss path), F(1, 15) = 50.42, p < .001, MSE = 17,034; target location (RT was smaller for near than for far targets), F(1, 15) = 58.83, p < .001, MSE = 26,011; and display size (RT was smaller for three than for six items), F(1, 15) = 131.37, p < .001, MSE = 44,569. A three-way interaction of looming item, direction of loom, and display size, F(2, 30) = 8.80, p < .001, MSE = 5,863, indicated a difference in RT slopes between looming targets and looming distractors on a collision path with the participant, F(1, 34) = 93.2, p < .001, and a smaller difference in RT slopes between looming targets and looming distractors on a near-miss path, F(1, 34) = 16.9, p < .01. The correlation between RT and error rate across the 24 combinations of display size, looming item, direction of loom, and target location was significant, r(22) = .68, p < .001, indicating that larger RTs were associated with a greater number of errors.

Post hoc comparison of looming targets in the 6:00 position indicated no difference between targets on a collision path (4 ms/item) and those on a near-miss path (13 ms/item, p > .20). In addition, the search rate for looming targets in the 6:00 position was significantly faster than the average search rate for stationary targets in all other positions (Fisher's least significant difference, p < .01).

An ANOVA comparing RTs on correct trials with small and large displays (Experiment 2 vs. Experiment 3) revealed several interactions. The significant interaction between experiment and display size, F(1, 32) = 7.22, p < .01, MSE = 18,146, indicated that search rates were generally slower in Experiment 3. The significant interaction between experiment and looming item, F(2, 64) = 16.01, p < .001, MSE = 16,580, indicated that the RT difference between looming distractors and looming targets was larger in Experiment 3. The significant three-way interaction of experiment, looming item, and display size, F(2, 64) = 4.79, p < .01, MSE = 4,985, indicated that search rates were slower when distractors were looming than when targets were looming and that this difference was larger in Experiment 3 than in Experiment 2. There was also a significant three-way interaction of experiment, looming item, and loom direction, F(2, 64) = 3.62, p < .03, MSE = 4,231, indicating that RT-slope differences between collision and near-miss targets were larger in Experiment 3 than in Experiment 2. Finally, even the RT difference between far and near distractors was larger in Experiment 3 than in Experiment 2, F(1, 32) = 8.66, p < .01, MSE =7,224.

GENERAL DISCUSSION

In Experiment 1, an object that was not predictive of the target received greater attentional priority when it loomed in the periphery than when it loomed near the center of gaze. We interpret this as support for the urgency hypothesis because of how this finding relates to other known aspects of peripheral vision. For instance, brain regions devoted to the retinal periphery are populated by fewer neurons, and neurons with larger receptive fields, than those near the fovea. This means that events in the periphery are not represented with the same detail in shape, color, and motion (Metha, Vingrys, & Badcock, 1994; Tyler, 1985). In the laboratory, this imbalance can be corrected by magnifying peripheral objects (Gurnsey, Poirier, Bluett, & Leibov, 2006), but even after such magnification, peripheral vision is more susceptible to interference by crowding than central vision is (Thompson, Hansen, Hess, & Troje, 2007). To compensate in the everyday world, humans and other animals orient their head and eyes toward events detected in the visual periphery so that those events can be evaluated with central vision. The novel result here is that the strength of this orienting effect is positively correlated with the degree of retinal eccentricity.

The results of Experiment 2 provide even more direct support for the urgency hypothesis. The special priority given targets on a collision path with the observer is consistent with the need to evaluate such objects for possible evasive action. This finding is also noteworthy because the final display positions of the targets were identical in the collision and near-miss conditions, meaning that the implied future of these items, rather than any immediate physical difference, was responsible for the different outcomes. We were also surprised that 90% of participants reported being unaware that the simulated motion on half of the trials was consistent with a near miss, whereas motion on the other half of trials was consistent with a collision with the observer. This finding suggests that conscious processes are not required to complete the evaluation of the need for urgent action in response to looming objects.

It is notable that the priority given to an object on a collision path was evident only when the looming object was the target. Distractors on a collision path and distractors on a near-miss path were equally effective in drawing attention away from a stationary target. Finding a differential influence of collision versus near-miss paths only for targets implies that shape discrimination for an object in motion is enhanced when the object is on a collision path, rather than a near-miss path. This finding also implies that the differential influence of these two kinds of looming motion occurs not when motion in an object is first detected, but only at a later stage when the looming objects are evaluated more closely. This differential sensitivity to type of looming motion for targets versus distractors therefore deserves further study, as it implies that the accuracy with which objects in motion are perceived may depend on whether the objects are evaluated as task relevant or not.

Another unanticipated finding in Experiment 2 was a consequence of the way we equated motion sequences while varying behavioral urgency. When targets in the 6:00 position loomed on a near-miss path, they were nevertheless on a collision path with the viewer's torso. As a result, 6:00 targets looming on a near-miss path with the observer's head (but on a collision path with the observer's torso) were given the same priority as 6:00 targets on a collision path with the observer's head. This makes us wonder whether priority is given reflexively to any part of the observer's body schema that appears to be threatened by a looming object, and perhaps even to an observer's extended body schema, which might include a tennis racquet or even parts of the car the observer is driving.

In Experiment 3, we replicated the findings of increased sensitivity to looming in the periphery, of differential sensitivity to targets looming on a collision versus a near-miss path, and of sensitivity to looming targets aimed at both the head and the body of the observer, using a display that more closely simulated real life. When projected displays covered much of a participant's field of view, the differential priority favoring objects on a collision path and looming objects in the periphery increased significantly.

These findings are consistent with what has long been known about the sensitivity of vision in many species to looming motion. Confronted with expanding visual patterns, insects show a hiding response (Hassenstein & Hustert, 1999), rhesus monkeys raise their limbs defensively (Schiff, Caviness, & Gibson, 1962), and newborn human infants show a startle reflex (Ball & Tronick, 1971). Electrophysiological studies have revealed individual neurons with fine-tuned sensitivity to motion consistent with a collision to the head (Regan, Beverley, & Cynader, 1979; Wang & Frost, 1992). Psychophysical studies in adults also show great sensitivity in discriminating objects on a collision path from those on a near-miss path (Poljac, Neggers, & van den Berg, 2006). However, our study is the first to demonstrate that differential priority is given to the other attributes of an object according to whether the object is on a collision or near-miss path. Our search task focused on shape discrimination, and future studies will be needed to determine whether other attributes (e.g., color or surface markings) also receive this preferential treatment.

These results also suggest the possibility of cross talk between the ventral and dorsal visual pathways (Milner & Goodale, 1995; Ungerleider & Mishkin, 1982). Specifically, our findings are consistent with the possibility that the dorsal stream may first do a “quick and dirty” analysis, possibly using low-spatial-frequency information, before guiding the ventral stream in its analysis of the finer details of shape and color (Bar, 2003). This implied direction of communication is predicted by some, but not all, theories of attentional control. Consistent with the present results, dual-systems theory (Milner & Goodale, 1995) claims that rapid action can be initiated without its visual basis being made accessible to awareness (Milner & Goodale, 1995). For example, in the present experiments, guidance of attentional orienting mechanisms to looming objects appears to have been unconscious; only the subsequent evaluation of these objects as either task-relevant targets or distractors to be ignored was made available for conscious awareness. Other theories propose a direction of influence that is opposite the one implied in our interpretation of the present findings, positing that shape processing guides action analysis (VanRullen & Koch, 2003).

Our findings are also relevant to the emerging research on priority given to fear-relevant images, such as pictures of snakes, spiders, and angry human faces (Öhman, 2005). What is new in our study is that participants responded to possible threat signaled by the action of an object, rather than by an object's identity. We hope this study prompts further research on how the actions of objects may contribute to the assignment of attentional priority, over and above the influence of object identity.

Footnotes

Acknowledgements

This research was supported by a Discovery Grant (Natural Sciences and Engineering Research Council of Canada) to J.T. Enns.

References

Ball

Tronick

(1971). Infant responses to impending collision: Optical and real. Science, 171, 818–820.

Bar

(2003). A cortical mechanism for triggering top-down facilitation in visual object recognition. Journal of Cognitive Neuroscience, 15, 600–609.

Egeth

H.E.

Yantis

(1997). Visual attention: Control, representation, and time course. Annual Review of Psychology, 48, 269–297.

Franconeri

S.L.

Simons

D.J.

(2003). Moving and looming stimuli capture attention. Perception & Psychophysics, 65, 999–1010.

Gurnsey

Poirier

F.J.A.M.

Bluett

Leibov

(2006). Identification of 3D shape from texture and motion across the visual field. Journal of Vision, 6, 543–553.

Hassenstein

Hustert

(1999). Hiding responses of locusts to approaching objects. Journal of Experimental Biology, 202, 1701–1710.

Metha

A.B.

Vingrys

A.J.

Badcock

D.R.

(1994). Detection and discrimination of moving stimuli: The effects of color, luminance, and eccentricity. Journal of the Optical Society of America A, 11, 1697–1709.

Milner

A.D.

Goodale

M.A.

(1995). The visual brain in action. London: Oxford University Press.

Öhman

(2005). The role of the amygdala in human fear: Automatic detection of threat. Psychoneuroendocrinology, 30, 953–958.

10.

Poljac

Neggers

van den Berg

A.V.

(2006). Collision judgment of objects approaching the head. Experimental Brain Research, 171, 35–46.

11.

Regan

Beverley

Cynader

(1979). The visual perception of motion in depth. Scientific American, 241, 140–145.

12.

Schiff

Caviness

J.A.

Gibson

J.J.

(1962). Persistent fear responses in rhesus monkeys to the optical stimulus of “looming.”Science, 136, 982–983.

13.

Thompson

Hansen

B.C.

Hess

R.F.

Troje

N.F.

(2007). Peripheral vision: Good for biological motion, bad for signal noise segregation? Journal of Vision, 12, 1–7.

14.

Tyler

C.W.

(1985). Analysis of visual modulation sensitivity. II. Peripheral retina and the role of photoreceptor dimensions. Journal of the Optical Society of America A, 2, 393–398.

15.

Ungerleider

Mishkin

(1982). Two cortical visual systems. In Ingle

D.J.

Goodale

M.A.

Mansfield

R.J.W.

(Eds.), Analysis of visual behavior (pp. 549–586). Cambridge, MA: MIT Press.

16.

VanRullen

Koch

(2003). Visual selective behavior can be triggered by a feed-forward process. Journal of Cognitive Neuroscience, 15, 209–217.

17.

Wang

Frost

(1992). Time to collision is signaled by neurons in the nucleus rotundus of pigeons. Nature, 356, 236–238.