Abstract
Theory suggests that the vision-for-perception and vision-for-action processing streams operate under very different temporal constraints (Glover, 2004; Goodale, Jackobson, & Keillor, 1994; Graham, Bradshaw, & Davis, 1998; Hu, Eagleson, & Goodale, 1999). With the present study, children and young adults were asked to estimate how far a cued target was from a response target in immediate and response-delay conditions. Based on maximum reach of each participant, target locations in peripersonal and extrapersonal space were created. ANOVA results for accuracy indicated differences between Age within Condition and Space. Overall, adults were more accurate than children. Analysis revealed that with delays of superior or equal to 2 s, performance affected all groups, but most notably the 5- and 7-year-olds. In summary, these findings suggest that young children have greater difficulty processing allocentric cues in the context of reach in delay paradigms.
Understanding the perception to action developmental dynamics involved in reaching and grasping an object constitutes one of the most mystifying issues in motor behavior research. One of the many steps in planning reach actions is to identify characteristics of the object (e.g., size, texture) and background independent of the actor, as associated with an allocentric frame of reference and the understanding of which constitutes the aim of the current study. The perceptual representation of the target and its surroundings are predominately memory-based and derived from allocentric cues. In other words, allocentric representation relates to targets coded as a function of the surrounding visual cues that are presumed to be independent from the participant’s position-visual contextual clues (Blouin et al., 1993; Lemay & Proteau, 2003; Neggers, Van der Lubbe, Ramsey, & Postma, 2006). This frame of referencing allows the target location to be represented as a function of the surrounding visual cues and not to the relatively transient position of the body. In this respect, as long as the relation between the target and the context remains the same, the position of the observer may be changed because body position relative to the target is of secondary importance. With the allocentric information, the individual can create a visual mapping of the effector to be used to perform an action; an egocentric frame of reference. The egocentric frame can be referenced from a retinotopic, head-centered, or body-centered frame. It is through these representations that the external environment can be processed (Burgess, 2006).
Obviously, the use of visual information is critical in the processing of both frames of reference. It has been proposed that allocentric representations that aid in recognizing (for example) object shape and location are found mainly along the ventral processing stream, whereas egocentric cues associated with intended action are processed in the dorsal stream brain areas (Goodale, Kroliczak, & Westwood, 2005; Goodale & Westwood, 2004). Often associated with these processing streams is the idea of the perception-action model [PAM; vision-for-perception and vision-for-action] (Goodale & Milner, 1992; Goodale et al., 2005). We do acknowledge and wish to point out that, although a large body of research suggests that the two visual pathways have distinct roles, for action processing, the two systems are not hermetically sealed from one another—perception and action are intimately linked. From a motor behavior perspective, this idea is commonly described as a ‘coupling’ of perception and action (Goodale & Milner, 1992; Goodale et al., 2005).
Moreover, the perception and action streams are speculated to operate under very different temporal constraints, a proposition that was tested with the present study. It is suggested that memory-driven actions make use of a perceptual representation of the target generated by the ventral stream. Encoding an object in an allocentric frame allows for a perceptual representation of the objects while preserving the relationship of the object and its surroundings. In contrast, real-time (visually-guided) movements depend on pathways from the early visual areas that are relatively encapsulated visuomotor mechanisms in the dorsal stream. These dedicated visuomotor mechanisms, together with motor centers in the premotor cortex and brainstem, compute the estimated metrics of the target and its position via egocentric coordinates of the effector used to perform the action. It is through this frame of reference (egocentric) that the computations of the ever changing relationships of observer and object are to be updated moment to moment. Research testing the temporal aspects of the visual system shows that it is imperative that the required coordinates for action be computed immediately before the movements are initiated (Goodale, Jackobson, & Keillor, 1994; Graham, Bradshaw, & Davis, 1998; Hu, Eagleson, & Goodale, 1999).
As inferred earlier, perceptual tasks (e.g., recognizing size and target location) may be separated from action-based tasks (e.g., reaching for an object) and differences in performance explained by functional dissociations of two independent streams, the ventral and the dorsal streams located in the temporal and parietal lobes, respectively. Behavioral experiments have demonstrated that some patients can perform perceptual tasks but not visually-guided behavioral tasks (see Goodale et al., 1994). On the other hand, others are capable of performing normally on visuomotor tasks, but not perceptual tasks (see Goodale, Milner, Jackobson, & Carey, 1991). Furthermore, extensive neuroimaging and neurophysiology work is supportive of the separation of the two streams (e.g., Buneo, Jarvis, Batitsta, & Andersen, 2002; Cohen, Cross, Tunik, Grafton, & Culham, 2009).
Of the few studies that have examined the development of visual processing streams for action, the results are somewhat conflicting. Using a visual (Ebbinghaus) illusion task with grasping, Hanisch, Konczak, and Dohle (2001) provided evidence for non-specific use of an allocentric (ventral stream) and/or egocentric (dorsal stream) frame of reference while making perceptual judgments or planning motor acts during childhood. That is, children were relying on both visual streams during perceptual and visuomotor activities, therefore suggesting that these pathways are not functionally segregated during childhood. The researchers did note that the reliance on visual feedback decreased with increasing age (5 to 12 years).
In a later paper, using the Duncker illusion with a pointing task, Rival, Olivier, Ceyte, and Bard (2004) found that the different type of spatial information (location versus distance) encoded by the participants to program their motor responses matures early during childhood. Their data suggested that before 7 years of age, children use mainly egocentric object representations when performing motor tasks. Conversely, when making only perceptual judgments, children preferentially use allocentric cues. Furthermore, the researchers contend that by 7 years, children use the same attentional strategy as adults. It was reasonably noted that the differences between their results and those of Hanisch and colleagues might lie in the task used. In essence, those findings beg the question and need for further study considering the developmental status of the visual pathways in children. The reports are somewhat conflicting—one notes that the pathways “are not” functionally segregated till after 12 years. The other concludes that the systems are relatively mature and segregated by 7 years of age.
An interesting approach to investigating the nature of the visual representation in action control is to introduce a temporal (response) delay between stimulus presentation and response; an idea that complements the notion described earlier that the visual systems operate under different temporal constraints. This paradigm has been used with adults with overt reach and pointing tasks (Bradshaw & Watt, 2002; Elliot & Madalena, 1987; Heath, Westwood, & Binsted, 2004; Rossit et al., 2009; Westwood, Heath, & Roy, 2003) and prehension tasks (e.g., Hu et al., 1999). Experimentally, the use of a temporal delay has been show to modify the features of visuomotor responses. For example, Bradshaw & Watt (2002) found that a 2 s delay was sufficient to disturb reach movements. That is, participants exhibited reaches with lower peak velocities and lower peak apertures. Other researchers report similar results ranging from a decrement in movement behavior at 1 s (Graham et al., 1998) and 5 s (Hu et al., 1999). Moreover, and of specific connection to the present study, Bradshaw and Watt used a perceptual-matching condition and found that accuracy was less affected after imposing a temporal delay. Those findings support the notion that the visuomotor pathway has limited memory, and response after a delay may be sustained by representations stored in memory through the perceptual stream.
With the intent to gain insight to the representation and planning of reach actions via judgment distance estimation, we examined the age-related ability to estimate object location independent of the self. These properties were explored via allocentric (perceptual) cues in real-time (visually-guided) and response-delay (memory-guided; 1 s, 2 s, and 4 s) conditions. Based on reports that the perceptual stream is memory dependent, our assumption was that there would be no differences between the no-delay and delay conditions; whereas in movement execution findings, when a delay was imposed, movement execution was compromised (Bradshaw & Watt, 2002; Graham et al., 1998; Hu et al., 1999). Our assumption is based on the notion that when objects are coded in allocentric frames of reference via the perceptual stream, object identity (size, shape, color lightness, and relative location) can be maintained over time. From an age-related perspective, we expected younger children would have more difficulty due “in part” to immaturity of the ventral processing stream; a state that speculatively could affect mental representation and subsequently action planning. To our knowledge, allocentric estimation has not been tested in children, therefore this is one of our primary reasons for this study.
Method
Participants
The study involved 83 right-handed participants representing age groups of 5- (n = 17; females = 9), 7- (n = 14; females = 7), 9- (n = 18; females = 7) and 11- (n = 17; females = 7) year-olds and a group of adults (n = 17; females = 8). Mean ages for each group were 5.61, 7.72, 9.47, 11.36, and 21.53 years respectively. All participants were screened using a questionnaire (filled out by the parent of the child) to ensure normal vision and that none have a history of past or present sensorimotor impairment. Handedness was identified via manual performance rather than questionnaire, using the Lateral Preference Inventory (Coren, 1993).
Paradigm
The perceptual task required participants to estimate object location (red target) relative to the initially cued blue target (Figure 1). The blue target was programmed as a representation of the individual participant’s actual maximum reach. To facilitate the visual representation of the target, participants were trained and instructed to use visual imagery (VI). In order to facilitate its use, both limbs rested on the participant’s lap under the table. This position minimizes the unintentional engagement of the effector, such as used in tasks using motor imagery (adopted from Sirigu & Duhamel, 2001; Stinear et al., 2006). We wish to emphasize that in contrast to motor imagery (MI), VI is linked with the spatial component of the perceived environment via the ventral stream. Furthermore, theory suggests that MI operates in real-time, whereas VI has a memory component; which is relevant to our delay paradigm (Stevens, 2005).

Experimental set-up.
Apparatus
Actual maximum reach (used as the cue [blue] target) was collected via a projection system linked to a PC programmed with Visual Basic and JAVA. Visual images were systematically projected onto a table surface at midline (90°). The table was constructed on a sliding bracket frame, allowing it be moved backward and forward for adjustment to the participant. Participants sat in an adjustable ergonomics chair fixed, aligned with the midline of the table and projected image midline. Seat pan height (surface is metal and non-depressive) was set to 105% of participant’s popliteal height. Popliteal height was the distance from the underside of the foot to the underside of the thigh at the knees. Table height was then adjusted to the midpoint between seat pan height and seated eye height. Table and seat pan positioning were modified from Carello, Grosofsky, Reichel, Soloan, and Turvey (1989) and Choi and Mark (2004). To aid in establishing actual reach limitations for a 1-df action (described in the next section), a commercial seatbelt system was modified and secured to the back of the chair. The room was darkened with the exception of light from the computer monitor and visual images projected onto a black colored tabletop; reach targets consisted of white 2cm diameter circles. The fixation point was projected onto a rectangular box (with a 45 degree angle surface) placed at midline approximately 45cm from the most distal target. This paradigm has been reported using children and adults (e.g., Gabbard, Cordova, & Lee, 2009a, 2009b).
Procedure
To begin, participants were systematically positioned in the chair and introduced to the task for determining “actual” maximum reach—full extension of the right limb and middle finger to slide forward a penny using a 1-df reach (Carello et al., 1989), representing the blue target. A 1-df reach involved a comfortable effort of the hand, forearm, and upper arm acting as a single functional skeletal unit.
The task required that the participant focus on the cued blue target, then estimate the location (oral response of distance estimation) of the red target in reference to the blue target. We wish to point out that participants were not to estimate the target to “self” but rather object to object (object based-allocentric). The red target was positioned randomly at one of three 2cm circle locations above or below the blue target (Figure 1); providing seven targets with a range of −3 to +3. Each block of trials began with a 5 s “Ready!” signal—immediately followed by a central fixation point lasting 3 s, at the end of which the participant heard a tone and the blue target appeared for 1 s. Then according to the specified delay (or no-delay), the red target appeared for 1 s followed immediately with a second tone, signaling the participant to respond. Theoretically, VI aided the participant in “remembering” the relative location of the blue target, which was used to estimate the red target location. For example, after the designated delay or no-delay, the participant would state “+2,” which was an estimate of two targets above the cued (blue) target. Three trials at each of the seven target sites were presented randomly. During the training phase of the study, participants were shown a diagram of the seven targets in order to familiarize themselves with object and space characteristics.
Four blocks of task conditions were administered, representing delays of: 0 (no-delay), 1 s, 2 s, and 4 s. Conditions were counterbalanced between participants and each condition began with three practice trials. No feedback was available to participants about the accuracy of performance. As a precaution for general and especially eye fatigue due to fixation, the experimenter provided breaks between trials. A second experimenter served to reinforce instructions regarding imagery technique and refocusing to the central fixation point with each trial. Testing required one approximate 30-minute session and four conditions. Each participant completed 84 trials (4 conditions × 21 trials).
We wish to note that participants were trained in use of VI; that is, holding (remembering) the blue target location used to estimate red target distance. For obvious reasons, special attention was given to the younger age groups. A total of five children were dropped due to immaturity in understanding task instructions.
Data analysis
Descriptive statistics and analysis of variance (ANOVA) procedures were employed. The dependent variable used was derived from error measurements. There were two classifications of error. The first category of error was in reference to absolute (magnitude) error (cm); that is, the magnitude of the distance error. For example, if target 5 was presented and participants’ response was “20,” an error of +2cm was recorded. All errors for the absolute error measurement were given converted to absolute values, allowing us to examine the magnitude of error. We wish to point out that there were two spaces where the targets were shown. The two spaces consisted of peripersonal space and extrapersonal space. Peripersonal space was defined at the target locations that were within reach (targets 1 through 4), whereas extrapersonal space was defined as the target locations that were beyond reach (targets 5 through 7). Absolute error was used for the 5 (Age) × 4 (Delay) × 2 (Space) ANOVA with repeated measure on Delay and Space. In addition to absolute error, constant error was computed to determine the bias of the error; that is, cue (blue) target from response (red) target. In contrast to the absolute error measurement where error was referred to as an absolute value, here we kept the direction of the error. For example, if target 5 was presented and participant responded with “+2,” there was a −2cm error. As appropriate, pairwise comparisons (Bonnferroni) and post hoc analyses (Duncan’s Multiple Range tests) were performed (p < .05)
Results
Absolute error
Figure 2 shows the absolute error across age. ANOVA results showed a three-way interaction effect F(12, 234) = 4.68, p < .001, η2 partial = 0.19. Simple effects for Delay within Age indicated that all age groups were affected by the delay conditions, 5 yrs F(3, 76) = 55.43, p < .001, η2 partial = 0.687; 7 yrs 55.89, p < .001, η2 partial = 0.69; 9 yrs 28.37, p < .001, η2 partial = 0.53; 11 yrs 20.02, p < .001, η2 partial = 0.44; and adults F(1, 78) = 24.07, p < .001, η2 partial = 0.28. Post hoc analysis showed that at 2 s, the 5- (M = 19.06) and 7-year-olds (M = 17.86) were non-significant to each other but different from the older age groups (9- and 11-year-olds and adults; M = 11.33, 10.82, and 5.41, respectively). Results also revealed that the 9- and 11-year-olds were non-significant to each other but different compared to adults. Post hoc analysis for the 4 s delay condition revealed that the adults (M = 16.82) were different from the three younger groups (5-, 7-, and 9-year-olds; M = 31.88, 37.43, and 26.33, respectively) but non-significant to the 11-year-olds (M = 22.82).

Experimental set-up.
Concerning Space, post hoc analysis showed that in peripersonal space, the 5- (M = 18.24) and 7-year-olds (M = 16.29) were non-significant to each other but different from the older age groups (9- and 11-year olds and adults; M = 10.78, 10.12, and 5.18, respectively) in the 2 s delay condition. There were no differences between the age groups with the 0 s and 1 s delay condition. Results also revealed that the 9- and 11-year-olds were non-significant to each other but different compared to adults. Similar main effects were shown at 4 s for the children, 5 yrs F(1, 78) = 14.79, p < .001; 7 yrs 44.19, p < .001; 9 yrs 50.53, p < .001; 11 yrs 17.83, p < .001, but not for the adults, 3.55, p > .05. Post hoc analysis for the 4 s delay condition revealed that the adults (M = 9.88) were different from the other age groups (5-, 7-, 9-, and 11-year-olds; M = 18.94, 24.43, 18.56, and 14.71, respectively). Differences also emerged in extrapersonal space in the 4 s delay condition. The 5- and 7-year-olds (M = 12.94 and 13.41, respectively) were non-significant to each other but different from the other age groups (9- and 11-year-olds, and adults; M = 7.78, 8.12, and 6.94, respectively).
Constant error
Constant error showed similar trends. The ANOVA results indicated a three way interaction effect, F(12, 234) = 5.85, p < .01, η2 partial = 0.231. Closer inspection revealed that the 2 and 4 s delay interaction was largest for the younger age groups (5- and 7-year-olds). Post hoc analysis showed differences between the 5- (M = −19.65) and 7-year-olds (M = 6.57). The 5- and 7-year-olds were also different to the other age groups (9- and 11-year-olds, and adults; M = −3.11, 0.12, and −2.00, respectively). No differences were indicated between the older age groups (9- and 11-year-olds, and adults).
Discussion
Our specific aim was to examine the age-related ability to estimate object location independent of the self. These properties were explored via allocentric (perceptual) cues in real-time (visually-guided) and response-delay (memory-guided) conditions. Our assumption was that since the perceptual stream (via allocentric ’cues) has a memory component, performance after delay would not decline. Furthermore, we predicted that young children would be more affected by delay due in part to immaturity and/or ineffective use of ventral processing. Based on previous work, we also expected that some children may have difficulties due to initial use of an egocentric frame of reference (Hanisch et al., 2001). Although memory might aid performance to a point (as the literature supports), our assumption was that there would be an age-related difference due to processing stream development. We also acknowledge that other factors may play a role, especially with young children.
As a general observation, our results indicated that all age groups were affected by delays of ≥2 s, with 5- and 7-year-olds being significantly different from the other groups. In regard to the younger group’s performance, two perhaps interrelated factors were involved: maturity of the visual systems, namely the ventral pathway, and/or effective use of visual information. Hanisch and colleagues found that children were relying on both visual streams during perceptual and visuomotor activities, therefore suggesting that these pathways are not functionally segregated during childhood (up to age 12). Although Rival et al. (2004) used a different task, their data suggested that before 7 years of age, the streams are used independently; children use mainly egocentric object representations when performing motor tasks and when making only perceptual judgments, children preferentially used allocentric cues. Compared to the aforementioned studies that did not compare their results to adults or use response delay, our findings revealed that young children’s responses were similar to adults in immediate and 1 s delay conditions. We can only speculate that the younger groups’ difficulty in conditions with more delay was attributed to immaturity of the memory-based visual system and general information processing ability. Obviously, we can only speculate, given that brain activity was not monitored, and that ventral processing was the primary structure involved, if at all. On the other hand, it could be that age-related differences were due to attention, memory strategies, and experience.
Whereas the younger two groups (5- and 7-year-olds) displayed more difficulty with delays of 2 or more seconds, it seems that the older groups were able to sustain certain levels of accuracy. It was only at the 4 s condition that adults and 11-year-olds displayed superior performance (concerning constant error) over the other age groups. Our assumption was that the adults and the oldest children’s age group represented the mature visual processing model (PAM) for use of allocentric cues. To reiterate, the PAM asserts that the ventral visual pathway plays the major role in building a stable perceptual representation of our visual world. Our adults were able to mediate and maintain the representations of the targets via the ventral visual pathway up to a certain point; namely, less than 2 s. This finding complements previous work using action-based paradigms (Bradshaw & Watt, 2002; Elliott & Madelena, 1987; Glover, 2004; Hu et al., 1999; Hu and Goodale, 2000; Lemay, Bertram, & Stelmach, 2004; Milner, Paulignan, Dijkerman, Michel, & Jeannerod, 1999). Thus, our results confirm that there are limitations in the system. That is, like the children, adult’s performance decreased substantially with a delay of ≥2 s. Moreover, as noted in the age-related comparison, whereas performance declined, it was significantly better than that of the 5- and 7-year-olds at 2 s and all groups at 4 s.
At this point we can speculate that with longer delays the system becomes more reliant on allocentric sources of information, but at the cost of perceptual accuracy. Our results seem to reaffirm studies measuring the kinematics of movement, which have shown that delays cause allocentric information to affect movement execution (e.g., Heath & Westwood, 2003; Hu et al., 1999; Westwood & Goodale, 2003; Westwood et al., 2003).
Another finding that warrants mention is the observation that more error was displayed in peripersonal space compared to extrapersonal space. Although to our knowledge this study is the first to describe peripersonal and extrapersonal space in a perceptual task paradigm, this finding contrasts sharply with children’s performance in an egocentric reach paradigm using motor imagery in which better accuracy was noted in peripersonal space (Gabbard, Cacola, & Cordova, 2009a, 2009b; Gabbard, Cordova, & Ammar, 2007; Gabbard, Cordova, & Lee, 2007). However, given that the task here was perceptual (allocentric) rather than an egocentric reach activity, appears to account for the difference. In other words, unlike egocentric referencing based on body and effector information, our task required participants to use visual cues from the surroundings independent of the body. What is interesting is that our results showed differences when estimates were made in peripersonal space compared to extrapersonal space. After closer inspection, our data does reaffirm previous studies showing that when objects are presented in the lower visual field (Danckert & Goodale, 2001; Losier & Klein, 2004; Previc, 1998), there seems to be better accuracy (target recognition and location) compared to objects being presented in the upper visual field. Although this study was not a test of upper and lower visual field, it seems that the task inadvertently tested this notion.
In conclusion, our findings revealed that children as young as 5 years of age are capable of effectively responding to a perceptual task requiring use of allocentric cues in conditions of immediate response and minimal response delay. However, performance for children and adults declined with delays of ≥2 s. In regard to the age-related comparison, whereas all groups were affected by delays of ≥2 s, the effect was significantly more pronounced with 5- and 7-year-olds. Once again we acknowledge that with children, especially younger ones, in addition to visual stream development, general information processing limitations (e.g., attention and memory ability in general) could play a role in their ability to locate and remember the location of objects in space.
Footnotes
Acknowledgements
We would like to thank all of the participants and parents for their participation in the study.
