The past two decades have seen dramatic advancements in our knowledge about the cortical representation of the visual world. Research has revealed an increasing number of topographic maps representing the various parameters that can be extracted from visual input, such as orientation, spatial frequency, or color. Some represent space finely, others coarsely, but any topographic map must account for a principal problem for its representation of space: how to cope with saccades. Should the map represent items independently of where they fall on the retina, or update to take account of the saccade? We might think of these two alternative ways of representing the visual world as being either world centered, with the map being invariant to where one fixates at any one moment, or eye centered, with the map representing the moment-by-moment location of the item on the retina. In vision research, these two alternatives are typically labeled as spatiotopic and retinotopic, respectively.
The human visual system uses both spatiotopic and retinotopic mapping. For example, primary visual cortex demonstrates a near-perfect retinotopic representation of the visual world, whereas portions of area MT contain a spatiotopic coordinate system of motion (d'Avossa et al., 2007). There is a good deal of research exploring the updating of visual information across saccades, or the relationship between covert spatial attention and overt saccadic eye movements, but whether/how we update spatial attention across saccades is a largely unanswered question. It is this question that Golomb et al. (2008) address in their article: how do we maintain a top-down attentional focus across saccades?
Traditionally, studies of covert attention orienting have required subjects to maintain fixation at a single point while their spatial attention is directed to a different location on the screen (Posner, 1980). In this circumstance, the enhancement derived from spatial attention could be mediated via a retinotopically arranged, or a spatiotopically arranged, salience map. Golomb and colleagues present a neat paradigm for distinguishing these two possibilities.
Subjects fixated at a specific point on the screen. A cue then appeared in a nearby portion of space, cueing subjects to covertly attend to this location; this cue then disappeared, but subjects held its location within visual short-term memory. On a subset of trials, a probe stimulus (a single bar) then appeared in this cued location, and subjects were asked to report the orientation of this probe stimulus [Golomb et al. (2008), their Fig. 1B]. On some trials, subjects were cued to make a saccade to a new fixation location, and then back to the original fixation location, before the onset of the probe [Golomb et al. (2008), their Fig. 1C]. In both of these trial types, as one would expect, subjects showed faster performance to probes appearing in the cued location than to those appearing at a control location [Golomb et al. (2008), their Fig. 2A,B]. Subjects used the initial cue to allocate spatial attention to a specific portion of the screen, and attention was maintained across saccades. However, these trial types did not distinguish whether spatial attention was operating on a retinotopically or a spatiotopically arranged map.
Golomb and colleagues included a third trial type that enabled them to tease apart predictions based on retinotopically and spatiotopically arranged maps. Moreover, they varied the delay of the probe between 75 and 600 ms, enabling them to track the time course of these two predictions [Golomb et al. (2008), their Fig. 1A]. On this third trial type, after an initial fixation, subjects made a single saccade to a new location, and the probe appeared before they could saccade back. The probe was presented in the exact location of the previous cue (one-half of trials), or to the portion of the retina that the previous cue had fallen on (one-quarter of trials), or in a control location (one-quarter of trials). This critical third trial type, with these three probe locations, enabled the authors to test whether subjects' spatial attention was operating retinotopically or spatiotopically and, importantly, whether this changed with time.
The distinction between these two alternatives can be seen in Figure 1: subjects' covert attention is initially cued to the right of fixation [Golomb et al. (2008), their Fig. 1A]. This enhancement is also shown graphically on the salience map in Figure 1A. When subjects make a saccade to a new location on the screen (Fig. 1B), there are two alternative salience maps that might best represent the allocation of spatial attention: if the attentional enhancement was spatiotopic, then the peak would be unaffected by the saccade. However, if the salience map is retinotopic, then the peak would shift with the change in fixation point. Salience maps representing these two alternative possibilities are also shown in Figure 1B.
A, A point of fixation on the visual display shown to subjects, marked with a + (at the x–y coordinates −2, −1), with the attention cue appearing to the right of this fixation point (at the x–y coordinates −1, −1). Below this there is a salience map, with salience being measured in arbitrary units up the z-axis (from 0 to 40). The x-axis and y-axis of the map show the same coordinates as represented on the visual display above, with a peak in salience at the location of attentional focus. B, The screen after the saccade, with the gray plus sign representing the previous fixation point and the black plus sign the new fixation point. The two salience maps below represent the two alternative possibilities that Golomb et al. are now able to distinguish: a retinotopic and a spatiotopic locus of spatial attention.
The results of Golomb et al.'s experiments clearly demonstrated that subjects' initial coordinate system for spatial attention was retinotopic; subjects' performance was faster and less error prone when probes were presented in the “retinotopically enhanced” portion of the screen, relative to that when the probe appeared in the control locations [Golomb et al. (2008), their Fig. 2C]. However, this effect was short lived: the difference was absent when the probe was delayed for >200 ms [Golomb et al. (2008), their experiment 1]. In contrast, at longer delays, subjects' allocation of attention was more in keeping with a spatiotopic coordinate system; subjects' performance was faster and less error prone when probes were presented in the “spatiotopically enhanced” portion of the screen, relative to that when the probe appeared in the control locations [Golomb et al. (2008), their Fig. 2C]. A subsequent experiment reinforced this finding: although there was no evidence of spatiotopic enhancement at a probe delay of 75 ms, there was a strong effect at 400 ms, whereas there was strong evidence of a retinotopic enhancement at a probe delay of 75 ms but no such effect at 400 ms [Golomb et al. (2008), their Fig. 4A,B, experiment 2]. This early retinotopic enhancement is especially surprising; not only were subjects instructed to maintain their spatial attention at the original presaccade location, but probes were twice as likely to appear in this spatiotopic location relative to the retinotopic location.
In a final experiment, Golomb et al. demonstrated that subjects can retain the retinotopic enhancement if instructed to do so [Golomb et al. (2008), their experiment 3], suggesting that the transfer from a retinotopic to a spatiotopic coordinate system was not obligatory. This manipulation of the stimulus onset asynchrony of the probe stimulus enabled the researchers to track the transition between retinotopic and spatiotopic coordinate systems in spatial attention in a way that had never been attempted before. This in turn gave fresh insight into both the nature of our spatial attention and the means by which our attention is maintained across saccades. But like any new paradigm, it also raised questions and potential controversies deserving of future experiments.
Given these data, we might question whether our spatial attention relies on multiple maps simultaneously or on a single map that can be updated after a saccade. Subjects might have a single salience map, initially representing the retinotopic location of the cue, which, after saccade, is altered in a top-down manner to represent the task-relevant location. In Figure 1B, this would look like a smooth transition from the retinotopic enhancement map on the left to the spatiotopic enhancement map on the right. Presumably, at some point the salience map would contain both peaks. Alternatively, subjects might have two (or more) distinct maps, one always exogenously triggered by the retinotopic location of the cue, fragile, and prone to rapid decay, and the second constructed endogenously, according to task relevance (Desimone and Duncan, 1995), and being more durable. According to this account, the two salience maps in Figure 1B would exist simultaneously and independently, with subjects switching from one to the other. In short, spatial attention could operate on a single map or across multiple maps, and Golomb et al.'s data do not distinguish these possibilities.
A second interesting question relates to the transition from using a retinotopic to using a spatiotopic coordinate system (whether this requires multiple salience maps or just one). Golomb et al. claim that their data support the view that both retinotopic and spatiotopic enhancement can occur simultaneously, but this cannot be convincingly determined by looking at their group-averaged results. Subjects' apparent showing of both retinotopic and spatiotopic enhancement at certain times, with all subjects putatively being midtransition, could instead result from one-half of the subjects using one coordinate system and the other half using the other. That is, the individual differences in the rates at which subjects switch from a retinotopic to a spatiotopic coordinate system might lead one to conclude that there is a smooth transition from one to the other, with each subject using both at some point. This issue could be easily resolved by rerunning Golomb et al.'s second experiment using more probe delay intervals. Careful analysis of the within-subjects data might reveal whether there is ever a period for which individual subjects use both spatiotopic and retinotopic coordinate systems, or whether the data are more characteristic of a rapid shift from one to the other. In short, one should always be careful when drawing conclusions on the basis of group-averaged data alone; individual subjects might not demonstrate the behavioral pattern that the group demonstrates.
As with any new paradigm, Golomb et al.'s data reveal fresh insight into oft-studied phenomena, as well as raise interesting questions for future research. In this case, the paradigm reveals the temporal relationship between retinotopic and spatiotopic attentional biases, begging the following questions: can they co-occur, and how do we move from one to the other? Psychophysicists and neuroscientists will doubtless seek to address these questions in the future.
Footnotes
-
Editor's Note: These short, critical reviews of recent papers in the Journal, written exclusively by graduate students or postdoctoral fellows, are intended to summarize the important findings of the paper and provide additional insight and commentary. For more information on the format and purpose of the Journal Club, please see http://www.jneurosci.org/misc/ifa_features.shtml.
-
D.E.A. is supported by a postdoctoral research fellowship from the Economic and Social Research Council, UK.
- Correspondence should be addressed to Duncan E. Astle, Department of Experimental Psychology, University of Oxford, South Parks Road, Oxford OX1 3UD, UK. Duncan.astle{at}psy.ox.ac.uk