Abstract
We investigated how attention shifts from one object to another by recording neuronal activity in the primary visual cortex. Monkeys performed a contour-grouping task in which they had to select a target curve and ignore a distractor curve. Some trials required a shift of attention, because the target and distractor curves were switched during the course of the trial. We monitored the dynamics of this attention shift in area V1, in which neuronal responses evoked by the target curve are stronger than those evoked by the distractor. The reallocation of attention was associated with a rapid and strong enhancement of responses to the newly attended curve, followed, after ∼60 ms, by a weaker suppression of responses to the curve from which attention was removed. We conclude that attention can be rapidly allocated to a new object before it disengages from the previously attended one.
Introduction
We sample the visual scene by moving our eyes and by covertly shifting attention from one object to the next. Psychological theories proposed that an attention shift consists of three distinct phases: (1) attention disengages from an item, (2) it moves to a new location, and (3) it engages a new item (Shulman et al., 1979; Tsal, 1983; Posner et al., 1984). It is not clear, however, how long it takes to redirect attention, because estimates have been highly variable across studies. Behavioral evidence from the visual search paradigm in humans, for example, suggested that attention shifts rapidly, approximately once every 50 ms (Wolfe, 1994). However, behavioral data from another task, in which observers had to identify two targets in a sequence, suggested that an attention shift takes several hundred milliseconds (Duncan et al., 1994). These discrepancies can be attributed to differences between tasks but also to difficulties in isolating the contribution of attention shifts from other factors that may also influence the subjects' accuracy and reaction time. To overcome these problems, recent studies used event-related potentials to directly measure neuronal correlates of attention shifts. Woodman and Luck (1999) measured the N2pc component of the event-related potential and observed that attention shifts within ∼150 ms. However, another study that used a different event-related potential technique (the steady-state visually evoked potential) reported that a shift of attention takes several hundred milliseconds (Müller et al., 1998). This suggests that the relationship between the event-related potentials and the neuronal events underlying attention shifts is too indirect to resolve the debate.
The present study therefore directly measures the neuronal correlates of attention shifts in the monkey visual cortex. Previous studies showed that neuronal activity evoked by attended objects is enhanced relative to activity evoked by unattended objects in areas of the visual cortex (for review, see Desimone and Duncan, 1995; Treue, 2001). However, only one study (Motter, 1994) measured neuronal activity during shifts of visual attention. The allocation of attention was found to cause a fast enhancement of neuronal activity in area V4, whereas the removal of attention caused a suppression of activity that was delayed, although the time course of these effects was not quantified. In the present study, we therefore directly determine the relative timing of attentional events and test the model of Posner et al. (1984) by investigating whether it is necessary to disengage attention before it can engage a new object.
We trained monkeys to trace a target curve while ignoring a distractor curve (see Fig. 1). Previous studies showed that neuronal responses in area V1 to the target curve are enhanced relative to responses to the distractor (Roelfsema et al., 1998, 2003; Khayat et al., 2004). This response enhancement provides a correlate of visual attention that is directed to the target curve (Scholte et al., 2001). On some trials, the target and distractor curves were switched during the course of the trial, forcing the monkey to shift attention. We will evaluate neuronal activity in area V1 during the attention shift.
Materials and Methods
Two macaques took part in the experiments. Standard surgical and electrophysiological techniques were used to record multiunit activity in area V1 (Roelfsema et al., 1998; Supèr and Roelfsema, 2005). All experimental procedures complied with the National Institutes of Health Guide for the Care and Use of Laboratory Animals and were approved by the institutional animal care and use committee of the Royal Netherlands Academy of Arts and Sciences.
Curve-tracing task. a, b, Sequence of events during normal (a) and switch trials (b). The monkey held fixation until the fixation point disappeared 800 ms after stimulus onset. Then, a saccade (arrow) was made to the circle located at the end of the target (T) curve. On switch trials, the target and distractor (D) curves switched 400 ms after stimulus appearance.
Behavioral task. A trial started as soon as the monkey's eye position was within a 1° square window centered on a 0.2° fixation point in the middle of a monitor. After 300 ms, the stimulus appeared (Fig. 1a), but the monkey had to maintain fixation. The stimulus consisted of two white curves (luminance, 24 cd/m2) on a black background (luminance, 0.5 cd/m2) and two red circles that subtended 0.4° of visual angle. After 800 ms, the fixation point was extinguished, and the monkey was required to make a single saccade to the circle that was connected to the fixation point by a curve. This curve will be referred to as the target curve (Fig. 1a, T). The other curve was a distractor (Fig. 1a, D). The length of the curves varied across recording sessions and equaled 5.8° on average (range, 3.4–8.9°). The size of the small “cue segment” that connected the fixation point to the target curve ranged from 0.1 to 1.1° (average, 0.4°). On most trials (normal trials, 75%), both curves were revealed to the monkey from the start of the trial and remained on the screen until the animal's response (Fig. 1a). On 25% of trials, however, the stimulus changed after 400 ms (Fig. 1b). On these switch trials, the connection to the fixation point (i.e., cue segment) switched so that the curve that was the distractor at the start of the trial became the target curve. In this condition, the monkey had to make a saccade to the circle located at the end of the new target curve. The stimuli of Figure 1 were interleaved with complementary stimuli, where the other curve was connected to the fixation point (see Fig. 2a), in a pseudorandom sequence. Trials in which the monkeys broke fixation before fixation point offset were terminated. Trials with a correct saccade that fell within a 2° square window centered on the saccade target were rewarded with apple juice. The monkeys' performance was >95%.
Recording and data analysis. We recorded multiunit activity from electrodes that were chronically implanted in area V1 (available at www.jneurosci.org as supplemental material). Receptive field (RF) dimensions were determined with an automatic plotting procedure, using moving light bars. Median RF size was 0.98° (range, 0.45–1.9°), and average eccentricity equaled 3.7° (range, 1.1–6.9°). Neuronal responses were obtained from neurons that had their RF on segments of the target or distractor curve (see Fig. 2a, rectangle). The curves were designed in such a way that the segment in the RF matched the preferred orientation of the cell, but attentional effects were also observed for neurons that were stimulated with suboptimal orientations (available at www.jneurosci.org as supplemental material). Only correct trials were included in the analyses. The responses at individual recording sites to the various stimuli were normalized to the average peak response, after subtraction of spontaneous activity. This procedure preserves differences between peak responses to different stimuli (Roelfsema et al., 1998, 2003). The significance of differences in response strength evoked by the target and distractor curve was determined in a computational window from 200 to 400 ms and from 600 to 800 ms after stimulus onset, using the Mann–Whitney U test. Population responses were computed by averaging across the normalized responses at different recording sites, and a paired t test was used to investigate the significance of differences between responses. Because V1 RFs are small, differences in eye position between conditions during fixation may influence firing rate. To exclude this possibility, we first removed trials with microsaccades and then applied a stratification procedure to factor out these eye position effects (see supplemental material, available at www.jneurosci.org). The latency of attentional modulation was determined by fitting a curve to the difference between responses. This method is not sensitive to the amount of available data and therefore provides a reliable estimate of the latency compared with other methods that compute the latency by using the significance of the difference between neuronal responses. The latency was defined as the time that the fitted function reached 33% of its maximum value. We note that qualitatively similar results were obtained with other criteria (e.g., 25, 50, or 75%). (A complete description of this method is available at www.jneurosci.org as supplemental material.)
Results
Neuronal response strength
Figure 2a illustrates the location of the RF of a V1 recording site relative to the stimuli on normal trials. The stimuli were configured in such a way that the RF of the neurons was located on a segment of the target curve (T) or on a segment of the distractor curve (D). The activity evoked by the target curve (Fig. 2b, black trace) at this recording site, as well as across the entire population of recording sites (n = 59) (Fig. 2c), was stronger than the activity evoked by the distractor curve (red trace) (single site, U test, p < 10–6; population data, paired t test, p < 10–6). This response difference was observed for the majority of recording sites in area V1. To determine the strength of the response difference, a modulation index (MI) was computed for all sites (n = 59) in a window from 200 to 400 ms after stimulus onset. MI was defined as the ratio between rate enhancement (or reduction) and the average firing rate: (RT – RD)/([RT + RD]/2), where RT and RD are the responses to the target and distractor curve, respectively. The distribution of MI was shifted to positive values (median, 0.22) (Fig. 2d), which indicates that most cells fire more action potentials if their RF is on the target curve. The response difference between the target and distractor curve was maintained during the entire period of fixation. The curve segment in the RF was identical for the two complementary stimuli. Thus, the response difference reflects an influence from outside the RF (Gilbert, 1993), in accordance with previous studies suggesting that it is a correlate of visual attention that is directed to the target curve (Roelfsema et al., 1998, 2003; Scholte et al., 2001; Houtkamp et al., 2003). The bottom panels in Figure 2, b and c, show a curve that was fitted to the difference between responses to the target and distractor curve, to measure the latency of attentional modulation (also available at www.jneurosci.org as supplemental material). The latency equaled 121 ± 3 ms (estimate ± SD) at the example V1 recording site, and 145 ± 2 ms at the population level.
Switch trials (Fig. 2e) began as normal trials. Accordingly, the initial part of the responses at the example recording site (Fig. 2f), as well as across the population of recording sites (n = 59) (Fig. 2g), was the same as that on normal trials. After 400 ms, however, the connection to the fixation point (i.e., the cue segment) switched. If the RF was initially located on the target curve, it fell on the distractor curve after the switch (Fig. 2e, T→ D). The opposite occurred for the other stimulus (Fig. 2e, D→ T). The switch changed neuronal responses in area V1 (Fig. 2f,g). Responses to the new target curve were enhanced, whereas responses to the new distractor curve were suppressed (single site, U test, p < 10–6; population data, paired t test, p < 10–6). The cue-segment switch occurred outside the RF. However, to exclude the possibility that the switch in neuronal response modulation represents a direct influence of the cue-segment switch on the RF, we investigated the relationship between response modulation and the distance between cue segment and RF (Fig. 2h). The analysis shows that the MI (computed from 200 to 400 ms after the stimulus switch) did not depend on the distance of the RF from the cue segment (linear regression coefficient, –0.08, t(57) = –1.28, p > 0.2) (Khayat et al., 2004). These results together imply that the change in neuronal responses after the stimulus switch is caused by the allocation of attention to the previously unattended curve and the removal of attention from the previously attended curve.
Neuronal responses during normal and switch trials. a, Complementary stimuli on normal trials. The rectangle depicts the location of an RF of a single recording site (eccentricity, 2.8°). The RF was located on a segment of the target curve (T, top), or on a segment of the distractor curve (D, bottom). b, Responses to the target (black) and distractor curve (red) at this recording site while the monkey maintained fixation. Responses are aligned on stimulus onset. Gray area, Difference between these responses. The bottom panel shows a curve that was fitted to the response difference, to determine the latency of response modulation (121 ± 3 ms; green arrow). The black bar on the x-axis shows 95% confidence interval of the latency. c, Responses averaged across all recording sites (n = 59). Attentional modulation appeared at 145 ± 2 ms after stimulus onset. d, Distribution of the modulation index of all recording sites during normal trials. Positive values indicate an enhancement of response to the target curve. Light blue, Cases for which the response difference between stimuli was significant (p < 0.05). Dark blue, Highly significant cases (p < 0.0005). Gray, Nonsignificant cases. The arrow indicates the median (0.22). e, Complementary stimuli on switch trials. The RF was on the distractor curve at stimulus onset but fell on the target curve after 400 ms (D→ T, top). The opposite occurred for the other stimulus (T→ D, bottom). f, g, Neuronal responses reflected the change in the stimulus at the single recording site (f), as well as at the population level (g). Black, Responses to T→ D stimulus. Red, Responses to D→ T stimulus. h, Relationship between the modulation index on switch trials and the distance (Dist.) separating the RF and the cue segment. The distance between the RF and the cue segment was 1.7° for the single example recording site and on average 1.8° for the population data (range, 0.4–4.8°). Red, Significant cases (p < 0.05; n = 39). Black, Nonsignificant cases (n = 20).
The dynamics of attention shifts
On switch trials, attention shifts from one curve to the other. To measure the dynamics of attentional reallocation, we compared the response to a curve that became a target because of the switch to the response to a curve that remained a distractor (Fig. 3a, D→ T and D, red traces). Similarly, to examine the time course of attentional withdrawal, we compared the response to the curve that switched to a distractor with the response to the curve that remained a target (Fig. 3a, T→ D and T, black traces). The switch from distractor to target (Fig. 3a, D→ T) was associated with a response enhancement (p < 10–6, t test) (Fig. 3a, light gray area), whereas the opposite switch (T→ D) resulted in response suppression (p < 10–6) (Fig. 3a, dark gray area). The effect of a D→ T switch was markedly different from the effect of a T→ D switch, both in magnitude and in time course.
Magnitude of attention shifts
The magnitude of the response enhancement evoked by the curve that switches from distractor to target (Fig. 3a, light gray area) was larger than the magnitude of the response suppression evoked by the curve that became a distractor (dark gray area). To quantify the strength of this response enhancement and suppression, a MI was computed in a window starting from 200 to 400 ms after the stimulus switch. MI was defined as the difference in response strength normalized to the average response on normal trials ([RT + RD]/2). At the population level, the response enhancement caused by the allocation of attention yielded an MI of 0.41 ([RD→T – RD]/{[RT + RD]/2}), whereas the response suppression caused by the removal of attention yielded an MI of 0.16 ([RT - RT→D]/{[RT + RD]/2}). To gain insight in the consistency of these effects, we analyzed the MI at individual recording sites. Here, we only included cases that exhibited a significant (p < 0.05) response difference between the target and distractor curve during normal trials (n = 39 of 59) (Fig. 3b). The MI was stronger for the enhancement on the vast majority (n = 34 of 39) of these recording sites (p < 0.0001, two-sided sign test). Moreover, the MI for the enhancement and suppression did not depend on the distance between the RF and the cue segment (enhancement: linear regression, –0.1, t(37) =–1.45, p > 0.2; suppression: linear regression, –0.04, t(37) =–1.62, p > 0.1). Thus, the allocation of attention causes a strong response enhancement, whereas the withdrawal of attention causes a weaker response suppression in area V1.
Effects of attention shift on neuronal responses. a, Population responses (n = 59) during switch trials (thick traces) aligned on the stimulus switch, superimposed on the responses during normal trials (dotted traces). The magnitude of response enhancement (light gray area) evoked by the new target curve is stronger than that of the response suppression (dark gray area) evoked by the new distractor curve. Bottom, Curves that were fitted to the response difference between stimuli D and D→ T (red) and stimuli T and T→ D (black) to determine the latency of enhancement and suppression, respectively. b, Comparison of the MI for the enhancement (MIenh) and MI for the suppression (MIsupp) for all sites that yielded a significant response modulation during normal trials (p < 0.05; n = 39 of 59). c, Comparison of the latency of enhancement (Latenh) and the latency of suppression (Latsupp) for all sites at which the difference in response between D and D→ T trials and the difference in response between T and T→ D trials were both significant (p < 0.05; n = 21).
Time course of attention shifts
To measure the latency of the response enhancement and suppression, a curve was fitted to the difference between responses (Fig. 3a, bottom, red, D→ T minus D; black, T minus T→ D). The switch from distractor to target curve yielded an increase in activity that occurred at a latency of 144 ms after the switch. The switch from target to distractor, in contrast, caused a suppression in activity that occurred 66 ms later, at a latency of 210 ms. We used a bootstrapping method to compute a 95% confidence interval for the difference in timing between enhancement and suppression (description available at www.jneurosci.org as supplemental material). The lower and upper bound of this confidence interval were 39.8 and 79.7 ms, respectively. The difference in latency between response enhancement and suppression was also evident at individual recording sites. We compared the timing of these effects at recording sites for which the enhancement and suppression of responses were both significant (p < 0.05; n = 21) (Fig. 3c). For the majority of sites (n = 19 of 21), the latency of enhancement (latencyenh) was earlier than the latency of suppression (latencysupp)(p < 0.001, two-sided sign test; median latencyenh, 134 ms; median latencysupp, 201 ms; latency difference, 67 ms; SEM, 13 ms). Moreover, the latency difference did not depend on the distance between the RF and the cue segment (linear regression, 9.5; t(19) = 0.78, p > 0.4). These results together show that the enhancement of neuronal activity associated with a switch from distractor to target is faster and stronger than the suppression of activity associated with a switch from target to distractor.
Discussion
The present results show that a shift of attention from one curve to another changes neuronal activity in area V1. We found that allocating attention to one curve and removing attention from another curve are two processes that are not complementary. Instead, the enhancement of neuronal activity caused by the allocation of attention is approximately two times stronger and 66 ms faster than the suppression caused by the removal of attention. It is as if the representation of the curve that becomes relevant first “lights up” quickly, and then the representation of the curve that becomes a distractor “fades away.” Previous models suggested that attention first has to disengage from an object before it can move and engage another object (Shulman et al., 1979; Tsal, 1983; Posner et al., 1984). Our results are incompatible with these models, because they demonstrate that attention can be allocated to a new object before it disengages from the previously attended object.
There is considerable variation in previous estimates of the time required for an attention shift. Some studies suggested that shifting attention from one object (or location) to another requires ∼500 ms (Duncan et al., 1994; Müller et al., 1998), whereas others indicated that attention can shift rapidly, every 50–150 ms (Wolfe, 1994; Woodman and Luck, 1999). These studies used different tasks and measured the attention shift indirectly, either by evaluating the accuracy of performance or by recording EEG from the scalp. It is therefore possible that the variation in the estimates of the switch time is caused by differences in methodology as well as by differences between tasks. Our results provide support for a rapid switch of attention, by showing that neuronal responses in area V1 are enhanced 150 ms after the cue to shift attention. A similar estimate was also found in area V4 (Motter, 1994) and in the frontal eye fields (Murthy et al., 2001), although these studies did not provide a precise analysis of the time course of the response switch. Interestingly, the pattern of activity on switch trials is similar to that reported in area V4 during a feature selection task in which a distractor color became the target color and vice versa (Motter, 1994). Motter (1994) found that the response suppression associated with a switch from target to distractor was delayed relative to the enhancement associated with the opposite switch. Thus, our results combined with these previous results demonstrate that switching attention occurs through a relatively fast target facilitation followed by a delayed distractor suppression in various tasks and in many areas of visual cortex.
In curve tracing, the monkey reports which of two circles is connected to the fixation point. We hypothesize that this task is solved by evaluating perceptual grouping criteria, such as connectedness and collinearity, that support grouping of all segments of the target curve into a coherent representation of an elongated curve (Roelfsema et al., 2000). During curve tracing, attention is directed to all segments of the target curve (Scholte et al., 2001; Houtkamp et al., 2003). At a neurophysiological level of description, all segments of the target curve are “labeled” by a response enhancement in area V1. This activity is presumably mediated by horizontal connections in area V1 and feedback connections from higher visual areas. The selectivity of horizontal connections can explain why collinear and connected contour elements are labeled by the enhanced response, because they predominantly interconnect neurons with nearby RFs that are in collinear configurations (Bosking et al., 1997; Schmidt et al., 1997).
In addition to the role of horizontal connections, previous studies implicated the parietal cortex as a source for attentional top-down signals. In the human imaging literature, the superior parietal lobe is consistently implicated in the maintenance of attention on objects, as well as in shifts of attention between objects (Vandenberghe et al., 2001; Yantis et al., 2002). In monkeys, the intraparietal sulcus and area 7a appear to fulfill a similar role (Bisley and Goldberg, 2003; Constantinidis and Steinmetz, 2001a,b). Parietal neurons may provide feedback to lower areas to enhance the representation of the target curve and to suppress the representation of the distractor.
In conclusion, we showed that attention shifts in area V1 are associated with a fast enhancement of responses to the newly attended object followed, after ∼60 ms, by a weaker suppression of responses to the object from which attention is removed. Attention thus rapidly engages a new object, even before it disengages from the previously attended one.
Footnotes
This work was supported by a Human Frontier Science Program grant (to P.R.R.). We thank K. Brandsma and J. C. de Feiter for technical assistance.
Correspondence should be addressed to Paul S. Khayat, Department of Physiology, McGill University, 3655 Promenade Sir W. Osler, Montreal, Quebec, Canada H3G 1Y6. E-mail: paul.khayat{at}mcgill.ca.
DOI:10.1523/JNEUROSCI.2784-05.2006
Copyright © 2006 Society for Neuroscience 0270-6474/06/260138-05$15.00/0