Abstract
Visual perception is influenced by attention deployed voluntarily or triggered involuntarily by salient stimuli. Modulation of visual cortical processing by voluntary or endogenous attention has been extensively studied, but much less is known about how involuntary or exogenous attention affects responses of visual cortical neurons. Using implanted microelectrode arrays, we examined the effects of exogenous attention on neuronal responses in the primary visual cortex (V1) of awake monkeys. A bright annular cue was flashed either around the receptive fields of recorded neurons or in the opposite visual field to capture attention. A subsequent grating stimulus probed the cue-induced effects. In a fixation task, when the cue-to-probe stimulus onset asynchrony (SOA) was <240 ms, the cue induced a transient increase of neuronal responses to the probe at the cued location during 40–100 ms after the onset of neuronal responses to the probe. This facilitation diminished and disappeared after repeated presentations of the same cue but recurred for a new cue of a different color. In another task to detect the probe, relative shortening of monkey's reaction times for the validly cued probe depended on the SOA in a way similar to the cue-induced V1 facilitation, and the behavioral and physiological cueing effects remained after repeated practice. Flashing two cues simultaneously in the two opposite visual fields weakened or diminished both the physiological and behavioral cueing effects. Our findings indicate that exogenous attention significantly modulates V1 responses and that the modulation strength depends on both novelty and task relevance of the stimulus.
SIGNIFICANCE STATEMENT Visual attention can be involuntarily captured by a sudden appearance of a conspicuous object, allowing rapid reactions to unexpected events of significance. The current study discovered a correlate of this effect in monkey primary visual cortex. An abrupt, salient, flash enhanced neuronal responses, and shortened the animal's reaction time, to a subsequent visual probe stimulus at the same location. However, the enhancement of the neural responses diminished after repeated exposures to this flash if the animal was not required to react to the probe. Moreover, a second, simultaneous, flash at another location weakened the neuronal and behavioral effects of the first one. These findings revealed, beyond the observations reported so far, the effects of exogenous attention in the brain.
Introduction
Information processing in the brain is regulated by selective attention. Attended targets are processed faster and in greater detail than unattended ones. The benefits of top-down attentional control have been widely demonstrated behaviorally and physiologically (for recent reviews, see Carrasco, 2011; Bisley, 2011; Gilbert and Li, 2013). Attentional modulation of neuronal responses is seen even at the earliest stage of visual cortical processing (Haenny and Schiller, 1988; Motter, 1993; Roelfsema et al., 1998; Ito and Gilbert, 1999; McAdams and Reid, 2005; Khayat et al., 2006; Roberts et al., 2007; Chen et al., 2008; Thiele et al., 2009; Chalk et al., 2010; Pooresmaeili et al., 2010, 2014; Briggs et al., 2013). In particular, response properties of V1 neurons can be specifically altered by perceptual tasks to convey more information about task-related stimulus features (Li W, et al., 2004, 2006, 2008; McManus et al., 2011).
Whereas endogenous attention facilitates processing of behaviorally relevant information, exogenous attention can be more potent, faster acting, and is regardless of the ongoing task (Müller and Rabbitt, 1989; Nakayama and Mackeben, 1989), allowing for rapid reactions to unexpected stimuli of potential significance. By definition, a stimulus that can capture attention exogenously is said to be salient. Hence, exogenous attention is often investigated by comparing the behavioral effects of a salient stimulus (cue) on a task involving a subsequent stimulus (probe) at the cued versus an uncued location. Although psychophysical studies have consistently shown improved behavioral performance at the cued location, knowledge about the neural correlates of exogenous attention is largely limited to brain regions associated with attentional guidance such as the frontal and parietal areas (e.g., Schall and Hanes, 1993; Gottlieb et al., 1998; Bisley and Goldberg, 2003; Buschman and Miller, 2007; Katsuki and Constantinidis, 2012) and the superior colliculus (e.g., McPeek and Keller, 2002; Fecteau et al., 2004; Fecteau and Munoz, 2005). In the current study, we examined the effects of a sudden onset of a salient, novel, cue on neuronal responses to a subsequent probe in the primary visual cortex (V1) of awake monkeys performing either a fixation or a detection task.
Materials and Methods
Animal preparations.
Four adult male monkeys (Macaca mulatta, 4–9 years, 7.8–15 kg, named MH, MI, MJ, and M7, respectively) were used. All procedures were in compliance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals and were approved by the Institutional Animal Care and Use Committee of Beijing Normal University.
Surgical procedures were performed in an aseptic environment and under general anesthesia induced with ketamine (10 mg/kg) and maintained with isoflurane (1.0–2.5% mixed in O2). Heart rate, tidal CO2, blood oxygen, and body temperature were continuously monitored and carefully maintained during the surgery. Antibiotics (ceftriaxone sodium) and analgesics (tramadol hydrochloride) were given after the surgery.
A biocompatible titanium head restraint (a small post on a cross-shaped pedestal with screw holes) was first attached to the animal's skull with titanium bone screws. After full recovery, the monkeys were trained to perform a simple fixation task for ∼2 weeks, with the head post attached to a customized primate chair for restraining head movements. One monkey (Monkey M7) was further trained in a detection task (see Visual stimuli and behavioral tasks) for 1 month, with several thousands of trials per day. After the behavioral training was completed, a microarray of electrodes (Blackrock Microsystems, 6 × 8 electrodes spaced 0.4 mm apart, ∼0.5 mm in length, 0.5–1.0 mΩ impedance at 1 kHz) was implanted, for each monkey, in the V1 regions corresponding to eccentricities between 2 and 5° in the lower visual field (see Fig. 1A,B).
Visual stimuli and behavioral tasks.
Visual stimuli were generated by a stimulus generator (ViSaGe MKII, Cambridge Research Systems) on a 22-inch CRT monitor (Iiyama HM204DTA, 1200 × 900 pixels at 100 Hz, 100 cm viewing distance). We adapted a classical stimulus paradigm that is known to reliably activate exogenous attention in humans (Posner and Cohen, 1984; Müller and Rabbitt, 1989). The two possible locations for the cue were marked on the monitor throughout an experimental session by two ring-shaped placeholders (outer diameter 2°, inner diameter 1.5°; see Fig. 1C), which distinguished from the homogeneous background (4.12 cd/m2) by pixelation using 0.05° square dots with random luminance values between 4.12 and 12.37 cd/m2 (mean 8.25 cd/m2). One ring was centered on an arbitrarily chosen receptive field (RF) to encompass most of the recorded V1 sites (see Fig. 1B); the other ring was placed symmetrically around the fixation point in the opposite visual-field quadrant (see Fig. 1C).
To avoid contamination of neuronal responses by potential top-down factors, such as selective attention, anticipation, and perceptual task, which have been shown to markedly modify V1 responses (Li W, et al., 2004, 2006, 2008; McManus et al., 2011), three monkeys (MH, MI, and MJ) were only trained to maintain fixation during V1 recordings, rendering the cue and probe stimuli entirely irrelevant to any perceptual tasks and thus allowing isolation of possible modulatory effects by exogenous attention.
A trial began when the monkey directed its gaze into a fixation window of 0.6° in radius centered on the fixation point. The trial was led by a fixation time (see Fig. 1C, ΔT), which was a random number varied between 600 and 1200 ms in most of the experiments but was fixed at 600 ms for Monkey MJ in the tests shown in Figures 2 and 6. After this period of fixation time, one or the other of the two pixelated rings (placeholders), with equal probability, was abruptly turned into a uniform and much brighter ring (the cue, 39.29 cd/m2), which lasted 60 ms before returning to the dim, pixelated pattern. With a delay relative to the cue onset (stimulus onset asynchrony [SOA]), a small square-wave grating patch appeared concentrically within each ring for 500 ms as the probe stimulus (1.0–1.2° diameter; 2.0 cycle/° spatial frequency; 20.62 cd/m2 mean luminance against the CRT background of 4.12 cd/m2; stationary for Monkeys MI and MJ, drifting at 2.0 Hz for Monkey MH). Unless stated otherwise, the probe gratings were fixed at 10% Michelson contrast, and the cue-to-probe SOA was 150 ms. The grating orientation matched the preferred orientation of the chosen RF that defined the stimulus center (see above). The monkey had to maintain fixation within the fixation window throughout the trial in exchange for a drop of liquid reward. An infrared tracking device (Matsuda et al., 2000) was used to sample eye positions at 30 Hz. This eye tracking system is able to detect a systematic gaze shift as small as 0.05° (Li W, et al., 2006).
To ensure that our stimulus paradigm was able to generate, in monkeys, similar behavioral effects reported in humans performing exogenous cueing tasks, another monkey (Monkey M7) was extensively trained for 1 month on a detection reaction time (RT) task before implantation of the microelectrode array in V1. The experimental design was similar to that shown in Figure 1C, except for the following modifications. The animal pulled a lever to start a trial. The mean luminance of probe gratings was set to the CRT background (4.12 cd/m2) instead of 20.62 cd/m2 used in the fixation task. Only one probe stimulus was presented, randomly and with equal probabilities, either on the recorded RFs or in the opposite visual field. To ensure that the animal would respond to the probe rather than the cue, any block of trials always included more than one cue-to-probe SOA. Different cueing conditions were randomly interleaved with equal probabilities: valid cue (the probe at the cued location), invalid cue (the probe in the visual field quadrant opposite to the cue), and, in some sessions, both cues (two simultaneous cues in the two opposite quadrants), or no cue at all. The animal had to release the lever for a reward within 800 ms after the probe onset while maintaining fixation until the lever release; only RTs fell within ±3 SD of the mean RT were included in data analysis. This behavioral design is similar to those used in studies of attentional orienting in humans (Posner et al., 1980; Berger et al., 2005).
Electrophysiological recording.
Multiunit activities of superficial layer V1 neurons were recorded with a 128-channel data acquisition system (Cerebus, Blackrock Microsystems). Spikes were detected by applying a voltage threshold with a signal-to-noise ratio of 3.5, and their waveforms were sampled and saved at 30 kHz for offline analyses. We did not intend to isolate single units, but this would not affect our results.
Before the exogenous cueing trials in each day, the RFs of V1 recording sites were mapped as rectangular minimum responsive fields using drifting square-wave gratings seen through a narrow aperture (0.3° wide, for details see Chen et al., 2014) when the animal was doing the fixation task. In brief, for each electrode, the mean firing rates as a function of aperture location were fitted by a Gaussian function. The goodness of fit was estimated using R2; only recording sites with R2 ≥ 0.7 were considered to have a clear RF profile. The RF center was defined as the center of the Gaussian, and its width as 2 × 1.96 SD of the Gaussian with the aperture width (0.3°) subtracted. After mapping the RF, a circular grating patch (6° in diameter) was used to determine the orientation tuning properties by a similar Gaussian fitting.
Results
Multiunit activities from V1 superficial layers (∼0.5 mm in depth from the cortical surface) were recorded with the implanted microelectrode arrays. The sizes of aggregated RFs of individual recording sites ranged between 0.2° and 1.3°. Modulation of V1 responses by the salient cue stimulus in the fixation task were examined in Monkeys MH, MI, and MJ; simultaneous, recordings of behavioral and neuronal data in the RT task were conducted in Monkey M7.
Cue-induced transient facilitation of V1 responses in the fixation task
For the physiological experiment (Fig. 1C), we defined cue-on or cue-away condition as when the cue was flashed around the recorded RFs or away in the opposite visual field, respectively. These two conditions were randomly mixed in a block of trials, each condition for 50 trials in a typical recording session. At 150 ms cue-to-probe SOA and 10% probe grating contrast (with a mean luminance of 20.62 cd/m2 against the CRT background of 4.12 cd/m2), the cue stimulus induced a transient enhancement of V1 responses to the center probe stimulus in the cue-on condition relative to the cue-away condition (Fig. 2A–C). This facilitation was delayed by ∼40 ms relative to the onset of neuronal response to the probe, or ∼80 ms after the probe onset, and lasted ∼100 ms.
To quantify the cue-induced response modulations, we divided the peristimulus time histogram (PSTH) into three consecutive time windows: 30–80, 80–180, and 180–400 ms since probe onset (Fig. 2A–C). These time windows, which were determined by eye based on rather consistent population data across the 3 animals, corresponded to the initial burst of neural responses to the probe, the transient period containing the noticeable cue-induced response enhancement, and the sustained response period afterward.
On average, the initial burst of neuronal responses to the probe (30–80 ms) was strong and little affected by the preceding cue: the PSTHs for the cue-on and cue-away conditions were nearly superimposed onto each other (Fig. 2A–C), and the mean cue-induced modulation, defined as the percentage of response change in the cue-on relative to the cue-away condition, was negligible (Fig. 2D). However, within the following time window (80–180 ms), a clear facilitatory modulation was seen in the population responses of all 3 animals (Fig. 2A–C), with a mean enhancement of 20%–50% (Fig. 2D). This enhancement, referred to as the cueing effect, was observed in most of the recording sites (Fig. 2E). The cueing effect was transient; it disappeared in the remaining part of sustained neuronal responses (180–400 ms). Therefore, the cue-induced facilitation was small within the entire period of neuronal responses (30–400 ms): 6.2% increment in the cue-on relative to the cue-away condition averaged across all recording sites from the 3 animals.
Because the cue-induced response modulations in V1 during the fixation task were qualitatively similar for all 3 animals (Fig. 2), we pooled the recording sites in many subsequent analyses.
The physiological cueing effect was observed by comparing the cue-on with the cue-away condition, analogous to comparing the behavioral difference between the valid-trial and invalid-trial conditions in the human psychological counterpart. Human studies have shown that multiple cues presented simultaneously at multiple locations drastically reduced or even eliminated the behavioral cueing effect (Posner and Cohen, 1984; Jingling et al., 2012). If the cueing effect in V1 is related to the psychological cueing effect rather than physical stimulation of V1 RFs by the cue per se, it should also become smaller when two identical cues are simultaneously presented at the two placeholder locations. We tested this cue-both condition, which was randomly interleaved with cue-on and cue-away conditions, in one animal (Monkey MH, Fig. 3). The cue-on and cue-both conditions evoked similar neural responses up to ∼80 ms after the probe onset, including the responses to the cue (−120 to −50 ms) as well as the initial responses to the probe (30–80 ms). However, only the cue-on but not the cue-both condition produced significant facilitatory modulation (relative to the cue-away condition) between 80 and 180 ms (Fig. 3B, right panel), even though identical stimuli were presented on the RF side. This suggests that the delayed, transient facilitation within this time window is unlikely due to a direct activation of the local RFs by the cue ring itself, but that the cueing effect is generated by a global mechanism that distinguishes between the cue-on and cue-both conditions. A deployment of exogenous spatial attention would be consistent with such a global mechanism. Similar results were also observed in another monkey (M7) performing the RT task instead of the simple fixation task (see data presented later).
SOA and contrast dependency of the cueing effect in V1 in the fixation task
It is well known that the behavioral effects induced by exogenous cueing in humans are dependent on the cue-to-probe SOA. We examined whether the physiological cueing effect that we observed in V1 also had a similar dependence. By randomly interleaving trials with SOA = 60, 120, 180, 240, 400, and 600 ms (Fig. 4A), we found that the strength of the cue-induced facilitation of V1 responses decreased with increasing SOA. This transient cueing effect was confined within a rather constant time window relative to the probe onset (∼80–180 ms) for SOAs ranging from 60 to 240 ms, and disappeared for larger SOAs. At a very short SOA of 60 ms, when the probe immediately followed the briefly flashed cue, the initial burst of neuronal responses to the probe (30–80 ms) was markedly suppressed in the cue-on relative to cue-away condition, suggesting a fast contextual inhibition by the ring-shaped cue stimulus; nevertheless, the cueing effect at 80–180 ms was still evident. These results are more clearly seen in Figure 4B, where the cue-induced modulations are plotted as functions of SOA.
The SOA dependence of the physiological cueing effect in V1, which was seen in the simple fixation task with minimal top-down interferences, is similar to the behavioral cueing effects reported in humans (Posner and Cohen, 1984; Müller and Rabbitt, 1989; Nakayama and Mackeben, 1989).
Awake monkey studies have shown a complex interaction between top-down attentional modulation and stimulus contrast in visual cortical areas (Reynolds et al., 2000; Martínez-Trujillo and Treue, 2002; Williford and Maunsell, 2006; Thiele et al., 2009). We next examined whether the cueing effect in V1 was also contrast dependent. By randomly interleaving trials of different probe contrasts (at zero contrast, the probe appeared as a homogeneous luminance disk), we found that the cueing effect within the time window of 80–180 ms decreased rapidly with increasing probe contrast (Fig. 5).
Similar dependencies of the V1 cueing effect on SOA and probe contrast were also observed in Monkey M7 performing the RT task (see results presented later).
Dependence of the cueing effect in V1 on the novelty of the cue in the fixation task
In the first animal (Monkey MH) used in the current study, after the aforementioned SOA and contrast tests (Figs. 4, 5), we noticed that an identical white cue with a fixed cue-to-probe SOA (150 ms) could reliably elicit the cueing effect in V1 within a small block of probing trials. However, if we repeated these probing tests with identical stimulus settings, there was a rapid reduction of the cueing effect (Fig. 6A, left group of empty bars). Because the cue and probe stimuli were irrelevant to, and even distracting for, the fixation task during V1 recordings, we speculated that the animals could learn to suppress them after repeated exposures, and that once the stimuli became familiar, they would summon less exogenous attention. We repeated similar tests with small blocks of probing trials in the other two animals (Monkeys MI and MJ) in the fixation task and observed consistent trends of habituation (Fig. 6B,C, left group of empty bars). We also noticed that, when the small blocks of identical trials were separated by other experiments with mixed trial conditions, such as those for the SOA and contrast tests, the habituation of V1 cueing effect was much slowed down (Fig. 6B, left empty bars).
Interestingly, the disappeared cueing effect in V1 was revived immediately after we changed the color of the cue in Monkey MH (Fig. 6A, cue 2–4), further suggesting the dependence of the cueing effect on the novelty of the cue. Similar phenomena were observed in the other two animals (Fig. 6B, cue 2–4; Fig. 6C, cue 2). The revived cueing effect in V1 was also subject to habituation if the new cue was repeatedly presented (Fig. 6A, cue 2; Fig. 6B, cue 4). Furthermore, when we switched back to the very first cue (cue 1, white) after testing with novel cues, a similar revival of the V1 cueing effect was seen (Fig. 6A,C, the right group of empty bars). In one monkey (Monkey MJ), we repeated the small blocks of probing trials with the original white cue, and again we observed a rapid decay of the cueing effect (Fig. 6C, empty bars on the right).
The series of consistent observations above (summarized in Fig. 6D) indicates that, in the absence of top-down influences other than a task demand to maintain fixation, the transient facilitation of neuronal responses in V1 is induced by novel stimuli that can capture exogenous attention.
Cue-induced modulations in a detection RT task
To examine whether our stimulus paradigm could evoke a corresponding behavioral effect, we trained another monkey (M7) in a RT task to respond to the probe onset (for more details, see Materials and Methods). Behavioral RTs and V1 neuronal responses were simultaneously recorded when the monkey performed the task.
The first experiment examined the effect of cue-to-probe SOA by randomly interleaving different SOAs within the same block of trials. Similar to that in the fixating monkeys, the cueing effect in Monkey M7's V1 (Fig. 7A,B) was substantial for small SOAs (<200 ms) and diminished for larger SOAs (compare with Fig. 4). Furthermore, this physiological cueing effect paralleled a behavioral one (compare Fig. 7B with Fig. 7D). At short SOAs, a significant reduction in RT was observed for the probe at the cued location (i.e., valid-cue) relative to the uncued location (i.e., invalid-cue) (Fig. 7C,D).
In the second experiment, we examined whether the cueing effects depended on the contrast of probe stimulus. Here we chose a cue-to-probe SOA of 150 ms, at which the cueing effect was around its peak in the previous experiment (Fig. 7B,D); however, to prevent the animal from simply responding to the cue onset over repeated trials with a single SOA, we mixed in each session 40%–50% catch trials with 600 ms SOA, which was significantly longer than the typical RTs of the animal. Similar to the fixating monkeys, Monkey M7's cueing effect in V1 decreased rapidly with increasing contrast of the probe gratings (Fig. 8A,B, compared with Fig. 5); this physiological cueing effect also paralleled the animal's behavioral performance (compare Fig. 8B with Fig. 8D).
The largest cueing effect was seen at 5% probe contrast in both V1 responses and in the animal's behavior. From the PSTH at this low contrast (Fig. 8A, top left panel), one may argue that the cueing effect could simply reflect the cue-evoked responses per se. To exclude this possibility, we conducted, in a separate session, the single- versus double-cue experiment on Monkey M7 under the detection task condition (Fig. 9). To maximize the cueing effect, we chose 150 ms cue-to-probe SOA and 5% probe contrast according to the results shown in Figures 7 and 8. Again, in this experiment, we mixed 50% catch trials with an SOA of 600 ms to force the animal to respond to the probe rather than the cue. Similar to our observations in the fixating monkeys (Fig. 3), even though the physical stimuli were identical in the vicinity of the recorded RFs, the V1 cueing effect was significantly larger in the cue-on than the cue-both condition (Fig. 9A,B). Consistent with the V1 modulations, Monkey M7's shortest RT to detect the probe was in the valid, single cue condition, whereas there was no significant difference between the RTs for the invalid-cue and both-cue condition (Fig. 9C).
We verified that the monkey seldom released the lever prematurely (before the probe onset) even for SOAs longer than the average RT, indicating that the monkey indeed responded to the probe rather than the cue. In particular, in the experiment examining the effects of cue-to-probe SOAs (Fig. 7), the percentage of lever releases that were within 800 ms after the probe onset was between 93.4% and 96.3% for every SOA including 600 ms and was 94.1% for no-cue trials in which the probe was presented without a preceding cue.
Analyses of eye tracking data excluded the possibility that the transient facilitation of V1 responses in the cue-on relative to the cue-away condition could result from eye movements reflexively locked to the onset of cue or probe. We calculated the mean eye traces and mean eye positional jitters averaged across the cue-on and cue-away trials, respectively, from the cue onset at −150 ms until 180 ms after the probe onset. For each monkey, considering all the recording sessions with SOA = 150 ms, the magnitudes of mean cueing effect (within 80–180 ms) were not significantly correlated (linear regression, test for significance of correlation coefficient, p > 0.05) with the differences of mean eye traces (or mean eye position jitters) between the cue-on and cue-away trials. Moreover, for all monkeys, the mean eye traces did not deviate systematically across trials by >0.05° at any time within −150 to 180 ms.
Discussion
An abrupt onset of a stimulus is known to capture exogenous attention, which facilitates subsequent processing at its location. We showed that flashing a task-irrelevant cue during fixation induced a transient enhancement of V1 responses to a subsequent low-contrast probe at the cued location, and that such a cue in the detection task was able to shorten the RTs to the probe. The physiological and behavioral cueing effects had a similar range of effective cue-to-probe SOAs, resembling exogenous attentional effects on human perception (Posner and Cohen, 1984; Müller and Rabbitt, 1989; Nakayama and Mackeben, 1989).
Modulation of V1 responses by exogenous attention
The cue-induced facilitation in V1 could not be simply due to the cue-evoked responses for several reasons. First, a single cue near the RFs, compared with when an additional cue was simultaneously flashed in the opposite visual field, produced significantly stronger facilitation (Figs. 3, 9), even though the RFs were stimulated by identical stimuli. Second, in the fixation task, the cue had little effect on the initial V1 responses to the probe for SOA ≥ 120 ms and even suppressed these responses for SOA = 60 ms while facilitating a later response component within a rather constant time window relative to probe onset (Fig. 4). Third, the cueing effect required the cue to be novel if the stimuli were task irrelevant (Fig. 6).
The cueing effect in V1 during the fixation task was locked to the probe onset; the cue-to-probe SOA mainly affected its strength. We speculate that exogenous attention is triggered by the cue, but its manifestation in V1 is mediated by a subsequent stimulus presented within an effective time window, making the modulatory effect contingent on the probe stimulus. Interestingly, in the detection task, the cue-induced enhancement of V1 responses was not that transient (Fig. 9A), probably due to top-down modulatory effects imposed by the detection task; nevertheless, the enhancement still depended on the SOA in a way similar to that in the fixating monkeys and similar to the SOA dependence of the animal's RT.
Selective attention interacts with stimulus contrast through complex gain-control mechanisms, such as contrast gain that shifts the dynamic range of neuron's contrast-response function (Reynolds et al., 2000; Martínez-Trujillo and Treue, 2002); response gain that scales the contrast-response function (Williford and Maunsell, 2006); and additive gain that is largely contrast independent (Thiele et al., 2009; Pooresmaeili et al., 2010). The V1 cueing effect that we observed was much stronger at low probe contrast (Figs. 5, 8), suggesting that it serves contrast-gain control. This is consistent with human behavior, in which attention captured reflexively by a cue boosts the apparent contrast of low-contrast gratings (Carrasco et al., 2004). Computationally (Zhaoping, 2014), enhanced neural responses should boost signal-to-noise or Fisher information in neural decoding for visual tasks, thereby improving task performance.
The temporal dynamics of the cueing effect observed in the fixation task, with minimal top-down confounding factors, is distinct from those reported in top-down attentional effects. The cue-induced V1 facilitation was transient, whereas top-down influences can last throughout the entire period of neuronal responses, as is seen in V1 (Li W, et al., 2004; Khayat et al., 2006; Thiele et al., 2009; Chalk et al., 2010; Pooresmaeili et al., 2010) and higher-order cortical areas (Treue and Martínez-Trujillo, 1999; Reynolds et al., 2000; Martínez-Trujillo and Treue, 2002; Williford and Maunsell, 2006). Moreover, the onset of the V1 cueing effect relative to probe onset (∼80 ms) is much earlier than the time required for a back propagation of top-down attentional signals from higher-order areas (Buffalo et al., 2010; Poort et al., 2012). The shorter latency and transient nature of the V1 cueing effect are similar to an effect of exogenous attention reported in monkey cortical area MT (Busse et al., 2008): a salient distracter can rapidly and transiently disrupt sustained, goal-directed, modulation of MT neuronal responses.
The time course of cue-induced facilitation of V1 neuronal responses is comparable with that seen in human event-related-potential studies. Attention captured by a cue does not affect the earliest event-related-potential component to a probe stimulus but modulates the later P1 component (110–140 ms in Fu et al., 2005; 120–150 ms in Hopfinger and West, 2006). Similarly, we showed that, in fixating monkeys, V1 responses to the probe were unaffected by the cue in the initial burst (∼30–80 ms after probe onset) but were facilitated afterward (∼80–180 ms).
The exogenous cueing effects on human behavior may be affected by the nature of the concurrent behavioral task, for example, whether the task is detection or discrimination and whether the task is easy or difficult. However, the dependence of these effects on cue-to-probe SOAs is similar across different tasks (Müller and Rabbitt, 1989; Nakayama and Mackeben, 1989; Berger et al., 2005; Wilschut et al., 2011), suggesting a common modulatory mechanism. This is in line with the similarity between the V1 cueing effects observed in the fixation and detection tasks in terms of their dependencies on SOA and probe contrast. However, this similarity merely suggests an association of the underlying brain network for exogenous attentional influence rather than a direct causality from the V1 physiology to the RT behavior. Indeed, at high probe contrasts, there is a discrepancy between an insignificant cueing effect in V1 and a still robust cueing effect in RT (Fig. 8B,D).
Bottom-up and top-down interactions
Our results support that both physical saliency and behavioral relevance matter for attention capture (Corbetta and Shulman, 2002; Burnham, 2007). An attention-grabbing stimulus and its influence, if behaviorally irrelevant and distracting to an ongoing task (e.g., fixation), could be suppressed after frequent exposures. In contrast, if a salient stimulus is associated with a subsequent task-relevant stimulus, exogenous attentional effect could remain. It has been suggested that the parietofrontal network implements a priority map by incorporating both bottom-up saliencies of stimuli and their behavioral relevance to guide top-down selection of target (Bisley and Goldberg, 2003; Ipata et al., 2006; Bisley, 2011; Katsuki and Constantinidis, 2012). This priority map may shape V1 responses via feedback modulations. In monkeys performing an RT task, neural correlates of exogenous cueing effects observed in the superior colliculus are also subject to top-down influences because they change with the predictability of upcoming stimulus locations (Fecteau et al., 2004; Fecteau and Munoz, 2005).
Considering the complex interactions between bottom-up and top-down factors in attentional capture, a simple fixation task, not requiring detection or discrimination of the probe stimulus after the cue, helps to minimize top-down contamination of the exogenous effects of our interest. Indeed, after introducing the RT task, we observed the following new features of V1 responses. First, the cueing effects did not disappear after 1 month's extensive training in the RT task using the same cue. Second, the cue-induced enhancement in V1 was not delayed and transient anymore but could start from the beginning of neuronal responses and remain until the animal responded to the probe stimulus. These new features are likely due to an interaction between the exogenous attentional effects evoked by the cue and the top-down influences imposed by the RT task. Nevertheless, the V1 modulatory effects in the fixation and detection tasks depended similarly on the cue-to-probe SOA, and this dependency resembles the SOA dependence of the animal's detection RT.
Exogenous attention captured by an oddball stimulus among others is also subject to the influences of top-down attention and past experience; for instance, it can be suppressed by focusing attention elsewhere (Joseph et al., 1997; Belopolsky and Theeuwes, 2010) or enhanced by detection training (Sireteanu and Rettenbach, 1995; Sigman and Gilbert, 2000). Neural correlates of these effects have been observed as early as V1 from human imaging (Sigman et al., 2005) and monkey electrophysiological (Lee et al., 2002) studies. In particular, after training monkeys in an oddball (Lee et al., 2002) or contour (Li W et al., 2008; Yan et al., 2014) detection task, a late response component, closely correlated with perceptual saliency of the stimulus, emerges in V1.
Converging evidence supports the idea that even V1, the earliest stage of visual cortical processing, serves as an adaptive processor for more efficient processing of task-relevant and familiar stimuli (Haenny and Schiller, 1988; Roelfsema et al., 1998; Li W, et al., 2004, 2008; McManus et al., 2011; Gilbert and Li, 2013; Yan et al., 2014; Poort et al., 2015). The current study revealed another aspect of the neural dynamics, modulation of V1 responses by task-irrelevant but novel and salient stimuli. This exogenous cueing effect in V1 is not surprising given accumulated evidence for a bottom-up saliency map in V1 to guide exogenous attention (Li Z, 2002; Zhaoping, 2008; Zhang et al., 2012). It is then natural to speculate that the effect of the exogenous attention guided by this saliency map could be exerted as early as V1, rather than being postponed, less economically, to higher visual areas. Future studies need to explore the origin of the cueing effect in V1, its consequences on visual information encoding and decoding, and its interaction with top-down influences.
Footnotes
This work was supported by National Key Basic Research Program of China Grant 2014CB846101 to W.L., the National Natural Science Foundation of China Grants (91432102 and 31125014 to W.L.; 31500851 to Y.Y.), the Gatsby Charitable Foundation to L.Z., and the 111 Project B07008. We thank Xibin Xu for technical assistance.
The authors declare no competing financial interests.
- Correspondence should be addressed to either of the following: Dr. Wu Li, State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, #19 Xinjiekouwai Street, Beijing 100875, China, liwu{at}bnu.edu.cn; or Dr. Li Zhaoping, Department of Computer Science, University College London, London WC1E 6BT, UK, z.li{at}ucl.ac.uk