Introduction
What we perceive depends critically on where we direct our attention. For example, attention to a location dramatically improves the accuracy and speed of detecting a target at that location. Attention has been shown not only to increase perceptual sensitivity for target discrimination but also to reduce the interference caused by nearby distracters. Moreover, attention is highly flexible and can be deployed in a manner that best serves the organism's momentary behavioral goals: to locations, to visual features, or to objects. Attention can also be based on internal goals (e.g., finding a familiar face in the crowd) or depend on the external environment (e.g., as when a loud alarm sounds).
A large body of evidence exists from both neurophysiology and functional magnetic resonance imaging (fMRI) concerning the neural mechanisms subserving attention. In this review, we focus on the parallels and differences that exist between findings revealed by these two techniques. We also highlight important contributions to the field of attention that have been brought about by fMRI studies. We discuss the effects of attention on activity within areas of visual cortex, as well as the evidence for the existence of source areas exerting top-down attentional control that are thought to provide biasing signals that are observed in target areas in visual cortex.
In general, in linking single-cell physiology and fMRI studies, it is assumed that the effects observed with fMRI directly reflect the summed responses of large populations of neurons. However, recent studies, including the seminal work by Logothetis and colleagues (Logothetis et al., 2001; Logothetis, 2003), have shown that the mapping between these two domains is not so simple. It has been suggested that blood oxygenation level-dependent (BOLD) signals reflect the incoming inputs into an area as well as the processing of this input information by the local cortical circuitry, including excitatory and inhibitory interneurons. Thus, in some cases fMRI may reveal significant activation that may have no counterpart in single-cell physiology. In such cases, local field potentials (LFPs), which also reflect both input signals and local intracortical processing, appear to correlate better with fMRI signals. Interestingly, the acquisition of LFPs by physiologists is becoming more prevalent (Fries et al., 2001) and may help bridge the gap between results from single-cell and fMRI studies.
The effect of attention on sensory processing
Visual attention has been studied extensively in single-cell recording experiments in monkeys. These experiments have shown that attention affects activity in areas of the brain that process stimulus features, such as color, motion, texture, and shape. In a typical experiment, two conditions are compared: an attended condition, in which the monkey focuses attention on a visual stimulus that is placed within the receptive field (RF) of a cell, and an unattended condition, in which the same visual stimulation conditions are present but the monkey focuses attention elsewhere in the visual field (e.g., at a fixation spot outside the RF). The paradigmatic finding is that when attention is directed to a single stimulus in the RF, there is an increase in the firing rate of neurons that respond to the attended stimulus (Motter, 1993, 1994). For instance, Reynolds et al. (2000) showed that a faint grating did not elicit a response in area V4 when it was ignored but elicited clear responses when it was attended. These studies reveal that attention increases the activity evoked by a stimulus at a given location, relative to that at an unattended location.
Neuroimaging studies of attention have also shown that attended locations are favored. Heinze and colleagues (Heinze et al., 1994) demonstrated with positron emission tomography (PET) that spatial attention leads to stronger activations in extrastriate cortex in the hemisphere contralateral to the attended side of a stimulus array spanning both hemifields. With the advent of fMRI and its greater spatial resolution, it has been shown that spatial attention exhibits fine retinotopy (Tootell et al., 1998; Brefczynski and DeYoe, 1999). In fact, the cortical topography of purely attention-driven activity precisely matches the topography of activity evoked by visual targets as revealed by retinotopic mapping.
Although the neuroimaging studies of spatially directed attention were driven primarily by findings from single-cell physiology, our understanding of how attention can be applied to visual features, such as color or motion, has been mainly advanced by imaging research. In a seminal paper, Corbetta et al. (1991) used PET to identify the neural systems involved in discriminating the shape, color, and speed of a visual stimulus. Their results demonstrated that selective attention to different features modulates activity in distinct regions of extrastriate cortex that are specialized for processing the selected feature. The results by Corbetta and colleagues (1991) cannot be solely accounted for in terms of spatial attention because all of the visual features were present in the same locations and were properties of the same objects. Thus, their results indicate that attention can directly affect the selection of specific visual features from spatial locations (Beauchamp et al., 1997; Clark et al., 1997; Wojciulik et al., 1998; Saenz et al., 2002). Interestingly, attention to features has only recently been investigated at the single-cell level (Treue and Martinez-Trujillo, 1999).
The physiology literature has reported attentional modulation of neuronal responses in many extrastriate cortical areas, including V2, V4, temporal–occipital area (TEO), and middle temporal area (MT). Relatively few reports, however, provide evidence that attentional modulation occurs in the primary visual cortex (V1). A similar picture also emerges from the literature on event-related potentials (ERPs). In sharp contrast, several recent fMRI studies have reported robust effects of attention on V1 responses, demonstrating that attentional selection operates very early in the visual pathway. Why are these effects commonly observed with fMRI but not with other techniques? Recent studies that have combined the high spatial resolution of fMRI with the high temporal resolution of EEG and magnetoencephalography (Noesselt et al., 2002) have demonstrated that the effects of attention on V1 activity do not take place during the initial stimulus-related response (∼60–90 msec). Instead, longer-latency activity (in the time range of 150–250 msec) was found to be strongly modulated by attention. Therefore, it appears that V1 is “reactivated” in the 150–250 msec time range after a stimulus within the focus of spatial attention (Martinez et al., 1999). Because the temporal resolution of fMRI is low (typically on the order of seconds), BOLD signals appear to reflect both initial feedforward processes and longer-latency feedback influences from other areas. For this reason, modulations of V1 activity may have been observed more reliably with fMRI. Interestingly, a recent fMRI study has shown that attention also modulates responses in the lateral geniculate nucleus of the thalamus in humans (O'Connor et al., 2002), contrary to previous single-cell studies in monkeys. Again, such discrepancies may be attributable to the relatively long latency of the attentional effect or the fact that functional imaging signals may preferentially reflect changes in the underlying neurophysiology other than spiking (Logothetis, 2003; Lauritzen and Gold, 2003).
Competition and the effects of attention
Although neural effects of attention are observed when single objects are viewed, the full expression of attention is perhaps best revealed in more challenging situations. In general, because the processing capacity of the visual system is limited, selective attention to one part of the visual field comes at the cost of neglecting other parts (Broadbent, 1958). Thus, several investigators have proposed that there is competition for neural resources (Grossberg, 1980; Bundesen, 1990; Desimone and Duncan, 1995). One instance of this proposal is the biased competition model of attention (Desimone and Duncan, 1995), according to which the competition among stimuli for neural representation, occurring within visual cortex itself, can be biased in several ways. One way is by bottom-up sensory-driven mechanisms, such as stimulus salience. For example, stimuli that are colorful or of high contrast will be at a competitive advantage. Another way is by attentional top-down feedback, which is generated in areas outside the visual cortex. For example, directed attention to a particular location in space facilitates processing of stimuli presented at that location. Stimuli surviving the competition for neural representation will have further access to memory systems for mnemonic encoding and to motor systems for guiding behavior.
Evidence for neural competition and how competition is biased by attention comes from single-cell studies in monkeys. It has been shown, for example, that in the absence of attention the neuronal response to a single effective stimulus in area V4 is reduced when an additional, ineffective stimulus is present in the RF (Reynolds et al., 1999). The reduced response to the paired stimuli suggests that the two stimuli within the RF interact with each other in a mutually suppressive way (Fig. 1A). Such correlates of competitive interactions have been observed in many visual processing areas of the ventral and dorsal processing streams (Moran and Desimone, 1985; Miller et al., 1993; Rolls and Tovee, 1995; Recanzone et al., 1997).
In human cortex, fMRI studies have revealed similar competitive interactions (Kastner et al., 1998, 1999, 2001). Complex, colorful visual stimuli were presented eccentrically in four nearby locations of the upper right quadrant of the visual field while subjects maintained fixation (Kastner et al., 1998). Because subjects counted the occurrence of T's or L's at fixation, the visual stimuli were unattended. The stimuli were presented under two different presentation conditions: sequential and simultaneous. In the sequential presentation condition, a single stimulus appeared in one of the four locations, then another appeared in a different location, and so on. In the simultaneous presentation condition, the same four stimuli appeared in the same four locations, but they were presented together. Thus, integrated over time, the physical stimulation parameters were identical in each of the four locations in the two presentation conditions. However, sensory suppression among stimuli within RFs could take place only in the simultaneous and not in the sequential presentation condition. Consistent with the results in monkeys, simultaneous presentations evoked weaker responses than sequential presentations in all activated visual areas, which included V1, V2, V4, TEO, V3A, and the MT complex (hereafter called area MT) (Fig. 1C). Importantly, the difference in activations between sequential and simultaneous presentations was smallest in V1 and increased in magnitude toward ventral extrastriate areas V4 and TEO and dorsal extrastriate areas V3A and MT. This increase in magnitude of the competitive effects across visual areas suggests that the sensory interactions were scaled to the RF size of neurons within these areas. That is, the small RFs of neurons in V1 and V2 would encompass only a small portion of the visual display, whereas the larger RFs of neurons in V4, TEO, V3A, and MT would encompass all four stimuli. Therefore, competitive interactions among the stimuli within RFs could take place most effectively in these more anterior extrastriate visual areas.
Further evidence that competitive interactions are scaled to RF size was obtained in a subsequent study in which the spatial separation between the stimuli was increased. In this manner, the suppressive interactions were progressively eliminated for areas V2, V4, and TEO for increasingly larger spatial separations (Kastner et al., 2001). Thus, by systematically varying the spatial separation among the stimuli and measuring the magnitude of suppressive interactions, it was possible to obtain an estimate of average RF sizes across several areas in the human visual cortex [see also Smith et al. (2001)]. From these experiments, it was estimated that, at an eccentricity of ∼5°, RF sizes were >2° in V1, in the range of 2–4° in V2, and ∼6° in V4. In TEO, the RFs were larger than in V4, but still confined to a single quadrant of the contralateral hemifield. Remarkably, these estimates are very close to those recorded in single cells in the homologous monkey visual areas (Gattass et al., 1981, 1988).
Single-cell recording studies have also demonstrated that spatially directed attention can bias the competition among multiple stimuli in favor of one of the stimuli by modulating competitive interactions (Reynolds et al., 1999). In particular, in areas V2 and V4 it has been shown that spatially directed attention to an effective stimulus within the RF of a neuron counteracts the suppressive influence induced by a second, ineffective stimulus presented within the same RF (Fig. 1B). In parallel fMRI studies, it has been shown that similar mechanisms occur in the human extrastriate visual cortex. As in monkeys, the effect of spatially directed attention in humans is to reduce the suppressive effect exerted by multiple competing visual stimuli. Kastner et al. (1998) studied the effects of spatially directed attention on multiple competing visual stimuli in a variation of the paradigm used to examine competitive interactions among simultaneously presented stimuli, described above. In addition to the two different visual presentation conditions, sequential and simultaneous, two different attentional conditions were tested, such that the eccentrically presented stimuli were either unattended or attended. As before, in the unattended condition, subjects maintained fixation and counted the occurrences of T's and L's. In the attended condition, subjects maintained fixation but covertly shifted attention to the stimulus location closest to fixation. The same areas in striate and extrastriate cortex were activated during both the unattended and attended conditions, namely, V1, V2, V4, TEO, V3A, and MT. However, in the attended condition, the extent of activation increased significantly in V4, TEO, V3A, and MT. These enhanced responses were found for both sequentially and simultaneously presented stimuli. More importantly, however, and in accordance with the findings from monkey physiology, directed attention led to greater increases of fMRI signals to simultaneously presented stimuli than to sequentially presented stimuli (Fig. 1C). Additionally, the magnitude of the attentional effect scaled with the magnitude of the suppressive interactions among stimuli, with the strongest reduction of suppression occurring in areas V4 and TEO, suggesting that the attentional effects, like the suppression effects, scaled with RF size. These findings support the idea that directed attention enhances information processing of stimuli at the attended location by counteracting suppression induced by nearby stimuli, which compete for limited processing resources. In essence, as a result of attention, irrelevant distracting information is effectively filtered out.
It therefore appears that, at the neural level, an important consequence of attention is to enhance the influence of behaviorally relevant stimuli at the expense of irrelevant ones, providing a mechanism for the filtering of distracting information in cluttered visual scenes. In all, the predictions from single-cell studies were confirmed in the fMRI investigations with humans, revealing similar types of interactions and effects of attention in monkey and human cortex. But what is the source of these attentional effects? It is here that fMRI has expanded our knowledge of attentional processes beyond that provided by single-cell studies by revealing brain regions that appear to control attention, i.e., generate signals that modulate activity in visual processing areas. We now turn to this topic.
Attentional control areas
A major advance in the study of attention comes from fMRI studies that have revealed a distributed system of brain regions that control attention. As reviewed above, neuroimaging studies of visual attention have helped to elucidate how the processing of visual information is enhanced for attended compared with unattended information. In addition to examining activations within visual cortex, it has proven informative also to examine whether other brain regions are routinely recruited by attentional tasks. From single-cell physiology, one might anticipate that at least parietal cortex would be involved in attentional control (for review, see Goldberg et al., 2002). Interestingly, frontal and parietal regions, including areas in the superior parietal lobule (SPL), the intraparietal sulcus (IPS), the frontal eye field (FEF), and the supplementary eye field (SEF) have been consistently activated in various tasks involving spatially directed attention (Corbetta et al., 1993; Fink et al., 1997; Nobre et al., 1997; Vandenberghe et al., 1997; Corbetta et al., 1998; Culham et al., 1998; Gitelman et al., 1999; Kim et al., 1999; Rosen et al., 1999). In addition, but less consistently, activations in the lateral prefrontal cortex in the region of the middle frontal gyrus (MFG) and the anterior cingulate cortex (ACC) have also been observed. A common feature among the visuospatial tasks in the experiments revealing such a frontoparietal network of regions is that subjects were asked to maintain fixation at a central fixation point and to direct attention covertly to peripheral target locations to detect a stimulus (Corbetta et al., 1993, 1998; Nobre et al., 1997; Gitelman et al., 1999; Kim et al., 1999; Rosen et al., 1999), to discriminate it (Fink et al., 1997; Vandenberghe et al., 1997; Kastner et al., 1999), or to track its movement (Culham et al., 1998). Thus, there appears to be a general spatial attention network that operates independently of the specific requirements of the visuospatial task.
Imaging results revealing a frontoparietal network of regions involved in attention were presaged by neuropsychological research with patients with brain lesions (Mesulam, 1981; Posner and Peterson, 1990). Visuospatial neglect (or “extinction”), which involves a disruption of spatial attention in the contralesional visual hemifield, may follow unilateral lesions at very different sites, including the parietal lobe, especially its inferior part and the temporoparietal junction (Vallar and Perani, 1987), regions of the frontal lobe (Heilman and Valenstein, 1972; Damasio et al., 1980), and the anterior cingulate cortex (Janer and Pardo, 1991). Lesions of prefrontal regions disrupt at least three types of interrelated attentional processes (Stuss and Knight, 2002): task switching, filtering out of irrelevant information, and gating sensory signals. At the same time, bilateral parietal damage may result in Balint's syndrome (Rafal, 1997), a neurological disorder with symptoms that include severe problems initiating and executing saccades (psychic paralysis of gaze) and a tendency to be unaware of more than one visual target at a time (simultanagnosia). Thus, studies of patients with brain damage suggested the existence of a distributed network of higher-order areas in frontal and parietal cortex that are important for attentional processing. However, because lesions are usually large and it is often difficult to determine the sites responsible for the impairments, the precise delineation of the attentional network primarily comes from the results of fMRI studies. Interestingly, parietal and frontal regions implicated in attentional control overlap those activated in eye-movements tasks, suggesting that the associated mechanisms invoke similar neural systems (Corbetta et al., 1998; Beauchamp et al., 2001).
Although activations outside of visual cortex in attention tasks have been indicative of regions involved in attentional control, in many of these studies it was not possible to separate signals associated with visual cues that prime the subject to expect potential subsequent visual targets from signals associated with the attended targets themselves. This was because cues and targets typically follow each other in rapid succession. More recent neuroimaging studies, however, have attempted to explicitly investigate top-down modulation in attentional paradigms by disentangling cue- and target-related activity by introducing, for instance, a longer interval between the two (Kastner et al., 1999; Hopfinger et al., 2000). In this way, the effects of attention in the presence and absence of visual stimulation can be assessed. The reasoning is that purely target-related activity should be observed in visual processing areas that respond to the specific stimulus (e.g., area MT to moving stimuli). By contrast, expectation- or cue-related activity that is uncontaminated by ensuing target-related activity should reflect mainly “top-down” signals and be observed in regions of the brain that control attention.
Attentional biasing signals in the human visual cortex in the absence of visual stimulation was investigated by adding a third experimental condition to the design used to study competitive interactions and their modulation by attention described above (Kastner et al., 1999). In addition to the two visual presentation conditions, sequential and simultaneous, and the two attentional conditions, unattended and attended, an expectation period preceding the attended presentations was introduced. The expectation period was initiated by a marker (the cue) presented briefly next to the fixation point 11 sec before the onset of the stimuli. At the appearance of the marker, subjects covertly shifted attention to the peripheral target location in anticipation of a target stimulus that would appear there. In this way, the effects of attention in the absence and presence of visual stimulation could be identified. Directed attention in the absence of visual stimulation activated the same distributed network of areas as directed attention in the presence of visual stimulation and consisted of the FEF, the SEF, and the SPL. The increase in activity in these frontal and parietal areas during the expectation period (in the absence of visual input) was sustained throughout the expectation period and the attended visual presentations. These results suggest that the activity reflected the attentional operations of the task per se and not the effects of attention on visual processing.
Converging evidence for a frontoparietal network of regions involved in attentional control comes from additional imaging studies. For example, by using a spatial attention Posner-type task, Corbetta et al. (2000) showed that the IPS was uniquely active when attention was directed toward and maintained at a relevant location (preceding target presentation), suggesting that the IPS is a top-down source of biasing signals observed in visual cortex. When studying attention to motion, the same investigators (Shulman et al., 1999) found cue-related activity in the FEF, as well as in several sites in the IPS. Finally, Hopfinger et al. (2000) obtained evidence for a wider attentional control network, including the superior frontal gyrus, mid-frontal gyrus, SPL, and IPS, as well as superior temporal gyrus. This latter region has been identified in lesion studies as a key site responsible for neglect (Karnath et al., 2001). Interestingly, a recent fMRI study indicates that the top-down control of attention to visual features draws on a network of areas that essentially overlaps with the one revealed by spatial attention tasks (Giesbrecht and Mangun, 2003; Giesbrecht et al., 2003). This raises the possibility that the control of attention may rely on a common network of brain regions, regardless of the attributed attended [see also Vandenberghe et al. (2001)].
Recent event-related fMRI studies are also beginning to clarify the dynamics of attentional control. In one study, Yantis et al. (2002) showed that shifting the location of attention produces a rapid, transient increase in SPL activation. Such transient activity in SPL was also observed when subjects viewed a stream of spatially superimposed houses and faces and switched their attention from the house stream to the face stream or vice versa (Yantis and Serences, 2003; J. T. Serences, J. Schwarzbach, S. M. Courtney, X. Golay, S. Yantis, unpublished observations). Thus, it appears that the SPL is transiently engaged during a “switch event” and may provide a biasing signal to visual cortex that affects on-going processing to favor the attended location or object.
Taken together, the above studies provide evidence that a distributed frontoparietal attentional network may be the source of feedback that generates the top-down biasing signals modulating activity in visual cortex. This would explain the finding that functional brain imaging studies using different visuospatial attention tasks have described very similar attentional networks. Recently, Corbetta and Shulman (2002) have proposed that there exist two, and not just one, anatomically segregated but interacting networks for spatial attention. According to their scheme, a dorsal frontoparietal system is involved in the generation of attentional sets associated with goal-directed stimulus-response selection. Key nodes within this essentially bilateral network would include the SPL, IPS, and the FEF. A second, ventral system, which is strongly lateralized to the right hemisphere, is proposed to detect behaviorally relevant stimuli and to work as an alerting mechanism for the first system when these stimuli are detected outside the focus of processing. This latter network is thought to involve the temporoparietal junction (at the intersection of the inferior parietal lobule and the superior temporal gyrus), thought to be a key region affected in neglect, and the middle and inferior frontal gyri. Overall, the dorsal and ventral networks can be viewed as subserving, respectively, endogenous and exogenous spatial attention functions. Although previous studies explicitly comparing these two functions revealed primarily overlapping networks (Kim et al., 1999; Rosen et al., 1999), because of the interacting nature of the two networks, more subtle event-related designs may have been required to reveal the anatomical specificities of the two.
Interactions between control and target areas
A key hypothesis of the work reviewed above is that regions that control attention in frontal and parietal cortex are involved in the generation and control of attentional top-down feedback signals that bias processing in favor of attended items. Consistent with this proposal, there exists an anatomical substrate for such top-down influences as revealed by tract-tracing studies in monkeys, which have demonstrated direct feedback projections to extrastriate areas V4 and TEO from parietal area LIP and to inferior temporal cortex area TE from prefrontal cortex, as well as indirect feedback projections to areas V4 and TEO from prefrontal cortex via area LIP (Cavada and Goldman-Rakic, 1989; Ungerleider et al., 1989; Webster et al., 1994).
What is the evidence that the top-down biasing signals generated in frontal and parietal areas produce a change within visual cortex so that visually evoked activity there is enhanced? Single-cell recording studies have shown that spontaneous (baseline) firing rates are 30–40% higher for neurons in areas V2 and V4 when a monkey is cued to attend covertly to a location within the RF of the neuron in expectation of a stimulus but before it is presented there, that is, in the absence of visual stimulation (Luck et al., 1997; Reynolds et al., 1999). A similar effect was demonstrated for neurons in dorsal stream area LIP (Colby et al., 1996). This increased baseline activity, termed the “baseline shift,” has been interpreted as a direct demonstration of a top-down signal that feeds back from higher-order control areas to lower-order processing areas. Such a shift in baseline activity in visual cortex would presumably “sensitize” neurons with RFs at the attended location, so that when a stimulus subsequently appears at that location there would be enhanced visually evoked activity. In this way, stimuli at attended locations would be biased to “win” the competition for processing resources at the expense of stimuli appearing at unattended locations.
In the fMRI study with an expectation period described above (Kastner et al., 1998), in addition to the observed frontal and parietal activations, visual processing areas were also activated. This latter activity was related to directing attention to the target location in the absence of visual stimulation, in anticipation of the target stimulus that would appear there. Notably, the increase in activity during these expectation periods was topographically specific, inasmuch as it was seen only in visual areas with a representation of the attended spatial location, specifically, in areas V1, V2, V4, and TEO. The increase of baseline activity during the expectation period was followed by a further increase of activity evoked by the onset of the stimulus presentations. Increases in baseline activity are not only spatially specific, they also depend on the type of visual feature attended to. For instance, Chawla et al. (1999) showed that baseline activity in motion- and color-sensitive areas of human visual cortex (V4 and MT, respectively) was enhanced by selective attention to these visual attributes, although the stimuli did not actually move or change color [see also Shulman et al. (1999)].
The baseline increases found in human visual cortex (Chawla et al., 1999a; Kastner et al., 1999; Shulman et al., 1999) are thought to reflect increases in the spontaneous firing rate similar to those found in the single-cell recording studies (Colby et al., 1996; Luck et al., 1997) but summed over large populations of neurons. By increasing baseline activity, attention would increase sensitivity to a stimulus at a given location (in the case of spatial attention) or to the stimulus feature. Although increasing background activity might be thought to hamper the discriminability of a transient signal from noise, it has been shown that increased baseline activity may increase response sensitivity by increasing postsynaptic effects (Chawla et al., 1999b), thereby providing a competitive advantage for attended locations or visual features.
Interestingly, the baseline shift revealed by imaging is much larger than that observed at the neuronal level. One reason for this discrepancy is that, in the case of fMRI, a large population of neurons is being sampled. In this manner, small single-cell differences may summate, yielding greater effects. It is also possible that the larger increase in fMRI signal relative to unit activity reflects the contributions of enhanced synaptic activity, which is evident in fMRI signals but not in unit recordings (Logothethis, 2003). Future, combined single-cell and fMRI experiments are needed to resolve this issue.
In summary, neuroimaging studies reveal a distributed network of areas in frontal and parietal cortex that appear to be involved in the control of attention. This network exhibits great overlap with the set of regions implicated in visuospatial neglect in studies of patients with brain lesions (Mesulam, 1981; Rafal, 1998). A puzzling finding, however, is that many nodes of this distributed network appear to be coactivated across a wide variety of tasks. Parsing out the specific functional contributions that each region provides is a challenge for the years to come. In this context, imaging studies with humans with focal brain lesions, in which specific territories are compromised, may provide a fruitful avenue of research. The use of transcranial magnetic stimulation, in which cortical territories are transiently compromised, provides yet another strategy for addressing this problem.
Attention is needed to process visual stimuli
An implicit prediction of the biased competition model is that only items that survive the competition for neural representation in visual processing areas will impact on subsequent memory and motor systems. A related, but stronger, proposal has been advanced by Lavie (1995), who has suggested that the extent to which unattended objects are processed depends on the available capacity of the visual system. If, for example, the processing load of a target task exhausts available capacity, then stimuli irrelevant to that task would not be processed at all. Hence, perceptually such stimuli may not even reach awareness.
Consistent with this idea, psychophysical studies in the past decade have demonstrated that processing outside the focus of attention is attenuated and may be eliminated under some conditions. Rock and colleagues (1992) showed that even the simplest visual tasks are compromised when attention is taken up elsewhere, a phenomenon they termed “inattentional blindness.” Furthermore, in a striking demonstration, Joseph et al. (1997) showed that so-called “pre-attentive” tasks, such as orientation pop-out, require attention to be successfully performed. The necessity of attention for perception is perhaps most compellingly illustrated by “change blindness” studies (Rensink et al., 1997; Simons and Levin, 1997; Rensink, 2002), in which subjects may miss even very large changes in complex scenes, provided the changes are not associated with stimulus transients that capture attention.
But what is the fate of unattended stimuli? As we have seen, in extrastriate areas V2 and V4, single-cell studies in monkeys have shown that when an effective and ineffective stimulus are placed within the RF of a neuron, spatially directed attention to the effective stimulus results in a response similar to the one elicited by the effective stimulus when presented alone. Remarkably, spatially directed attention to the ineffective stimulus results in a response similar to the one elicited by the ineffective stimulus when presented alone. In essence, it is as if the unattended stimulus, be it the effective or ineffective one, were not in the RF (Reynolds et al., 1999). These findings suggest that, at the neural level, responses evoked by unattended items may be eliminated.
Recent fMRI investigations suggest that this is indeed the case. In these studies, the stimulus-evoked fMRI response is essentially abolished when subjects are engaged in a competing task with high attentional load. In one study, Rees et al. (1997) showed that moving stimuli did not elicit fMRI activation in area MT when subjects performed a concurrent, highly demanding linguistic task (Fig. 2A). In a related study, Rees et al. (1999) showed that activations associated with words were not elicited when subjects performed a concurrent, highly demanding object working-memory task. Thus, like the processing of visual motion, even word processing seems to require attention, contrary to claims for full automaticity (Van Orden et al., 1988; Menard et al., 1996).
A major exception to the critical role of attention may be in the neural processing of emotion-laden stimuli, which are reported to be processed automatically, namely, without attention (Vuilleumier et al., 2001; Ohman, 2002). For example, it has been reported that subjects exhibit fast, involuntary autonomic responses to emotional stimuli, such as aversive pictures or faces with fearful expressions (Wells and Matthews, 1994; Globisch et al., 1999). Other behavioral studies suggest that such autonomic responses to facial expressions occur not only “automatically” (Stenberg et al., 1998) but may even take place without conscious awareness (Ohman et al., 1995). This conclusion is also supported by imaging studies of the neural processing of emotional stimuli in the amygdala, a structure that is known to be important in emotion, particularly the processing of fear (LeDoux, 1996; Aggleton, 2000; Lane and Nadel, 2000). Such studies report that the amygdala is activated not only when normal subjects view fearful faces, but even when these stimuli are masked and subjects appear to be unaware of their occurrence (Morris et al., 1998; Whalen et al., 1998). Using the backward masking paradigms developed by Ohman and colleagues (Esteves and Ohman, 1993), Whalen et al. (1998) showed that fMRI signals in the amygdala were significantly larger during viewing of masked, fearful faces than during the viewing of masked, happy faces. In another study, Morris et al. (1998) combined backward masking with classical conditioning to investigate responses to perceived and nonperceived angry faces. Although the participants never reported seeing the masked, angry stimuli, the contrast of conditioned and nonconditioned masked, angry faces activated the right amygdala.
The view has thus emerged that the amygdala is specialized for the fast detection of emotionally relevant stimuli in the environment and that this can occur without attention and even without conscious awareness. If this were indeed the case, amygdala activity would reflect an obligatory response independent of the locus of spatial attention. Vuilleumier et al. (2001) tested this prediction in an fMRI study in which subjects fixated a central cue and matched either two faces or two houses presented eccentrically. Both fearful and neutral faces were used. As in earlier studies (Haxby et al., 1994; Wojciulik et al., 1998), activity in the fusiform gyrus, which is known to respond strongly to faces (Kanwisher et al., 1997; Haxby et al., 2001), was enhanced by attention. At the same time, Vuilleumier et al. (2001) failed to see evidence that attention modulated responses in the amygdala, regardless of stimulus valence. Not surprisingly, these results were interpreted as further evidence for obligatory activation of the amygdala by negative stimuli.
In a recent study (Pessoa et al., 2002), an alternative possibility was tested, namely, that the neural processing of stimuli with emotional content is not automatic and instead requires some degree of attention, similar to the processing of neutral stimuli. It was hypothesized that the failure to modulate the processing of emotional stimuli by attention in previous studies was caused by a failure to fully engage attention by a competing task. In other words, activation in the amygdala by emotional stimuli should resemble activation in MT to moving stimuli; if the competing task is of sufficiently high load, activation should be reduced or absent. Therefore, fMRI was used and activations were measured in the amygdala and other brain regions that responded differentially to faces with emotional expressions compared with neutral faces; the effect of attention on these responses was then investigated.
fMRI responses evoked by pictures of faces with fearful, happy, or neutral expressions were measured when attention was focused on them (attended condition) and compared with the responses evoked by the same stimuli when attention was directed to oriented bars (unattended condition). In designing the bar orientation task, Pessoa et al. (2002) chose one that was sufficiently demanding to exhaust all attentional resources on that task and leave little or none available to focus on the faces, although they were viewed foveally during the bar orientation task. It was found that attended compared with unattended faces evoked significantly greater activations bilaterally in the amygdala for all facial expressions. Importantly, there was a significant interaction between stimulus valence and attention. That is, the differential response to stimulus valence was observed only in the attended condition (Fig. 2B). Moreover, for the unattended condition, responses to all stimulus types were equivalent and not significantly different from zero. Thus, amygdala responses to emotional stimuli are not automatic and instead require attention. Interestingly, responses to faces in the fusiform gyrus, which is known to be highly responsive to face stimuli (Kanwisher et al., 1997; Haxby et al., 2001), also required attention (Fig. 2C). In fact, all brain regions responding to faces needed attention to elicit a response. Thus, as demonstrated in single-cell studies, unattended visual stimuli can be completely filtered out.
Conclusions
Although available for only slightly more than a decade, fMRI has greatly contributed to our understanding of attention. Although in many cases the imaging work followed leads provided by single-cell or ERP studies, the converse is also true. For instance, the PET studies by Corbetta et al. (1991) on attention to visual features (as opposed to spatial locations) antedated analogous demonstrations in physiology. Additionally, recent imaging studies have been instrumental in revealing a network of regions involved in the control of attention. Although presaged by studies of neglect and other attentional deficits, this work has isolated specific nodes of the network, including the SPL and IPS in parietal cortex, and the FEF, SEF, MFG, and ACC in frontal cortex. This work is certain to inspire physiologists to reevaluate the role of structures such as the FEF and SEF, classically considered to be involved in the planning of eye movements [but see Thompson et al. (1997) and Bichot and Schall (1999)], as well as that played by subregions of parietal cortex. Ultimately, however, a deeper understanding of the neural correlates of attention will emerge through the convergence of multiple techniques and approaches. Neuroimaging is certain to be a key player in this endeavor.
Footnotes
This work was supported by the National Institute of Mental Health Intramural Research Program. We thank Dr. Marcus Raichle for feedback on a previous version of this manuscript and David Sturman for assistance in the preparation of this manuscript.
Correspondence should be addressed to Dr. Luiz Pessoa, Laboratory of Brain and Cognition, 49 Convent Drive, Building 49, Room 1B80, National Institute of Mental Health, National Institutes of Health, Bethesda, MD 20892-4415. E-mail: pessoa{at}ln.nimh.nih.gov.
Copyright © 2003 Society for Neuroscience 0270-6474/03/233990-09$15.00/0