Reward-related mesolimbic dopamine steers animal behavior, creating automatic approach toward reward-associated objects and avoidance of objects unlikely to be beneficial. Theories of dopamine suggest that this reflects underlying biases in perception and attention, with reward enhancing the representation of reward-associated stimuli such that attention is more likely to be deployed to the location of these objects. Using measures of behavior and brain electricity in male and female humans, we demonstrate this to be the case. Sensory and perceptual processing of reward-associated visual features is facilitated such that attention is deployed to objects characterized by these features in subsequent experimental trials. This is the case even when participants know that a strategic decision to attend to reward-associated features will be counterproductive and result in suboptimal performance. Other results show that the magnitude of visual bias created by reward is predicted by the response to reward feedback in anterior cingulate cortex, an area with strong connections to dopaminergic structures in the midbrain. These results demonstrate that reward has an impact on vision that is independent of its role in the strategic establishment of endogenous attention. We suggest that reward acts to change visual salience and thus plays an important and undervalued role in attentional control.
Reward plays a fundamental role in human cognition, but surprisingly few studies have examined the impact of reward on early cognitive processes, such as visual perception and attention (but see Della Libera and Chelazzi, 2009). This is reflected in theory: models of visual attention propose that selection is guided by automatic exogenous factors, which bias attention toward salient stimuli, and volitional endogenous factors, which direct attention toward task-relevant objects and locations (Treisman and Gelade, 1980; Wolfe et al., 1989). Reward plays no explicit role in this framework.
In sharp contrast, theories of the function of dopamine in reinforcement learning and animal approach behavior place reward firmly at the center of attentional control (Berridge and Robinson, 1998; Ikemoto and Panksepp, 1999; Redgrave et al., 1999; Wise, 2004; Alcaro et al., 2007). For example, the “incentive salience hypothesis” of Berridge and Robinson (1998) proposes that reward-related mesencephalic dopamine is specifically responsible for changing the perceptual representation of reward-conditioned stimuli such that they become salient and attention-drawing. Other theories propose a more general role for dopamine in reinforcement learning but also suggest that reward has a direct impact on vision (Schultz et al., 1997; Schultz, 2002) (for review and integration of dopamine models, see McClure et al., 2003).
The direct nature of this influence deserves emphasis; the idea is that reward automatically changes the visual salience of reward-associated perceptual features, and this is theoretically distinct from the known role of reward in the strategic establishment of attentional set (Maunsell, 2004). Unfortunately, much of the extant literature is based on an experimental paradigm that fails to differentiate these manners of influence. In this type of experiment, human or animal observers are presented with stimuli that predict reward outcome for the current experimental trial. Results show that visual processing of objects that predict good outcome is better than processing of other objects (Platt and Glimcher, 1999; Ikeda and Hikosaka, 2003; Roesch and Olson, 2003; Kiss et al., 2008; Peck et al., 2009). However, stimuli that predict reward have an inherent aspect of motivational significance; a liquid-deprived monkey treats a cue indicating forthcoming liquid as a type of reward in itself, even when this information has no bearing on task performance (Bromberg-Martin and Hikosaka, 2009). Anticipation of reward cues might therefore trigger endogenous biases in visual attention. These biases would facilitate detection and discrimination of cue stimuli by enhancing visual processing, possibly from very early in visual cortex (Hopf et al., 2004). This makes it unclear whether the facilitated visual processing of reward-predictive stimuli reflects the direct impact of reward or the strategic impact of endogenous attention.
The current study was designed to determine whether reward has a direct impact on vision that is distinct from its impact via endogenous attention. Our approach to this problem was to associate reward to visual features that characterized objects participants were actively trying to ignore. To gain insight into the mechanisms involved in translating reward outcome into visual salience, we recorded both behavior and electrical brain activity while participants completed our experimental task.
Materials and Methods
In our general paradigm, which was used in all experiments, participants searched for a uniquely-shaped target presented among a number of homogenous distractors (Fig. 1a) (Theeuwes, 1991). Response was based on the orientation of a small line contained within the target. In some trials, all the stimuli were of the same color, but more often one of the distractors had a color that was different from all other objects. “Color singletons” such as this are known to draw attention to their location during search for a unique shape (Hickey et al., 2006). Two critical parameters of the task varied from trial to trial. First, the colors of the target and distractor changed such that the distractor could be red, with all other stimuli including the target green, or vice versa. The colors could therefore swap between trials, with the color of the target becoming that of the distractor, or could remain the same. Second, participants received either high-magnitude or low-magnitude monetary reward after correct responses. Importantly, participants were instructed to maximize reward, but reward magnitude was actually randomized on a trial-by-trial basis.
Forty healthy adults were randomly assigned to take part in experiment 1 (mean ± SD age, 20 ± 2 years; three left handed; 16 men), eight to the control experiment (mean ± SD age, 21 ± 2 years; all right handed; three women), and 14 to experiment 2 (mean ± SD age, 21 ± 3 years; all right handed; six women). All gave informed consent before taking part in the experiment and reported normal or corrected-to-normal vision, and none had previous experience with the experimental task.
Stimuli and procedure.
Experimental stimuli were very similar to those used by Hickey et al. (2006) with the addition of monetary feedback after participant response. Briefly, in experiment 1, visual search arrays contained six object outlines (line thickness of 0.3° visual angle), each presented equidistant (9.1°) from a central fixation point and from each other. Objects could be diamonds (4.2° × 4.2°) or circles (3.4° diameter), with each display containing only one uniquely shaped item. This unique item could be a diamond, with all other stimuli circles, or a circle, with all other stimuli diamonds. In 80% of trials, one of the homogenously shaped nontarget items was of unique color, either red with all other objects green or vice versa. Target and distractor colors were randomly determined for each trial. The onset of the visual search array was preceded by a fixation dot for a random duration of 400–1400 ms.
Each object contained a gray line (0.3° × 1.5°) randomly oriented vertically or horizontally. Responses were made on a standard computer keyboard; the “z” button pressed with the index finger of the left hand denoted a vertical target line, and the “m” button pressed with the index finger of the right hand denoted a horizontal target line. Correct responses to the search target were immediately followed by the replacement of the central fixation dot with an indication of reward feedback in blue text (65 point font; 5° height), either “+10,” denoting the receipt of 10 points, or “+1,” denoting the receipt of 1 point. The visual search display remained onscreen during the presentation of feedback, and the search display and feedback were presented together for 1000 ms. Incorrect responses resulted in the removal of 10 points, denoted by “−10.” Each point had a value of approximately 0.2 euro cent. Participants were paid based on the number of points received, but, because reward magnitude after correct response was random and most participants performed very well, there was little variability in compensation; across all experiments, participants never earned less than €8.50/h and never more than €9.25/h.
All parameters in the control experiment and in experiment 2 were as in experiment 1 with the following exceptions. In the control experiment, search arrays contained four objects (9.1° from fixation; 12.9° from each other), and, in experiment 2, arrays contained 10 objects (9.1° from fixation; 5.6° from each other). In experiment 2, the uniquely shaped item was always a circle (with all other stimuli diamonds), responses were made with a standard computer mouse (the left mouse button pressed with the index finger of the right hand denoted a vertical target line, and the right mouse button pressed with the middle finger of the right hand denoted a horizontal target line), and each reward point was worth approximately 0.4 euro cent (to compensate participants for the time required to set up the electroencephalographic recording).
Experiment 1 and the control experiment took place in a sound-attenuated room, and experiment 2 took place in a sound-attenuated Faraday cage. All stimuli were presented on a cathode ray tube monitor located 60 cm away from the eyes. Participants in experiment 1 and the control experiment completed 15 blocks of 30 trials for a total of 450 trials, which took ∼0.5 h, and participants in experiment 2 completed 45 blocks of 30 trials for a total of 1350 trials, which took ∼1.5 h. Feedback regarding accuracy and response speed followed the completion of each block. All participants were given detailed instructions regarding the experimental task. Importantly, in the predictive condition of experiment 1 and in the control experiment, participants were explicitly informed of the information encoded in reward magnitude. These participants were asked to describe how reward magnitude predicted features of the next trial and only began the experiment when they expressed clear understanding of this aspect of the experimental design.
Trials in which response occurred sooner than 100 ms after stimulus or later than 1500 ms after stimulus were rejected from all analysis, as were all trials resulting in error. Approximately 99% of correct responses occurred within this time window. All reaction times reflect mean averages of raw, untransformed data.
Electroencephalogram recording and analysis.
In experiment 2, electroencephalogram was recorded from 134 Ag/AgCl electrodes using the Biosemi Active2 system. This included 128 encephalic sites, two electrodes placed 1 cm lateral to the external canthi of each eye, two electrodes located 1 cm above and below the right eye socket, and two electrodes placed on the mastoids (500 Hz sample rate; encephalic electrode locations are illustrated in the figures). Independent component analysis (Bell and Sejnowski, 1995; Delorme and Makeig, 2004) was applied to the data and the results used for the rejection of trials tainted by eye movements (8 ± 4% of trials per participant, mean ± SD) and correction of artifacts stemming from blinks and muscle activity. Event-related potentials (ERPs) were calculated from the resulting data using standard signal averaging procedures (Luck, 2005), referenced to the average of mastoids, and digitally filtered (0.05–22 Hz; finite impulse least-square kernels with 6 db transition of 0.01 Hz for low-pass filtering and 6 db transition of 2 Hz for high-pass filtering). Statistics were calculated before filtering.
All ERPs were baselined on a period beginning 100 ms before stimulus onset and ending 50 ms after. The baseline was calculated before filtering. Visual activity in sensory relay nuclei does not create a potential at scalp surface, and the first ERP evidence of visual activity in unimodal visual experiments occurs between 50 and 60 ms after stimulus in the C1 component, making the first 50 ms of the visual ERP a valid aspect of the baseline (Rugg and Coles, 1995). During the editorial process, an anonymous reviewer asked us to explicitly address the possibility that the 50 ms poststimulus interval contained evoked activity. A paired t test comparing activity in the 50 ms preceding stimulus onset to the 50 ms after onset revealed no trend toward a reliable difference (t(14) = 0.755, p = 0.464; collapsed across experimental conditions and based on unfiltered data recorded at the same lateral posterior electrodes used to generate the visual ERPs presented in the figures). Note that, in ERPs elicited at anterior electrode sites, the application of filters resulted in an apparent “ramping up” of the N1 that begins at ∼45 ms after stimulus; this early aspect of the N1 is a filter artifact and does not reflect actual evoked activity. Because baseline was calculated before filtering, this had no impact on the calculation of ERP amplitude.
For the purposes of both topographic mapping and source analysis, data were referenced to the average of all 128 encephalic signals. Topographic maps are based on spherical spline interpolation (Perrin et al., 1989). Source analysis was conducted using BESA version 5.1 (Megis Software). The BESA algorithm operates by recursively seeding electrical sources in a three-shell elliptical model of a human head, calculating an estimate of the electrical potential at scalp surface generated by the model and comparing this with the recorded activity. The dipoles are adjusted in terms of position and orientation until the model optimally fits the recorded data, and a good fit is reflected in a low measure of residual variance.
Reward guides search automatically
Our interest lay in how reward in one trial affected visual processing in the next, and we made two predictions based on the idea that reward directly impacts salience. First, high-magnitude reward should facilitate processing of the features that characterize a target such that visual attention is biased toward these features in the next trial (Berridge and Robinson, 1998; Ikemoto and Panksepp, 1999; Della Libera and Chelazzi, 2009). Participants should therefore respond quickly when the same color characterizes the target as did so in the preceding trial but should respond slowly when the colors swap. Second, low-magnitude reward should result in a relative devaluation of features that characterize a target such that attention is less likely to be deployed to these features (Berridge and Robinson, 1998; Frank et al., 2004; Della Libera and Chelazzi, 2009). Participants should therefore respond slowly when the same color characterizes the target as did so in the preceding trial, but quickly when the colors swap. As illustrated in Figure 1b, these predictions were borne out in mean reaction time (reward × color swap, F(1,38) = 11.86, p = 0.001; all other effects involving reward and color swap factors, F values <1).
The pattern of results revealed in experiment 1 suggests that humans preferentially attend to objects with visual features associated with reward (Fig. 1b). It is important to note that the impact of reward is tied to color but that the target in this paradigm was defined by unique shape. Color was therefore task irrelevant, making it unlikely that the influence of reward reflects endogenous attentional set; there was no strategic motivation to pay color heed. However, the basic paradigm does not allow us to rule out the possibility that participants adopted a strategy of attending to stimuli characterized by reward-associated perceptual features, possibly motivated by the benefits this strategy provides in life outside the laboratory. To determine whether such a strategic account for the results was viable, we included a between-participant factor in the design of experiment 1. For half of participants, reward magnitude was unrelated to the likelihood that the target and distractor colors would swap between trials. For the others, high-magnitude reward was 80% predictive of a swap in colors, and low-magnitude reward was 80% predictive of no swap. These latter participants were informed of the relationship between reward and color swap and told to use this information to optimize performance.
Our reasoning was that participants in the predictive condition should discard the useless strategy of maintaining attentional set after high-magnitude reward and adopt the beneficial strategy of preparing for the colors to swap (and, correspondingly, discard the useless strategy of preparing for a color swap after low-magnitude reward and adopt the beneficial strategy of preparing for the target and distractor colors to remain the same). Importantly, were they to do so, the experimental results in this condition would be opposite those illustrated in Figure 1b: they would be faster when a color swap followed high-magnitude reward (rather than slower) and slower when no swap followed low-magnitude reward (rather than faster). In fact, results from the two conditions were statistically indistinguishable (Fig. 2) (main effect of predictiveness, F(1,38) = 1.20, p = 0.28; all other effects involving predictiveness, and critically the three-way interaction, F values <1). The interaction between reward and color swap was reliable when both the nonpredictive condition (Fig. 2a) (F(1,19) = 7.24, p = 0.01; all other F values <1) and the predictive condition (Fig. 2b) (F(1,19) = 4.69, p = 0.04; all other F values <1) were examined in isolation. These results are inconsistent with the idea that the impact of reward on search reflects endogenous strategy; participants in the predictive condition were explicitly aware that high-magnitude reward meant that the colors would likely swap in the next trial. Under these circumstances, they had no strategic motivation to continue to attend to objects characterized by the same color. This strategy would in fact be counterproductive, guiding attention to an object that was unlikely to be the target. Despite this, the results show a continuing propensity to select the stimulus characterized by the color recently associated with reward.
There is a small possibility that results from the predictive condition of experiment 1 reflect an inability or unwillingness to extract information encoded in reward. Participants in this condition were made explicitly aware that reward predicted the likelihood of a color swap, and all participants could clearly describe this aspect of the experiment before beginning the task. However, if for any reason they did not take the information encoded in reward into account during experimental participation, this might account for the absence of any difference between the predictive and nonpredictive conditions. To ensure that this was not the case, we conducted a control experiment. Here, reward predicted target location rather than the likelihood of a color swap: high-magnitude reward indicated that the target in the next trial was more likely to be on the horizontal meridian, and low-magnitude reward indicated that it would more likely be on the vertical meridian (Fig. 3a). Spatial cues of this nature are known to have a strong impact on visual search (Posner and Cohen, 1984), and participants in this experiment were indeed faster to respond to targets at cued locations (Fig. 3b) (target location, F(1,13) = 1.54, p = 0.255; reward, F < 1; target location × reward, F(1,13) = 13.88, p = 0.007). This demonstrates that participants are able to extract strategic information encoded in reward and that they make an effort to use this information to improve performance. Participants in experiment 1 were presumably aware of the fact that high-magnitude reward predicted a color swap. Why does their behavior show no evidence of this knowledge? We believe that, in the predictive condition of experiment 1, the receipt of high-magnitude reward cued participants to strategically prepare for a color swap, but receipt of high-magnitude reward also initiated the automatic priming of perceptual features that characterized the target. This perceptual effect was strong enough to overwhelm and negate the impact of endogenous strategy.
Electrophysiological indices of the impact of reward on perception and attention
Because behavior reflects the outcome of processing in multiple cognitive stages and change in response can reflect modulation at any point in this sequence, measures such as response latency and accuracy provide relatively coarse insight into the cognitive stages affected by experimental manipulations. In contrast, noninvasive measures of brain electrical activity, ERPs (Luck, 2005), can sometimes be used to demonstrate change in discrete processing stages. With this in mind, in a second experiment, we had 14 new participants complete the nonpredictive version of our visual search task while we recorded ERPs from the scalp surface. As illustrated in Figure 4, behavioral results from this experiment replicated those of experiment 1 (reward × color swap, F(1,13) = 6.30, p = 0.026). Participants in experiment 2 additionally responded more rapidly when colors did not swap between trials, although not reliably so (color swap, F(1,13) = 2.75, p = 0.121; all other F values <1).
To index changes in perceptual and attentional processing, we analyzed lateralized ERPs elicited over visual cortex by the onset of search displays. We concentrated on ERPs elicited when the target and distractor were presented to opposite visual hemifields because lateralized effects observed under these circumstances can be associated with relative increase in the perceptual and attentional processing of one or the other of the two salient objects. This is the case because of the contralateral and retinotopic nature of the visual system; effects associated with target processing will be primarily evident at electrode locations contralateral to the target, whereas effects associated with distractor processing will be evident at electrode locations contralateral to the distractor (Luck and Hillyard, 1994a,b; Hillyard et al., 1998; Woodman and Luck, 1999; Luck, 2005; Hickey et al., 2006). It is important to note that this design does not make is possible to link a signal to one discrete stimulus; ERPs have inherently bad spatial resolution, and, because of this, a minority of the signal observed contralateral to the target may reflect distractor processing. However, because our experimental design allows us to compare the lateralized potentials elicited by identical stimulus arrays under varying conditions of reward, it provides perspective on the relative degree to which perceptual and attentional resources are being applied to either the target or distractor.
As illustrated in Figure 5, a and b, multiple effects were apparent in the ERPs elicited after high-magnitude reward, beginning with an increase in the amplitude of the lateral P1 component at ∼100 ms after stimulus. This increase in P1 reflects an amplification of early visual processing stages in extrastriate visual cortex (for review, see Hillyard et al., 1998). Critically, in no-swap trials after high-magnitude reward, the P1 increase was opposite the target (Fig. 5a), demonstrating an increase in sensory and perceptual processing of the target, but in swap trials after high-magnitude reward, the P1 increase was opposite the distractor (Fig. 5b), demonstrating an increase in sensory and perceptual processing of the distractor (electrode laterality × swap condition, F(1,13) = 4.98, p = 0.044; all other F values <1; analysis based on peak amplitude difference). Notice that, in both cases, it is the stimulus characterized by the color reinforced with high reward on the immediately preceding trial that elicited the enhanced P1 effect. No corresponding pattern was observed in the ERPs elicited after low-magnitude reward (Fig. 5c,d), consistent with the absence of a behavioral effect in this condition (Fig. 4).
There are two important points to be made in the context of this P1 effect. First, the displays that elicited all four of the ERPs presented in Figure 5 were identical, and therefore the change in lateralized P1 amplitude observed in the high-magnitude reward condition cannot be a product of changes in visual input. Second, the lateral P1 is not normally sensitive to endogenous attentional set for visual features in search; the lateral P1 elicited by a unique item that is the target is not any larger than the lateral P1 elicited by a unique item that is a distractor (Luck and Hillyard, 1994b). The increase in lateral P1 amplitude in response to an object characterized by a reward-associated color thus suggests that reward can have an impact on perceptual processing that cannot be accounted for as a product of endogenous attentional set. The P1 results support the idea that high-magnitude reward facilitates subsequent processing of stimuli characterized by reward-associated visual features from very early in the visual processing sequence.
A similar pattern was evident in the N2pc component. The N2pc is an increase in negative ERP amplitude from 200 to 300 ms after stimulus at posterior electrode sites contralateral to an attended object (Luck and Hillyard, 1994a,b), and it constitutes a reliable index of visuospatial attention (Luck and Hillyard, 1994a,b; Woodman and Luck, 1999; Hickey et al., 2009). In the high-reward no-swap condition, the N2pc was elicited contralateral to the target (Fig. 5a), reflecting rapid target selection, but in the high-reward swap condition, the N2pc was elicited contralateral to the distractor, demonstrating the deployment of attention to the distractor location (Fig. 5b) (electrode laterality × swap condition, F(1,13) = 7.29, p = 0.018; all other F values <1; simple effect electrode laterality in no-swap condition, t(13) = 2.42, p = 0.015; simple effect electrode laterality in swap condition, t(13) = 2.14, p = 0.026; all N2pc statistics based on mean amplitude 240–255 ms after stimulus). No corresponding effects were observed in the ERPs elicited after low-magnitude reward, although the target-elicited N2pc in this condition was numerically larger in the swap condition, possibly reflecting better target selection when the target has not been associated with suboptimal outcome (Fig. 5c,d) (electrode laterality × swap condition, F(1,13) = 2.04, p = 0.177). Together, the P1 and N2pc results reinforce the conclusions we took from behavior, namely that reward primes early visual processing such that objects characterized by reward-associated features are more likely to be attended.
Reward processing in anterior cingulate cortex predicts the magnitude of the impact of reward on the deployment of attention
A crucial goal of experiment 2 was to seek a direct signature of reward-related processing in the brain such that this activity could be localized and related to behavior. We approached analysis with the hypothesis that the mesolimbic dopamine system would play a role in creating the impact of reward on vision, and we examined the ERP elicited by reward feedback to determine whether this was the case. Midbrain dopaminergic structures are too deep in the brain to be detected in electrical recordings at scalp surface, but reinforcement-monitoring aspects of the dopamine system extend to the anterior cingulate cortex (Holroyd and Coles, 2002), consistent with the known connectivity between these areas (Williams and Goldman-Rakic, 1993). Reward processing in anterior cingulate cortex can be indexed in a midline anterior ERP component known as the medial frontal negativity (MFN). The MFN is apparent from 200 to 300 ms after reward feedback, and it appears to reflect activity involved in the assessment of motivational impact (Gehring and Willoughby, 2002). The MFN is generally larger for low-magnitude reward, as was the case in the present study (Fig. 6a) (t(13) = 4.20, p = 0.001; all MFN analyses are based on mean amplitude 275–295 ms after stimulus).
Figure 6b presents the scalp topography of the difference in ERP activity elicited by high-magnitude versus low-magnitude reward observed across the peak interval of the MFN (275–295 ms). This topography suggests that differences in MFN co-occurred with change in posterior cortical processing. Examination of ERPs elicited over visual cortex suggested that the posterior effect may reflect an occipital selection negativity (Harter and Aine, 1984). This is consistent with the idea that the attentive response to reward feedback varied as a function of the magnitude of reward denoted. We created a reverse dipole model (Scherg, 1992) of the difference in activity observed across the peak of the MFN (275–295 ms) to isolate anterior reward-related processing from concurrent activity in visual cortex. The best-fitting unconstrained model suggested four discrete sources (residual variance, 3.69%) (Fig. 7a). Two sources were located in occipital cortex. Another source was located in extracortical space in the vicinity of the right eye and likely reflects a combination of residual eye movement activity in the ERP and reward processing in orbitofrontal cortex (Rolls, 2000). A final source was located on the border of the anterior cingulate and medial frontal gyrus. By applying this model to the complete ERP interval, we calculated an activity waveform for this cingulate source (Fig. 7b) that was markedly similar in terms of onset and peak latency to the raw difference in MFN (Fig. 6a).
To determine whether anterior cingulate cortex was involved in creating the influence of reward on perceptual and attentional processing, we examined the relationship between the modeled cingulate activity and the impact of high-magnitude reward on behavior. We applied the dipole model to individual participant data, creating per-participant versions of the cingulate source waveform, and measured activity of the cingulate source for each participant across the peak interval identified in the grand-average MFN waveform (275–295 ms) (identified by broken box in Figs. 6a, 7b). We also quantified the behavioral impact of high-magnitude reward for each participant by calculating the mean latency difference between swap and no-swap responses. These measures correlated strongly (Fig. 7c) (Spearman's ρ = 0.618, p = 0.021). A similar relationship was identified when the MFN difference observed in the raw ERPs was used rather than the dipole model output (Fig. 7d) (Spearman's ρ = 0.662, p = 0.020).
These correlations were primarily driven by variability in the cingulate response to high-magnitude reward. They show that as the difference between cingulate activity elicited by low- and high-magnitude reward decreases—driven by the increasing amplitude of the signal elicited by high-magnitude reward—the impact of high-magnitude reward on behavior becomes stronger. We interpret this as evidence of individual differences in sensitivity to the motivational impact of positive feedback. Participants who are sensitive to positive feedback—as reflected in a larger cingulate response to high-magnitude reward—show a correspondingly larger effect of reward on vision.
The results demonstrate that reward has a direct, non-volitional impact on human perception and attention that is independent of its impact on strategy and endogenous attentional control. Behavioral measures show that participants are fast to respond to a target characterized by a color recently associated with high-magnitude reward but slow to respond to a target when a distractor characterized by this color competes for attentional resources. In contrast, participants are slow to respond to a target associated with low-magnitude reward but fast to respond when it is the distractor color that has been associated with suboptimal outcome. This behavioral pattern is evident even when participants were aware that a strategy to select objects characterized by reward-associated features would be counterproductive and a much better strategy was made available to them. Electrophysiological measures confirm that this reflects changes in perceptual and attentional processing: the P1 ERP component is enhanced contralateral to an object characterized by a reward-associated color, reflecting facilitated perceptual activity, and this stimulus elicits an N2pc, indexing the deployment of attention to its location. Critically, these effects are observed regardless of whether the stimulus is the search target: when the salient distractor is characterized by the reward-associated color visual resources are allocated to this task-irrelevant object.
Processing of reward feedback elicits an ERP component known as the MFN, and we find that the magnitude of the MFN elicited by positive feedback predicts the behavioral impact of reward on visual search on a per-participant basis. The MFN is thought to reflect neural processing involved in the evaluation of the motivational impact of an event (Gehring and Willoughby, 2002) and has been linked to other mediofrontal ERP components elicited by the commission of errors or feedback indicating erroneous performance (Holroyd et al., 2002). In general, these mediofrontal components are thought to reflect cortical processing in a system that involves midbrain dopamine neurons (Holroyd and Coles, 2002). Our experimentation was motivated by theoretical interpretations of the role of dopamine in animal approach behavior that suggest that reward-related mesolimbic dopamine acts to facilitate perceptual and attentional processing of stimuli with reward-conditioned features (Schultz et al., 1997; Berridge and Robinson, 1998; Ikemoto and Panksepp, 1999; Redgrave et al., 1999; Wise, 2004; Alcaro et al., 2007). Motivated by the idea that the MFN might constitute an indirect index of activity in this system, we believe that the correlation between anterior cingulate activity and the impact of reward on performance reflects an underlying relationship between mesencephalic reward processing and activity in visual cortex. According to this, reward-related activity in the dopamine system initiates a series of events—one stage of which involves the anterior cingulate—that eventually leads to changes in sensory representation.
There is substantial circumstantial evidence for this idea in the connectivity and behavior of dopaminergic neurons. Dopaminergic nuclei such as the substantia nigra and ventral tegmental area project diffusely to the basal ganglia and cortex; in primates, the greatest density of cortical terminals are in medial frontal cortex, including the superior frontal gyrus and anterior cingulate (Williams and Goldman-Rakic, 1993). Anterior cingulate and surrounding cortex is known to be fundamentally involved in the control of attention and processing of attended stimuli (Mesulam, 1999; Hopfinger et al., 2000). Midbrain dopamine neurons themselves show a pattern of activity consistent with the creation of associations between stimuli and outcome: cells become active when unexpected reward is encountered but do not respond to expected reward and become less active than normal when an expected reward fails to materialize (Schultz et al., 1997; Schultz, 2002). Finally, recent results have demonstrated that reward expectation is represented in the activity of individual cells in primary visual cortex in the rat (Schuler and Bear, 2006), suggesting the existence of neural architecture necessary for the translation of reward processing to sensory modulation in low-level visual cortex.
Results from the current study demonstrate that human vision operates according to principles that are strikingly similar to those that underlie approach behavior in non-primate animals (for review, see Berridge and Robinson, 1998; Ikemoto and Panksepp, 1999). Even a bee that has found a flower that is rich with nectar will spend the day searching for flowers of the same color (Menzel and Muller, 1996). This close correspondence across species suggests that the brain structures involved should be phylogenetically old and thus present in the brains of very different animals. The relationship of the impact of reward on attention to the anterior cingulate cortex is consistent with this; cingulate cortex is thought to have developed either before neocortex or shortly thereafter (Allman et al., 2001). The dopaminergic structures in the midbrain, which we suggest underlie the observed activity in cingulate cortex, are even older, long predating neocortex (Marín et al., 1998).
We emphasize the phylogenetic age of the brain areas responsible for the impact of reward on attention because we believe that the pattern identified in the present study reflects the action of a very old cognitive mechanism. Early in the development of the brain, a visual bias toward stimuli characterized by reward-conditioned features likely constituted the sole source of attentional control. This bias may continue to play a primary role in attentional control in animals with brains less complex than those of primates. This has clear adaptive benefits; environmental stimuli that have garnered primary rewards such as food are very likely to do so again in the future, and thus attending to them makes sense. Humans have acquired the ability to select stimuli in the absence of immediate external reinforcement, but the development of this ability does not necessarily preclude the continuing influence of the older system.
The current study complements and significantly extends a growing literature examining the impact of reward on the immediate deployment of attention and attentional learning in humans. In a previous study, lingering effects of attentional suppression of a distracting stimulus, known as negative priming, were only found after high rewards, indicating that persisting inhibition of visual representations is abolished by poor outcomes (Della Libera and Chelazzi, 2006). In a subsequent study designed to explore long-term effects of rewards, human participants became more efficient at selecting targets consistently associated with high-magnitude reward but relatively inefficient at ignoring the same stimuli when shown as distractors (Della Libera and Chelazzi, 2009). Interestingly, the ability to ignore a given distractor also improved when this was consistently followed by high (as opposed to low) rewards, whereas the ability to select the same items as targets became relatively impaired. Finally, the consistent association of stimuli with reward can improve their detectability, rendering them relatively immune to the attentional blink (Raymond and O'Brien, 2009).
As noted in Introduction, studies investigating the effects of reward expectancy on attentional deployment are radically different from the present work because this manipulation does not allow for the dissociation of strategic and automatic effects related to reward (Maunsell, 2004). Nonetheless, previous work found an enhanced and earlier N2pc—the ERP component used as an index of selective attention in the present study—when elicited by targets associated to the expectation of high rewards (Kiss et al., 2009). Animal electrophysiology is just beginning to explore modulations of visual processing that result from controlled stimulus–reward associations (Schuler and Bear, 2006; Peck et al., 2009; Frankó et al., 2010), but the exact link to the present findings is still unclear. Despite the recent surge of interest, and as noted in Introduction, the impact of reward on attention is not a prominent factor in models of visual search and attention (with the notable exception of Navalpakkam et al., 2009). It is clear that, in future refinements of attentional theory, the role of reward will need to be given full consideration.
In summary, the present results provide evidence that reward has a direct impact on human vision that is independent of its role in strategy and endogenous attentional set. Our results suggest that the anterior cingulate cortex—a cortical expression of the mesolimbic dopamine system—plays a crucial role in this source of attentional control.
L.C. is supported by Fondazione Cariverona.
- Correspondence should be addressed to Clayton Hickey, van der Boechorststraat 1, 1055 AB Amsterdam, The Netherlands.