Abstract
Theories of reinforcement learning and approach behavior suggest that reward can increase the perceptual salience of environmental stimuli, ensuring that potential predictors of outcome are noticed in the future. However, outcome commonly follows visual processing of the environment, occurring even when potential reward cues have long disappeared. How can reward feedback retroactively cause now-absent stimuli to become attention-drawing in the future? One possibility is that reward and attention interact to prime lingering visual representations of attended stimuli that sustain through the interval separating stimulus and outcome. Here, we test this idea using multivariate pattern analysis of fMRI data collected from male and female humans. While in the scanner, participants searched for examples of target categories in briefly presented pictures of cityscapes and landscapes. Correct task performance was followed by reward feedback that could randomly have either high or low magnitude. Analysis showed that high-magnitude reward feedback boosted the lingering representation of target categories while reducing the representation of nontarget categories. The magnitude of this effect in each participant predicted the behavioral impact of reward on search performance in subsequent trials. Other analyses show that sensitivity to reward—as expressed in a personality questionnaire and in reactivity to reward feedback in the dopaminergic midbrain—predicted reward-elicited variance in lingering target and nontarget representations. Credit for rewarding outcome thus appears to be assigned to the target representation, causing the visual system to become sensitized for similar objects in the future.
SIGNIFICANCE STATEMENT How do reward-predictive visual stimuli become salient and attention-drawing? In the real world, reward cues precede outcome and reward is commonly received long after potential predictors have disappeared. How can the representation of environmental stimuli be affected by outcome that occurs later in time? Here, we show that reward acts on lingering representations of environmental stimuli that sustain through the interval between stimulus and outcome. Using naturalistic scene stimuli and multivariate pattern analysis of fMRI data, we show that reward boosts the representation of attended objects and reduces the representation of unattended objects. This interaction of attention and reward processing acts to prime vision for stimuli that may serve to predict outcome.
Introduction
Reward-predictive stimuli become salient and draw attention even when this is strategically counterproductive (Bromberg-Martin and Hikosaka, 2009; Hickey et al., 2010a, b; Anderson et al., 2011; Hickey and van Zoest, 2012, Della Libera and Chelazzi, 2009). In the same time frame, the neural response to reward shifts: initially a response to reward itself, it comes to be triggered by cues indicating the potential for reward (e.g., Schultz et al., 1997). One account of approach behavior, the “incentive salience hypothesis,” suggests a direct relationship between these observations. In this hypothesis, the neural response to reward initially sensitizes the visual system to proximal cues, causing them to draw attention. When still-earlier cues become available in the environment, the process iterates: the “reward” response, now triggered by the proximal cue, causes the still-earlier cue to itself become attention-drawing. Ultimately, this ensures that the animal is sensitive to the earliest predictors of reward in the environment. These draw attention, ensuring that the information that they convey guides behavior (Berridge and Robinson, 1998).
In the laboratory and in the real world, reward cues precede actual outcome. A key challenge for the incentive salience hypothesis is therefore to solve the “credit assignment problem” (Sutton and Barto, 1998). How is the representation of environmental stimuli impacted by feedback later in time? We and others have suggested that this might occur through the potentiation of lingering visual representations of environmental stimuli (Hickey et al., 2010a; Roelfsema et al., 2010; Weil et al., 2010; Schiffer et al., 2014; Hickey and Peelen, 2015). In this view, the neural response to reward triggers a process in which the representation of recently attended objects are “stamped in” to visual cortex. Vision becomes sensitized for these stimuli, making them salient and attention-drawing.
Here, we used fMRI to test this hypothesis in naturalistic human vision. We had participants search for examples of target categories—cars, people, and trees—in briefly presented images of real-world scenes (Fig. 1A). When participants completed this task accurately, we rewarded them with points that had cash value. Our interest lay in how the magnitude of reward affected lingering representations of objects that had been present in the scene. Crucially, many scenes contained objects of two types: targets and nontargets. This allowed us to test for a selective effect of reward outcome on previously attended versus ignored objects.
A, Example of experimental trial. B, Analytic approach.
In our design, reward feedback was randomly determined for each trial: so long as participants responded correctly, they were equally likely to receive 1 point as 100 points and they were explicitly aware of this fact. This design feature requires some explanation. When reward is linked to a discrete category, for example, if detecting “people”' in a scene always results in high-magnitude reward, then humans and other animals will look out for these objects and this involves the establishment of top-down attentional set. Attentional set changes how stimuli are encoded and, though interesting in its own right, this effect is theoretically distinct from the direct, low-level, and nonstrategic impact of reward feedback on already-encoded representations that is the focus of the current study (Hickey et al., 2010a; Maunsell, 2004). Our use of random reward magnitude made it impossible for participants to establish attentional set for reward-predictive stimuli because stimuli had no predictive value. Accordingly, we were able to isolate the discrete low-level effect of reward feedback on object representations.
To index variance in the quality of object representations, we calculated measures of category information using multivariate pattern analysis (MVPA) of fMRI data (Seidl et al., 2012; Peelen and Kastner, 2011; Hickey and Peelen, 2015). Our technique involves a comparison of the scene-elicited voxelwise pattern in ventral visual cortex to benchmark data patterns in the same area collected during an independent localizer (Fig. 1B). In the localizer, people viewed isolated examples of the object categories of interest. If the scene-elicited pattern was similar to one or more of these benchmark patterns, we can infer that examples of the corresponding category are being strongly represented in visual cortex.
Our expectation was that high-magnitude reward would selectively boost the representation of previously attended objects, as reflected in an increase in the amount of category information in ventral visual cortex.
Materials and Methods
Procedures.
All procedures were approved by the ethics committee of the University of Amsterdam Department of Psychology.
Participants.
Twenty healthy volunteers with normal or corrected to normal vision gave informed consent before beginning the experiment and were financially compensated for their participation (seven male, one left-handed woman, mean 23.6 years ± 2.3 SD).
Experimental stimuli and design.
Before entering the scanner, each participant completed a native language version of the behavioral inhibition system/behavioral activation system (BIS/BAS) scale questionnaire (English: Carver and White, 1994; Dutch: Franken et al., 2005).
In the scanner, an experimental session consisted of five scanner runs of 500 s duration, each composed of six blocks of 20 trials. A run began with a 15 s fixation interval and ended with a 30 s fixation interval; 10 s fixation intervals occurred between each block. A trial (3.367 s) began with a fixation interval (800 ms) followed by brief presentation of a scene (50 ms; 3° × 4° visual angle), a mask (333 ms), the reappearance of fixation (900 ms), and reward feedback (1284 ms; Fig. 1). Natural scene images (n = 240) were selected from an online database (Russell et al., 2008) and rendered in black and white. Scenes contained examples of cars, trees, and people alongside a variety of other objects and textures. Of these three stimuli categories, two were identified during task instruction as target categories (T1 and T2 categories). The left response on an MR-compatible button box indicated the presence of an example of the T1 category and a right response an example of T2. The third category was task irrelevant and was not identified or discussed with the participant before experimental participation (nontarget category). The identity of the T1, T2, and nontarget categories was counterbalanced across participants and participants reported whether each scene contained examples of either of the target categories.
In each experimental block, four scenes contained one or more examples of T1 without examples of T2 or nontarget, four contained T1 with one or more examples of the nontarget, four contained T2 without examples of T1 or nontarget, four contained T2 with one or more examples of the nontarget, and four contained no example of T1, T2, or the nontarget category. To remove the possibility of ordering effects, the order of trials was randomized within a block. Participants were instructed to make no response when the scene did not contain a target; they saw each scene three times throughout the course of the experiment and scenes were randomly selected from each of five scene groups without replacement such that all 240 images were presented before the selection procedure reset. Scenes were masked with one of 48 images created by generating white noise at varying spatial frequencies and superimposing a naturalistic texture.
During the feedback interval of each trial, points with cash value were awarded to the participant. Correct task performance resulted in the receipt of either 001 or 100 points, presented centrally, with incorrect performance always resulting in the loss of 50 points. However, because reward magnitude was randomly determined for each correct trial, total pay was determined solely by task accuracy and all participants received between €40 and €50 at the conclusion of the experiment.
Object-selective cortex (OSC) localizer.
Two localizer experiments followed the primary experimental session. The first of these was designed to identify OSC and comprised two scanner runs of 336 s duration, each containing 20 blocks of 20 trials. Each trial began with fixation (383 ms) followed by a central image of either an isolated everyday object on a white background or a pixel-scattered version of such an image (383 ms; 3° × 3°). Participants monitored for repetition of an image, which occurred twice in each block. Block order was counterbalanced across runs and each run began and ended with a 15 s fixation block; fixation blocks additionally occurred after every four experimental blocks.
An OSC region of interest (ROI) was defined for each subject in native space by contrasting activity evoked by intact and scrambled objects and subsequently transformed to Talairach space. ROIs were generated by identifying occipital and temporal voxels in the ventral visual stream where this contrast garnered uncorrected p values <0.05. Mean OSC size was ∼87 cm3 (3209 voxels) ± ∼54 cm3 SD (2017 voxels).
Category localizer.
The second localizer experiment identified benchmark patterns of voxel activation in OSC associated with each of the three stimuli categories used in the main experiment. It comprised two runs of 380 s duration, each made up of 20 blocks of 20 trials and five fixation blocks. Each trial contained a fixation period (367 ms) followed by a central image (383 ms; 3° × 3°) of an isolated car, tree, or headless human body on a white background (see Fig. 1 for examples). Participants monitored for stimulus repetition, all trials in a block contained images from the same category, and every fourth block was a fixation block. Block order was counterbalanced across runs such that mean serial position of each condition was equal. Images of people were headless because, in the scenes used in the main experiment, people were commonly too small to resolve the face visually and we did not want the localizer data to be driven by face processing if this did not compose a primary response to the scenes themselves.
Data acquisition and preprocessing.
Imaging data were collected with a 3 T Philips Achieva XT MRI scanner using a 32 channel head-coil (functional data: echo planar imaging, 37 slices, 3 × 3 × 3 mm voxel size with 0.3 mm gap, repetition time (TR) = 2 s., echo time (TE) = 27.68 ms, flip angle (FA) = 76.1°; structural data: T1-weighted MPRAGE, 220 slices, 1 × 1 × 1 mm voxel size, 240 × 240 matrix, TR = 8.2 ms, TE = 4.38 ms, FA = 8°). Functional data were slice time and motion corrected, low-frequency drift was removed with a 0.0006 Hz high-pass filter, and structural and functional data were transformed to Talairach space. Before transformation, results were spatially smoothed with a 6 mm, full-width, half-amplitude Gaussian kernel. This degree of smoothing has been found to improve correlation-based MVPA analysis (Op de Beeck, 2010). Data analysis relied on the AFNI software package (Cox, 1996), CoSMoMVPA toolbox (Oosterhof et al., 2016), and custom MATLAB (The MathWorks) and shell scripts.
Data analysis.
Initial analysis of category localizer and experimental imaging data were similar and began with the creation of general linear models (GLMs) for each participant with conditional predictors for each correctly completed trial. The experiment was motivated by the idea that reward may have an impact on the lingering representation of objects that had been present in the now-absent scenes. Therefore, in the GLM analysis of experimental results, predictors were time locked to the onset of reward feedback, not the onset of the scene. Predictors were convolved with a standard model of the hemodynamic response function with additional regressors to account for changes in mean signal across scanning runs and for head motion. For pattern analysis, the resulting t values for each voxel and condition were normalized by subtracting across all values for one voxel the mean value observed across conditions (Haxby et al., 2001). This eliminates voxelwise differences in hemodynamic response that are unrelated to experimental manipulations while retaining conditional variability.
Patterns of normalized t value observed in OSC during the experiment were correlated with patterns of normalized t value observed in the same area in the category localizer. Each condition of the main experiment thus had three associated values describing the similarity of the scene-elicited data to the car, tree, or person benchmarks derived from the localizer experiment (Fig. 1B). These correlations were Fisher transformed and organized as a function of whether each of the three categories were present in the scene, whether they acted as target or nontarget, and whether high or low magnitude reward was received in that trial. Category information was computed for categories that were present in the scene by subtracting the correlation between scene pattern and the benchmark for the target that was absent from the scene from the correlation between scene pattern and benchmark for the respective stimulus present in the scene. For example, for a scene containing a person target and tree nontarget, identification of category information about people would begin with calculation of the correlation between scene pattern and people benchmark. The correlation between scene pattern and cars benchmark was subsequently subtracted. Similarly, identification of information specific to the tree nontarget would begin with correlation between scene pattern and tree benchmark followed by subtraction of correlation between scene pattern and cars benchmark.
Results
Multivoxel fMRI
Primary analysis began with examination of the impact of high-magnitude reward on target and nontarget category information in object selective cortex (OSC). As illustrated in Figure 2A, information about the target category increased after high-magnitude reward, both when the target was presented alongside an example of the irrelevant nontarget (solid red line) and when this nontarget was absent (broken red line). In contrast, information about the nontarget category decreased after reward (solid blue line). In a repeated-measures ANOVA of trials where both target and nontarget were present, this pattern expressed as a significant interaction (F(1,19) = 10.476, p = 0.004, ηp2 = 0.355; main effects F < 1). Follow-up contrasts revealed a significant effect of reward on both target information (t(19) = 3.639, p = 0.002, Cohen's d = 0.825, Morris and DeShon, 2002) and nontarget information (t(19) = −2.260, p = 0.036, Cohen's d = −0.506).
A, Results from pattern analysis in OSC. Reward was differentially modulated the representations of targets and nontargets. B, Correlation of reward effect on OSC information with reward effect on behavior. The OSC effect is calculated as the point estimate of category information interaction of the solid red and blue lines in A. The behavioral effect reflects the impact of high-magnitude versus low-magnitude reward on performance in the next immediate trial, with the sole confine that the nontarget category was present in each of these two different scenes. Error bars reflect within-participant SEM (Cousineau, 2005).
Behavior
High-magnitude reward appears to have caused a sharpening of the lingering target representation in OSC relative to the representation of other objects in the scene. To determine whether this had an impact on behavior, we looked to sequential effects on search performance, examining the impact of reward feedback in one trial on accuracy in the next (Hickey et al., 2010a, b; Hickey and Los, 2015; Hickey et al., 2015). When we examined trials in which the nontarget was present in both instances, high-magnitude reward in one trial was nominally associated with a small and nonsignificant cost to search accuracy in the next (77.1% to 75.6%; t(19) = 0.552, p = 0.587, Cohen's d = 0.124). However, as illustrated in Figure 2B, individual variability in this behavioral effect was predicted by the strength of reward's influence on stimuli representation in OSC (as measured in a point estimate of the interaction; r(19) = 0.459, p = 0.042). Pearson correlation values are sensitive to extreme values, so we additionally conducted a Studentized bootstrap analysis of this correlation (with 104 permutations in outer bootstrap and 100 permutations in inner bootstrap). This analysis, which is much less sensitive to extreme values, also identified a reliable effect (rboot = 0.424, pboot = 0.041). Participants who showed a strong reward-related increase in the target representation relative to the nontarget representation thus showed a reduced cost on-task accuracy (or even a benefit) on the second of sequential trials when the nontarget was present in both scenes. An independent analysis of trials in which the nontarget was absent in the second scene revealed no corresponding relationship (r(19) = 0.161; p = 0.499).
To sum, those participants whose imaging data showed a strong and selective effect of reward on lingering target representations relative to nontarget representations also demonstrated the greatest sequential benefits of reward on task performance. This appears to be a product of the variation in OSC representation of stimuli present in the first scene, rather than a more generic influence of reward because the relationship emerged only when we examined performance for scenes in which a nontarget object category had been repeated.
Whole-brain correlation analysis
To identify brain areas involved in instantiating the reward effects in visual cortex described above, we adopted a whole-brain correlational technique that we have used in earlier work (Hickey and Peelen, 2015). This analysis began with a contrast of results for each participant across conditions in which feedback indicated high-magnitude versus low-magnitude reward outcome. For each voxel in brain space, we subsequently calculated Pearson correlation coefficients for the relationship between reward effect in that voxel and the impact of high-magnitude reward on target and nontarget information in OSC. This allowed us to identify voxels for which the strength of the univariate response to high-magnitude reward predicted the impact of reward on the target and nontarget representations in OSC across participants.
This approach identified a small set of voxel clusters (Table 1), including areas in the vicinity of the substantia nigra (SN), ventral tegmental area (VTA), and nucleus accumbens (NAcc) (Fig. 3). Critically, correlations in SN/NAcc clusters did not appear to differ as a function of whether category information for targets or nontargets was examined. When we conducted equivalent analysis relating voxelwise activation to the specific increase of target representation, as reflected in a point estimate of the interaction illustrated in Figure 2A, no voxels showed a reliable relationship (all p > 0.0001). Activity in these midbrain areas therefore did not predict the sharpening of target representation described above and illustrated in Figure 2A, but rather appears to predict an increase in the strength of representation for both target and nontarget information.
Results from whole-brain correlation analysis
A, Results from whole-brain correlation of reward response to reward effect on OSC target information. Marked in broken outline are the bilateral NAcc (anterior) and SN (posterior). Identified voxels show a relative increase in activity in response to high-magnitude reward feedback that predicts the reward-related benefit to target representation calculated across participants. Results in A are FDR corrected for multiple comparisons (Benjamini and Hochberg, 1995). B, Equivalent analysis with nontarget information. Results in B are not corrected for multiple comparisons, but are thresholded at p < 0.0001.
Personality inventory
Before beginning the experiment, all participants completed a personality inventory, the BIS/BAS scale (Carver and White, 1994). This scale is composed of 24 statements and participants rate their agreement to each statement on a four-point scale. The BAS subscale of this measure is thought to index a reward-sensitive motivational system that underlies approach behavior and it loads on agreement with statements such as “I go out of my way to get things I want” and “When I get something I want, I feel excited and energized.” The BIS subscale rather measures a punishment-sensitive system that underlies aversion and avoidance of negative situations and outcomes and it loads on statements such as “Criticism and scolding hurts me quite a bit” and “I worry about making mistakes.”
Consistent with results from our prior work (Hickey et al., 2010b; Hickey and Peelen, 2015), high BAS scores predicted the impact of high-magnitude reward on target category information in OSC (r = 0.567, p = 0.009; Fig. 4A), much as was observed in analysis of SN and NAcc activation described above. This relationship also emerged in analysis of the effect of high-magnitude reward on nontarget category information (r = 0.467, p = 0.038; Fig. 4B). Again, we calculated Studentized bootstrap tests of correlation to ensure that these relationships were not driven by extreme values; this suggested that, although the target relationship is reliable (rboot = 0.546, pboot = 0.016), the nontarget relationship should be interpreted carefully (rboot = 0.447, pboot = 0.059). No relationship with BIS was identified (target: r = −0.116, p = 0.628; nontarget: r = −0.239, p = 0.310). This does not appear to be a product of increased task motivation because BAS did not predict general task performance (r = −0.02 for the correlation of BAS to cross-conditional task accuracy, r = −0.07 for the correlation of BAS to cross-conditional reaction time). BAS and mean reward activation in the midbrain voxels identified in Figure 1 and Table 1 were positively correlated (r = 0.515, p = 0.020). Those participants with a reward-driven personality thus show an increase in sensitivity to reward feedback such that high-magnitude reward boosts the representation of both target and nontarget stimuli present in the scene.
A, Relationship between participant BAS score and the change in target information caused by high-magnitude reward. B, Same relationship, but with change in nontarget information.
Discussion
Reward's impact on object representations in visual cortex
How does reward cause real-world stimuli to become salient and attention-drawing? One possibility is that reward primes the lingering representation of attended environmental stimuli in visual cortex, causing vision to become sensitized to these objects. To test this, we had human participants report the presence of examples of target categories in natural scene images (Fig. 1). When participants responded correctly, the scene disappeared and was replaced with feedback indicating reward that randomly had either high or low magnitude. fMRI results showed that high-magnitude reward boosted the representation of targets and diminished the representation of nontarget objects that had been present in the scene (Fig. 2A). Reward thus selectively primed the representation of recently attended stimuli relative to ignored stimuli.
The total magnitude of this effect in each participant predicted the influence reward had on that person's visual search behavior. When examples of a nontarget category appeared in two sequential scenes, participants who showed a strong effect of high-magnitude reward on category information also showed a positive effect of high-magnitude reward on task accuracy in the next trial (Fig. 2B; cf. Weil et al., 2010). This relationship emerged only when scenes contained repeated examples of the same nontarget category, suggesting that a reduction in nontarget category information in one trial caused subsequent examples of this category to become easier to ignore.
When participants received high-magnitude reward, responses in visual cortex were strongly biased toward the target object relative to the nontarget object. Interestingly, after low-magnitude reward, no such bias was observed, so targets and nontargets were equally represented (Fig. 2a). At first glance, this may seem to contradict studies using similar stimuli and analysis to show enhanced processing of targets (Peelen and Kastner, 2014). An important difference, however, is that participants here performed a discrimination task (T1 vs T2), whereas, in previous studies, participants detected the presence of a single cued category. When searching for one category, participants are able to form strong top-down attentional templates that bias the processing of the scene in favor of the target category once the scene appears (Peelen and Kastner, 2014). Our results suggest that such top-down effects are weaker when participants look for multiple categories, which is consistent with behavioral studies (Houtkamp and Roelfsema, 2009; Stein and Peelen, 2017).
Another possible explanation is that the receipt of low-magnitude reward, a suboptimal outcome, was recognized by participants as a loss. We have found recently that loss-associated objects tend to be badly represented in ventral visual cortex, even under circumstances in which they are strategically useful (L. Barbaro, M.V. Peelen, C. Hickey, unpublished data). In the current study, participants did not know whether high- or low-magnitude reward would be received at the moment of scene presentation, suggesting that suboptimal outcome may trigger a reweighting of target and nontarget representations after these have been encoded in the visual system (Gong and Li, 2014; Infanti et al., 2015); that is, in visual memory). This relative down-weighting of loss-associated stimuli may be similar in nature to the inhibition of disgusting objects (Zimmer et al., 2015).
Our main finding of a target-specific effect of reward on visual cortex representation is consistent with other fMRI work investigating the credit assignment problem in vision. Schiffer et al. (2014) gracefully addressed this issue by examining univariate BOLD responses in the fusiform face area (FFA) and the parahippocampal place area (PPA). Participants reported whether they saw a house or a face in images that contained either degraded examples of these stimuli or were pure noise. When participants reported seeing a house or face in a pure noise image and were subsequently rewarded for this response, results showed an activity increase in the corresponding specialized visual area (i.e., increase in PPA when house was reported, increase in FFA when face was reported). This is consistent with the current results in that we also saw an impact of reward on categorical information in ventral visual cortex.
A notable distinction with the current study is that the images used by Schiffer et al. (2014) contained only degraded examples of single objects and participants had to report which of two objects they saw. There is the potential that a strategic decision to make one of two possible responses could have generated a correlate in the visual system even in the absence of any corresponding perceptual experience when the noise stimulus was viewed. This introduces some uncertainty as to whether reward's impact on visual cortex in Schiffer et al. (2014) reflects a change to lingering perceptual representations or a later influence on the decision-making process. In contrast, in the current study, the perceptual experience was unambiguous and we observed effects on the representation of both targets and nontargets. The nontarget effects in particular suggest an impact on visual representations rather than a correlate of postperceptual decision making because these objects had no importance to strategic task responses.
In contrast to this observation of increased reward-related activity in visual cortex, Arsenault et al. (2013) found that reward decreased the representation of a reward-predictive cue. In this monkey fMRI study, animals received liquid reward that was commonly preceded by a cue. However, analysis was focused on trials in which reward was not predicted and thus not preceded by the visual stimulus. Results showed that unanticipated reward of this nature had a robust impact on areas of visual cortex responsible for cue representation (as identified in a separate experimental task). However, surprisingly, the effect direction was negative: cortical areas responsible for representation of the cue became less active when reward was unexpectedly received. This could have reflected the beginning of extinction, but results from further experimentation showed that the magnitude of this negative effect predicted the positive impact of reward on overt behavior.
Arsenault et al. (2013) suggest that this puzzling finding may reflect action of a mechanism in vision that accentuates the representation of a stimulus by “quietening” noise in the system. In this mechanism, the reduction in BOLD reflects an improvement in cue representation through noise suppression. There is room for further research to determine how this type of inhibitory mechanism acts in concert with the excitatory mechanism identified by Schiffer et al. (2014). It is worth noting, however, that both of these mechanisms could underlie the variance in multivariate category information observed in the current study: both a reduction in neural noise and a boost in target signal would cause an increase in MVPA category information.
Individual differences in dopaminergic midbrain activity and reward sensitivity
We conducted a whole-brain correlational analysis to identify the functional network involved in instantiating reward's impact on lingering category information in visual cortex. This identified a small set of clusters in which individual differences in BOLD responsivity to high-magnitude reward (vs low-magnitude reward) predicted individual differences in the increase of category information in ventral visual cortex (Table 1). Notable here were clusters in or close to the SN and VTA, midbrain nuclei known to contain high concentrations of dopaminergic neurons, and clusters in or around the NAcc, a primary dopamine target (Fig. 3). These results support the notion that the DA system is involved in mediating reward's impact on visual representations, which is consistent with the incentive salience hypothesis (Berridge and Robinson, 1998; Hickey and Peelen, 2015). However, it is important to note that we identified a relationship between midbrain activity and the representation of both targets and nontargets, not the sharpening of target representation identified in our primary analysis. Analyses targeted at identifying functional predictors of the differential reward-related effects observed in OSC (target vs nontarget) did not identify any brain areas that showed this relationship reliably.
Why do some people show a greater dopaminergic response to reward? Our results show that this is related to individual differences in personality. We find a correlation between BAS personality scores, reflecting trait sensitivity to reward feedback, and the impact of high-magnitude reward on target (Fig. 4A) and nontarget category information (Fig. 4B). High BAS participants were also those who had strong midbrain responses to reward feedback, as reflected in a positive correlation of these measures. This suggests a role for dopaminergic midbrain structures in the definition of this personality trait, as has been proposed by others previously (Beaver et al., 2006; Hahn et al., 2009).
We have shown that reward has a selective effect, differentially modulating representations of targets and nontargets in ventral visual cortex. Conversely, individual differences in reward responsivity and midbrain activity predict a nonspecific boost to both targets and nontargets in visual cortex. How can these results be reconciled? One possibility is that reward-sensitive participants may have attended to both targets and nontargets in the scenes. Participants were informed that there was no relationship between stimuli characteristics and reward in our design, but they may have nevertheless attempted to identify objects in the scenes that predicted outcome. This could be a strategic effort, reflecting disbelief in our description of the experimental parameters, or it could be automatic, reflecting a mechanism in visual cognition that may be active even when strategically unwarranted (Gottlieb et al., 2013). In either case, the result would be an attentive response to targets required to correctly complete the task, but also selection and processing of nontarget scene characteristics as participants searched for predictive relationships between scene features and outcome. A greater dopaminergic response to high-magnitude reward could in this way boost representations of both targets and distractors, as was observed in our results.
Summary
We demonstrate that high-magnitude reward after visual search through images of real-world scenes creates a strong bias in the visual system to represent previously attended objects relative to ignored objects. Participants are subsequently less distracted by examples of the nontarget category, as expressed in a behavioral advantage. We interpret this as evidence of the assignment of credit to the target representation, causing the visual system to become sensitized for similar objects in the future.
Footnotes
This work was supported by the Netherlands Organization for Scientific Research (NWO; VENI Grant 016-125-283 to C.H.) and the Autonomous Province of Trento, Italy (Grandi Progetti 2012 Project: “Characterizing and Improving Brain Mechanisms of Attention–ATTEND”).
The authors declare no competing financial interests.
- Correspondence should be addressed to Clayton Hickey, Center for Mind/Brain Sciences, University of Trento, Corso Bettini 31, 38068 Rovereto, Italy. clayton.hickey{at}unitn.it