Feedback-based learning involves making use of the information provided by the consequences of one’s actions. The caudate nucleus, which has been implicated in processing information about extrinsic rewards and punishments, such as monetary gains and losses (Delgado, Locke, Stenger, & Fiez, 2003; Delgado, Nystrom, Fissell, Noll, & Fiez, 2000; Mullette-Gillman, Detwiler, Winecoff, Dobbins, & Huettel, 2011; Tom, Fox, Trepel, & Poldrack, 2007), responds in very similar manners to the intrinsic rewards and punishments of positive and negative feedback indicating correct and incorrect performance, respectively, with an increase in signal in response to positive feedback, and a decrease in signal to negative feedback (Tricomi, Delgado, McCandliss, McClelland, & Fiez, 2006). Thus, in typical studies involving feedback-based learning, the brain responses associated with rewards are evoked by positive feedback, and responses associated with punishments are evoked by negative feedback (Elliott, Frith, & Dolan, 1997; Poldrack et al., 2001; Seger & Cincotta, 2005; Tricomi et al., 2006; Ullsperger & von Cramon, 2003). However, during learning tasks, negative feedback can provide useful information that can help a learner reach the goal of task mastery. Thus, if the definition of “reward” is expanded beyond extrinsic rewards to include any outcome that can serve as a goal of behavior, information itself can then be construed as a reward if it aids in goal attainment. Thus, negative feedback could actually act as a “reward,” rather than as a punishment or the absence of an expected reward. In this case, the response of brain regions thought to represent reward-related information might be expected to reflect the amount of information provided by the feedback, rather than the valence of the feedback. Our experiment was designed to examine how the amount of information provided by feedback influences the way that the feedback is processed by the brain.

Information can be thought of as the deviation of an outcome from expectation or as a reduction in uncertainty (Shannon, 1948). When an outcome is fully predicted, it provides no information, whereas if there is a low probability that any given action is the correct one, positive feedback then provides a large amount of information. Phasic changes in the firing of dopaminergic neurons in the midbrain have been found to correspond to “prediction error” signals; that is, phasic changes encode the difference between a reward received and an expected reward (Schultz & Dickinson, 2000). Unexpected rewards produce the most neuronal activity; in contrast, the lack of an expected reward leads to a negative prediction error, and to a corresponding dip in dopaminergic firing rates (Schultz, Dayan, & Montague, 1997). More recent evidence, however, has presented a more complex picture of the functions of dopamine neurons. In addition to neurons coding for reward prediction error, other dopamine neurons fire in response to both liquid rewards and aversive air puffs to the eye, when these are unpredictable, and to stimuli predicting both of these outcomes (Matsumoto & Hikosaka, 2009). Furthermore, dopamine neurons and lateral habenula neurons, which influence dopamine neuron activity, have been found to encode “information prediction errors” in addition to reward prediction errors (Bromberg-Martin & Hikosaka, 2011). That is, these neurons appear to code for the value of information in a way that is analogous to the value of primary rewards, with increases in firing to the unexpected delivery of information or to cues linked with information delivery, and decreases in firing to denial of reward information. Thus, it is possible that the informational value of negative feedback could cause it to elicit an increase in dopamine neuron firing, despite its “negative” valence.

A key target of dopaminergic neurons is the striatum, which in recent years has been heavily implicated in reward processing (Delgado, 2007; O’Doherty, 2004). Previous work indicated that reward-related activation of the caudate nucleus, a structure within the dorsal striatum, is very context dependent (e.g., Nieuwenhuis et al., 2005). In some cases, the same outcome can produce differing amounts of blood-oxygenation-level dependent (BOLD) activation in the striatum, depending on how the outcome is interpreted—or, in other words, its subjective value. For example, as tasks become well learned and positive feedback provides less information, caudate activation is attenuated (Delgado, Miller, Inati, & Phelps, 2005; Haruno et al., 2004; Law et al., 2005). Other factors, such as a sense of agency (Tricomi, Delgado, & Fiez, 2004), the delay to receive a reward (Kable & Glimcher, 2007; McClure, Laibson, Loewenstein, & Cohen, 2004), and whether an outcome is framed as a loss or a gain (Tom et al., 2007), also affect the degree to which the striatum is activated. The perceived value of an outcome is also affected by the other possible outcomes. Studies of “counterfactual comparisons” have indicated that reward-related brain regions respond to an outcome’s value relative to the alternative outcomes, rather than to its absolute value (Breiter, Aharon, Kahneman, Dale, & Shizgal, 2001; Lohrenz, McCabe, Camerer, & Montague, 2007; Nieuwenhuis et al., 2005; Ursu & Carter, 2005). For example, earning no money when the alternative is losing money produces a greater response in the caudate than when the alternative is winning money (Nieuwenhuis et al., 2005). Similarly, in a design that allowed participants to discover not only what the earnings resulting from their choices were, but also what their earnings would have been if they had chosen differently, activation in the striatum was found to be correlated with the “fictive error signal” provided by the information about the unchosen outcome (Lohrenz et al., 2007). Importantly, these studies demonstrated that the magnitude of reward-related changes in the caudate is influenced by the relative values of the possible outcomes.

Other research has demonstrated that the caudate will even become activated by responses that are independent of external reward presentation (Han, Huettel, Raposo, Adcock, & Dobbins, 2010; Mullette-Gillman et al., 2011; Zink, Pagnoni, Martin, Dhamala, & Berns, 2003). For example, caudate activation has been shown to be elicited by task-relevant target detection and by recognition of previously viewed images in an old–new memory task (Han et al., 2010; Mullette-Gillman et al., 2011). Additionally, “model-based” knowledge of task structure has been shown to influence striatal activity, demonstrating that the striatum is sensitive to complex cognitive information (Daw, Gershman, Seymour, Dayan, & Dolan, 2011). Taken together, these experiments suggested that a subjective sense of goal achievement may be as important for activating the caudate as the absolute reward value of an outcome (Han et al., 2010; Tricomi & Fiez, 2008). Thus, subtle contextual differences in experimental paradigms can sometimes lead to different patterns of activation in the caudate.

A previous experiment indicated that when positive and negative feedback provide equal amounts of information and do not indicate task success, little activation is observed in the caudate nucleus and no differences in caudate activation occur for positive versus negative feedback (Tricomi & Fiez, 2008). That study did not address, however, whether the caudate would become active if positive and negative feedback provided differing amounts of information. In this context, positive and negative feedback could be seen as having different values relative to one another, which could drive caudate activation. We tested this idea by using a feedback-based word association task similar to one used previously (Tricomi & Fiez, 2008), with the exception that the amount of information provided by the feedback was manipulated by altering the number of response options.

In this new task variation, participants chose which of several word options correctly formed a pair with a target word. The pairs were arbitrary, so that it was impossible for participants to discern the correct answer in advance of feedback. With only two response options, positive and negative feedback provided equal amounts of information about the correct answer; if the feedback was positive, the response chosen was correct, whereas if the feedback was negative, the response not chosen was correct. However, with more than two response options, positive feedback provided more information about the correct answer than did negative feedback, because the correct answer could not be deduced from negative feedback. If caudate activity tracks the informational content of feedback, regardless of its valence, negative feedback should produce activation similar to that for positive feedback when there are two choices, but should produce less activation than positive feedback when there are more than two choices. To test this hypothesis, we examined brain activation related to receiving feedback when either two or four response options were available to participants.

A secondary goal of this experiment was to investigate the link between brain-based measures of feedback processing and behavioral indices of learning and memory. Of particular interest was the question of whether the caudate, a brain region that is typically associated with reinforcement-based learning and the formation of procedural memories (Dayan & Balleine, 2002; Graybiel, 1995; Packard & Knowlton, 2002), may contribute to the mastery of new declarative knowledge. The paired-associate learning task that forms the basis for the present experiment is traditionally regarded as a task in which learning necessitates the involvement of a medial temporal lobe system that contributes to the formation of new declarative knowledge and to the encoding of episodic experiences (Law et al., 2005; Meltzer & Constable, 2005). However, there is growing recognition that memory performance can reflect the combined influences of multiple memory systems, each of which may contribute different types of learning signals that can promote task mastery (Dickerson, Li, & Delgado, 2011; Mattfeld & Stark, 2011; Sadeh, Shohamy, Levy, Reggev, & Maril, 2011; Tricomi & Fiez, 2008). In line with this view, in a previous experiment involving a similar paired-associate learning task (Tricomi & Fiez, 2008), we found evidence that brain activation in a number of regions during an initial encoding experience predicted the accuracy of subsequent memory retrieval events. The identified regions overlapped with those reported in prior investigations of subsequent memory effects and included regions associated with declarative memory processing (e.g., left inferior frontal gyrus and left fusiform gyrus). The standard subsequent memory effect was not observed in the striatum, but an exploratory analysis indicated that the magnitude of caudate activation during trials with positive feedback was associated with future reaction time decreases in response to the same stimuli. These findings were taken as preliminary evidence that the caudate may contribute to the “proceduralization” of declarative knowledge. The present study provided an opportunity to determine whether a link between caudate activation and the formation of declarative knowledge can be replicated.

Method

Participants

A group of 19 healthy, right-handed adults were recruited through posted advertisements and were paid $60 for their participation in the study. The analysis included data from 16 of these participants (11 women, 5 men; mean age ± SD: 22.9 ± 1.9); the data from the other 3 participants were excluded due to a program error (2 participants) or to excessive head motion (1 participant). All of the participants gave informed consent according to the Institutional Review Board at the University of Pittsburgh.

Materials

A 3-tesla Siemens head-only scanner and standard radio frequency coil were used for all of the magnetic resonance scanning sessions. Stimulus presentation and behavioral data acquisition were controlled using E-Prime software (Schneider, Eschman, & Zuccolotto, 2002) and the Integrated Function Imaging System (Psychology Software Tools, Inc., Pittsburgh, PA).

Procedure

Scan session

Structural images were collected using a standard T1-weighted pulse sequence, in 42 contiguous slices (0.78125 × 0.78125 × 3.0 mm voxels) parallel to the AC–PC line. A total of 38 functional images were collected in the same locations as the middle 38 structural slices (3.125 × 3.125 × 3.0 mm voxels), providing full coverage of the cerebrum and partial coverage of the cerebellum. A one-shot echo-planar imaging (EPI) pulse sequence was used for image acquisition (TR = 2,000 ms, TE = 25 ms, FOV = 20 cm, flip angle = 79°).

Behavioral paradigm

This experiment involved a paired-associate word learning task similar to the one used in our previous work (Tricomi & Fiez, 2008). On each trial, participants saw a target word and either two or four choices of possible word matches, labeled as in a multiple-choice test (Fig. 1). The words were matched for word length and frequency at the trial level. The words contained 4–8 letters and 1–2 syllables, had Kučera–Francis frequencies of 20–650 words per million, and had imageability ratings of over 400 according to the MRC database (Coltheart, 1981). The words presented on the same trial had a score of less than 0.2 on the latent semantic analysis similarity matrix (Landauer, Foltz, & Laham, 1998), indicating low semantic relation, and they did not rhyme or begin with the same letter. Participants were asked to guess which response word went with the target word and to enter their response using the appropriate button on a response glove. After a 6-s response period, a feedback display was shown for 1-s to indicate whether the guess was correct (three green checkmarks) or incorrect (three red Xs). Three white hyphens were shown if no response was entered, and the trial was excluded from later analysis. Participants were asked to use the feedback to try to remember the correct word pairs. The trial ended with a 13-s delay, indicated by a white fixation cross in the center of the screen.

Fig. 1
figure 1

Experimental Design. Each trial, a target word was presented, along with options for possible word matches, labeled as in a multiple choice test. After a 6-s response period, the display was replaced with a 1-s feedback display of 3 green √s, indicating a correct response, 3 red Xs, indicating an incorrect response, or 3 white hypens, indicating that no response was made. After a 13-s delay, with a screen showing a fixation cross, the next trial began

We presented a total of 60 trials with two response options and 120 trials with four options, randomly intermixed. The words used in the two conditions were matched on all parameters (e.g., word length, number of syllables, etc.). The experiment was designed such that participants were correct on 50% of the trials with two options and 25% of the trials with four options, to ensure that participants encountered positive and negative feedback in accordance with the laws of probability. The positive- and negative-feedback trials were randomly intermixed. This trial structure ensured that there were at least 30 “correct” trials for each trial type, while maintaining chance levels of performance. All 180 trials were distinct, with no words repeating across trials.

Immediately following the scan, participants completed a computerized posttest. Instead of the multiple-choice format, the participants saw the target word and the word they had picked on the corresponding trial during the scan and were asked to respond as to whether the pairing was correct or incorrect. This procedure was used so that, for all trial conditions, there were two possible responses, which allowed for performance to be compared between the conditions. The participants were not made aware of how the words shown on the screen had been picked. Thus, although correct memory for the feedback would lead to correct performance on the posttest, the posttest did not explicitly test memory for the feedback paired with the target item. No feedback was given. Pairs from all 180 trials were presented, in random order. The posttest was self-paced, and participants made confidence judgments following each trial by choosing a number from 1 to 7 (1 = complete guess, 7 = completely sure). Participants returned to the lab and completed the posttest a second time approximately one week later (4–8 days following the scan session).

Data analysis

FMRI data

The NeuroImaging Software package (NIS 3.5) was used to analyze the fMRI data. The images were reconstructed and corrected for participant motion with the Automated Image Registration software (AIR 3.08; Woods, Cherry, & Mazziotta, 1992). Runs with head motion that exceeded 4 mm or 4° in any direction were not used in the analysis. The images were detrended to adjust for scanner drift within runs. The structural images of each participant were stripped to remove the skull and co-registered to a common reference brain, chosen from among the participants (Woods, Mazziotta, & Cherry, 1993). Functional images were transformed into the same common space, normalized by a mean scaling of each image to match the global mean image intensity across participants, and smoothed using a three-dimensional Gaussian filter (8-mm full width at half-maximum [FWHM]) to account for anatomical differences between participants. This set of data was then analyzed statistically.

Due to our a priori interest in the response of the caudate nucleus to feedback, we performed an ANOVA on the trial outcome phase (from the onset of feedback until the end of the trial) for a priori spherical regions of interest (ROIs; radius = 5 mm) centered in the left head of the caudate nucleus (−12, 8, 12) and the right head of the caudate nucleus (12, 8, 12). These coordinates are based on the Talairach Daemon database (Lancaster et al., 2000) and have been used previously in the literature (e.g., Zink et al., 2003). In addition, a whole-brain analysis was performed on the fMRI data, through a repeated measures three-way ANOVA with Participant as a random factor and Trial Type (two vs. four response options), Accuracy (correct vs. incorrect), and Time Period (2-s time periods T1–T10) as within-subjects factors. A voxel-wise significance threshold of p < .001 was used to identify additional ROIs; a contiguity threshold of 41 voxels was also applied as a precaution against Type I errors (Forman et al., 1995). This contiguity threshold was set using the AFNI AlphaSim program (Cox, 1996), such that the mapwise probability of a false detection remained lower than .05.

A subsequent memory analysis was performed to investigate whether levels of activation during the scanning session could be associated with accuracy on the immediate and delayed posttests. As in previous work (e.g., Wagner et al., 1998), trials for which the correct answer was remembered with high confidence on the posttest following the scanning session were compared with trials corresponding to subsequent incorrect responses. Trials were coded as “high confidence” if the confidence score was greater than or equal to 5 on the 7-point scale. A repeated measures ANOVA was performed on the fMRI data with Participant as a random factor and Posttest Accuracy (high confidence correct vs. incorrect), and Time (2-s time periods T1–T10) as within-subjects factors. Additionally, we had an a priori hypothesis that caudate activity might predict future confidence, based on our previous work (Tricomi & Fiez, 2008), so we extracted the outcome phase data from our left- and right-caudate a priori ROIs and compared the signal for trials corresponding to high posttest confidence versus trials corresponding to low posttest confidence (5–7 vs. 1–3 on the 7-point scale) using a paired t test.

Behavioral data

The behavioral data for the posttests were analyzed in terms of accuracy, reaction time, and confidence measures. We calculated d' scores for each participant for each trial condition. Two-tailed t tests were used to determine whether performance exceeded chance for each trial type, and to determine whether performance differed between the trial types. Repeated measures ANOVAs were performed to see whether confidence differed across conditions. Finally, a mixed model linear regression analysis of reaction times on confidence, with Participant as a random factor, was performed for each posttest to determine whether differences in confidence judgments were associated with differences in reaction times.

Results

Behavioral results

The behavioral data for the posttests were analyzed in terms of accuracy, reaction time, and confidence measures. For the immediate posttest, the average d' scores were 0.81 (SD = 0.7) for the two-choice condition and 0.91 (SD = 0.85) for the four-choice condition. This performance was better than chance for both conditions [t(15) = 4.9, p < .001, for the two-choice condition; t(15) = 4.3, p < .001, for the four-choice condition], indicating that learning had occurred. The difference in performance between the conditions was not significant [t(15) = −0.64, p = .53, two-tailed]. (We report d' scores here to avoid problems related to response bias, but this precluded us from examining accuracy separately for positive- and negative-feedback trials; analyses of the raw percentages correct for the individual conditions, however, showed accuracy that was significantly above chance [p < .05] for both positive and negative feedback in both the two-choice and four-choice conditions.) Three of the participants performed below chance for one of the conditions (two- or four-choice), and none performed below chance for both conditions. Exclusion of those 3 participants from the analysis did not affect the results; therefore, we report the results with all participants included.

The average confidence judgment score (out of 7) on the immediate posttest was 4.5 (SD = 1.0). A repeated measures three-way ANOVA indicated a main effect of feedback type [F(1, 16.4) = 29.4, p < .001], a main effect of posttest accuracy [F(1, 15.5) = 31.6, p < .001], and a Trial Type × Feedback Type × Posttest Accuracy interaction [F(1, 17.5) = 4.8, p = .04] on the confidence ratings. As is shown in Fig. 2, confidence was higher for accurate than for inaccurate posttest responses [t(15) = 3.3, p = .004, two-tailed]. For the trials drawn from the two-choice condition, confidence was higher for posttest item pairs on which the participant had received positive feedback (i.e., made a correct choice) during the scan than for posttest item pairs on which the participant had received negative feedback (i.e., made an incorrect choice) [t(15) = 3.4, p = .004, two-tailed]; however, the size of this positive-feedback effect did not vary according to the accuracy of the posttest response [t(15) = 0.47, p = .6, two-tailed]. In contrast, for the trials drawn from the four-choice condition, the size of the positive-feedback effect did vary according to the accuracy of the posttest response. For accurate posttest responses, confidence was higher if the participant had received positive feedback for the item pair during the scan, rather than negative feedback [t(15) = 3.9, p = .001, two-tailed]; for inaccurate posttest responses, neither positive nor negative feedback during the scan led to differential confidence scores [t(15) = 0.26, p = .8, two-tailed]. Finally, increases in confidence judgments were associated with decreases in reaction time; a mixed-model linear regression analysis of reaction times on confidence, with Participant as a random factor, showed this effect to be significant [F(6, 2790) = 10.0, p < .001].

Fig. 2
figure 2

Behavioral Results from the Immediate Post-test. Mean confidence scores (normalized to the subject mean) were higher for accurate trials than inaccurate trials and higher for trials for which participants received positive feedback rather than negative feedback during the scan. For accurate trials, the latter effect was more pronounced for the 4-choice condition than the 2-choice condition

The results from the follow-up posttest that occurred one week after the scanning session mirrored the results from the immediate posttest. Performance remained better than chance for both conditions [t(15) = 3.2, p = .006, for the two-choice condition; t(15) = 3.1, p = .007, for the four-choice condition]. A repeated measures three-way ANOVA on the confidence scores indicated a main effect of feedback type [F(1, 15.3) = 8.5, p = .01], a main effect of posttest accuracy [F(1, 15.3) = 14.9, p = .002], and a Trial Type × Feedback Type interaction [F(1, 15.7) = 8.6, p = .01]. This pattern is remarkably similar to that observed for the immediate posttest. Confidence was highest for accurate trials containing item pairs for which the participants had received positive feedback during the scan, and this positive-feedback effect was greatest for the posttest trials drawn from the four-choice condition. Increases in confidence judgments were again associated with decreases in reaction times; a mixed-model linear regression analysis of reaction times on confidence, with Participant as a random factor, showed this effect to be significant [F(6, 2670) = 15.2, p < .001].

fMRI results: Trial Type effects

The primary goal of this experiment was to examine how the amount of information provided by feedback influenced the way that the feedback was processed by the brain. On the basis of our previous work with a paired-associate learning task (Tricomi & Fiez, 2008) and of prior findings in the literature (Delgado et al., 2005; Grahn, Parkinson, & Owen, 2008), we began with an a priori interest in the response profile of the caudate nucleus. For this reason, we performed an initial ANOVA on data extracted from a priori ROIs centered in the left (−12, 8, 12) and right (12, 8, 12) heads of the caudate nucleus (cf. Zink et al., 2003). In the left caudate, there were significant main effects of feedback type [F(1, 15) = 9.5, p = .008] and trial type [F(1, 15) = 5.8, p = .03], with a greater caudate response observed for trials with positive as compared to negative feedback, and a greater caudate response in the four-choice as compared to the two-choice condition (Fig. 3). The trial type effect was also significant in the right caudate [F(1, 15) = 5.7, p = .03]. A Feedback Type × Trial Type interaction approached significance [F(1, 15) = 4.3, p = .057] in the left caudate, and a similar (though nonsignificant) pattern was observed in the right caudate [F(1, 15) = 1.4, p = .25]. Although p was greater than .05 in the left caudate, we believe that ignoring this interaction would likely constitute a Type II error, due to the focal nature of activation in the caudate. Indeed, the size of the analyzed area was actually somewhat greater than the defined area, due to the spatial smoothing kernel of 8 mm FWHM. An alternative approach using a single a priori voxel, which has been utilized in past work (Cohen et al., 2002; Tricomi et al., 2004), yielded a Feedback Type × Trial Type interaction at p = .04.

Fig. 3
figure 3

Caudate Activation Across Conditions. a) The left caudate shows a feedback type by time interaction in the 4-choice condition (p < 0.001; contiguity threshold of 41 voxels). The image is left-right reversed. The green crosshair marked the center of the a prioi ROI used for the analyses and the graph in Part b. b) Mean intensity of the BOLD response during the trial outcome phase in the left caudate, using a sphere centered on a priori Talairach coordinates of (-12, 8, 12; cf. Zink et al., 2003). The signal is significantly greater following positive feedback than negative feedback in the 4-choice condition but not the 2-choice condition, and the signal is significantly greater for negative feedback in the 2-choice condition than in the 4-choice condition

Additionally, within each trial type condition, paired t tests of outcome phase activation demonstrated that the signal in the left caudate differed for positive and negative feedback in the four-choice condition [t(15) = 3.6, p = .002, two-tailed], but not in the two-choice condition [t(15) = 0.5, p = .6, two-tailed]. Looking across the trial type conditions, the signal was significantly greater for negative-feedback trials from the two-choice as compared to the four-choice condition [left caudate, t(15) = 3.1, p = .008, two-tailed; right caudate, t(15) = 2.6, p = .02, two-tailed], whereas the signals for positive-feedback trials were comparable across the two-choice and four-choice conditions [left caudate, t(15) = 0.04, p = .97, two-tailed; right caudate, t(15) = 0.37, p = .72, two-tailed]. These results are consistent with the hypothesis that the caudate response to feedback is modulated by the information value of the feedback (which varied across the two-choice vs. four-choice conditions for negative but not for positive feedback).

In order to ascertain the extent of these effects in the brain, we also performed a repeated measures ANOVA on the whole-brain data from the four-choice and two-choice conditions. For the two-choice condition, no regions showed a differential response to positive and negative feedback at our significance threshold. However, a number of regions (listed in Table 1) did exhibit a Feedback Type × Time interaction in the four-choice condition; in line with the results from the a priori analysis, the left head of the caudate nucleus was one of the identified regions, and it is depicted in Fig. 3. A similar region in the right head of the caudate nucleus was detected, but it was smaller (22 voxels) and did not survive our contiguity threshold.

Table 1 Four-choice condition: Feedback Type × Time (p < .001)

Several brain regions were identified as showing a Trial Type × Feedback Type × Time interaction (Table 2). The largest and most significant cluster that exhibited this effect was in the left prefrontal cortex. A similar cluster had also been identified as showing a Feedback Type × Time interaction effect in the four-choice condition. The cluster included voxels in the dorsolateral and ventrolateral prefrontal cortex (PFC), the inferior frontal gyrus, and the anterior insula. All of the subregions of the activated cluster displayed the same pattern of activation; Fig. 4 therefore depicts the results from the entire cluster showing the three-way interaction. Within this cluster, the interaction effect reflected the fact that there was no differentiation between trials with positive and negative feedback for the two-choice condition, whereas there was a greater response to positive than to negative feedback for the four-choice condition; additionally, the signal following negative feedback in the two-choice condition exceeded that signal in the four-choice condition, whereas the signals following positive feedback were comparable across conditions.

Table 2 Trial Type × Feedback Type × Time (p < .001)
Fig. 4
figure 4

Left Prefrontal Cortex Activation Across Conditions. a) The left PFC (circled) shows a Feedback Type by Trial Type by Time interaction (p < 0.001; contiguity threshold of 41 voxels). The image is left-right reversed. b) Mean intensity of the BOLD response during the trial outcome phase in the left prefrontal cortex. There is a significant interaction of Trial Type and Feedback Type, with a greater signal following positive feedback than negative feedback in the 4-choice condition but not the 2-choice condition. The signal is also significantly greater for negative feedback in the 2-choice condition than in the 4-choice condition. The plot was constructed using data from the entire left PFC cluster identified in our whole-brain analysis

fMRI results: Links to future memory performance

A secondary aim of the present study was to explore the links between brain-based measures during the initial encoding of the stimulus items and subsequent behavioral measures of learning and memory. We focused on two questions that were motivated by a prior study involving a similar paired-associate learning task (Tricomi & Fiez, 2008). The first was whether we could find evidence for a subsequent memory effect, in which variability in the signal during the initial encoding of a trial could be shown to predict future accuracy on a posttest of memory performance. We performed a whole-brain analysis of posttest accuracy (high-confidence accurate posttest responses vs. incorrect posttest responses) and time (2-s time periods T1–T10) to investigate this question. Trials from both the two-choice and four-choice conditions were included, although it should be noted that there were more trials from the four-choice condition, since the study was designed with more trials in this condition, which was also associated with the largest number of high-confidence responses on the immediate posttest. Table 3 lists the regions that showed a subsequent memory (Posttest Accuracy × Time) effect. These areas showed greater activation on trials in which the subsequent posttest response was accurate, as compared to inaccurate. Among the regions displaying this pattern were the left inferior frontal gyrus, which has previously been associated with subsequent memory effects (Brewer, Zhao, Desmond, Glover, & Gabrieli, 1998; Wagner, Koutstaal, & Schacter, 1999).

Table 3 Subsequent memory effects: Posttest Accuracy × Time (p < .001)

The second analysis was motivated by a subsequent reaction time effect we had observed in our prior work (Tricomi & Fiez, 2008). In the prior study, an exploratory analysis found that the level of caudate activation for trials with positive feedback (i.e., correct trials) predicted the size of the savings in reaction times to the same stimulus items on their next presentation. Since both the present and prior work had found that reaction times were significantly correlated with confidence ratings, we investigated the relationship between the caudate signal and subsequent confidence ratings on the posttest. Trials with positive feedback were first divided into those with high versus low posttest confidence ratings (5–7 vs. 1–3 on a 7-point scale), and then the outcome phase data extracted from our left- and right-caudate a priori ROIs were compared with respect to posttest confidence. In the left caudate, the signal was significantly greater for trials with high as compared to low confidence on the posttest [t(15) = 2.3, p = .04, two-tailed], and a similar, but nonsignificant, pattern was observed in the right caudate [t(15) = 1.5, p = .14, two-tailed]. A similar analysis was performed for the trials with negative feedback, which were analyzed separately because in our prior work we had examined correct trials only. Although the effect was in the same direction, it was not significant for either the left caudate [t(15) = 0.8, p = .43, two-tailed] or the right caudate [t(15) = 0.43, p = .68, two-tailed].

Discussion

In this experiment, positive feedback indicated which answer was correct in both conditions, whereas negative feedback provided enough information to determine which answer was correct for the two-choice condition, but not the four-choice condition. The signal in the caudate mirrored the amount of information provided by feedback, with greater activation for positive than for negative feedback in the four-choice condition, but not in the two-choice condition. In the two-choice condition, negative feedback produced as strong a signal in the caudate as did positive feedback. This suggests that the caudate responds to the informational value of feedback in a manner analogous to its response to the reward value of an outcome. Thus, negative feedback in this experiment might not have been interpreted as a punishment, but rather as a reward that varied in magnitude depending on the amount of information it provided (i.e., small for the four-choice condition and large for the two-choice condition). The idea that the caudate responds to the informational value of feedback would also help reconcile previous findings of negative feedback producing an increased striatal response relative to no feedback (Bischoff-Grethe, Hazeltine, Bergren, Ivry, & Grafton, 2009), whereas negative feedback often produces a decreased striatal response relative to positive feedback (e.g., Tricomi et al., 2006).

As in previous work (Brewer et al., 1998; Reber et al., 2002; Tricomi & Fiez, 2008; Wagner et al., 1998), activation in regions such as the left inferior frontal gyrus was found to predict which trials would subsequently be remembered on the posttest following the scanning session. Additionally, in this experiment, when positive feedback was delivered, the caudate was more active for trials that would subsequently be associated with high posttest confidence. Furthermore, positive feedback led to higher confidence ratings on the posttests than did negative feedback, which is in line with the finding that a greater BOLD signal was elicited in the caudate for positive than for negative feedback in the four-choice condition. These results suggest that caudate activation may play a role in facilitating feedback-based learning, in accord with mounting evidence indicating that both the striatum and the declarative memory system can contribute simultaneously to learning (Dickerson et al., 2011; Mattfeld & Stark, 2011; Sadeh et al., 2011; Shohamy, Myers, Kalanithi, & Gluck, 2008; Tricomi & Fiez, 2008).

One quandary, however, is an apparent mismatch between the neural versus the behavioral effects of our trial type manipulation. Neurally, trial type was found to modulate the signal in the caudate in response to negative feedback (which differed in information content across the two conditions) but to produce little difference in the signal in response to positive feedback (which had equivalent information value across the two conditions). Behaviorally, however, trial type disproportionately affected posttest confidence ratings for trials with positive feedback, and, as noted above, correlations between neural activity and subsequent confidence ratings were found for trials with positive but not with negative feedback. These findings suggest that a distinction must be made between the reward value signal and the learning signal produced in the caudate. Our results suggest that caudate activation indexes the subjective reward value of a stimulus, but that the link between the negative-feedback signal in the caudate and behavioral performance is not as tight as the link associated with positive feedback. It could be that the caudate is more effective at strengthening associations than at forming negative associations, or these results could be due to the fact that the response to negative feedback was not truly an “error” response in our experiment. Support for the former conclusion comes from other studies that have shown that learning rates and reaction times are faster when positive feedback or monetary rewards are presented than when negative feedback or punishments are presented (Chase et al., 2010; Kim, Shimojo, & O’Doherty, 2006; Wächter, Lungu, Liu, Willingham, & Ashe, 2009).

The role of the striatal response to negative feedback as a learning signal remains uncertain. One theory is that learning from errors depends on the “dip” in firing of dopaminergic neurons following negative prediction errors caused by negative feedback (Frank, Seeberger, & O’Reilly, 2004; Holroyd & Coles, 2002). Negative feedback, however, may not necessarily lead to negative prediction errors. If structures in the brain’s reward circuitry code for informational value in a manner similar to the coding for primary reward value (Bromberg-Martin & Hikosaka, 2011), negative feedback could lead to a positive prediction error if it were to provide more information than expected. Since the prediction error signals conveyed by dopamine neurons are thought to have a major influence on striatal activity, it is unclear whether negative feedback can be more effective in promoting learning when it produces a decrease rather than an increase in striatal signal. It is possible that a “reward” response produced by negative feedback could actually lead to reinforcement of the incorrect response. Alternatively, if people are able to flexibly use the information provided by negative feedback, a positive striatal response to negative feedback could still support accurate learning, presumably in conjunction with other regions involved in learning, such as the prefrontal cortex and the medial temporal lobe.

Indeed, the results of our study indicate that the caudate is not acting in isolation in its involvement in feedback-based learning. Other regions, such as the left PFC, were also sensitive to the amount of information provided by feedback. A smaller PFC response was elicited in response to negative than to positive feedback in the four-choice condition, whereas there was no differentiation in the two-choice condition. Such a pattern of activation does not necessarily mean that the specific role of the left PFC is to code for the amount of information provided by feedback. Instead, the pattern could simply reflect the increased processing demands that occur when more information is available. For example, the dorsolateral PFC has been associated with a role in executive control processes, such as working memory, and it may be that with more information, there is an increased load on this system as it updates memory representations. The involvement of multiple memory systems could explain the interactive influences of trial type, feedback type, and memory accuracy on confidence ratings. For instance, variations in the amount of elaborative processing that is supported by regions such as the PFC in the two-choice versus the four-choice conditions could yield variations in the ability to recall specific episodic experiences that particularly bolster the confidence ratings for accurate memory judgments. On the other hand, the delivery of positive feedback may help to create stimulus–response associations that bolster stimulus familiarity, and subsequently confidence ratings, independent of whether the recall of a prior episodic experience with the stimulus was accurate or inaccurate.

In summary, the results of this experiment build upon previous research by demonstrating the strong role that context plays in guiding feedback-related caudate activation. The value of the outcome of one’s actions is determined relative to one’s current goals, and caudate activity reflects this (Grahn et al., 2008; Han et al., 2010; Tricomi & Fiez, 2008). Critically, in the four-choice condition of this experiment, positive and negative feedback carried different amounts of information, and thereby could be perceived by learners as aiding them in reaching their goal of task mastery to different degrees, leading to differential caudate activity. When negative feedback provided as much information as positive feedback, the two types of feedback elicited similar signals in the caudate. This implies that there can be positive aspects of “negative” feedback and suggests that the subjective value of the information provided by feedback must be taken into account when developing models of the neural mechanisms supporting learning and motivation. Additionally, our results suggest that declarative memory processing is influenced by the magnitude of the positive feedback signal in the caudate. Feedback processing in the caudate may thus be important for both episodic and nonepisodic forms of memory.