Abstract
The ability to learn from feedback is a key component of adaptive behavior. This type of learning is traditionally thought to depend on neural substrates in the striatum and not on the medial temporal lobe (MTL). Here we show that in humans the MTL becomes necessary for feedback-based learning when feedback is delayed. Specifically, amnesic patients with MTL damage were impaired at probabilistic learning of cue–outcome associations when response-contingent feedback was delayed by a few seconds, but not when feedback was immediate. By contrast, patients with striatal dysfunction due to Parkinson's disease demonstrated the opposite pattern: impaired learning when trial-by-trial feedback was immediate but not when feedback was delayed, indicating that the striatum is necessary for learning only when feedback is immediate. Together, these results reveal that multiple complementary learning processes support what appears to be identical behavior in healthy individuals and point to an important role for the MTL in feedback-driven learning.
Introduction
Studies of the neural bases of learning and memory suggest that the medial temporal lobe (MTL) and the striatum play distinct roles, supporting two dissociable memory systems. The MTL is thought to support long-term declarative, episodic memory. The striatum, by contrast, is thought to support incremental feedback-based learning of stimulus–response associations (Knowlton et al., 1996; Squire and Zola, 1996; Gabrieli, 1998).
A seminal study supporting this view demonstrated that amnesics with presumed MTL damage were intact at probabilistic learning, but had no episodic memory for events that took place during the experiment (Knowlton et al., 1996). By contrast, patients with striatal dysfunction due to Parkinson's disease were impaired at probabilistic feedback-based learning, but had intact episodic memory for experimental events. Later studies further revealed a selective role for the striatum in feedback-based learning, demonstrating that learning through observation of the correct outcome, instead of through trial-by-trial feedback, shifted learning from the striatum to the hippocampus (Poldrack et al., 2001; Shohamy et al., 2004). These findings are consistent with the traditional view that the MTL supports episodic memory, not feedback-based learning.
However, emerging findings raise questions about this view and suggest that the hippocampus may contribute to feedback-based learning. Neurons in the hippocampus respond to rewarding outcomes during learning (Johnson et al., 2007; Wirth et al., 2009); in humans, activation in the hippocampus has been shown to code learning-related prediction errors (Dickerson et al., 2011; Foerde and Shohamy, 2011) that are ubiquitously observed in the striatum (O'Doherty et al., 2003; for review, see Daw and Doya, 2006). Thus, the MTL, like the striatum, may contribute to feedback-based learning. However, evidence for the role of the MTL in feedback-based learning has been indirect: recordings in animals and fMRI in humans do not provide a test of whether the MTL plays a necessary and causal role in feedback-based learning. In addition, these findings raise questions about the specific circumstances under which the MTL supports feedback-based learning.
We hypothesized that the MTL supports learning when feedback is delayed, on the order of seconds, but not when feedback is immediate. This hypothesis was motivated by evidence that the hippocampus binds events across time (Thompson and Kim, 1996; Wallenstein et al., 1998; Davachi, 2006), and mounting evidence that the striatum is not well suited for learning when feedback is delayed (Maddox et al., 2003; Maddox and Ing, 2005; Roesch et al., 2007; Fiorillo et al., 2008; Kobayashi and Schultz, 2008). Indeed, a recent fMRI study demonstrated feedback-driven responses in the hippocampus when response-contingent feedback was delayed by several seconds, but not when feedback was immediate (Foerde and Shohamy, 2011).
In this study, we tested amnesic patients with MTL damage on a probabilistic learning task with delayed feedback, with the goal of determining whether the MTL plays a necessary role in incremental feedback-driven learning, and if so, under what circumstances. In addition, to determine whether the MTL and the striatum make distinct contributions to feedback learning, we compared the performance of amnesics to that of patients with Parkinson's disease.
Materials and Methods
Participants
Amnesics and their controls.
Seven amnesic patients (three females) with MTL lesions participated in the study (Table 1). The neuropsychological profiles of all patients indicate impairments isolated to the domain of memory. Six patients had an etiology of anoxia and one patient had an etiology of herpes encephalitis. The amnesic group had a mean verbal IQ score of 107.0, as measured by the Wechsler Adult Intelligence Scale-III (WAIS-III). Their attentional abilities, as measured by the Wechsler Memory Scale-III (WMS-III) Working Memory Index were also intact, as indicated by a mean score of 97.3. Their memory functioning was severely compromised, as indicated by a mean General Memory Index of 58.0, a mean Visual Delay Index of 62.9, and a mean Auditory Delay Index of 59.9.
Demographic and neuropsychological characteristics of participants
To assess the extent of patients' neural damage, structural MRI scans were collected for three of the anoxic patients. MRI could not be obtained for the remaining patients because of medical contraindications, but MTL pathology can be inferred based on etiology and neuropsychological profile, which indicated selective amnesia. Information about the acquisition and analysis of MRI scans was previously reported (Kan et al., 2007). For lesion volumetric analysis of MTL regions, the hippocampus and amygdala were individually segmented according to established parameters (Seidman et al., 2002). The parahippocampal gyrus was defined anteriorly by the isthmus of the temporal and frontal lobes, medially by the collateral fissure, laterally by the hippocampal fissure, and posteriorly by the anterior limit of the calcarine fissure. Regional brain volumes were determined by multiplying the number of voxels within a parcellation unit on a given coronal slice by the voxel volume, and summing across all slices in which each unit appeared.
Quantitative analysis compared patients' regional brain volumes (corrected for intracranial volume) to volumes from eight age-matched and gender-matched control subjects. One of the anoxic patients (P04) had damage limited to the hippocampus and two of the anoxic patients (P02, P03) had damage to the hippocampus and to the surrounding parahippocampal gyrus (volume reductions > 2 SDs from the control mean). P04 had a unilateral reduction in right hippocampal volume of 27%, P02 had a bilateral reduction in hippocampal volume of 63%, and P03 had a bilateral reduction in hippocampal volume of 69%. To assess the possibility of additional damage outside the MTL, measurements of frontal, parietal, occipital, and lateral temporal cortex were also made. The hippocampus was the single area of damage observed across all participants and the only extra-MTL volume reductions were observed in the left lateral temporal lobe for P02.
Fifteen healthy controls (seven females) were matched to the patient group on mean age, education, and verbal IQ (Table 1). All subjects provided informed consent in accordance with the procedures of the Institutional Review Boards at Boston University and the VA Boston Healthcare System.
Parkinson's disease patients and their controls.
We compared patients with MTL damage to 15 patients (five females) with Parkinson's disease. Patients with a diagnosis of idiopathic Parkinson's disease were recruited through the Columbia University Medical Center Department of Neurology with the assistance of a neurologist (Dr. Lucien Cote) and through the online database Fox Trial Finder (data from six of these participants were reported previously by Foerde and Shohamy, 2011; a separate element of data from 14 of these participants was reported by Foerde et al., 2013). Only patients in mild or moderate disease stages (Hoehn and Yahr stage 1–3) were recruited. We tested Parkinson's disease patients off dopaminergic medication (either withdrawn from their medication, n = 9, or not yet receiving dopaminergic treatment, n = 6). This was done to focus on impairments related to Parkinson's disease rather than possible dysfunction caused by dopamine replacement therapies (Frank et al., 2004; Cools, 2006; Shohamy et al., 2006). Patients who were receiving medications had withdrawn from them overnight and were tested at least 16 h after their last medication dose. When on medication, five of the patients were being treated with l-Dopa, one was receiving a dopamine agonist, and three were receiving a combination of l-Dopa and dopamine agonists (three Parkinson's disease patients were also taking antidepressant medication).
Fifteen controls (10 females) were matched to Parkinson's disease patients on mean age, education, and verbal IQ (Table 1). All participants provided informed consent in accordance with the guidelines of the Institutional Review Board of Columbia University.
Parkinson's disease and amnesic patients did not differ in age, education, verbal IQ, or COWAT-FAS (all p values > 0.05). Patients and their respective control groups did not differ on age, education or VIQ (p values > 0.11).
Task
All participants completed a probabilistic learning task adapted from prior studies (Knowlton et al., 1996; Poldrack et al., 2001; Shohamy et al., 2004; Foerde et al., 2006; Foerde and Shohamy, 2011). In this task, participants learn to associate cues with outcomes through trial and error. Because there is no one-to-one mapping between cues and outcomes, optimal learning involves the use of response-contingent feedback across multiple trials to incrementally learn the most probable outcome. Critically, here we manipulated the timing with which feedback was delivered during probabilistic learning (Foerde and Shohamy, 2011).
Feedback was delivered after 1 or 7 s. These particular feedback delays were chosen based on pilot testing and are consistent with other studies showing that delays of this magnitude affect striatal-dependent learning (Maddox et al., 2003; Maddox and Ing, 2005; Fiorillo et al., 2008; Kobayashi and Schultz, 2008; Gregorios-Pippas et al., 2009; Weinberg et al., 2012). Here, for the sake of simplicity we refer to these conditions as immediate feedback (feedback presented 1 s after a response), and delayed feedback (feedback presented 7 s after a response). However, in everyday situations, feedback delays are likely to lie along a continuum, and whether feedback is considered immediate or delayed may not depend on the absolute timing, but rather on the timing relative to other conditions with which it co-occurs.
On each trial, participants saw a cue (one of four different butterflies) and had to predict which of two outcomes (different colored flowers) was more likely (Fig. 1). Each butterfly was associated with one flower on 83% of trials and with the other flower on 17% of trials (Fig. 1C). Upon responding, a delay period of either 1 s (immediate condition) or 7 s (delay condition) followed before feedback was given. During the delay, the chosen flower and the butterfly remained on the screen to minimize working memory demands. Thus, the critical manipulation was the time interval between response and feedback (Fig. 1A). Immediate and delayed feedback trial types were interleaved throughout training in a randomized order. Feedback consisted of a verbal response (“Correct” or “Incorrect”) presented on the screen for 2 s. The assignment of cues to outcomes and conditions was counterbalanced across participants. Participants completed a short practice to ensure that they understood the task and were able to respond in the allotted time (7 s).
Paradigm for probabilistic learning with immediate versus delayed feedback: Task structure and events. A, Participants used trial-by-trial feedback to learn which flower four different butterflies preferred (Learning phase). One set of butterflies was learned in the Immediate feedback condition wherein feedback was presented 1 s after a response was made. Another set of butterflies was learned in the Delayed feedback condition wherein feedback was presented 7 s after a response was made. Immediate and delayed feedback trials were randomly interleaved during learning. On each trial in the Learning phase, as soon as a response was made, participants' choices were displayed along with the butterfly until feedback was provided. B, After learning, participants completed a probe test in which they continued to make predictions about the butterflies' preferences (Test phase). However, they no longer received feedback, and the timing of all trial events was equal across trial types. C, Each butterfly was associated with one flower on 83% of trials and with the other flower on 17% of trials. The association between butterflies and Feedback timing condition was counterbalanced across participants.
After the Learning phase (96 trials divided into 4 blocks with rest breaks in between) followed the Test phase (24 trials; Fig. 1B), which was the main focus of our analysis. In the Test phase, participants saw the butterflies from the Learning phase and were told to continue predicting which of the flower outcomes was more likely based on what they had learned. The Test phase structure resembled the Learning phase, with the exception that no feedback was given and, critically, the timing of all trial parts was equivalent across trial types. Previous work has shown that a test phase can be essential for revealing how information acquired during learning is used (Foerde et al., 2006; Foerde and Shohamy, 2011).
Performance on the probabilistic learning task was assessed in terms of making optimal choices (the degree to which participants selected the most likely outcome for each cue), as in previous studies (Knowlton et al., 1994, 1996; Poldrack et al., 2001; Gluck et al., 2002; Hopkins et al., 2004; Shohamy et al., 2004; Foerde et al., 2006).
Results
To assess whether the MTL contributes to feedback-driven learning, we compared Test phase accuracy of amnesics and their control group in a repeated-measures ANOVA with Group as a between-subject factor and Feedback timing as a within-subject variable. We found a significant interaction between Group and Feedback timing condition (F(1,20) = 4.45, p = 0.048) (Fig. 2A). The interaction reflected the fact that amnesics were significantly impaired at learning from delayed feedback (t(20) = −2.74, p = 0.013) but not from immediate feedback (t(20) =−0.006, p = 0.99). This pattern of performance suggests that the MTL is necessary for feedback learning, but only for the delayed feedback condition.
Learning from immediate versus delayed feedback in amnesia and Parkinson's disease. A, Amnesics' Test phase performance in the delayed feedback condition was impaired, but they performed as well as controls in the immediate feedback condition. B, Parkinson's disease patients' Test phase performance in the immediate feedback condition was impaired, but they performed as well as controls in the delayed feedback condition.
To assess the contribution of the striatum to feedback-based learning, we next compared Test phase performance of Parkinson's disease patients and their controls. We found a significant interaction between Group and Feedback timing (F(1,28) = 4.53, p = 0.042), as well as a main effect of condition (F(1,28) = 7.18, p = 0.012) (Fig. 2B). Examination of the interaction revealed that, in contrast to the amnesics, Parkinson's disease patients were significantly impaired at learning from immediate feedback (t(28) = −2.76, p = 0.01) but not from delayed feedback (t(28) = 0.12, p = 0.90) (Foerde and Shohamy, 2011).
Next, to directly compare amnesic and Parkinson's disease patients, we normalized each patient's performance with reference to their matched control group. The resulting z-scores were compared in a 2(Group; Parkinson's vs amnesics) × 2(Feedback timing; immediate vs delayed) ANOVA. This analysis revealed a significant interaction between Group and Feedback timing (F(1,20) = 17.2, p < 0.001) (Fig. 3), and no main effects of either Group or Feedback timing (F values < 1). An examination of the simple effects revealed that amnesics had significantly better performance than Parkinson's disease patients for the Immediate feedback condition (t(20) = −2.28, p = 0.034), whereas Parkinson's disease patients performed significantly better than amnesics on the Delayed feedback condition (t(20) = 2.83, p = 0.01).
Comparison of learning from immediate versus delayed feedback across amnesics and Parkinson's disease patients. For comparison between patient groups, performance of each patient was normalized to their own matched control group. Amnesics performed better than Parkinson's disease patients in the immediate feedback condition, whereas Parkinson's disease patients performed better than amnesics in the delayed feedback condition. The figure represents the z-scores of each group.
Notably, the same pattern of results was observed on an individual basis: six of seven amnesics, including a patient with restricted hippocampal damage, showed better performance in the immediate than delayed feedback condition, whereas 10 of 15 Parkinson's disease patients showed the opposite: better performance in the delayed than immediate feedback condition.
Analyses of response times during the Test phase revealed no significant differences between amnesics and Parkinson's disease patients or between each group and their matched controls (all p values > 0.05).
Next, we analyzed performance during the Learning phase to see whether we would find the same pattern of selective impairments. Performance in the Learning phase was substantially more variable in all participants, presumably reflecting trial-by-trial fluctuations based on recent probabilistic reward history, both within and between conditions. In addition, the one amnesic participant who did not show a benefit for immediate feedback during the Test phase, showed particularly erratic performance during the Learning phase.
Nonetheless, we compared the performance of amnesics to their matched controls in the Learning phase in a 2(Group) × 2(Feedback timing) × 4(Block) ANOVA and found marginal effects of Group (F(1,20) = 3.81, p = 0.065) and Feedback timing (F(1,20) = 3.22, p = 0.088), but no significant effect of Block or interactions (p values > 0.16). Comparing Parkinson's disease patients and their controls during the Learning phase, we found a main effect of Block (F(3,84) = 15.26, p < 0.001) and a significant Group × Block interaction (F(3,84) = 2.90, p = 0.042), but no other significant effects or interactions (p values >0.1). The comparison of amnesic and Parkinson's disease patients' normalized performance during the Learning phase yielded a significant effect of Block (F(3,60) = 3.22, p = 0.044) and an interaction between Block and Feedback timing (F(3,60) = 2.94, p = 0.04), but no other main effects or interactions (p values > 0.1).
Thus, analyses of results during the Learning phase did not reveal the significant interactions between Group and Feedback timing that were found in the Test phase. However, if the amnesic patient characterized as an outlier was removed from analyses, similar patterns were found across the Learning and Test phases. After removing one amnesic participant from analyses, a 2(Group; amnesics vs controls) × 2(Feedback timing) × 4(Block) ANOVA revealed a marginal interaction between Group and Feedback (F(1,57) = 3.34, p = 0.084), and no other significant effects or interactions (p values >0.1). In the comparison between Parkinson's disease and amnesic patients, there was a significant effect of Block (F(3,57) = 4.06, p = 0.019), an interaction between Block and Feedback timing (F(3,57) = 3.27, p = 0.028), and a marginal interaction between Group and Feedback timing (F(1,57) = 3.49, p = 0.077).
In contrast to the Learning phase, the Test phase provides a relatively clean test of stimulus–response learning by controlling for feedback timing across trials and groups, as it requires making the same type of response to the same stimuli without any differences in timing. Although the pattern was qualitatively similar in both Learning and Test phases for most individuals, more variable trial-by-trial performance during Learning in a subset of patients weakened the power of the analyses comparing conditions during the Learning phase.
Because of the known memory deficits in amnesia and to rule out any strategic differences between the groups, we tested participants' awareness of the timing manipulation using postexperiment questionnaires. First, we asked whether participants noticed any variability in the timing between responses and feedback and then asked them to rank order cues according to how long they had to wait for feedback for that particular cue. Analysis of these reports suggested that participants were not aware that distinct cues were associated with specific feedback delays during learning. Six amnesic patients completed a posttest survey, but none reported correct timing estimates. Of the 15 (amnesic) control subjects, only three subjects reported correct timing estimates. Among Parkinson's disease patients and their controls, only one patient and one control reported correct stimulus-feedback timing associations. Thus, across subjects, although feedback timing had a significant impact on what was learned, we found no evidence that these differences might have been related to explicit representation of feedback timing.
Finally, we investigated correlations between impairments in probabilistic learning and measures of neuropsychological measures and disease severity for each patient group. For the amnesic patients we found no significant correlations between task performance in the delayed feedback condition and neuropsychological measures (including WMS-III, WMS-III Working memory, Paired Associates, Visual Memory-immediate and Visual Memory-delayed; p values > 0.1). For the Parkinson's disease patients we found no significant correlations between task performance in the immediate feedback condition and UPDRS scores or disease duration (p values > 0.1).
Discussion
The current findings reveal that the MTL plays a critical role in feedback-driven learning and that the role of the MTL is distinct from that of the striatum. In a feedback-based learning task in which response-contingent feedback on each trial arrived either immediately or with a brief delay, we found dissociation in the performance of patients with damage to the MTL versus the striatum. Specifically, amnesic patients with MTL damage were impaired at learning when feedback was delayed but not when feedback was immediate. Patients with striatal dysfunction due to Parkinson's disease demonstrated the opposite pattern: they were impaired at learning when feedback was immediate but not when feedback was delayed. This pattern of findings provides direct evidence that multiple learning systems are needed to support feedback-based learning and that subtle changes in the learning environment critically alter which system is engaged to support learning.
Understanding the neural mechanisms underlying feedback-driven learning has been the focus of intense research across multiple fields for the last two decades. This work has established an essential role for the striatum and its dopaminergic inputs in learning from reinforcement (Schultz et al., 1997; Schultz, 1998; Pessiglione et al., 2006) both in electrophysiological studies in animals (Schultz et al., 1997; Tremblay et al., 1998; Waelti et al., 2001) and using BOLD fMRI in humans (Delgado et al., 2000; O'Doherty et al., 2003; Daw and Doya, 2006).
However, the majority of this research has focused on learning from immediate outcomes and there is evidence that the striatum might not effectively support learning from delayed feedback. Several studies have demonstrated that delayed feedback can impair learning. In rodents, delayed feedback slows acquisition of instrumental responses (Perin, 1943; Grice, 1948; Lattal and Gleeson, 1990; Dickinson et al., 1992, 1996; Cheung and Cardinal, 2005); in humans, delayed feedback leads to impaired procedural learning of a perceptual categorization task (Maddox et al., 2003; Maddox and Ing, 2005; Worthy et al., 2013). In addition, recent electrophysiological data suggest that the responses of dopaminergic inputs to the striatum change significantly when outcomes are delayed on the order of seconds (Fiorillo et al., 2008; Kobayashi and Schultz, 2008). Similarly, in humans performing a gambling task, feedback sensitive ERP-waveforms distinguish between gains and losses when outcomes occur after 1 s but not when they occur after 6 s (Weinberg et al., 2012). These studies all point to limitations in the role of the striatum in learning from delayed feedback, consistent with the current results.
Such findings also raise questions about how learning from delayed outcomes is accomplished. The current results resolve this question by showing a necessary and selective role for the MTL in learning from delayed outcomes. These results are consistent with recent reports from fMRI demonstrating that BOLD signals in the hippocampus correlate with feedback prediction errors during learning (Dickerson et al., 2011; Foerde and Shohamy, 2011). Critically, the current results provide the first evidence that such patterns of activation in the hippocampus reflect a necessary role in acquiring representations rather than merely reflecting downstream inputs from regions that are critical for learning.
A role for the MTL in feedback-driven learning
The current results point to a causal role for the MTL in instrumental feedback-driven learning, which stands in contrast to predictions based on the traditional taxonomy of memory. The multiple memory systems framework emphasizes a distinction between declarative memory, which relies on the hippocampus and surrounding MTL, and feedback-driven or instrumental learning, which depends on the striatum (Knowlton et al., 1996; Squire and Zola, 1996).
Although the role of the MTL in instrumental feedback-driven learning may be surprising, the current results are consistent with theoretical and computational accounts of the learning and memory mechanisms supported by the hippocampus. In particular, the hippocampus is thought to play a critical role in relational memories, in which elements of experience that are discontiguous in time or space are bound together (Cohen and Eichenbaum, 1993; Davachi, 2006; Staresina and Davachi, 2009), such as in trace conditioning (Thompson and Kim, 1996; Clark and Squire, 1998; Büchel et al., 1999). Relational memory mechanisms were not previously considered in the context of feedback-based learning. However, evidence for a critical role for the hippocampus in relational memory in other domains (Mitchell et al., 2000; Hannula et al., 2006; Olson et al., 2006; Piekema et al., 2006), suggests that the hippocampus may play an analogous role in relationally integrating information across time, providing a bridge across the temporal gap between responses and feedback.
It should be noted that a trade-off between the striatum and the hippocampus as a function of feedback timing might not occur for all feedback-driven learning. For example, in feedback-driven perceptual categorization, delayed feedback leads to impaired learning compared with immediate feedback. Interestingly, model-based analyses of the learning data have suggested that when feedback is delayed, participants attempt to use rule-based strategies that have been associated with prefrontal cortex and the head of the caudate nucleus rather than procedural learning, which instead is suggested to rely on extrastriate cortex and the tail of the caudate nucleus, (Maddox et al., 2003; Maddox and Ing, 2005; Worthy et al., 2013), albeit that such strategies are less successful. Similar results might be expected for a variety of motor learning tasks. Thus, an important question for future research will be to understand the boundaries and task-domains for which multiple learning systems can trade off to support learning.
MTL contributions to delayed, but not immediate, feedback-driven learning
An important question that arises from our findings concerns why the hippocampus did not appear able to support learning from immediate feedback. Parkinson's disease patients exhibit a severe impairment in learning from immediate feedback, raising questions about why they could not leverage the MTL mechanisms that support learning from delayed feedback to ameliorate this deficit.
One possibility is that the hippocampus, like the striatum, can support learning from immediate feedback, but that there is interaction, even competition, between these two memory systems (Sherry and Schacter, 1987; Poldrack et al., 2001; White and McDonald, 2002; Poldrack and Packard, 2003). This idea suggests that in some situations where both systems can learn, there may be competition to control behavior. For example, effective connectivity analyses in an fMRI study using a learning task where striatum and hippocampus appear to be in competition, showed that regions in the prefrontal cortex appeared to be the critical mediator of competition between the striatum and MTL (Poldrack and Rodriguez, 2004). By this view, it is possible that we do not see the MTL compensating for learning deficits when feedback is immediate because the prefrontal cortex is gating such a contribution. Such a process need not be implemented through an intentional cognitive control mechanism. Instead, the learning system in control could be determined implicitly by the degree of uncertainty associated with each learning system (Daw et al., 2005).
Another possibility is that the MTL is not suited to support learning from immediate feedback, analogous to the diminishing contribution of the striatum as feedback is delayed. Data from studies investigating the role of the MTL in both working memory and long-term memory are suggestive of sensitivity to increasing delays in the MTL. For example, a study of object working memory found increasing hippocampal activity during recognition with increasing retention delays (Picchioni et al., 2007), suggesting that the contribution of the MTL changes as delays between events change. Similarly, within long-term memory, responses to item repetitions in MTL are lag sensitive (Xiang and Brown, 1998; Brozinsky et al., 2005), again consistent with the notion that responses in the MTL unfold over time.
Interestingly, there is also evidence that the hippocampus contributes critically even at very short delays, in particular in support of relational binding and comparison (Ranganath and Blumenfeld, 2005; Olsen et al., 2012). Moreover, recordings from neurons in the MTL show rapid responses to reward or feedback related information (Watanabe and Niki, 1985; Liu and Richmond, 2000; Johnson et al., 2007; Wirth et al., 2009). However, due to the correlational nature of these data, it is not clear whether these responses reflect the building of representations that are necessary for task performance. A challenge for future studies is to elucidate the precise learning conditions under which the hippocampus can and cannot support performance, and to understand the importance of absolute versus relative temporal delays between responses and feedback in engaging distinct neural mechanisms.
Conclusion
The current results are broadly consistent with mounting evidence that points to a less stark division of labor between memory systems (Adcock et al., 2006; Shohamy and Wagner, 2008; Dickerson et al., 2011; Foerde and Shohamy, 2011; Seger et al., 2011). The striatum has long been known to support feedback-driven learning, whereas the hippocampus has garnered little attention for its possible contributions to feedback-based learning. The dissociation in the current study in the performance of patients with MTL and striatal damage demonstrates that the MTL is necessary for learning when feedback is delayed by a few seconds. In contrast, immediate feedback learning depends on the striatum. These results extend prior reports of feedback-related activity observed in the hippocampus with fMRI and point to a causal role for the hippocampus in feedback-driven learning under some circumstances. Further, they show that multiple processes support what appears to be identical behavior in healthy individuals.
Footnotes
This work was supported by the NIH (NIDA 1R03DA026957 to D.S., NINDS NRSA 5F32NS063632 to K.F., and F32NS073212 to E.R.), an NSF Career Development Award to D.S., and the Clinical Science Research and Development Service of the Department of Veterans Affairs. We are grateful to Dr. Lucien Cote for recruitment of participants with Parkinson's disease and to Erin Kendall Braun for assistance with data collection.
The authors declare no competing financial interests.
- Correspondence should be addressed to either Karin Foerde or Daphna Shohamy, Schermerhorn Hall 312, 1190 Amsterdam Avenue, MC: 5501, New York, NY 10027. kf2265{at}columbia.edu or ds2619{at}columbia.edu