Previous neuropsychological findings have implicated medial temporal lobe (MTL) structures in retaining object-location relations over the course of short delays, but MTL effects have not always been reported in neuroimaging investigations with similar short-term memory requirements. Here, we used event-related functional magnetic resonance imaging to test the hypothesis that the hippocampus and related MTL structures support accurate retention of relational memory representations, even across short delays. On every trial, four objects were presented, each in one of nine possible locations of a three-dimensional grid. Participants were to mentally rotate the grid and then maintain the rotated representation in anticipation of a test stimulus: a rendering of the grid, rotated 90° from the original viewpoint. The test stimulus was either a “match” display, in which object-location relations were intact, or a “mismatch” display, in which one object occupied a new, previously unfilled location (mismatch position), or two objects had swapped locations (mismatch swap). Encoding phase activation in anterior and posterior regions of the left hippocampus, and in bilateral perirhinal cortex, predicted subsequent accuracy on the short-term memory decision, as did bilateral posterior hippocampal activity after the test stimulus. Notably, activation in these posterior hippocampal regions was also sensitive to the degree to which object-location bindings were preserved in the test stimulus; activation was greatest for match displays, followed by mismatch-position displays, and finally mismatch-swap displays. These results indicate that the hippocampus and related MTL structures contribute to successful encoding and retrieval of relational information in visual short-term memory.
- medial temporal lobes
- entorhinal cortex
- perirhinal cortex
- short-term memory
- working memory
- spatial memory
Converging evidence suggests that medial temporal lobe (MTL) structures are critically involved in long-term declarative memory (Cohen and Squire, 1980), with the hippocampus mediating memory for interitem relationships (Cohen and Eichenbaum, 1993). Notably, previous neuropsychological investigations also implicate the hippocampus in short-term retention of relational memory representations (Hannula et al., 2006; Olson et al., 2006; Hartley et al., 2007), although it remains unclear when and how the hippocampus supports accurate performance. Defined here, short-term memory refers to the collection of processes (i.e., encoding, maintenance, and retrieval) (Jonides et al., 2007), that support temporary retention of perceptual representations, and comparison of said representations to current sensory experience.
Neuroimaging investigations have not consistently implicated MTL structures in short-term retention of relational information (but see Mitchell et al., 2000; Piekema et al., 2006); instead, results point to prefrontal or posterior parietal regions as the key players. For instance, Prabhakaran et al. (2000) reported greater activation in dorsolateral prefrontal cortex (DLPFC) for retention of letter-location bindings than for separate retention of letters and locations (see also Mitchell et al., 2000; Piekema et al., 2006), and two other previous investigations reported strong evidence for posterior parietal involvement in encoding and maintenance of object-location relationships (Todd and Marois, 2004) (see also Piekema et al., 2006; Xu and Chun, 2006). One possibility is that the relative absence of hippocampal activity in these experiments reflects use of strategies that obviate relational memory processing demands. For instance, participants may have retained object-location bindings in viewer-centered coordinates, a process that may be more dependent on prefrontal and posterior parietal cortices (Constantinidis and Wang, 2004; Curtis, 2006) than MTL structures.
Here, event-related functional magnetic resonance imaging (fMRI) was used to test whether the hippocampus supports accurate short-term retention of relational memory representations. On every trial, four objects were presented, each in one of nine possible spatial locations of a three-dimensional grid (see Fig. 1). Participants were to mentally rotate the grid and maintain the rotated representation in anticipation of the test stimulus, a rendering of the grid, rotated 90° from the original viewpoint. The test stimulus was either a match, in which object-location relations were intact, or a mismatch, in which one object occupied a new location (mismatch position), or two objects had swapped locations (mismatch swap). The viewpoint change between sample and test displays, in combination with instructions that required specific identification of each test stimulus, discouraged use of a viewpoint-dependent strategy.
It was predicted that the hippocampus and adjacent MTL structures would support successful formation and subsequent use of the relational memory representations, and that hippocampal recruitment would be most apparent during initial processing of object-location relationships, or early in the delay, when participants performed the viewpoint manipulation. Additionally, based on results showing that MTL neurons exhibit firing rate changes as a function of the match/mismatch status of a test stimulus (Deadwyler et al., 1996; Fried et al., 1997; Otto and Eichenbaum, 1992; Rolls et al., 1993; Suzuki, 1999; Wood et al., 1999), MTL activation was expected to differentiate among test displays.
Materials and Methods
Participants were 18 right-handed undergraduate students (10 female) from the University of California at Davis (UC Davis). Two of the participants were excluded from the analyses reported here; one participant was excluded because of a technical malfunction and the other because there were too few incorrect trials to perform the required analyses. All of the procedures were approved by the Institutional Review Board at the University of California, Davis.
Stimuli and design.
The stimuli used in this experiment were 126 rendered scenes created with Punch! Home Design Software (Kansas City, MO). Each scene consisted of a three dimensional 3 × 3 grid with four different objects positioned in four spatial locations. These four objects were selected from a set of nine possible objects, and every object was used equally often across scenes. The four-object combination presented in a given scene was always unique, and each object appeared no more than seven times in each spatial location.
Four distinct renderings of each scene were created: the original scene, captured as though the grid was being viewed from the front, and three manipulated versions of that scene, each rotated 90° to the right of the original viewpoint (i.e., a 90° counter-clockwise rotation of the scene with the viewer's position unchanged). Manipulated scenes were distinguishable on the basis of object location. As illustrated in Figure 1, the spatial positions of the objects were either unchanged (match displays), one of the objects was moved to a previously unoccupied spatial position (mismatch-position displays), or two of the objects had swapped locations (mismatch-swap displays). In position-change displays, the critical object always moved to an empty location immediately left or right of the originally occupied location; leftward and rightward position changes occurred equally often across scenes, and every object (from the set of nine) underwent a position change in ∼14 scenes. In swap displays, each object changed positions with every other object in either three or four scenes; the same was true for spatial locations.
One hundred and twenty scenes (from the set of 126) were assigned at random to one of eight lists; the remaining six scenes were used to illustrate the task in the instructions that were provided to participants before beginning the practice block. Each list contained 15 scenes, and lists rotated across blocks (i.e., one practice block and seven experimental blocks). For the purposes of this experiment, the original scene was always used as a sample stimulus and one of the three manipulated versions of that scene was used as the test stimulus; match, mismatch-position, and mismatch-swap displays were presented equally often and in random order within each block.
After informed consent was obtained from each participant, the experimenter provided instructions for the experimental task. The participant was told that every trial would begin with the presentation of a rendered scene (i.e., the “sample” stimulus), and that they should attempt to commit the objects and their spatial locations to memory. The participant was shown one of the original scenes (i.e., a scene that was not used in the experiment itself), and instructed that every scene would contain four items positioned in four different spatial locations; each participant also viewed the set of nine objects, and was told that these objects, and no others, would appear repeatedly in the scenes. Pre-exposure to the complete set of objects was meant to reduce the possibility that any observed differences in performance or neural activity across experimental blocks would be a consequence of learning the items. Next, the participant was told that an 11 s delay would follow the presentation of the sample stimulus. They were instructed to form a mental image of the scene rotated 90° to the right of the original viewpoint, and to maintain that representation in anticipation of a test stimulus; a visual example of the rotation was provided to ensure that the participant understood how they should attempt to mentally rotate the scene. Finally, the participant was told that after presentation of the test stimulus they were to indicate via button press whether: (1) the object-location bindings in the grid were intact (match), (2) one of the objects occupied a new, previously unfilled, location (mismatch position), or (3) two of the objects had swapped positions (mismatch swap). Examples of each type of test stimulus were provided, and the participant was encouraged to ask questions about the task at this time. Once the participant felt they understood the task, the experimenter presented six examples using the scenes that, as indicated earlier, were set aside for instructional purposes. Two scenes, a sample stimulus and the corresponding test stimulus, were presented side by side and the participant was to indicate verbally what type of test stimulus they were viewing. If they provided an incorrect answer, they were prompted to try again, and when they identified a mismatching scene as such, they were to indicate specifically what had changed. When it was clear that the participant understood the task, a block of practice trials was initiated. The practice trials were completed in a testing room outside of the scanner, but the procedure was identical to the experimental trials in all other regards.
During the MRI scanning session, the participant completed seven blocks of 15 trials for a total of 105 experimental trials (35 match, 35 mismatch position, and 35 mismatch swap). On each trial, a sample stimulus was presented for 3 s, followed by an 11 s delay, and finally, presentation of the test stimulus for 3 s (see Fig. 1). When the test stimulus was presented, the participant was to indicate, via right-handed button press, whether they were viewing a match, mismatch-position, or mismatch-swap display. Speed was emphasized, but not at the expense of accuracy. A fixation screen was presented for either 9, 11, or 13 s between subsequent trials. The total duration of each block (i.e., scanning run) was 432 s, including a 12 s unfilled interval at the beginning of each block.
Image acquisition and preprocessing.
MRI data were acquired with a 3T Siemens (Erlangen, Germany) Trio scanner located at the UC Davis Imaging Research Center. Each participant was provided with ear plugs to help attenuate scanner noise and padding was used to reduce head movement. Stimuli were back-projected onto a screen positioned at the foot of the scanner bed and viewed through a mirror attached to the head coil.
Functional data were obtained with a gradient echoplanar imaging (EPI) sequence (repetition time, 2000 ms; echo time, 25 ms; field of view, 220; 64 × 64 matrix); each volume consisted of 34 axial slices, each with a slice thickness of 3.4 mm, resulting in a voxel size of 3.4375 × 3.4375 × 3.4 mm. Coplanar and high-resolution T1-weighted anatomical images were also acquired from each participant and a simple motor-response task (Aguirre et al., 1998; Handwerker et al., 2004) was performed to estimate subject-specific hemodynamic response functions (HRF).
Preprocessing was performed using Statistical Parametric Mapping (SPM5) software. EPI data were slice-timing corrected using sinc interpolation to account for timing differences in acquisition of adjacent slices, realigned using a six-parameter, rigid-body transformation, spatially normalized to the Montreal Neurological Institute (MNI) EPI template, resliced into 3 mm isotropic voxels, and spatially smoothed with an isotropic 8 mm full-width at half-maximum Gaussian filter.
fMRI data analysis.
Event-related blood oxygen level-dependent (BOLD) responses associated with qualitatively distinct components of each experimental trial (i.e., sample presentation, early delay, late delay, and test presentation) were deconvolved using linear regression (cf. Zarahn et al., 1997; Postle et al., 2000) (see supplemental Fig. 1a, available at www.jneurosci.org as supplemental material). This approach assumes that BOLD signal changes over the course of a short-term memory trial represent linear combinations of neural activity associated with temporally distinct neural events, and that these events can be independently convolved with the HRF. In the current work, the individual hemodynamic response functions, acquired during performance of the motor-response task, were averaged together and this group-average HRF was used in lieu of the canonical HRF to model the covariates of interesta (supplemental Fig. 2, available at www.jneurosci.org as supplemental material). The covariates of interest were created by convolving this average empirically derived HRF with vectors of predicted neural activity for each trial component (supplemental Fig. 1b, available at www.jneurosci.org as supplemental material).
Separate covariates were constructed to model responses for each trial component (sample, early delay, etc.) as a function of behavioral response accuracy (i.e., correct or incorrect), and test type (i.e., match, mismatch swap, or mismatch position). This classification scheme resulted in 10 distinct covariates of interest (i.e., sample correct, sample incorrect, early delay correct, early delay incorrect, late delay correct, late delay incorrect, match correct, mismatch-position correct, mismatch-swap correct, and test incorrect) that were used in the reported analyses. Additional covariates of no interest modeled spikes in the time series, global signal changes that could not be attributed to variables in the design matrix (Desjardins et al., 2001), scan-specific baseline shifts, and an intercept. Regression analyses were then performed on single-subject data using the general linear model with filters applied to remove frequencies above 0.25 Hz and below 0.005 Hz. These analyses yielded a set of parameter estimates for each participant, the magnitude of which can be interpreted as an estimate of the BOLD response amplitude associated with a particular trial component (e.g., the sample-correct component).
After single-subject analyses were completed, images for the contrasts of interest were created for each participant. These contrasts compared (1) parameter estimates for correct versus incorrect trials associated with each trial component (i.e., sample, early delay, late delay, and test), and (2) parameter estimates for correctly identified match versus mismatch test displays. Contrast images were entered into a second-level one-sample t test in which the mean estimate across participants for every voxel was tested against zero. Significant regions of activation were identified using an uncorrected threshold of p < 0.001 and a minimum cluster size of six contiguous voxels. With this voxelwise threshold, the mapwise false positive rate for the MTL (i.e., hippocampus, parahippocampal, perirhinal, and entorhinal cortices), estimated using a Monte Carlo procedure (as implemented in the AlphaSim program in the AFNI software package), was p < 0.01. Local maxima of activations identified in these contrasts are summarized in the supplemental Results section (supplemental Table 1, available at www.jneurosci.org as supplemental material). Suprathreshold clusters of voxels in the hippocampus, the adjacent MTL cortical structures, the prefrontal cortex (PFC), and posterior parietal regions were used to define regions of interest (ROIs) that were interrogated in subsequent analyses. Mean parameter estimates were extracted from these functional ROIs for trials that elicited correct versus incorrect behavioral responses, and for correctly identified match versus mismatch test displays. To further evaluate differences in activity as a function of test type, a separate second-level analysis was performed in which areas that exhibited suprathreshold activation in the test-correct versus test-incorrect contrast were interrogated for effects of test type (match, mismatch position, mismatch swap, and incorrect) using a repeated-measures ANOVA.
Accuracy was significantly different across test stimulus types (F(2,30) = 10.22, p < 0.001). Participants were better at identifying match displays (77.68% correct; SD, 12.27%) than both mismatch-position (65% correct; SD, 18.71%) and mismatch-swap (59.64% correct; SD, 25.21%) displays (t(15) = 3.75, p = 0.002 and t(15) = 3.89, p = 0.001, respectively). Differences in performance between the two types of mismatch display were not significant (t(15) = 1.28, p > 0.20). Critically, performance on the test of memory was significantly above chance (i.e., 33% correct; all p values ≤ 0.001), but remained below ceiling; this pattern of behavioral performance allowed us to analyze differences in activation as a function of accuracy in the fMRI analyses described in the next section.
Additional analyses were conducted to rule out the possibility that participants had adopted a response bias. Response bias (β) was calculated using signal detection theory (Macmillan and Creelman, 1991), with hits defined as trials on which match displays were correctly identified, and false alarms defined as trials on which participants incorrectly identified a mismatch display as a match. The obtained β value (β= 0.996) indicated that there was minimal, if any, evidence for bias in the patterns of behavioral responses (i.e., β-value of 1.0 indicates no bias).
Response times (RTs) were significantly faster for correct (2070.56; SD, 93.36) than for incorrect (2214.25; SD, 201.18) trials (F(2,30) = 7.97, p = 0.01), but there were no significant RT differences across test types (match, 2079.89; SD, 121.52; mismatch position, 2071.27; SD, 152.39; mismatch swap, 2134.27; SD, 154.67; F(2,30) = 1.92, p = 0.16), nor was there a significant accuracy by test type interaction (F(2,30) = 1.81, p = 0.18).
Based on the hypothesis that the hippocampus is critically involved in processing and representation of relational memories, it was predicted that hippocampal activity would be greater on correct than on incorrect trials, and that hippocampal recruitment would be most apparent either during processing of the sample stimulus, or early in the delay, when participants were to encode and mentally rotate the relational memory representation into a viewpoint compatible with the upcoming test stimulus. Furthermore, and consistent with single-unit recording investigations (Deadwyler et al., 1996; Fried et al., 1997; Otto and Eichenbaum, 1992; Rolls et al., 1993; Suzuki, 1999; Wood et al., 1999), it was expected that activation in MTL structures would differentiate among types of test displays.
Medial temporal lobe activations
As predicted, several MTL regions were more active for trials associated with correct behavioral responses than for those associated with incorrect behavioral responses. These differences in MTL activation were evident during the sample and test periods of the trials, but not during the early or the late delay. As shown in Figure 2, BOLD signal was higher for correct than for incorrect trials in both anterior and posterior regions of the left hippocampus during presentation of the sample stimulus. Inspection of trial-averaged time courses revealed that activity in the left anterior hippocampal region decreased after sample presentation [relative to activity during the intertrial interval (ITI)], but the decrease was of smaller magnitude for correct than for incorrect trials. Activation in the posterior hippocampus, in contrast, increased after sample presentation, and the increase was larger for correct than for incorrect trials.b Like the hippocampal regions identified above, bilateral regions of the anterior parahippocampal gyrus, most likely corresponding to perirhinal cortex (Insausti et al., 1998), also exhibited differences in BOLD signal between correct and incorrect trials when the sample stimulus was presented. Activity in the left perirhinal cortex after sample presentation decreased relative to the ITI baseline, but less so for correct than for incorrect trials. Activity in the right perirhinal cortex was elevated relative to the ITI for correct trials, but not for incorrect trials.
During presentation of the test stimulus, BOLD signal was higher for correct than for incorrect trials in bilateral regions of the posterior hippocampus. As noted above, we predicted that activation elicited by the test stimulus should also differentiate among display types (i.e., match, mismatch position, mismatch swap, and incorrect). To further test this prediction, ROIs were defined in the left and right posterior hippocampal areas identified in the correct versus incorrect test stimulus contrast, and mean parameter estimates of test period activation were extracted for correctly identified match, mismatch-position, mismatch-swap, and incorrect trials, respectively. A one-way ANOVA performed on these parameter estimates revealed significant main effects of test type for both the left (F(3,45) = 6.96, p = 0.001) and right (F(3,45) = 6.68, p = 0.001) posterior hippocampus. Planned comparisons showed that activity in the left posterior hippocampal ROI was greater for match than for both mismatch-swap (t(15) = 2.32, p = 0.04) and mismatch-position displays (t(15) = 3.99, p = 0.001). Activity was also greater for match and mismatch-position displays than for test displays that elicited incorrect responses (t(15) = 5.55, p < 0.001 and t(15) = 2.79, p = 0.01, respectively). Analyses for the right posterior hippocampal ROI revealed that activity was greater for match than for mismatch-swap displays (t(15) = 2.16, p = 0.05) and that parameter estimates were numerically greater for match than for mismatch-position displays, although this difference was not statistically significant (t(15) = 1.11, p > 0.05). As in the left hippocampus, activity was greater for match and mismatch-position displays than for test displays that elicited incorrect responses (t(15) = 6.11, p < 0.001 and t(15) = 3.71, p = 0.01, respectively). There were no significant differences in activation between mismatch-swap and mismatch-position displays (all t(15) ≤ 1.47, p values > 0.05) or between mismatch-swap and incorrect trials in either ROI (all t(15) ≤ 1.45, p values > 0.05) (Fig. 3).
A mapwise analysis was also performed to test for differences in activity between correctly identified match- and mismatch- test displays (supplemental Table 1, available at www.jneurosci.org as supplemental material). Consistent with the results reported above, match displays elicited more activity than mismatch displays in left posterior hippocampus. A region of the right parahippocampal gyrus, most likely corresponding to perirhinal cortex (Insausti et al., 1998) was also identified in this contrast (Fig. 4). No MTL regions were more active for mismatch than for match displays.
Prefrontal and posterior parietal activations
As shown in Figure 5, regions in prefrontal and posterior parietal cortices also exhibited differences in BOLD signal changes for correct and incorrect trials. As in the MTL, these differences were apparent during the sample and the test periods, but not during the delay. During the sample period, right ventrolateral prefrontal cortex (i.e., VLPFC) exhibited less deactivation on correct than on incorrect trials, and the right inferior intraparietal sulcus (IPS) exhibited more activation for correct than incorrect trials (Fig. 5a).
Analyses of activity during processing of the test display revealed that activation in a relatively caudal region and another relatively rostral region of the right dorsolateral prefrontal cortex (i.e., DLPFC) was greater on correct than on incorrect trials. Additional analyses revealed that activation differed as a function of test type in both the rostral (F(3,45) = 8.54, p < 0.001), and caudal regions of DLPFC (F(3,45) = 4.54, p < 0.01). In the caudal region, however, there were no significant pairwise differences between match, mismatch-position, and mismatch-swap displays (all t(15) < 0.68; p values > 0.05). The significant main effect of test type seems to have been driven solely by differences in activity between correct match and mismatch-position trials versus trials on which incorrect responses were made (t(15) = 3.28, p < 0.05 and t(15) = 4.30, p < 0.01). In the rostral region, activation was greater for match displays than for mismatch-swap displays (t(15) = 3.08, p = 0.008) and greater for match and mismatch-position displays than for incorrect trials (t(15) = 6.97, p < 0.001 and t(15) = 3.41, p = 0.004, respectively; no other comparisons were significant (all t(15) ≤ 1.75, p values > 0.05) (see Fig. 5b). Unlike in the MTL, mapwise contrasts between match and mismatch displays revealed no suprathreshold differences in parietal or prefrontal regions.
Relationship between individual differences in performance and delay period activation
Some previous work has shown that delay period activation can be correlated with accuracy on short-term memory tasks (Pessoa et al. 2002; Sakai et al., 2002). However, in the present study, no brain regions (within or outside of the MTL) exhibited suprathreshold delay period activation in the accuracy contrast. This could not be attributed to an absence of delay period activity, however, because when the data from correct and incorrect trials were collapsed, persistent activity during the delay period was evident in a network of brain regions (e.g., bilateral frontal eye fields and posterior parietal cortex, along with supplementary eye fields and left dorsolateral prefrontal cortex) that are typically implicated in spatial short-term memory tasks (Curtis, 2006). These results are reported in the supplemental Results section (supplemental Fig. 3, Table 2, available at www.jneurosci.org as supplemental material).
Because there were substantial differences in behavioral performance across subjects, we conducted additional analyses to examine whether these differences might account for variability in activation during the delay period. For example, it is possible that delay period activity in the MTL or other brain regions was higher for good than for poor performers. To explore this possibility, we ran a mapwise regression analysis that evaluated the relationship between individual differences in accuracy and the activation difference between correct and incorrect trials during the short-term memory delay. Results of this analysis revealed that, during the early portion of the delay, activation in a region of the left anterior parahippocampal gyrus, most likely corresponding to the entorhinal cortex (Insausti et al., 1998), was highly correlated with individual performance (r2= 0.79, p < 0.001) (Fig. 6). No other brain areas within or outside of the MTL were identified in this contrast.
Similar analyses were also performed for the sample, late delay, and test periods, but none of these analyses yielded any significant correlations of individual performance with activity differences between correct and incorrect trials in the MTL. Regions outside of the MTL that showed significant correlations are summarized in supplemental Table 3 (available at www.jneurosci.org as supplemental material).
The current investigation tested whether hippocampal activity is related to successful formation and subsequent use of short-term relational memory representations. Using a novel task, in which accurate performance required retention of object-location relationships, we found that hippocampal activity predicted accuracy, and was sensitive to the match between the test stimulus and the retained representation of the sample. Furthermore, differences in entorhinal cortex activity between correct and incorrect trials during the early delay were strongly correlated with individual behavioral performance. These results complement and extend findings from previous neuropsychological investigations (Hannula et al., 2006; Olson et al., 2006; Hartley et al., 2007) by showing that the hippocampus may be critical for encoding and evaluation processes that contribute to relational memory, even across short delays.
Before proceeding, it is worth considering whether the reported differences in hippocampal activation might reflect differences in memory for items or generalized memory strength, rather than relational memory. For instance, failure to correctly identify the test display on incorrect trials could either reflect poor memory for interitem relationships, poor memory for the items themselves, or some combination thereof. However, item memory- or strength-based accounts cannot explain why hippocampal activity also differed across correctly identified test displays (i.e., match>mismatch position>mismatch swap). Because items in the sample display were always re-presented at test, mere encoding of item identity would be insufficient to accurately distinguish test displays in which object-location relationships were intact (i.e., match displays) from those in which one (i.e., mismatch position) or two (i.e., mismatch swap) items were displaced. Furthermore, use of test displays that were visually dissimilar (i.e., rotated) to corresponding sample displays discouraged holistic encoding strategies, as these could not support accurate performance.
Still, one might question whether the match-mismatch differences observed in the hippocampus were driven by response biases rather than relational memory. For example, if participants were more likely to respond “mismatch” when they were unsure, then “guess” trials would have been disproportionately allocated to the mismatch trial bin, possibly contributing to a spurious match>mismatch activation difference. However, it was determined that response bias was minimal, and if anything, there was a slight tendency toward a “match” bias.c To the extent that this slight bias influenced the results, it would have led to reduced activation for match trials, and could not explain the current findings. Instead, the results of the match-mismatch analyses seem to parallel “match enhancement” effects reported in single-unit recording studies, in which neurons exhibit selective increases in firing rate after presentation of a target that matches an actively retained sample (Otto and Eichenbaum, 1992; Rolls et al., 1993; Deadwyler et al., 1996; Suzuki et al., 1997; Wood et al., 1999). Match enhancement effects have been proposed as a mechanism that may support short-term memory decisions (Suzuki, 1999).
In the absence of explicit memory demands, repetition of an item or a sequence of items can result in reduced MTL activity, relative to novel stimuli or unexpected sequences (Ranganath and Rainer, 2003; Kumaran and Maguire, 2006, 2007a). Such results are consistent with the view that the hippocampus acts as an associative novelty detector, and should be most responsive when there is a partial mismatch between previously stored associative memory representations and currently available information (Kumaran and Maguire, 2007b). According to this model, one might expect that hippocampal activity in the present experiment should be greatest during processing of mismatch displays, which contain the same items, but only partial preservation of interitem relationships. This was not the case though, and it is noteworthy that, in contrast to the current experiment, participants in previous experiments (Kumaran and Maguire, 2006, 2007a) were not required to actively retain a sample stimulus and compare it to a subsequently presented test stimulus (i.e., no explicit memory demands). Therefore, hippocampal function might be more broadly construed as critical for relational memory, with the relative degree of hippocampal activity in response to a relational match or mismatch dependent on specific task requirements. This possibility should be addressed in future work.
In addition to hippocampal involvement described above, sample activation was also greater for correct (vs incorrect) trials in perirhinal cortex. Some results have implicated this region in perceptual processing (Bussey et al., 2002; Lee et al., 2005a,b; Buckley and Gaffan, 2006) and encoding (Winters et al., 2006) of visual object representations (but see Levy et al., 2005). Here, perirhinal activation might have reflected the demand to retain object representations from a sample stimulus in service of relational memory binding that occurs in the hippocampus (Cohen et al., 1997; Davachi, 2006; Eichenbaum et al., 2007).
During the delay, MTL activation was not significantly elevated relative to baseline, nor were there differences in MTL activation as a function of accuracy. Therefore, the results of this experiment point to a role for the hippocampus in encoding and retrieval, but not necessarily maintenance, of object-location relationships. However, additional analyses showed early delay activation in entorhinal cortex that was correlated with individual differences in performance. One possibility is that encoding, or strengthening of the previously encoded representation continued into the early delay to a greater extent for good than for poor performers. Alternatively, good performers may have been more adept at performing the viewpoint manipulation once the stimulus was removed from view. This idea is consistent with results from electrophysiological investigations, in which entorhinal neurons exhibited persistent increases in activity during maintenance of odors, visual objects, and spatial locations (Suzuki et al., 1997; Young et al., 1997), and with a lesion study, in which bilateral aspiration of entorhinal cortex in monkeys produced a selective impairment on tasks that required flexible manipulation of relational memory representations (Buckmaster et al., 2004).
Outside the MTL, inferior IPS and prefrontal regions were also more active for correct than for incorrect trials. Inferior IPS was sensitive to successful encoding of sample displays, a result that may reflect allocation of spatial attention to all four objects in the grid stimulus during encoding (cf., Xu and Chun, 2006). If this spatial attention mechanism failed, hippocampally mediated relational memory processing, and behavioral performance, would be expected to suffer. Right prefrontal regions also distinguished correct from incorrect trials during presentation of both the sample, and the test displays. Previous findings have suggested that VLPFC and DLPFC may implement control processes that contribute to the processing of item and relational information, respectively, and have shown that this activation predicts subsequent long-term memory performance (Blumenfeld and Ranganath, 2007). The current work extends these findings to encoding and retrieval of relational information in short-term memory.
The present study revealed no evidence to support the idea that persistent delay-period activity in MTL, prefrontal, or parietal regions is sufficient to support accurate performance on this task. This may seem surprising, as correlations between short-term memory accuracy and delay-period activity in prefrontal and parietal regions have been reported previously (Pessoa et al., 2002; Sakai et al., 2002; Curtis et al., 2004). However, those previous studies investigated memory for individual items or simple item-location bindings that could be maintained in viewer-centered coordinates. Further, although previous studies of short-term memory have reported persistent hippocampal activation (Ranganath and D'Esposito, 2001; Schon et al., 2004; Ranganath et al., 2005; Nichols et al., 2006, Piekema et al., 2006; Axmacher et al., 2007), those investigations generally used novel, trial-unique visual stimuli, and no attempts were made to correlate persistent activity with accuracy.
At face value, the absence of delay-period activity in the accuracy contrast might indicate that the kind of relational information tested here is not maintained through persistent activity in any brain region. For example, it may be the case that short-term retention of relational memory representations is supported by transient changes in synaptic efficiency (Jonides et al., 2007), possibly in the hippocampus. This explanation is compatible with the fact that activity during the sample and test phases was reliably associated with successful performance. A second possibility is that fMRI is insensitive to the neural mechanism that supports active relational memory maintenance. For example, relational memory maintenance may be signaled by changes in the timing of neural oscillations (Jensen and Lisman, 2005; Jensen, 2006), a possibility that is supported by previous evidence from electrophysiological recordings in monkeys (Lee et al., 2005c) and humans (Raghavachari et al., 2001; Howard et al., 2003; Rizzuto et al., 2003; Mainy et al., 2007).
It is generally agreed that structures in the MTL play a critical role in forming long-term relational memory representations (Eichenbaum et al., 1994). The current results demonstrate that hippocampal activity may also reflect formation and use of these representations in service of short-term memory. Further research will be needed to specify the degree to which hippocampal (or MTL) recruitment depends on the processing or representational requirements of a memory task, rather than the duration of the retention interval interposed between study and test (Cabeza et al., 2002; Ranganath and Blumenfeld, 2005; Postle, 2006).
↵a A group-average HRF was used because there were five subjects from whom the individual HRF could not be acquired (because of time constraints). Under those circumstances, it seemed best to use the same analysis strategy for all of the participants (i.e. convolving the data with an average of the individual HRFs obtained from the remaining 11 participants). Furthermore, the fits of the data to the model were carefully examined for each subject to ensure that there were no serious deviations of the data from the model consequent to the use of the group-average HRF.
↵b It is notable that some of the reported BOLD signal changes were negative-going, whereas others were positive-going. This pattern might suggest qualitative differences in patterns of activation across brain regions. However, these BOLD signal changes reflect differences between activity associated with a particular trial component (e.g., sample, test) and the ITI. Accumulating evidence suggests that meaningful cognitive activation may occur during unfilled ITIs, particularly in the MTL (Stark and Squire, 2001). In our study, it is unclear whether any significant cognitive activity occurred during the ITI, so the functional significance of negative (or positive) changes in activity relative to this baseline condition is difficult to interpret.
↵c Examination of the behavioral data revealed that when participants made an error they were most likely to indicate that they were viewing a “match” display. Participants erroneously reported that they were viewing a match display on 47% of the incorrect trials (i.e. nearly half of the trials on which an error was made), whereas they said “position” or “swap” in error on just 31 and 21% of the incorrect trials. If it had any effect on reported MRI results, this bias would reduce memory strength for “match” responses, because this bin would contain a larger number of guesses than the mismatch bins.
This work was supported by National Institute of Mental Health Fellowship F32MH075513 (D.E.H.) and National Institutes of Health Grant MH68721 (C.R.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of Mental Health or the National Institutes of Health. Many thanks to Aaron Heller, Linda Murray, Andy Haskins, Rob Blumenfeld, and Charlie Nuwer for their technical guidance and assistance with data collection.
- Correspondence should be addressed to Deborah Hannula, Center for Neuroscience, University of California, 1544 Newton Court, Davis, CA 95618.