It is debated whether subregions within the medial temporal lobe (MTL), in particular the hippocampus (HC) and perirhinal cortex (PrC), play domain-sensitive roles in learning. In the present study, two patients with differing degrees of MTL damage were first exposed to pairs of highly similar scenes, faces, and dot patterns and then asked to make repeated same/different decisions to preexposed and nonexposed (novel) pairs from the three categories (Experiment 1). We measured whether patients would show a benefit of prior exposure (preexposed > nonexposed) and whether repetition of nonexposed (and preexposed) pairs at test would benefit discrimination accuracy. Although selective HC damage impaired learning of scenes, but not faces and dot patterns, broader MTL damage involving the HC and PrC compromised discrimination learning of scenes and faces but left dot pattern learning unaffected. In Experiment 2, a similar task was run in healthy young participants in the MRI scanner. Functional region-of-interest analyses revealed that posterior HC and posterior parahippocampal gyrus showed greater activity during scene pattern learning, but not face and dot pattern learning, whereas PrC, anterior HC, and posterior fusiform gyrus were recruited during discrimination learning for faces, but not scenes and dot pattern learning. Critically, activity in posterior HC and PrC, but not the other functional region-of-interest analyses, was modulated by accuracy (correct > incorrect within a preferred category). Therefore, both approaches revealed a key role for the HC and PrC in discrimination learning, which is consistent with representational accounts in which subregions in these MTL structures store complex spatial and object representations, respectively.
Although it is undisputed that medial temporal lobe (MTL) structures, including the hippocampus (HC) and perirhinal cortex (PrC), participate in memory, their exact role remains controversial (Burgess et al., 2001; Eichenbaum et al., 2007; Squire et al., 2007; Brown et al., 2010; Graham et al., 2010; Montaldi and Mayes, 2010; Ranganath, 2010). A current debate is whether these regions, or subareas within them, differ in their contribution to learning and memory for distinct categories of visual stimuli (Diana et al., 2008; Aly et al., 2010; Preston et al., 2010; Duarte et al., 2011; Watson et al., 2012).
For example, recent neuropsychological studies have revealed that selective bilateral HC damage impairs recognition memory (Bird et al., 2007), discrimination learning (Graham et al., 2006), and odd-one-out decisions (Lee et al., 2005a) for scene, but not face, stimuli. In contrast, larger MTL lesions that encompass both the HC and PrC result in poor long-term memory for scenes and faces (Taylor et al., 2007) and reduced discrimination accuracy for scenes, faces, and objects (Barense et al., 2005, 2007; Lee et al., 2005b). Critically, however, these domain-specific patterns are not present in all patients and there is disagreement over the exact anatomical locus of such deficits (Levy et al., 2005; Shrager et al., 2006; Kim et al., 2011). It has been suggested that concomitant involvement of domain-sensitive regions in parahippocampal cortex and/or fusiform gyrus, rather than damage to the HC and PrC per se, may underlie such stimuli-dependent functional dissociations (Squire et al., 2006; Suzuki, 2009, 2010 but see Baxter, 2009; Graham et al., 2010; Jeneson and Squire, 2012; Lee et al., 2012; Rudebeck et al., 2013).
We undertook complementary patient and fMRI experiments to address this issue, investigating whether the HC and PrC would play distinct domain-sensitive roles in a novel learning task in which performance on scenes and faces were compared directly alongside an equally difficult visual control (dot patterns). In Experiment 1, two amnesic patients (one with selective HC damage and another with damage including the HC and PrC) and 12 matched controls were preexposed to pairs of visually similar faces, scenes, and dots. Subsequently, they made repeated same/different judgements to both previously exposed and nonexposed pairs. This paradigm allowed us to determine whether patients could learn to discriminate faces, scenes, and/or dots as measured by either an accuracy advantage for preexposed compared with nonexposed pairs (Analysis 1A) and/or by increasingly better discrimination success as same/different discriminations were repeated during the test (Analysis 1B). In Experiment 2, we aimed to elucidate the unique contributions of the HC and PrC, alongside the parahippocampal and fusiform areas, to perceptual learning. Young healthy participants performed a version of the task used in Experiment 1. A functional region-of-interest (fROI) approach was adopted, complemented by whole-brain analyses, to determine which brain regions showed a difference between preexposed versus nonexposed scene, face, and/or dot pairs (Analysis 2A) and how brain activity was modulated by decision accuracy (correct > incorrect, Analysis 2B).
Materials and Methods
Experiment 1: patients
Two patients with damage to the MTL (initially reported in Lee et al., 2005b, as patients HC3 and MTL3) and 12 healthy participants matched to the patients for age and education were included in Experiment 1. The two patients were selected for the study based on definitive evidence of circumscribed involvement of the MTL, neuropsychological confirmation of selective difficulties with episodic recall, and willingness to take part in our study (Barense et al., 2005; Lee et al., 2005b; Graham et al., 2006; Barense et al., 2007; Lee and Rudebeck, 2010a; Rudebeck et al., 2013). Qualitative and quantitative measures of the patients' brain damage, as well as detailed neuropsychology, have been published previously (Lee et al., 2005b; Lee and Rudebeck, 2010a). Patient HC3, a 50-year-old woman with 10 years of education, has selective bilateral HC involvement after an episode of carbon-monoxide-induced hypoxia. MTL3, a 64-year-old woman with 10 years of education, has a larger bilateral lesion to the MTL that includes damage to both the HC and PrC. Use of a standard functional localizer in both patients confirmed activation in parahippocampal place area (PPA), fusiform face area (FFA), and lateral occipital cortex, a profile consistent with the pattern of structural integrity evident from volumetric and connectivity analyses of the patients' structural MRI scans (Lee and Rudebeck, 2010a; Rudebeck et al., 2013; Figure 1).
On neuropsychological testing, the patients showed exceptionally poor episodic recall of both verbal and visual material. For example, both scored 4 of 50 on delayed recall of a prose passage (logical memory) and were similarly poor at reproducing the Rey-Osterrieth Complex Figure after a delay (HC3, 3 of 36; MTL3, 4.5 of 36) despite good initial drawings (HC3, 35 of 36; MTL3, 30.5 of 36). Recognition memory in HC3, as measured using the Warrington Recognition Memory Test, was within the normal range for faces and scenes, but not words, whereas MTL3 showed impairment in all three recognition memory tasks. This pattern is consistent with the patients' performance on other experiments in which we have investigated recognition memory performance across visual categories (Taylor et al., 2007). Other measures of cognition, including visual processing (as measured by the Visual Object and Space Perception Battery, see also copy of the Rey-Osterrieth Figure described earlier) and problem solving were preserved in both the patients (Lee et al., 2005b), although MTL3 showed some difficulties with semantic memory as evidenced by a mild deficit on category comprehension and semantic association tasks (Lee and Rudebeck, 2010a).
Two groups of six neurologically healthy control participants (male and female) age and education matched to the patients were recruited from the Cardiff University School of Psychology Community Panel. The controls for patient HC3 had a mean age of 53.3 ± 2.9 years and 9.8 ± 1.0 years of education; matched controls for MTL3 were on average 62.3 ± 4.1 years of age with 10.0 ± 0.6 years of education. Because analyses of the visual discrimination data obtained from these two groups revealed no significant differences in accuracy or reaction time (RT) across any of the three experimental conditions (all F < 1), the two groups were combined into a single group for comparison with the patients (age, 57.8 ± 5.8 years; education, 9.9 ± 8.2 years). There was no significant difference between the patients and the larger control group in age or years of education (all t < 1.07, p > 0.30). Ethical approval was obtained from the Cambridge National Health Service Research Ethics Committee. All participants gave informed consent according to the Declaration of Helsinki (1991) regarding involvement in the experiment.
Portrait photographs (grayscale) of two pairs of men and two pairs of women with similar shaped faces and visual features were taken from an online yearbook. From these, four morphed face pairs were created using the software package Morpheus 1.85 (ACD Systems; see Mundy et al., 2007, for detailed information about the procedure). In brief, a sequence of intermediate (blended) images were created from a pair of exemplars by anchoring key feature points such as nose, eyes, and mouth and changing the distance between these points (Fig. 2A shows example stimuli pairs). Two faces were then selected from each male and female morph continuums, one of which had 56.6% of the features of original face 1 and 43.3% of original face 2 and the other with 43.3% of face 1 and 56.6% of face 2.
Four 3D virtual reality, computer-generated rooms were created. A new item was then generated from each of these prototype room layouts ensuring that within the pair there were differences in the size, orientation, and/or location of three of the features of the room (e.g., a window, a staircase, and a wall cavity). In the example pair shown in Figure 2B, the two rooms differ in the location of the pillar on the left, the orientation of the right wall, and the angle of the center staircase. The rooms were created using a commercially available computer game (Deus Ex; Ion Storm) and a freeware software editor (Deus Ex Software Development Kit version 1112f).
A computer program written in Visual Basic was used to generate four pairs of confusable dot patterns (Fig. 2C). The program was constrained to create an initial random pattern of 11 dots of 0.5 cm radius. A second confusable pattern was made for each initial dot pattern by making random adjustments to the location of 3 dots in the original image within a range of 0.25 to 0.75 cm. All stimuli were 10.2 × 9.9 cm when presented on the computer screen.
Stimuli were presented using Presentation (Neurobehavioural Systems) running on either a 17 inch laptop (patients) or an IBM-compatible desktop computer (controls), with the latter connected to a standard 17 inch LCD monitor. Stimuli were shown at a resolution of 1024 × 786 pixels. Participants were seated ∼60 cm from the computer screen. After providing informed consent, the first of three exposure-test cycles began with the following instructions appearing on the computer screen:
“You will now see a series of images; some will be very similar. Please pay close attention—the differences are very subtle. (Press the response button to begin).”
Once the response key was pressed, the participants were presented with an item for 2 s, followed by an empty black screen for 0.5 s (a single trial). They were not required to make any response to these items. As in Mundy et al. (2006), an intermixed presentation schedule was used. For example, within Face Pair 1 (FP1), the two morphed faces (FP1 and FP1′) were presented in an intermixed manner one after each other (e.g., FP1, FP1′, FP1, FP1′…) until there had been 5 presentations of each exemplar (a total of 10 individual trials). The participant was then presented with the stimuli comprising FP2 in the same fashion. Participants then moved on to the test phase.
At the start of the test phase, participants received the following instructions on the computer screen:
“You will now see a second series of images; some will be new. The image will flash—please indicate whether you think the image has changed. Left button = yes, right button = no. (Press the response button to begin).”
During each test trial, participants saw one stimulus for 500 ms, followed by a 300 ms interstimulus interval (which was filled by a high-contrast mask) and then a second stimulus for 500 ms, which was followed by a 4 s response period. Two mouse keys were used to record the participants' “yes” and “no” responses. Subsequent trials proceeded automatically after the completion of the response period (Fig. 2D is a schematic of a test trial).
In each of the three separately run conditions (faces, scenes, and dots), there were 64 test trials consisting of 16 trials for each of the stimulus pairs seen in the exposure phase, and 16 trials for each of the two pairs not seen in the exposure phase. Half of the presentations of each item pair were “same” trials (e.g., either FP1 then FP1 or FP1′ then FP1′) and half were “different” trials (e.g., either FP1 then FP1′ or FP1′ then FP1). The order of same and different trials within a run was randomized, with the restriction that no more than two of each type of trial could occur in succession. Furthermore, the order of trials was randomized with the constraint that there must be eight trials from each condition (preexposed or nonexposed) in every 16 trials. After every 16 trials, a fixation cross was presented for 20 s to allow the participant to rest. At the completion of the test phase, participants were allowed to rest for 5 min before moving on to the next exposure-test cycle with a different type of stimulus. Patient HC3 was tested on dots, then faces, then scenes; her controls received the same sequence. Patient MTL3 was tested on dots, then scenes, then faces; her controls received the same sequence.
The data were analyzed in two ways. In Analysis 1A, we investigated whether participants would show any benefit of prior exposure to stimuli by comparing discrimination performance (both accuracy and RT) for preexposed pairs compared with nonexposed pairs. Analysis 1B investigated whether the patients, compared with their controls, showed any evidence of learning across the nonexposed (and preexposed) pairs by looking for improvement in accuracy over four separate time blocks of the test phase (Block 1 to Block 4).
Experiment 2: fMRI in healthy participants
Sixteen right-handed healthy participants (10 male) were scanned. The ages of the participants ranged from 18 to 40 years (mean, 30) and all had normal or corrected-to-normal vision. All participants gave written informed consent for their participation in the study (according to the Declaration of Helsinki, 1991). This work received ethical approval from the Cardiff University School of Psychology Research Ethics Committee.
Twelve face, 12 scene, and 12 dot pattern pairs were created using the procedure described above (see Materials, Experiment 1).
The basic design of the fMRI experiment was similar to Experiment 1 in that participants were exposed to pairs of stimuli at study before undertaking a same/different discrimination task with previously seen, but also nonexposed pairs, at test. One difference, however, was the use of two different preexposure conditions (intermixed and blocked), a manipulation designed to investigate the impact of exposure schedule (Mundy et al., 2009). The full experimental procedure is described below (also see the schematic in Fig. 3), but our statistical analyses were restricted to the preexposed intermixed and nonexposed pairs only.
Stimuli were presented during scanning using Presentation software running on an IBM-compatible desktop computer connected to a digital projector (1024 × 786 pixels resolution). The latter projected onto a white screen situated behind the participant scanner bed and this could be seen via an angled mirror placed directly above the participant's eyes in the scanner. The on-screen dimensions of all images was identical to those in Experiment 1, with stimuli covering 15 × 12 degrees of visual angle (h × w).
Figure 3 shows the basic experimental design for one run (of two) for a single participant. Within a run, the study phase for one type of stimulus was always followed by the test phase for the same type of stimulus comprising preexposed pairs interspersed with nonexposed pairs from the same category. All stimulus categories appeared equally often in each serial position (presented first, second, or third) within each of the two runs (balanced across participants).
Two preexposure schedules, intermixed and blocked, were used during the study phases. The intermixed preexposure condition was similar to the study phase undertaken by the patients in which the two items comprising a pair were alternately presented (e.g., FP1, FP1′, FP1, FP1′…) until each item in a pair had been viewed five times. The blocked preexposure involved five repetitions of one item from the pair before five presentations of the other item from the pair (FP1, FP1, FP1, FP1, FP1, FP2…). In both of these conditions, the timing was the same as that in the patient study. Participants were not required to make any response during these preexposure conditions.
During the discrimination test, participants were presented with preexposed pairs from the intermixed and blocked conditions and also nonexposed pairs of faces, scenes, or dots. They indicated whether these pairs were the same or not by pressing the relevant key of a button box held in the right hand. To ensure adequate jitter in trial timings, instead of the 4 s response window used for the patient study, there was a random intertrial interval of between 4.5 and 12 s sampled from a Poisson distribution. In the test phase, there were 64 preexposed and 32 nonexposed trials in each run, resulting in a total of 64 trials per stimulus type per condition (intermixed, blocked, and nonexposed) across the whole experiment. Trials from pairs seen previously during preexposure were randomly interspersed between nonexposed stimuli trials. Items from each visual category were presented in blocks.
Imaging was performed on a General Electric 3T HDx MRI system using an eight-channel receive-only head coil at the Cardiff University Brain Research Imaging Centre, School of Psychology, Cardiff University. For functional imaging, a T2*-weighted gradient-echo, echoplanar imaging (EPI) sequence with high-order shim (HOS) was used to image volumes with BOLD contrast. Fifty slices were collected per image volume covering the whole brain, prescribed 30 degrees inclined from the AC-PC plane (to maximize signal coverage in the MTL). Scanning parameters were as follows: TR/TE, 3000/35 ms; flip angle, 90 degrees; slice thickness, 2.8 mm (1 mm gap); acquisition matrix GE-EPI, 64 × 64; in-plane field of view, 22 cm; ASSET (acceleration factor), 2; and HOS. The HOS is a procedure that allows the scanner to (partially) correct for variations in the magnetic field that arise once a participant is placed in the scanner by adjusting shims inside the gradient coils according to a low-resolution magnetic field map. Additional high-resolution field maps were also acquired for every participant for the purpose of un-distorting the EPI datasets during image preprocessing. For anatomic localization, a structural scan was obtained for each participant using a T1-weighted sequence (3D FSPGR). Scanning parameters were as follows: TR/TE 7.9/3.0 ms; flip angle, 20 degrees; acquisition matrix, 256 × 256 × 176; field of view, 256 × 256 × 176 mm; isotropic resolution, 1 mm.
Data preprocessing and statistical analysis of fMRI data were performed using FEAT (fMRI Expert Analysis Tool) Version 5.63, part of the software library of the Oxford Centre for Functional MRI of the Brain (fMRIB) (www.fmrib.ox.ac.uk/fsl). The following prestatistics processing was applied: motion correction using MCFLIRT (Jenkinson et al., 2002); nonbrain removal using BET (Smith, 2002); spatial smoothing using a Gaussian kernel of FWHM 4 mm; mean-based intensity normalization of all volumes; high-pass temporal filtering (Gaussian-weighted least-squares straight line fitting, with σ = 20.0 s); and un-distorting the EPI data to correct for magnetic field distortions by means of individual field maps. Time-series statistical analysis was performed using FILM with local autocorrelation correction (Woolrich et al., 2001). Registration to high-resolution 3D anatomical T1 scans (per participant) and to a standard MNI template image (for group average) was performed using FLIRT (Jenkinson and Smith, 2001; Jenkinson et al., 2002). Coordinates reported here have been converted to Talairach and Tournoux (1988) convention, where appropriate, for ease of comparison with existing literature (Lacadie et al., 2008).
Data analysis: behavioral
The primary measure of performance was response accuracy (percentage of correct discriminations) averaged over both scanning runs for each stimulus type (dot patterns, faces, and scenes). RTs during test blocks were also examined to assess whether preexposed compared with nonexposed discriminations were facilitated.
Data analysis: fMRI
We focused on two analyses complementary to the patient study (Fig. 4). Both of these used an fROI approach to investigate how activity in key regions sensitive to faces and scenes was modulated by exposure history (Analysis 2A) and by discrimination accuracy (correct > incorrect, Analysis 2B). The latter was based on pairs presented in the nonexposed condition, but similar findings were evident when we analyzed the intermixed (and blocked) preexposed pairs that were also presented at test. The fROI analyses were complemented, where sensible, with whole-brain contrasts. The procedure for identifying the fROIs is described first, before specific details about the two analyses.
To mirror procedures used in the visual perception literature, an fROI localizer analysis was performed (Fig. 4, Steps 1 and 2). The following procedures were first performed on individual participant data and then pooled for group-level statistical analysis. To identify orthogonal fROIs for our analysis, the first, completely novel, test trial involving each stimulus pair was used. The first test trial for each face pair was contrasted with the first trial for each scene pair, giving voxel clusters particularly activated by faces. The opposite contrast (first trial of each scene pair vs first trial of each face pair) generated voxels particularly activated by scenes. All other further analyses were performed on data from the subsequent trials (n = 15 per stimulus pair, a total of 60 per category) with this first novel test trial removed so that localizer and test data were independent, thus avoiding the problem of circularity (Kriegeskorte et al., 2009).
The most significantly active voxel within each anatomical area of interest [i.e., posterior fusiform gyrus (PFG), which includes FFA; posterior parahippocampal gyrus (PostPG), which encompasses PPA); PrC; anterior HC (AntHC); posterior HC (PostHC)] was located in regions of cortex that corresponded well with previously reported anatomical locations and visible anatomy (Tables 1, 2).
Two fROIs were defined for each of our five anatomical areas of interest: one containing any voxels active in the face minus scene localizer contrast, and the other containing any voxels active in the scene minus face localizer contrast. Therefore, each fROI was defined as the set of contiguous voxels that were significantly activated within 12 mm in the anterior/posterior, superior/inferior, and medial/lateral direction of the peak anatomically constrained voxel in the contrast (Table 1, Table 2). To ensure a liberal inclusion criterion for identification of all domain-sensitive voxels involved in the task, a threshold of p < 0.05 (uncorrected) was used to isolate active voxels.
Analysis 2A: effect of preexposure on learning.
fMRI time series data were submitted to a (random effects) general linear model, with one predictor that was convolved with a standard model of the hemodynamic response function for each event type/condition. The regressors were defined by the exposure history of each discrimination event (i.e., ‘intermixed dots,” “blocked dots,” “nonexposed dots,” “intermixed faces,” “blocked faces,” “nonexposed faces,” “intermixed scenes,” “blocked scenes,” and “nonexposed scenes”). The first nonexposed trial from each stimulus type was excluded from this analysis because it had been used to generate the independent fROI data; at this point, data from blocked preexposure conditions were also discarded. Multiple linear regression on the time courses resulted in one β-image for each event type per participant. These parameter estimates were used in a higher-level (group) FLAME analysis (fMRIB's Local Analysis of Mixed Effects; Beckmann et al., 2003; Woolrich et al., 2004).
The parameter estimates within the 10 ROIs identified from our localizer (face-sensitive and scene-sensitive populations of voxels within PrC, AntHC and PostHC, PFG and PostPG) were measured (using Featquery) for intermixed preexposed versus nonexposed faces, scenes and dots (Fig. 4, Steps 3 and 4). A whole-brain contrast between the intermixed preexposed and nonexposed items was also performed for each stimulus category. FEAT's group (Gaussianized) t-statistics were converted to z-statistics and thresholded using clusters determined by z > 3 and a (corrected) cluster significance threshold of p = 0.05 (Worsley et al., 1992).
Analysis 2B: learning of nonexposed pairs over repetition.
For Analysis 2B, we looked at the activity associated with behavioral performance on scene and face nonexposed pairs during their presentation in the test phase. Regressors (n = 32) were defined by the stimulus type of each discrimination event and the time point of occurrence (e.g., scene stimuli, first trial; scene stimuli, second trial (i.e., first repeat) … scene stimuli, sixteenth trial). Each event was further categorized according to behavioral outcome (correct or incorrect discrimination), resulting in four additional regressors (correct scenes, incorrect scenes, correct faces, and incorrect faces). Parameter estimates from the GLM were then combined in a higher-level (group) FLAME analysis (fMRIB's Local Analysis of Mixed Effects; Beckmann et al., 2003; Woolrich et al., 2004) that allowed group-level contrasts.
Data were then submitted to a fROI analysis using the same localizer coordinates used in Analysis 2A. To assess the effect of response accuracy within each of the 10 fROIs, the remainder of the discrimination trials (from the second to the sixteenth repeated trials) were separated according to correct versus incorrect discrimination response separately for face and scene trials (Fig. 4, Steps 5 and 6). An average of 40 trials per category per participant were classed as correct responses, with an average of 20 trials classed as incorrect.
For the purposes of the statistical analysis, the functional regions identified in the PostHC, AntHC, and PrC were grouped together as “MTL” regions. Although parahippocampal cortex is anatomically associated with the MTL (Witter, 2002) and necessary for some aspects of long-term memory (Diana et al., 2007, 2010), it is also critical for representing the spatial layout of visual environments (Epstein and Kanwisher, 1998). This perceptual role seems to be functionally different from that played by the HC in scene perception and memory (Epstein et al., 2007; Hartley et al., 2007; Epstein, 2008; Mundy et al., 2012) and more similar to other domain-sensitive areas located on the ventral surface of the temporal lobe (Schwarzlose et al., 2008). Therefore, the domain-sensitive parahippocampal and fusiform fROIs were grouped together (for statistical purposes) as “extrastriate” regions.
For completeness, we also report standard whole-brain accuracy analyses at the end of the Results section.
Patients: Analysis 1A (the effect of preexposure on learning)
Figure 5A shows the controls' mean discrimination accuracy for the three stimulus types (scenes, faces, and dots), represented as a percentage difference between performance on pairs of stimuli seen previously (exposed) compared with those not exposed to participants at test. The greater the difference between these two conditions, the larger the perceptual learning effect shown by the controls (and, by extension, patients). Figure 5A indicates an average improvement in discrimination accuracy between preexposed and nonexposed pairs of ∼12% in controls, a difference that was significant for all three stimulus types (smallest t(11) = 18.28, p < 0.01). Furthermore, the graph revealed that control performance was well matched across stimulus types, a conclusion supported by an ANOVA revealing no significant effect of stimulus category (F(2,22) = 1.7, p = 0.193).
Like controls, patient HC3 showed clear evidence of a benefit of preexposure on her discrimination decisions for dots and faces, with an accuracy difference of 17% between preexposed and nonexposed conditions. In contrast to the controls, however, she was unable to learn any scene discriminations showing equivalent (chance) performance for both pairs of scenes seen previously and nonexposed scene pairs. Although MTL 3 showed a small difference between the preexposed and nonexposed pairs of scenes (2%) and faces (4%), both of these were well outside the level of perceptual learning demonstrated in the controls, highlighting abnormal discrimination learning for these two stimulus types. MTL3 was not incapable of any learning, however, as she showed a level of perceptual learning similar to HC3 (and numerically greater than the controls) for dot patterns (16%). Crawford t test analyses (Crawford et al., 1998; Crawford and Garthwaite, 2002) confirmed that the patients' perceptual learning for dots was not significantly different from controls (all ts < 1), but that they had poor perceptual learning for scenes (HC3: t(11) = 5.3, p < 0.01; MTL3: t(11) = 4.4, p < 0.01), and, in the case of MTL3, deficient perceptual learning for faces (t(11) = 4.2, p < 0.01; HC3: t(11) = 1.7, p = 0.1).
Similar stimulus-dependent patterns (in patients) were also evident in the RTs (Fig. 5B). To analyze these data, we collapsed across preexposed and nonexposed trials because statistical analysis (ANOVA for controls; paired one-way t tests for patients) confirmed that there were no significant differences between RTs for preexposed versus nonexposed pairs in controls (all Fs < 1) and that any differences in the RTs obtained for preexposed and nonexposed stimuli in patients were not significantly greater than those seen in the individual controls (true of all conditions). Although RTs were well matched across stimulus type in the controls (F < 1), like the accuracy data, patient HC3 showed RTs that were similar to controls for dot patterns and faces, but took almost double the time of controls to respond to scene discriminations. Similarly, patient MTL3 showed strikingly longer RTs for scene discriminations; in addition, she was also much slower in her response to the face pair discriminations while responding as fast as controls to dot patterns. Crawford t test analysis confirmed that the RTs obtained for MTL3 in the face and scene conditions (faces: t(11) = 4.3, p < 0.01; scenes: t(11) = 5.5, p < 0.01) and in HC3 for scene discriminations (t(11) = 8.3, p < 0.01) were significantly different from those seen in controls. Critically, HC3 did not show a significantly different pattern in the face condition (t < 1), and neither patient was significantly slower than controls when responding to dots (all ts < 1).
Patients: Analysis 1B (learning of nonexposed pairs over repetition)
Complementing these analyses, we also investigated whether the patients showed any learning over repeated presentation of the nonexposed pairs in the discrimination test phase (Fig. 5C). ANOVA confirmed that the controls' discrimination performance to nonexposed stimulus pairs improved over repetition, but did not differ across stimulus type (e.g., there was a significant main effect of block, F(3,9) = 27.63, p < 0.01, but no significant effect of stimulus type or an interaction, F < 1). ANOVA also confirmed significant linear trends in the controls' learning across all stimulus types (F(1,11) = 51.63, p < 0.01). To compare the performance of the controls with the patients statistically, improvement was measured by calculating the gradient of the linear trend (m) in learning for each stimulus type as follows: m = (y − c)/x, where c is the y-axis intercept, x-axis is the block number, and y-axis is the percentage correct; values obtained for m = 13.02 for dots (r2 = 0.94), 12.5 (r2 = 0.94) for faces, and 11.77 (r2 = 0.99) for scenes. ANOVA confirmed that there were no differences in these learning profiles (F < 1). Although patient HC3 showed a similar learning profile to controls for dots and faces (Crawford t < 1), the gradient of the linear trend for nonexposed scene stimuli was significantly weaker than that of controls (t(11) = 4.0, p < 0.01). Patient MTL3 also showed similar performance to controls for dot patterns (t < 1), but her learning profile for faces and scenes were significantly weaker than that seen in the controls (faces: t(11) = 3.6, p < 0.01; scenes: t(11) = 4.0, p < 0.01). It is reassuring to note that the same statistically significant patterns were evident over repetition of the preexposed stimuli during the discrimination test phase, although the performance differences between patients and controls were exacerbated by the effect of preexposure itself.
Analysis 2A: fMRI (the effect of preexposure on learning)
Table 3 shows the discrimination scores obtained for the six experimental conditions. ANOVA revealed a main effect of exposure condition (F(1,15) = 45.86, p < 0.01), but no overall effect of stimulus category (F < 1) and no interaction (F < 1).
Parameter estimates from preexposed (intermixed) and nonexposed face, scene, and dot trials were extracted from the 10 ROIs identified using the localizer. ANOVA revealed a significant four-way interaction of: fROI (PFG, PostPG, PrC, PostHC, AntHC) × stimulus-sensitive voxels (face-sensitive/scene-sensitive) × stimulus type (faces/scenes/dots) × exposure (preexposed/nonexposed) (F(8,120) = 15.44, p < 0.01).
Further statistical exploration focusing separately on patterns of activity within face-sensitive voxels and scene-sensitive voxels in the five fROIs, revealed for face-sensitive voxels a significant three-way interaction between stimulus type, exposure, and ROI (F(4,60) = 25.01, p < 0.01; Fig. 6A,B). Face-sensitive voxels in PFG, AntHC, and PrC showed greater activity to preexposed face pairs than to nonexposed faces (PFG: t(15) = 3.41, p < 0.01; AntHC: t(15) = 2.99, p < 0.01; PrC: t(15) = 3.07, p < 0.01), but there was no modulation of exposure in these three regions for scene or dot pairs (t < 1). Face-sensitive voxels in PostPG and PostHC showed no significant differences between preexposed and nonexposed stimuli regardless of stimuli type (t < 1).
Turning to scene-sensitive voxels, ANOVA revealed a significant three-way interaction between stimulus type, exposure, and ROI (F(2,30) = 19.83, p < 0.01; Fig. 6C,D). This interaction reflected greater activation in scene-sensitive voxels in PostPG and PostHC for previously seen pairs of scenes compared with nonexposed pairs (PostPG: t(15) = 2.94, p < 0.01; PostHC: t(15) = 3.19, p < 0.01). Activity associated with preexposed and nonexposed face and dot pairs was not significantly different (t < 1). Similarly, scene-sensitive voxels in PFG, AntHC, and PrC showed no evidence of significantly greater activation for preexposed over nonexposed trials for all stimuli (t < 1).
Whole-brain analyses in which the preexposed and nonexposed conditions for faces, scenes, and dots were contrasted (separately by stimulus type) revealed similar findings. For faces, a significant region of BOLD signal change was observed in the lingual gyrus that extended into the inferior occipital gyrus and (temporal/occipital) fusiform gyrus. The extent of this activation likely included FFA. There was also significant activity in the AntHC that extended into the PrC bilaterally (L > R). Previously reported anatomical locations place the FFA bilaterally at −38, −46, −16; 41, −47, −17 (Table 1), which is close to the peak voxel coordinate in the significant clusters highlighted here: −36, −47, −15; 35, −43, −14. The (left) PrC has previously been identified at −26, −9, −18 (see MNI space −27, −7, −25, Lee et al., 2008), again, almost identical to the significant cluster obtained from our current analysis (−26, −9, −26).
In the scenes contrast, activation was evident in the PostPG (likely encompassing the PPA) and extending bilaterally into PostHC. The PPA has been previously reported at −23, −44, −9; 27, −40, −7 (Table 1) and the (left) PostHC at −23, −29, 0 (see MNI space −24, −29, −4, Lee et al., 2008); both of these locations are close to the clusters found here (PostPG: −21, −39, −6; 25, −37, −7; PostHC: −26, −33, −4; −28, −32, −7).
Perceptual learning of dot patterns revealed significant activation in the occipital pole extending into the medial inferior occipital gyrus and lingual gyrus, but no higher-order ventral visual or MTL areas. Because no parahippocampal, fusiform, HC, or PrC activation was found in the dots perceptual learning contrast (preexposed vs nonexposed), this condition was excluded from further analysis.
Analysis 2B: fMRI (learning of nonexposed pairs over repetition)
Participants showed no overall difference in their averaged discrimination accuracy between stimulus types for the nonexposed trials presented at test (mean faces accuracy: 66.25%, scenes accuracy: 68.55%, dots accuracy: 66.56%; F < 1). Similarly, there was no overall difference in RTs (mean faces RT: 1.40 s, scenes RT: 1.43 s, dots RT: 1.40; F < 1). There was also no significant difference in participants' use of “same” or “different” responses in any of the stimulus types (t < 1), suggesting no response bias. Furthermore, there were no significant differences in the accuracy evident on “same” compared with “different” responses for any stimulus type (largest t(15) = 1.73, p = 0.104).
Turning to the analysis of the data obtained from the 10 fROIs (face-sensitive and scene-sensitive populations of voxels within PrC, AntHC and PostHC, PFG and PostPG), we found a significant four-way interaction of area (extrastriate/MTL) × stimulus-sensitive voxels (face-sensitive/scene-sensitive) × stimulus type (faces/scenes) × response accuracy (correct/incorrect) (F(1,15) = 15.78, p < 0.01). It is worth noting that although this accuracy analysis, like that of the patients in Experiment 1, focused on the nonexposed trials, an equivalent analysis can be performed using the previously exposed intermixed stimuli, which also revealed a similar significant four-way interaction (F(1,15) = 12.22, p < 0.01).
Although extrastriate regions showed a main effect of stimulus type (F(1,15) = 18.35, p < 0.01) qualified by an interaction between stimulus-sensitive subregion (PFG/PostPG) and stimulus type (F(1,15) = 34.27, p < 0.01), there was no significant effect of discrimination accuracy and no three-way interaction (all Fs < 1; Fig. 7A, right). In contrast, the MTL showed a significant three-way interaction (F(2,30) = 18.94, p < 0.01; Fig. 7A, left). Face-sensitive voxels in PrC were associated with discrimination accuracy for faces (t(15) = 4.55, p < 0.01), but not scenes (t < 1), but scene-sensitive PrC voxels did not show any significant change in activity for either category of stimuli (t < 1.4). In PostHC, scene-sensitive voxels were associated with discrimination accuracy for scenes (t(15) = 3.43, p < 0.01), but not for faces (t < 1), whereas face-sensitive PostHC voxels were not involved in successful discrimination performance for either category (t < 1).
Voxels within the AntHC were not associated with accuracy for either faces or scenes (t < 1) and thus will not be considered further. Analyses of activity by accuracy for the nonpreferred category in each voxel population revealed no statistically significant effects (all Fs < 1; Fig. 7B).
Consistent with our fROI analysis, whole-brain analysis also revealed involvement of the PrC and PostHC in discrimination accuracy for nonexposed faces and scenes, respectively (Fig. 8). BOLD activity relating to accurate discrimination for faces alone and scenes alone was defined by contrasting correct versus incorrect nonexposed trials. This contrast was performed with a random-effects model and tested at an uncorrected threshold of p < 0.001. When this analysis was conducted with face stimuli (i.e., correct nonexposed faces vs incorrect nonexposed faces), it revealed activation centered on PrC (−26, −10, −25; 25, −12, −25), with no further areas of significant BOLD activation. A similar contrast of correct versus incorrect nonexposed scene trials revealed activation in PostHC (−25, −35, −3; 27, −34, −5), with no further areas of significant activity. Furthermore, equivalent patterns of domain-sensitive MTL activity were evident if preexposed stimuli were analyzed and when both nonexposed and preexposed were pooled together.
These findings were further complemented by a whole-brain comparison in which we investigated whether there were any brain areas showing a domain-general pattern as measured by a significant difference in activity for correct discriminations compared with incorrect discriminations across all trial types (i.e., correct scenes + correct faces vs incorrect scenes + incorrect faces). This contrast was performed with a random-effects model and tested at an uncorrected threshold of p < 0.001. No areas in the MTL, parahippocampal, or fusiform cortex showed a significant domain-general pattern of activation for correct compared with incorrect discriminations. There was, however, a single cluster of activation revealed in lingual gyrus, corresponding to early visual cortex. The results of this analysis do not change if dot trials are included (e.g., correct faces + correct scenes + correct dots minus incorrect faces + incorrect scenes + incorrect dots).
In Experiment 1, HC damage resulted in impaired scene, but not face or dot pattern, discrimination learning. Broader MTL involvement, including the HC and PrC, affected scene and face, but spared dot learning. These distinct patterns were evident on two measures: (1) a comparison of performance on preexposed versus nonexposed discriminations and (2) learning of nonexposed (and preexposed) discriminations over repetition at test. There was no hint that the patients' preserved learning was abnormal; both patients performed as well as controls for accuracy and their RTs were equivalent to controls when they showed good perceptual learning. These findings complement Graham et al. (2006), in which three patients with bilateral HC damage (including the patient reported here) showed slowed RTs (but normal accuracy) to scene categorization and learning. Our study extends the conclusions from that study, however, by demonstrating a clear impact on accuracy as well as RTs, revealing a PrC contribution to face perceptual learning, and showing normal dot pattern discrimination learning using an identical task. The latter finding is important: normal perceptual learning in amnesia is often demonstrated with a dot prototype learning paradigm (Knowlton and Squire, 1993; Kolodny, 1994; Squire and Knowlton, 1995). Our patients showed normal learning on this paradigm (Graham et al., 2006) and, as revealed here, for dot discrimination learning using a different experimental task. Because the patients did not show evidence of normal perceptual learning across all visual categories, however, our study reveals that the type of information to be acquired is a key factor in driving performance on perceptual learning tasks.
In Experiment 2, we obtained complementary evidence that the PostHC and PrC were involved in discrimination learning for scenes and faces, respectively. Activity within face-sensitive, but not scene-sensitive, voxels in PrC and scene-sensitive, but not face-sensitive, voxels in PostHC was modulated by discrimination accuracy (correct > incorrect) at test for both nonexposed and previously exposed pairs. Whole-brain analysis also revealed a similar domain-sensitive, accuracy-dependent pattern in the PostHC and PrC. In contrast, activity in the parahippocampal cortex and fusiform gyrus distinguished between preferred and nonpreferred categories (scenes vs faces), but was not modulated by discrimination accuracy (see also O'Neil et al., 2009, in which fusiform gyrus showed more limited accuracy effects compared with the PrC during recognition memory for face stimuli).
In our fMRI experiment, there was no difference in overall accuracy across the three stimulus conditions; participants started at the same baseline and showed the same degree of improvement in their learning of faces and scenes (and also dots). Therefore, differences in the difficulty of learning about faces and scenes cannot explain the fMRI findings, nor can they explain the results of the patient study in which performance was similarly matched. The fMRI results, therefore, imply that the PrC and PostHC subregions that we identified encode face and scene representations (respectively) that are useful in supporting successful discrimination between the highly similar face and scene pairs presented in our experiment. The lack of accuracy effects for nonpreferred categories in the PostHC and PrC fROIs further strengthens this contention.
The results reported here complement animal and human neuropsychological studies highlighting stimulus-sensitive deficits for complex objects and scenes after damage to the MTL (e.g., Buckley et al., 2001; Bussey et al., 2002; Lee et al., 2005a; Saksida et al., 2006; Barense et al., 2007; Bird et al., 2007; Taylor et al., 2007). However, not all focal amnesic patients show such patterns (Levy et al., 2005; Shrager et al., 2006), and there has been heated debate regarding the locus of these cognitive difficulties, including suggestions that some patients have involvement of fusiform and/or parahippocampal areas in addition to their HC and PrC damage (Squire et al., 2006; Jeneson and Squire, 2012). This view is inconsistent with data showing that the two amnesic patients described here show domain-sensitive responding in PPA for scenes, lateral occipital cortex for objects, and FFA for faces when scanned during a functional localizer task (Lee and Rudebeck, 2010a; Fig. 1). Our neuropsychological and neuroimaging results add weight to this finding, in particular the converging evidence that the HC and PrC were the critical contributors to successful discrimination learning. Therefore, it seems highly unlikely that the deficits observed here, and in our patients on similar tasks (Barense et al., 2005; Lee et al., 2005b), reflect fusiform and parahippocampal involvement. Instead, this developing body of evidence highlights that the requirement to process conjunctions of visual and/or spatial features appears to be critical in eliciting such impairments in patients (Graham et al., 2010). More explicitly, as argued by Barense et al. (2012), the PrC is necessary for storing unique object representations (with individual object features dependent upon more posterior regions within the brain; Mundy et al., 2012). In contrast, the HC stores the unique spatial layouts of these objects in an environment and may be required when there is repetition of object features, but also of the spatial locations of objects themselves. It remains to be determined whether the HC is also engaged by conjunctive spatial feature changes within an object in the same way that it processes conjunctive spatial layout changes within a scene containing multiple objects.
Our findings demonstrate that regions beyond visual cortex (Mukai et al., 2007) contribute to short-term discrimination learning, a finding not predicted by some human memory accounts (Diana et al., 2007; Squire et al., 2007; Brown et al., 2010; Montaldi and Mayes, 2010; Ranganath, 2010). It is also controversial whether the role of MTL regions goes beyond long-term memory to short-term memory (Ranganath and Blumenfeld, 2005; Hartley et al., 2007; Lee and Rudebeck, 2010b; Jeneson and Squire, 2012) and even perceptual processing (Lee et al., 2005b; Baxter, 2009; Suzuki, 2009; Barense et al., 2010a,b; Graham et al., 2010; Lee et al., 2012). The domain-sensitive impairments observed in our amnesic subjects are also seen in trial-unique oddity judgements that placed no explicit requirement on remembering stimuli across trials. Patients were presented with different views of the same item (e.g., face, object, or scene) alongside a completely different item and asked to indicate which item was the odd one out (Lee et al., 2005a; Barense et al., 2007). Selective damage to the HC affected scene oddity judgments, but not judgments on faces or objects, whereas larger lesions to the MTL, including both the HC and PrC, impaired object, face, and scene, but not color or size, oddity decisions (see also Lee et al., 2006b, for similar findings in dementia and Buckley et al., 2001, for equivalent impairments in nonhuman primates).
FMRI studies using variations of these oddity judgment tasks activate similar regions to those elicited by our visual discrimination paradigm (Lee et al., 2006a, 2008; Devlin and Price, 2007; O'Neil et al., 2009; Barense et al., 2010a, 2011), revealing complementary patterns of domain-sensitive responding in the HC and PrC across different tasks with varying degrees of mnemonic demand. It is also worth highlighting recent findings from fMRI studies in nonhuman primates revealing multiple temporal lobe brain regions that respond relatively selectively to discrete object categories, including an anterior face patch (Tsao et al., 2003; Pinsk et al., 2005; Rajimehr et al., 2009). The precise functional roles of these more anterior regions in animals have not yet been elucidated, but given the striking convergence between findings from human and nonhuman primate neuropsychological lesion studies (for review, see Saksida and Bussey, 2010), it is possible that the anterior face patch, if analogous to PrC in humans, may also include complex conjunctive face and/or object representations.
There is accruing evidence that anatomically separate domain-sensitive HC and PrC regions represent complex conjunctive stimuli necessary for multiple aspects of human memory, including—as demonstrated here—success on tasks that require learning to make perceptual discriminations between highly visually similar exemplars. Models that focus on a specific role for the HC in spatial information processing (Hassabis and Maguire, 2009; Bird et al., 2012), as well as accounts that place these findings in an evolutionary context (Murray and Wise, 2010), provide a potential framework within which to understand these domain-sensitive contributions. The challenge for these and related theories is to determine the following: (1) what types of representations are being stored within these domain-sensitive subareas and whether these are the only regions that drive such effects, (2) when these regions are necessary (or not) for learning and memory, and (3) how any domain-sensitive parts of the HC and PrC may be anatomically and functionally connected with areas involved in binding information across different modalities (Eichenbaum et al., 2007; Graham et al., 2010; Montaldi and Mayes, 2010; Ranganath, 2010). A further issue that requires resolution is the “anatomical” association of PHC with the MTL (Witter, 2002) in the context of a “functional” profile similar to other extrastriate areas (Schwarzlose et al., 2008; Mundy et al., 2012). Consideration of possible anatomical/functional dissociations between anterior and posterior areas of the PHC and HC might help to address this issue.
This work was supported by the Wales Institute of Cognitive Neuroscience and the BBSRC (Grant #BB/I007091/1). The Wales Institute of Cognitive Neuroscience was set up by a cross-institution grant from the Welsh Government to the Schools of Psychology at Cardiff, Bangor, and Swansea Universities. We thank our colleagues at the Cardiff University Brain Research Imaging Centre, particularly John Evans and Martin Stuart, for help with the scanning protocol and data collection, Andy Lee for providing Figure 1, and Chris Chambers, Andrew Lawrence, and Ed Wilding for comments on the manuscript.
- Correspondence should be addressed to Prof. Kim Graham, School of Psychology, Cardiff University, Tower Building, Cardiff, CF10 3AT, United Kingdom.
This article is freely available online through the J Neurosci Author Open Choice option.