Abstract
Expertise can increase working memory (WM) performance, but the cognitive and neural mechanisms of these improvements remain unclear. Here, we used functional magnetic resonance imaging to assess the degree to which expertise acquisition is supported by tuning of occipitotemporal object representations and tuning of prefrontal and parietal networks that may support domain-specific WM skills. We trained subjects to become experts in a novel category of complex visual objects and examined brain activity while they performed a WM task with objects from the expert category and from an untrained category. Visual expertise training resulted in improved recognition of expert, compared with untrained objects, and this effect was eliminated in a behavioral experiment by stimulus inversion. These behavioral changes were accompanied by increased recruitment of bilateral dorsolateral prefrontal, posterior parietal, and occipitotemporal cortices during WM encoding and maintenance. Across subjects, behavioral measures of expertise reliably predicted increased activation during maintenance of expert objects in all three regions. These neural expertise effects could not be attributed to differences in low-level stimulus characteristics between the two categories, familiarity with features of expert-domain objects, or familiarity with the WM task. These results are consistent with the idea that visual expertise improves WM performance through tuning of occipitotemporal object representations and through development of lateral prefrontal and posterior parietal networks that mediate the application of domain-specific mnemonic skills.
Introduction
Working memory (WM) processes support the on-line maintenance and manipulation of information in the absence of external stimulation (Baddeley, 1986). Results from a number of studies suggest that WM is capacity limited (Miller, 1956; Luck and Vogel, 1997; Cowan, 2001; Irwin and Zelinsky, 2002), but expertise can effectively increase WM capacity for stimuli within the expert domain (Chase and Simon, 1973; Ericsson and Kintsch, 1995). For example, chess experts can encode and maintain complex configurations of pieces on a chess board (Chase and Simon, 1973), but their enhanced visual memory performance is specific to the domain of chess. Little is known about the neural mechanisms by which expertise might influence WM, but available evidence suggests at least two possibilities.
Neuroimaging research has focused on the effects of perceptual expertise acquisition in occipitotemporal cortex (OTC), showing that activation in at least two regions [the fusiform face area (FFA) and the lateral occipital complex (LOC)] is enhanced during perceptual processing of expert-domain objects (Gauthier et al., 1999, 2000; Grill-Spector et al., 2004; Xu, 2005). Thus, increased WM performance in experts might reflect tuning of posterior cortical areas with training, which in turn decreases the need for executive control during maintenance of task-relevant information. Accordingly, tuning of visual object representations in OTC during expertise acquisition might support more efficient and automated encoding and maintenance of expert-domain objects.
Neuroimaging studies have not investigated the effects of expertise acquisition in regions outside of OTC. Psychological models suggest that expertise should be associated with the development of executive memory skills that guide the encoding and maintenance of domain-specific information (Ericsson and Kintsch, 1995; Gobet, 1998). Dorsolateral prefrontal cortex (DLPFC) has been shown to play a role in directing efficient (Bor et al., 2003, 2004; Olesen et al., 2004; Bor and Owen, 2006) and successful (Pessoa et al., 2002; Sakai et al., 2002) WM encoding and maintenance. Additionally, activity in the intraparietal sulcus (IPS) has been shown to be modulated by the amount (Todd and Marois, 2004) and complexity (Xu and Chun, 2006) of visual information that is encoded and maintained in WM. Thus, we hypothesized that the development of memory skills might be associated with increased prefrontal and parietal recruitment.
Here, we used event-related functional magnetic resonance imaging (fMRI) to determine the extent to which the effects of expertise on WM reflect plasticity in OTC, DLPFC, and IPS. We trained participants for >10 h to acquire expertise with a category of abstract visual objects. After training, they were scanned while performing a WM task with stimuli from the expert category and stimuli from an untrained category (see Fig. 1A). Next, participants completed a behavioral experiment to further assess perceptual processing of stimuli from the expert and untrained categories (see Fig. 1B). Analyses of the fMRI data then examined the effects of expertise acquisition on activation in OTC, DLPFC, and IPS during WM encoding and maintenance.
Materials and Methods
Subjects.
Eleven right-handed subjects (three females and eight males) were recruited from the University of California Davis undergraduate student population. In addition to these subjects, 11 control subjects participated in one behavior-only testing session (for comparison of performance with the 11 experts on the last day of training). Each subject provided informed consent before participation in the expertise training and fMRI experiment.
Stimuli.
Two categories of novel stimuli were generated as structures of polygons in MATLAB (MathWorks, Natick, MA). Each category was constructed from the same set of polygons, but the categories differed in the orientation and configuration of these parts. Within each category, 800 individual exemplars were generated by varying a prototype object along 40 feature dimensions (e.g., the width of a polygon) along a normal distribution. Each category was divided into four families, with each family having a distinctively extreme set of features. Three of the four families were used during training, and the fourth was used during scanning. For each of the 800 stimuli, foil stimuli were created at three difficulty levels, which differed from their respective exemplars along 30, 20, and 10 dimensions, respectively. These stimuli were used during the discrimination training tasks and allowed for the task difficulty to be increased as subjects progressed through their expertise training.
Expertise training.
Each subject was trained to become an expert on one of the two categories of novel stimuli and was not exposed to the other category until the day of the scanning session. To minimize the possibility that expert-novice differences in brain activity could be attributed to physical differences between the stimuli in each category, we trained five subjects on one category and six on the other. Results were collapsed across the two groups in all described analyses. Before the scanning session, each subject participated in seven 90 min expertise training sessions over the course of 10 d. During each training session, subjects performed four tasks: simultaneous match-to-sample, delayed recognition, family placement, and family discrimination. These tasks are schematically depicted in a figure in the supplemental material (available at www.jneurosci.org). Over the course of training, each training task was made progressively more difficult, forcing subjects to develop skills to rapidly process objects from the training category. In the match-to-sample task, subjects were required to determine which of two objects matched a simultaneously presented sample object. In the delayed recognition task, subjects were required to hold a shape in memory during a brief delay (3–6 s) and then determine whether a second shape was identical. For these tasks, difficulty was increased over the course of training by increasing the similarity between targets and foils and progressively decreasing presentation times. During the first session, stimuli were presented for 5 s, and targets and foils differed along 30 feature dimensions (of 40 possible dimensions). During the last session, stimuli were presented for 1 s, and targets and foils differed along only 10 dimensions. In the family placement task, subjects were required to assign stimuli to a family (labeled Family 1, 2, and 3). During the family discrimination task, subjects were shown three objects, two of which were from the same family. Subjects were required to indicate which shape was the “odd” one (i.e., from the other family). In these categorization tasks, stimuli were presented for progressively shorter periods, and subjects were given immediate feedback after each trial. Training was complete when the subject had finished the 10.5 h of training. To assess the effectiveness of the training procedure, 11 control participants were tested in one session, using the tasks that experts completed in their final training session.
MRI session.
Immediately before MRI scanning, subjects were familiarized with six exemplars from the untrained category. Specifically, we had subjects perform 300 trials of a match-to-sample task with these objects as stimuli. These stimuli were subsequently used in the “familiar untrained” trials in the WM task performed in the scanner. During the MRI scanning session, subjects performed a WM task with novel stimuli from the expert category and novel and familiar stimuli from the untrained category (see Fig. 1). The expert objects were taken from an untrained family that the subject had not encountered during training, and each object presented during the test was unique. On each trial, a cue object was presented for 1.25 s, followed by a variable delay of 6.75–12.75 s. Next, a probe object was presented for 1.25 s, during which time the subject was to decide whether it matched the cue object. The intertrial interval was jittered from 8.75 to 14.75 s. Within each scanning run, expert and untrained trials were randomly intermixed to minimize attentional or motivational confounds. Across the session, subjects completed a total of 45 novel expert, 45 novel untrained, and 45 familiar untrained trials.
After the completion of the WM task, subjects performed localizer tasks to identify the FFA (Puce et al., 1995; Kanwisher et al., 1997) and LOC (Malach et al., 1995; Kourtzi and Kanwisher, 2000; Grill-Spector et al., 2001). In the LOC localizer task, subjects passively viewed alternating blocks of intact and scrambled objects from the expert category (Murray and Wojciulik, 2004). In the FFA localizer task, subjects performed a one-back task while viewing face and scene stimuli (Ranganath et al., 2004a). After performing the localizer tasks, each subject performed a visuomotor response task that was used to derive a subject-specific hemodynamic response function (HRF) (Aguirre et al., 1998b; Handwerker et al., 2004).
Expertise evaluation session.
Ten of the 11 participants from the MRI study also completed a behavioral experiment that was designed to assess differences between processing of expert and untrained objects. The mean time between the MRI study and the subsequent behavioral experiment was 11.4 d, and the distribution was as follows: 1, 1, 1, 2, 3, 13, 14, 32, 33. As described later, there was no relationship between the time between testing sessions and performance on the behavioral experiment. Subjects performed a simultaneous match-to-sample task (see Fig. 1). On each trial, a target object was presented on the top of the screen, and two test shapes were presented on the bottom. The array of stimuli was presented for 1 s, and subjects were given a total of 2 s to determine which of the bottom objects was identical to the test object. After the subject responded, there was a 1 s delay until the next trial. On each trial, stimuli were either upright or inverted objects from the expert or untrained class (80 trials in each condition).
Image acquisition and processing.
MRI data were collected on a 1.5T GE SIGNA scanner at the University of California Davis Imaging Research Center. Functional imaging was performed using a gradient echo echo-planar imaging sequence (repetition time, 2000; echo time, 40; field of view, 240 mm; 64 × 64 matrix), with each volume consisting of 24 contiguous 5 mm axial slices oriented parallel to the AC–PC (anterior commissure–posterior commissure) line. Coplanar and high-resolution T1-weighted images also were acquired in the same session. fMRI data preprocessing was performed with statistical parametric mapping (SPM99) software for all subjects. For map-wise statistical analyses, images were sinc-interpolated in time to correct for interslice timing differences in image acquisition, realigned using a six-parameter, rigid-body, transformation algorithm, spatially normalized to the template from the International Consortium for Brain Mapping Project (Cocosco et al., 1997), resliced into 3.5 mm isotropic voxels, and spatially smoothed with an 8 mm full-width at half-maximum Gaussian filter. Analyses of data from the FFA and LOC regions of interest (ROIs) were performed on native-space data to maximize the ability to discriminate these areas from adjacent cortical areas (Kanwisher et al., 1997; Aguirre et al., 1998a). For these native-space ROI analyses, images were sinc-interpolated in time and spatially realigned, but no spatial normalization or smoothing was performed.
fMRI analysis.
As in previous studies (Courtney et al., 1997; Zarahn et al., 1997b; Postle et al., 2000; Rowe et al., 2000; Ranganath and D'Esposito, 2001; Munk et al., 2002; Sakai et al., 2002; Curtis et al., 2004; Ranganath et al., 2004a, 2005), activity changes associated with different trial components were deconvolved using multiple regression. In this approach, the time course of BOLD signal changes on any given WM trial is considered as a combination of cue, delay, and probe-related neural activity changes that are convolved with the HRF. The vectors of expected neural activity during the cue, delay, and probe phases for each delay-length are depicted in the supplemental material (available at www.jneurosci.org). Covariates modeling BOLD signal changes were constructed by convolving these vectors for each trial type with a subject-specific HRF estimated from responses in the central sulcus during the visuomotor response task (Aguirre et al., 1998b; Handwerker et al., 2004; Ranganath et al., 2004a). Because the length of the delay period varied from trial to trial, our design allowed us to efficiently deconvolve delay period activity from activity occurring during the cue and probe phases (Rowe et al., 2000; Sakai et al., 2002; Sakai and Passingham, 2003; Ranganath et al., 2005). Inspection of observed activity time courses confirmed that the model estimates accurately characterized the data.
Responses during each task phase were modeled separately for each trial type (novel expert, novel untrained, and familiar untrained). These covariates only modeled responses for trials that were associated with correct match/nonmatch decisions on the WM probe. Trials associated with incorrect WM decisions were modeled with separate nuisance covariates. Additional nuisance covariates modeled global signal changes that could not be accounted for by variables in the design matrix (Desjardins et al., 2001), trial-specific baseline shifts, and an intercept. Each regression analysis was performed using the modified general linear model (Worsley and Friston, 1995), in which the convolution matrix included a time-domain representation of the 1/f power structure and filters to remove frequencies above 0.25 Hz and below 0.02 Hz (Aguirre et al., 1997; Zarahn et al., 1997a).
Each regression analysis yielded parameter estimates indexing the fit of the covariates for each component (cue, delay, and probe) of each type of WM trial (novel expert, novel untrained, familiar untrained) to the observed data. The magnitude of each parameter estimate can be interpreted as an estimate of the BOLD response amplitude attributable to the corresponding trial component. After single-subject analyses, contrast images were created for each subject by computing the difference in parameter estimates between expert and untrained trials across the cue and delay periods. In this contrast, the parameters were weighted as follows: expert cue = +2, expert delay = +2, novel untrained cue = −1, novel untrained delay = −1, familiar untrained cue = −1, familiar untrained delay = −1. These contrast images were entered into a second-level, one-sample t test, in which the mean estimate across participants at each voxel was tested against zero. Significant regions of activation were identified using an uncorrected, one-tailed threshold of p < 0.001 and a minimum cluster size of at least 10 contiguous voxels. Thresholded statistical parametric maps were overlaid on T1-weighted images using MRIcro software (Rorden and Brett, 2000). Suprathreshold clusters of voxels in the left and right middle frontal (DLPFC) gyri, middle occipital (OTC) gyri, or IPS were used to define ROIs that were interrogated in subsequent analyses. The time course of activity on each trial was extracted from these ROIs, and the time courses were temporally realigned to cue and probe stimulus onsets before averaging. Mean parameter estimates were also extracted for each ROI and contrast of interest.
In addition to ROIs defined from group analyses, we additionally created individually defined ROIs for the FFA and LOC to test a priori hypotheses regarding expertise in visual WM. Each was defined by analyzing single-subject native space data acquired during a functional localizer task. The FFA was defined to include all contiguous suprathreshold voxels in the right midfusiform gyrus in the contrast between activity during viewing of blocks of faces and blocks of scenes. Using these criteria, we were able to define an FFA in 9 of the 11 participants. The LOC was defined to include all contiguous suprathreshold voxels in the lateral ventral occipital cortex in the contrast between viewing of blocks of intact objects and blocks of scrambled objects. LOC ROIs were identified for 10 of the 11 participants (because of technical difficulties, data for the LOC localizer scan were not available for one participant).
To assess the degree of overlap between the OTC ROI (defined in stereotactic space) and the LOC and FFA ROIs (defined in native space), we spatially normalized each subject's native-space ROIs. This allowed us to determine whether any voxels were included in the analysis of both the OTC region and the LOC or FFA region. This analysis revealed that no voxels overlapped between the FFA (which was in the right midfusiform gyrus for all subjects) and OTC (which was in the middle occipital gyrus) ROI. The lack of overlap between the OTC ROI and the FFA is consistent with previous findings showing that areas that show large expertise effects do not substantially overlap with fusiform areas that show a high degree of face-specificity (Rhodes et al., 2004). There was also no overlap between the LOC and OTC ROIs, with the exception of two subjects, for which there was a slight overlap in the left hemisphere. For one of the subjects, 2 of the 734 voxels in the LOC ROI (0.3%) overlapped with the OTC ROI. For the other participant, 5 of 867 LOC voxels (0.6%) overlapped with the OTC ROI. This overlap constituted 0.5 and 1.2% of the 406 voxels that comprise the normalized LOC ROI for the two subjects, respectively. Although the location and extent of the functionally defined LOC ROI varied across subjects, in each case, it was situated caudal to the OTC ROI. Additionally, the center of mass of the LOC tended to be slightly ventral and lateral to the OTC ROI.
Results
Overview of experimental design
Before the MRI scan session, each subject was trained to become an expert with one of two categories of novel visual objects over seven 90 min training sessions (see Materials and Methods for details on training procedures). On the day of the scan session, subjects performed a task to familiarize themselves with six exemplar objects from the untrained object category. Next, they were scanned while performing a WM task that required maintenance of a complex object across a delay (Fig. 1). On each trial, the cue and probe stimuli were either novel trial-unique objects from the expert category, novel trial-unique objects from the untrained category, or familiar objects from the untrained category (i.e., taken from the set of six objects that were viewed during the prescan task).
The reasoning for including trials with novel and familiar objects from the untrained category was as follows: contrasting activity elicited by novel objects from the expert and untrained categories should reveal differences in brain activity related to expertise while holding constant the cognitive processes related to maintenance of novel, complex visual stimuli (Ranganath and D'Esposito, 2001). However, in the event that novel objects from the expert category might have seemed familiar, because of their similarity to the objects shown in the training phase, comparison of activation between expert and familiar untrained objects could control for such effects. With these three conditions, we were able to distinguish neural and behavioral effects specifically related to visual expertise from effects related to repetition of stimulus features, perceived novelty, or familiarity (Henson and Rugg, 2003; Ranganath and Rainer, 2003; Grill-Spector et al., 2006).
Behavioral results
Prescan training phase
To assess the effectiveness of the training procedure, we compared subjects' performance on the final training session with that of 11 control subjects who received no training. Accuracy on the individual delayed recognition task was significantly higher in the trained experts relative to novice controls (F(1,20) = 19.64; p < 0.0005), confirming the effectiveness of the expertise training procedure.
MRI scan session: WM task
Discriminability (d') on the WM task was high for both expert (M = 2.43; SD = 0.13) and untrained (M = 2.08; SD = 0.21) trials. Discriminability values were significantly higher for expert than for untrained trials (F(1,10) = 8.2; p = 0.017) (Fig. 2A).
Expertise evaluation experiment
To further compare perceptual processing of objects from the expert and novice categories, we conducted a follow-up behavioral experiment. In this experiment, we used a simultaneous matching-to-sample paradigm to test the effects of stimulus inversion on visual recognition performance for expert and untrained category objects (Fig. 1B). Stimulus inversion is thought to disrupt the ability to exploit expert knowledge of feature-configurations that distinguish exemplars from one another (Diamond and Carey, 1986). Previous studies of visual expertise [e.g., dogs (Diamond and Carey, 1986) or “Greebles” (Gauthier and Tarr, 1997)] have shown that inversion eliminates reaction time (RT) advantages in recognition of stimuli from the expert category. Accordingly, we predicted that performance on the matching task would be enhanced for upright expert objects relative to untrained objects, but that performance would not differ between inverted expert and untrained objects. As shown in Figure 2B, this prediction was confirmed.
A repeated-measures factorial ANOVA revealed significant differences in d' between inverted and upright stimuli (F(1,9) = 7.7; p = 0.021) and an interaction between inversion and expertise (F(1,9) = 8.564; p = 0.017). Follow-up analyses revealed that d' was significantly higher for upright objects than for inverted expert stimuli (t(9) = 3.793; p = 0.005), upright untrained stimuli (t(9) = 2.773; p = 0.03), and inverted untrained stimuli (t(9) = 2.562; p < 0.04). There were no significant differences between upright and inverted objects from the untrained category (t(9) = 1.02; p > 0.3).
Another repeated-measures factorial ANOVA was performed to test for differences in RT. This ANOVA, including the same two factors as above, revealed a marginal effect of orientation (F(1,9) = 4.39; p < 0.07) but not expertise (F(1,9) < 1), as well as a significant interaction between orientation and expertise (F(1,9) = 12.915; p < 0.01). Follow-up tests revealed that RTs were faster on trials with upright expert stimuli than for trials with inverted expert (t(9) = 3.01; p = 0.015), upright untrained (t(9) = 2.77; p = .022) and inverted untrained trials (t(9) = 2.562; p = 0.031). No other differences were significant (all others, p > 0.3).
Because the time between the scanning session and the expertise evaluation session varied across subjects, we ran additional analyses to determine whether the delay between testing sessions was correlated with performance. Results revealed no correlation between the number of days between testing sessions and the behavioral expertise effect (expert upright − novice upright discriminability) on the postscan test (r = −0.02). The top five performers averaged 13 d between sessions, whereas the bottom five performers averaged 10 d. Thus, behavioral performance was not affected by time between the two sessions, indicating that the effects of expertise training remained robust across time.
Overall, results from this behavioral experiment demonstrated that participants were better at recognizing objects from the expert category, and that stimulus inversion eliminated the expert advantage in both discriminability and RT. These results confirm that our training procedure induced changes in processing specific to the trained category.
fMRI results
Map-wise analyses
We predicted that expertise should enhance the efficiency of WM encoding and maintenance processes. Accordingly, in our analyses of fMRI expertise effects, activation during the cue and delay periods was contrasted between trials with objects from the expert category and trials with objects from the untrained category. Our preliminary analyses revealed no significant activation differences between familiar and novel objects from the untrained category, and we therefore collapsed across these trial types in our contrasts between expert and untrained trials.
Map-wise contrasts of activity during the cue and delay periods of expert trials compared with untrained trials revealed significantly increased activation in bilateral dorsolateral prefrontal regions lying along the middle frontal gyrus (BA 9 and 46), in bilateral occipitotemporal regions in the middle occipital gyrus (BA 19/37), and in bilateral intraparietal sulcus (BA 7). These and other regions that showed differential activation are shown in Figure 3 and Table 1. No regions showed significantly increased activation for untrained stimuli, relative to expert stimuli. As noted in the supplemental material (available at www.jneurosci.org), the finding of increased (as opposed to decreased) activation related to expertise is consistent with results from other imaging studies of expertise and is not inconsistent with research on perceptual learning or repetition priming.
The results described above suggest that expertise is associated with enhanced recruitment of DLPFC, posterior OTC, and IPS during WM encoding and maintenance. We ran additional analyses to compare the relative magnitudes of neural expertise effects between these regions. In these analyses, clusters of suprathreshold voxels identified in the expert versus untrained contrast were used to define bilateral ROIs in DLPFC (BAs 9 and 46) and OTC (BA 19/37). Trial-averaged time courses and parameter estimates indexing activation during the cue and delay phases are shown for each ROI in Figure 4. The parameter estimates were entered into a repeated-measures ANOVA with three factors: ROI (DLPFC, OTC, or IPS), expertise (expert, familiar untrained, or novel untrained), and time-period (cue or delay). Critically, in addition to the anticipated significant main effect of expertise (the contrast by which the ROIs were defined) and the main effect of time period (F(1,10) = 44.79; p < 0.0001), the ANOVA revealed a significant interaction between ROI and expertise (F(2,20) = 6.5; p < 0.001). As shown in Figure 5, the expertise effect was larger in DLPFC than in OTC during the cue (t(10) = 2.58; p < 0.05) and delay (t(10) = 3.25; p < 0.01) periods and larger in DLPFC than in IPS during the delay (t(10) = 2.344; p < .05) but not cue (t(10) < 1) period. Finally, the expertise effect was larger in IPS than OTC during the cue (t(10) = 3.78; p < 0.005) and delay (t(10) = 2.346; p < 0.05) periods. These analyses suggest that, during WM encoding and maintenance, the relative magnitudes of neural expertise effects were largest in DLPFC and weakest in OTC. It is unlikely that differences in dynamic range (i.e., ceiling effects) can completely account for the differences between the three regions during the delay period, because the magnitude of expert-related activation in OTC during the delay was well below the magnitude of activation during the cue period.
The above analyses demonstrated that activity in DLPFC, OTC, and IPS was substantially increased during maintenance of objects from the expert category compared with the untrained category, but there was some intersubject variability in these neural expertise effects. As noted previously, there was also substantial intersubject variability in the degree to which participants developed expertise as a result of training. Consequently, we predicted that intersubject variability in WM activation in these regions might reflect meaningful individual differences in expertise. To test this prediction, we calculated the difference in discriminability between upright expert and untrained objects on the expertise evaluation test for each subject and correlated this measure with parameter estimates for the expert (untrained contrast in each ROI). Behavioral expertise effects in the postscan behavioral experiment were significantly correlated with neural expertise effects during the delay period in OTC (r = 0.84; p < .005), DLPFC (r = 0.63; p < 0.05), and IPS (r = 0.64; p < 0.05). As shown in Figure 6, these correlations indicate that subjects who achieved a greater degree of expertise exhibited larger differences in delay period activity between expert and untrained stimuli. The differences between these correlations were not significant. There was also a significant negative correlation between the behavioral expertise effect and the neural expertise effect during the cue period in DLPFC (r = −0.82; p < 0.005) (see supplemental material for discussion of this finding, available at www.jneurosci.org). This result remained significant even when a possible outlier participant (Fig. 6) was excluded from the analysis (r = −0.67; p < 0.05). Trial-averaged time courses of activation were separately averaged for the top five and bottom five performers on the postscan discriminability task and are presented in Figure 7. Consistent with the correlational analyses, these data show that experts tended to have reduced DLPFC activation during the cue period of the WM task but enhanced activation during the delay period of the task.
Native-space ROI analyses
Many previous studies of expertise and perceptual learning have examined activation in functionally defined ROIs identified on native-space data. If posterior cortical regions involved in expertise are relatively small and inconsistently localized, this approach might yield increased sensitivity to detect expertise effects in extrastriate cortex. We therefore analyzed native-space data and examined activity in two functionally defined ROIs (see Materials and Methods) that might be involved in mediating expertise-related WM enhancements: the FFA (Gauthier et al., 1999) and LOC (Grill-Spector et al., 2000). Trial-averaged time courses and neural expertise effects during WM trials are presented in Figure 8. Analysis of parameter estimates revealed increased activation in response to expert than untrained trials during the cue period in the LOC (t(9) = 4.29; p < 0.005). A similar trend was apparent in the FFA (t(8) = 2.23; p < 0.06). Neither region showed an expertise effect during the delay period (both t values <1; NS). We additionally tested whether neural expertise effects in these regions were correlated with behavioral indices of expertise. These tests revealed no significant correlations between behavioral and neural expertise effects in the FFA or LOC.
Discussion
In the present study, we found that visual expertise training induced improved WM performance and processing changes (e.g., inversion effects) specific to objects from the trained category. These behavioral changes were accompanied by increased recruitment of DLPFC, IPS, and OTC during WM encoding and maintenance. Across subjects, behavioral measures of expertise reliably predicted neural expertise effects during the delay period in all three regions. These neural expertise effects could not be attributed to differences in low-level stimulus characteristics (because different subjects trained in different categories) nor to familiarity with features of expert-domain objects (because encoding and maintenance-related activation was not increased during maintenance of familiar novice-domain objects). Our results are consistent with the idea that expertise training resulted in the development of domain-specific memory skills that facilitated WM encoding and maintenance processes.
Behavioral effects of expertise training
Although WM capacity may be limited to a relatively small number of “chunks” (Miller, 1956; Cowan, 2001), experts may be able to apply domain-specific skills to increase the informational content of each chunk (Ericsson and Kintsch, 1995; Gobet et al., 2001; Gobet and Clarkson, 2004). We cannot directly comment on whether this occurred in the present study, because we did not manipulate memory load or explicitly measure WM capacity changes with training. Nonetheless, several findings support the idea that expertise training resulted in improved perceptual and mnemonic processing of expert-domain objects. First, trained subjects performed significantly better on the delayed recognition task than did control subjects who were not trained. Second, on the WM task in the scanner, and on the postscan evaluation, discriminability was higher for expert stimuli than for untrained stimuli. Additionally, the expert advantage was eliminated by stimulus inversion, indicating that experts developed skills that exploit regularities in stimulus configuration (Diamond and Carey, 1986). Third, IPS activation was increased during maintenance of items from the expert category. Recent evidence suggests that IPS activation is related to WM capacity (Todd and Marois, 2004; Xu and Chun, 2006) and to the complexity of information that is maintained (Xu and Chun, 2006). Accordingly, increased IPS activation during expert WM maintenance might have reflected experts' ability to encode more detailed information about objects from the expert category (see supplemental material for additional discussion, available at www.jneurosci.org).
Of course, it is likely that training also induced the development of general task skills that help subjects discriminate complex, multifeatural visual objects even outside the expert category. Indeed, the performance difference between expert subjects and control subjects who did not undergo expertise training was larger than the difference in expert subjects between the expert and untrained conditions. Such generalized effects related to task learning are commonly reported in the perceptual expertise literature (e.g., Gauthier et al., 1997, 1999; Rossion et al., 2002). Critically, in this study, generalized task-learning effects could not account for activation differences between expert and untrained objects, because the two categories were contrasted within the same testing session.
Expertise-related activation in FFA and LOC
Many previous neuroimaging studies have focused on how visual expertise training influences activity in extrastriate areas (Gauthier et al., 1999, 2000; Gauthier, 2000; Grill-Spector et al., 2004; Rhodes et al., 2004). Based on the idea that most people are experts at subordinate-level face recognition, some have additionally suggested that face-selective responses in the FFA and the LOC may reflect processing related to visual expertise (Gauthier et al., 1999, 2000; Gauthier, 2000). Consistent with this view, some studies have reported that FFA and LOC activity is increased during processing of expert-domain objects (Gauthier et al., 1999, 2000). We also found that the FFA and LOC showed modest neural expertise effects during the cue period, although increases were not evident during the memory delay (Fig. 8).
Interestingly, expertise effects in these regions were not significantly correlated with postscan measures of expertise during any task period. Previous studies of bird and car experts have revealed positive correlations between behavioral expertise and FFA activation during tasks that involved judging the location of objects (Gauthier et al., 2000; Xu, 2005); however, no significant correlations were reported during tasks that involved processing of object identity (Gauthier et al., 2000; Grill-Spector et al., 2004). In another study of Lepidoptera experts (Rhodes et al., 2004), FFA activation was correlated with behavioral performance during recognition of objects from novice as well as expert categories. Collectively, these findings do not suggest a consistent relationship between FFA activation and individual differences in expertise.
Expertise-related activity in DLPFC, IPS, and OTC
In contrast to the relatively modest effects of expertise in individually defined FFA and LOC ROIs, we found robust expertise effects in the DLPFC, IPS, and OTC (see Table 1 for a complete list of regions). Activity in each of these regions during encoding and maintenance was modulated by expertise, and maintenance-related activation was also positively correlated with individual differences in expertise (see supplemental material, available at www.jneurosci.org, for discussion of correlations during the cue period). Although these findings do not invalidate reports of expertise-related activity in the FFA and LOC, they do suggest the need to consider the effects of expertise on other cortical areas that are usually not considered in imaging studies.
Our results are generally consistent with models proposing that visual WM maintenance is accomplished through top-down modulation of visual object representations in OTC by frontal and parietal cortices (Desimone, 1996; Miyashita and Hayashi, 2000; Ranganath and D'Esposito, 2005; Ranganath, 2006). The effects of expertise on these networks could be explained at least three ways. One possibility is that expert WM advantages reflect “bottom-up” influences. According to this idea, tuning of OTC networks during expertise acquisition allows people to encode more information about expert-category objects, giving rise to greater recruitment of DLPFC and IPS during WM encoding and maintenance. A second possibility is that expert WM advantages are driven by “top-down” influences. For example, tuning of networks in DLPFC during expertise acquisition may allow experts to seek out domain-relevant feature configurations, thereby enhancing the ability to activate neural representations in OTC (Bar, 2003, 2004). This idea fits well with previous research showing that OTC activation is increased during the engagement of object-based attention (Kanwisher and Wojciulik, 2000; Gazzaley et al., 2005) and WM maintenance (Ranganath et al., 2004a,b), even when perceptual stimulation is controlled. A third possibility is that expert WM is supported by both top-down and bottom-up influences. This view accords with psychological theories suggesting that expertise results in acquisition of domain-specific skills that direct WM encoding and maintenance (Ericsson and Kintsch, 1995; Gobet, 1998), as well as pattern learning that facilitates passive perceptual processing (Chase and Simon, 1973; Ericsson and Kintsch, 1995; Gobet, 1998).
Our study, like other fMRI studies of perceptual expertise, cannot conclusively adjudicate between these accounts. Results from other imaging studies, however, are more consistent with a top-down or an interactive model, rather than a purely bottom-up model. For example, Olesen et al. (2004) reported improved spatial WM performance and increased DLPFC and IPS activation after 5 weeks of training in verbal and visuospatial WM tasks. These findings demonstrate that activation in DLPFC and IPS can accompany the development of abstract mnemonic skills even when there is no obvious perceptual learning. Additionally, Bor and colleagues have demonstrated that DLPFC and IPS activation increases when participants use “chunking” strategies to recode information during WM encoding (Bor et al., 2003, 2004; Bor and Owen, 2006). These findings demonstrate a role for the DLPFC and IPS in using previous knowledge (or expertise) to facilitate efficient WM encoding and maintenance.
Results from single-unit recording studies are also consistent with the idea that top-down and bottom-up influences play a role in expert visual WM. These studies have shown that selectivity of prefrontal and inferior temporal neurons is influenced by learning about objects (Rainer and Miller, 2000; Rainer et al., 2004) and categories of objects (Sigala and Logothetis, 2002; Freedman et al., 2003). For example, Freedman et al. (2003) showed that, after categorization training, prefrontal and inferior temporal neurons exhibited category-selective neural responses. However, prefrontal neurons exhibited stronger category tuning, whereas inferior temporal neurons were more sensitive to the visual features of each exemplar within the category. Additionally, prefrontal neurons showed persistent activity during the delay between each stimulus and the upcoming decision probe, whereas inferior temporal neurons tended to show phasic responses during stimulus presentation.
The evidence summarized above suggests a cortical division of labor with regard to visual expertise. Expertise acquisition may result in tuning of stimulus-specific neurons in OTC that project to prefrontal neurons. The convergence of these inputs in prefrontal cortex may result in the formation of frontoparietal networks that represent category information in a manner that can guide visual attention, WM encoding, and maintenance (Riesenhuber and Poggio, 1999). Thus, cognitive control processes implemented by the DLPFC and other regions may support the application of expert memory skills, as described in psychological theories of expert WM.
Footnotes
-
This work was supported by National Institutes of Health Grants R01 MH067821 and PO1 NS40813. We thank Aaron Heller and Evan Katsuranis for assistance and Nancy Kanwisher, Scott Murray, Luiz Pessoa, Ewa Wojciulik, and two anonymous reviewers for helpful comments.
- Correspondence should be addressed to either of the following: Christopher Moore, Department of Psychology, Princeton University, Princeton, NJ 08540 cdm{at}princeton.edu; or Charan Ranganath, Center for Neuroscience, University of California at Davis, Davis, CA 95616 cranganath{at}ucdavis.edu