Flexible rule learning, a behavior with obvious adaptive value, is known to depend on an intact prefrontal cortex (PFC). One simple, yet powerful, form of such learning consists of forming arbitrary stimulus-response (S-R) associations. A variety of evidence from monkey and human studies suggests that the PFC plays an important role in both forming new S-R associations and in using learned rules to select the contextually appropriate response to a particular stimulus cue. Although monkey lesion studies more strongly implicate the ventrolateral PFC (vlPFC) in S-R learning, clinical data and neurophysiology studies have implicated both the vlPFC and the dorsolateral region (dlPFC) in associative rule learning. Previous human imaging studies of S-R learning tasks, however, have not demonstrated involvement of the dlPFC. This may be because of the design of previous imaging studies, which used few stimuli and used explicitly stated one-to-one S-R mapping rules that were usually practiced before scanning. Humans learn these rules very quickly, limiting the ability of imaging techniques to capture activity related to rule acquisition. To address these issues, we performed functional magnetic resonance imaging while subjects learned by trial and error to associate sets of abstract visual stimuli with arbitrary manual responses. Successful learning of this task required discernment of a categorical type of S-R rule in a block design expected to yield sustained rule representation. Our results show that distinct components of the dorsolateral, ventrolateral, and anterior PFC, lateral premotor cortex, supplementary motor area, and the striatum are involved in learning versus executing categorical S-R rules.
Humans possess a marked aptitude for learning arbitrary stimulus-response (S-R) associations, one hallmark of our cognitive flexibility. This ability allows us to choose from our entire behavioral repertoire in responding to stimuli we encounter and to retain knowledge of contextually optimal responses to particular stimuli. Successfully making S-R associations depends on keeping in mind contextual details relevant to stimulus perception or response choice, as well as salient previous experiences. The use of this ability is vastly expanded by our capacity for grouping stimuli that require the same response, a form of categorical learning. Such cognitive functions have been ascribed to the frontal lobes, especially the dorsolateral prefrontal cortex (dlPFC) (Fuster, 1997). Indeed, the ability to flexibly learn and apply such associative “rules” depends on intact prefrontal and premotor cortices in both humans (Petrides, 1985, 1990, 1997; Halsband and Freund, 1990), and monkeys (Halsband and Passingham, 1982; Petrides, 1982; Gaffan and Harrison, 1988, 1989; for review, see Murray et al., 2000). Electrophysiological recordings in the monkey PFC indicate that such associations are highly plastic, facilitating flexible behavioral change with context (Chen and Wise, 1995, 1996; Asaad et al., 1998, 2000). Mechanistically, the PFC is hypothesized to form representations of new S-R associations and to use existing representations to bias response selection, given a particular stimulus (Murray et al., 2000; Miller and Cohen, 2001).
Despite the fact that several of the clinical and non-human primate studies noted above implicate both dorsal and ventral (vlPFC) regions of the lateral PFC in S-R learning, previous human functional neuroimaging studies of such tasks have not found reliable dlPFC activation (Deiber et al., 1997; Toni and Passingham, 1999; Toni et al., 2001). However, in two of these studies (Deiber et al., 1997; Toni and Passingham, 1999), subjects learned and practiced the rules before scanning and, in the case of the study by Deiber et al., received no feedback during scanning. Thus, the subjects were not attempting to acquire the rules during scanning but to strengthen newly established associations. Toni et al. (2001) examined activity during a task identical to that in their original study (with minimal prescan practice), and although they identified a site in the dlPFC showing decreasing activity over the scan session only for the new rule condition, this effect could be attributed to novelty adaptation. Moreover, all previous imaging studies of S-R learning have been limited to simple one-to-one mapping of a handful of stimuli onto an equal number of possible manual responses. Humans learn these associations very quickly, limiting their use for examining rule acquisition.
To focus on the learning process, we designed the present functional magnetic resonance imaging (fMRI) study, in which subjects learned higher-order S-R rules. Abstract visual stimuli were grouped into sets by non-obvious principles that were difficult to verbalize. Subjects learned to distinguish sets in pairs and learned by trial and error to associate each categorical set with an arbitrarily assigned manual response. With this paradigm, we were able to separate rule learning from rule execution by having subjects learn some rules before the scan session and others during the scan session. Furthermore, we controlled for stimulus novelty effects in the learning condition by including an additional pair of sets that covertly lacked a higher-order grouping rule. Thus, our task identifies regions specifically participating in higher-order S-R rule acquisition and utilization.
Materials and Methods
Participants. Fourteen healthy right-handed subjects (six females; mean age, 27 years old; range, 19-35 years old) from the San Francisco Bay Area volunteered and were paid for their participation. All subjects gave written, informed consent before participation in the study, in accordance with the guidelines of the Committee for the Protection of Human Subjects at the University of California, Berkeley. Subjects were screened for medical, neurological, and psychiatric illness and for psychoactive medication.
Behavioral task. The two-fold aim of the behavioral task was to (1) identify neural circuits recruited for learning a demanding categorical S-R association task and (2) to identify circuits engaged by executing previously learned rules of this type. Subjects performed an associative S-R task in which sets of visual stimuli mapped onto one of four manual response buttons. Individual abstract visual stimuli were presented for 750 ms (visual angle, ≈4°), and the subject had 1 s from the stimulus onset to make a manual button press response. After 1 s, a feedback screen appeared for 300 ms, indicating “correct,” “incorrect,” or “no response.” Incorrect and no-response trials were both counted as errors. Responding was monitored on-line, and all subjects were compliant, with minimal response omissions. Stimuli were presented serially in 20 s blocks, which were randomized with respect to condition (Fig. 1 A). Each block consisted of 15 stimuli selected randomly from a pair of stimulus sets (set size: 10 each, for a total of 20 possible stimuli per block).
Stimulus set construction. To ensure that the stimulus properties were as closely matched as possible, we designed stimulus sets that were identically sized and shaped and used identical subcomponents but that varied in the arrangement of these components across sets. Thus, sensory properties and motor actions were well conserved between the novel and familiar conditions. As can be seen in Figure 1 B, no set members shared any subcomponents, but rather for each stimulus set, a unique constellation of subcomponents varied together in color across set members. Grouping or categorizing the members of each stimulus set required detecting and remembering the unique constellation of covarying elements. Stimulus sets were constructed as follows. For each set, a randomly generated 8 × 8 matrix was constructed, and 10 individual stimuli were generated from this matrix using 10 different color maps of the same 10 isoluminant hues (Matlab 6.1; Mathworks, Natick, MA). Thus, each member of the set shared no commonly colored squares but rather preserved the relationship between identically colored squares. Two such sets are shown in Figure 1 B.
Task training. Subjects learned arbitrary S-R mappings for two block types, each of which consisted of a pair of stimulus sets, at least 1 d before the scan session, with all subjects reporting at least one full night of sleep between training and testing, ensuring adequate consolidation of learning (Stickgold et al., 2000; Maquet et al., 2003). We refer to these stimulus sets and the S-R rules acquired during training as “familiar” (fam). Each set was assigned randomly to one of four possible manual response buttons. The categorical rules were not stated explicitly and were instead learned by trial and error. Subjects were instructed to learn the correct response associated with each stimulus and were told that some of the stimuli were related by rules that, if acquired, would help them learn the associations more quickly.
Subjects were required to reach a performance criterion of 90% accuracy for both blocks; all subjects tested were able to do so. Training to criterion took an average of seven blocks of 40 trials each for these subjects, corresponding to a total training duration of ∼15 min. A few subjects took a large number of training blocks to reach criterion on the first block type (order was counter-balanced across subjects), but once a successful strategy was established, learning the associative rule for the second block type was always more rapid. All subjects were required to demonstrate retention (accuracy, ≥90%) of the learned rules immediately before the functional scanning session. Subjects practiced the previously learned sets in the magnet during shimming and anatomical scans, which lasted 15-20 min, immediately before functional scanning. All subjects demonstrated retention of the learned S-R rules, with median time to reach the performance criterion of 90% accuracy of two practice blocks per block type.
Task conditions. During the scan session, there were three types of experimental blocks, each composed of two stimulus sets. fam blocks consisted of the two block types learned in the training session and practiced on the day of the scan. “Novel” (nov) blocks were identical in structure but consisted of two new stimulus sets each (two different novel blocks were included in each scan session). Although nov and fam blocks differed in terms of learning, they also differed in degree of stimulus novelty. Thus, detected differences in the blood oxygenation level-dependent (BOLD) signal between these two conditions could simply reflect familiarity differences (Knight, 1984; Asaad et al., 1998; Daffner et al., 2000). To control for this possibility, we included a condition with two stimulus sets composed of novel but unrelated stimuli. In these blocks, no rule bound the set members together, prohibiting subjects from acquiring a rule to facilitate accurate performance. Subjects were not warned in advance that such a set was included. “No rule” (NR) blocks consisted of 20 independent stimuli that were each generated from a different underlying matrix, with 10 arbitrarily assigned to one response button and the other 10 assigned to a different button. Within a given block type, individual members of the two sets were intermixed randomly. Each scanning run consisted of 18 blocks (for a total of ∼6 min), including three of each experimental block type and 3 fixation/baseline blocks, for a total of 18 blocks per run, presented in random order. At the end of each run, the subject was provided with the overall accuracy for that run. Subjects were awarded increasing monetary bonuses for exceeding each of three percentage levels (60, 70, and 80%). This served to motivate accurate performance in the fam blocks, to spur rapid learning of the nov blocks, and to discourage giving up during the NR blocks. Six runs of data were acquired from each subject, for a total of 36 min of functional scan time. To maximize the chances of detecting learning-related neural activity, we designed the task to extend the period of learning across several scanning runs. The behavioral results from the scanning session indicate that our design served the intended purpose. The majority of subjects participated in a second (later) experiment within the same scan session, the results of which will be reported elsewhere.
Data acquisition. MRI was performed on a Varian/Inova (Palo Alto, CA) whole-body 4T scanner that was equipped with echo-planar imaging. For all experiments, a standard radiofrequency (RF) head coil was used, and cushioning comfortably restricted head motion. E-Prime software (PST, Pittsburgh, PA) controlled the stimulus display and recorded subject responses via a magnet-compatible fiber-optic keypad. An LCD projector (Epson, Long Beach, CA) projected stimuli onto a backlit projection screen (Stewart, Torrance, CA) within the magnet bore, which the subjects viewed via a mirror mounted within the head coil.
A gradient echo, echo-planar sequence (repetition time, 2200 ms; echo time, 22 ms; flip angle, 20°) was used to detect BOLD contrast. The sequence used two-shot interleaved k-space acquisition with a phase map correction to reduce Nyquist ghosts. Three-millimeter coronal slices were acquired to facilitate coverage of the ventral/orbital PFC. In-plane resolution was 3 × 3 mm, yielding isotropic voxels. Twenty-two slices with a 0.5 mm interslice gap were acquired (field of view, 19.2 cm2), which accommodated complete coverage of the frontal lobes. In most subjects, these parameters did not allow us to collect data posterior to the postcentral gyrus. Each run was preceded by 20 s of dummy gradient RF pulses to achieve steady-state tissue magnetization and to minimize startle-induced motion in the functional data. Coplanar T1-weighted anatomical images were acquired for each participant. In addition, an axial MP (magnetization-prepared)-FLASH high-resolution T1-weighted image was acquired for use in spatial normalization.
fMRI data processing. MRI data were processed off-line using the VoxBo analysis package (http://www.voxbo.org/download.html). First, images were reconstructed into Cartesian space and sinc interpolated in time to correct for differences in slice time acquisition (Aguirre et al., 1998). Next, we motion-corrected the data using a six-parameter, rigid-body, least-squares transformation algorithm (Friston et al., 1995) and a slicewise motion compensation that removed spatially coherent signal changes by applying a partial correlation method to each slice in time (Zarahn, 1997). Next, an empirically derived threshold was applied to remove extremely low-intensity voxels, a mask was applied to exclude regions located outside the brain, and finally, a Gaussian smoothing kernel with a full width at half maximum (FWHM) of 6 mm was applied. fMRI data analyses. Statistical analyses were performed within the framework of the modified general linear model (GLM) (Worsley and Friston, 1995) and included a 1/f model of temporal autocorrelation, derived empirically for each subject (Zarahn et al., 1997). The model included a design matrix with covariates for each block type convolved with an empirically derived hemodynamic response function. A notch filter removed frequencies below 0.03 Hz and above the Nyquist frequency (0.227 Hz). For the brain-behavior and inter-region of interest (ROI) correlation analyses, a separate GLM analysis of the fMRI data was done using a model that specified a unique covariate for each block. Filtering and convolution were as for the main analysis.
ROI analyses. Within subjects, we defined functional ROIs (fROIs) within each of eight a priori defined anatomical regions, identified from the T1-weighted anatomical images obtained from each subject, with reference to standard brain atlases (p < 0.05; small volume corrected) (Duvernoy, 1991; Tzourio-Mazoyer et al., 2002). These ROIs were chosen based on previously published data and included the dlPFC [middle frontal gyrus (MFG)], vlPFC [inferior frontal gyrus (IFG)], fronto-polar cortex (FP), striatum, and three regions of the premotor cortex: medial [supplementary motor area (SMA)/pre-SMA], dorsolateral premotor area [dlPM; dorsal aspect of the superior frontal gyrus (SFG), including portions of Brodmann areas (BA) 6 and 8], and ventrolateral premotor area [vPM; precentral gyrus (PCG), proximal to the caudal terminus of the inferior frontal sulcus (IFS)]. Within these fROIs we selected the peak activation cluster in a contrast of nov versus fam blocks to identify areas potentially involved in S-R learning. This yielded nov>fam voxels, putatively correlated with the acquisition of S-R rules, and fam>nov voxels, putatively correlated with the execution or application of learned rules. Robust Nov>Fam activations were found for all subjects in the right dlPFC, midline SMA/pre-SMA, left vPM, and right striatum. Robust fam>nov activations were found for all subjects in the left dlPM and left FP ROIs. For each subject, we extracted the average response magnitude for each covariate from each these six fROIs, and fam and nov block types were compared with a paired t test. By also comparing the nov and NR conditions, we were able to rule out simple stimulus novelty effects. Because learning evolved over the course of a scan session, we predicted that learning-related voxels would show a monotonic change in response magnitude, exclusively in the nov blocks, across runs. Therefore, we compared the parameter estimates within nov>fam and fam>nov voxels between the first two runs and the last two runs for each block type. Average parameter estimates for each block were also extracted from these six fROIs from the second GLM analysis described above.
Brain-behavior correlation analyses. Correlation of activations between fROIs on a block-by-block basis was calculated as Pearson's correlation coefficients transformed to Z-values using Fisher's R-to-Z transform. Correlations between behavioral performance and fROI activity, on a block-by-block basis, were calculated using the transformed accuracy scores (see below) for each block. These correlations were determined across all blocks and separated by condition. Linear regression was used to determine whether these correlations predicted the ability to learn new S-R associations in this task, measured as overall accuracy on nov blocks.
Map-wise group analysis. Group statistical parametric maps were generated for the same contrasts of interest, using the results from the individual subject analyses. Each subject's response magnitude maps were normalized into the standard space of the Montreal Neurological Institute (MNI) using SPM99 (http://www.fil.ion.ucl.ac.uk/spm), resampled to a resolution of 2 mm2 and additionally smoothed to yield a total smoothing kernel of 10 mm FWHM, and entered into second-level t tests treating each subject as a random variable. Statistical significance was defined by meeting a height threshold of p < 0.005 (uncorrected; t >3.01), with a minimum cluster size of eight contiguous voxels. This relatively lenient threshold was chosen to allow direct comparison with previous studies. As can be seen in Table 1, all but a few activations actually meet the more stringent threshold of p < 0.001 (t > 3.85).
Analysis of behavioral data. Total accuracy was calculated for block, condition, and experiment. For the purposes of parametric statistical comparisons and correlation analysis, accuracy data were transformed as arcsin√accuracy.
Subjects demonstrated excellent accuracy during the fam blocks (mean ± SEM; 74 ± 4%). In contrast, subject accuracy in the nov blocks improved gradually over six runs (mean ± SEM; run 1, 36 ± 3%; run 6, 61 ± 5%) (Fig. 2), representing a significantly greater increase in accuracy than for Fam blocks (t(13) = 8.34; p < 0.001). Average accuracy scores for the NR condition were below chance for all subjects (group mean ± SEM; 35 ± 3%) (Fig. 2). Two effective strategies are possible in the NR condition: (1) subjects could attempt to learn the correct response associated with each of the 20 stimuli, which represents a load well beyond normal working memory capacity; and (2) subjects could identify which two response buttons were associated with the two NR sets and consistently press one of these buttons for all stimuli, yielding chance performance (50%). By combining these two approaches, subjects could potentially perform above chance. Our results indicate that subjects were either using strategy 1 or unsuccessfully trying to establish associative rules for the NR sets. Debriefing of subjects after scanning and examination of behavioral responses confirmed this inference.
In these analyses, we tested specific ROIs for activations correlated with learning or executing rules. Anatomical ROIs were defined a priori on the basis of reviews of functional imaging and non-human primate electrophysiology literature (see Materials and Methods for a list of regions). fROIs were then defined as the peaks of contiguous activation clusters in a contrast of the Nov rule versus fam rule conditions within these anatomical ROIs.
Novel rule learning versus familiar rule execution
We found nov selective (nov>fam) activations consistently across subjects in four regions where the magnitude of BOLD responses to the nov blocks were significantly larger relative to the fam blocks: the right dlPFC (MFG), the midline SMA, the left vPM cortex (PCG), and the right striatum. Representative examples are shown in Fig. 3A. To test whether these differences reflected selectivity for learning per se and not effects of differing degrees of uncertainty, stimulus novelty, or error feedback, we compared the average response magnitudes within the nov>fam fROIs between the nov and NR conditions. We consistently found greater activation for the nov blocks (Fig. 3B), supporting the conclusion that these regions are indeed engaged by the act of learning associative S-R rules. The significance levels were p < 2 × 10-4, p < 0.03, p < 8 × 10-5, and p < 0.01 for the MFG, SMA, PCG, and striatum fROIs, respectively.
Although nov>fam activations identified by this novel rule versus familiar rule contrast are putative sites of rule acquisition, fam>nov activations are predicted to be involved in recalling, maintaining, or implementing previously learned rules. We observed such activations, consistently across subjects, in two left frontal regions: the left FP and the dorsal premotor area (dPM) (SFG) (Fig. 3C). To rule out the possibility that differences in the level of certainty, stimulus familiarity, or positive feedback were driving this effect, we again compared the response magnitudes within these fROIs from the nov and NR conditions. We consistently observed relatively greater BOLD signal in the nov condition in the FP fROIs (p < 0.004) (Fig. 3D), supporting a contribution to rule representation or implementation in this brain area. However, the differences in the SFG between the nov and NR conditions showed only a trend toward a significant difference (p < 0.16) (Fig. 3D).
Nov>Fam temporal decay
One prediction regarding rule-learning-related activations is that as the rule is acquired, activity in these areas would decrease. Moreover, if the activity were strictly learning related, we would expect to see such time-dependent decreases only within the context of nov blocks. Supporting this hypothesis, activations in the MFG and the SMA fROIs show significant decreases from the early to late runs only for the nov condition (Fig. 4, top panels). However, activity in the PCG and striatum, although clearly selective for the learning condition, did not fit this profile, indicating that these regions likely play a qualitatively different role in the rule-learning process.
A second prediction regarding sites engaged in rule learning is that the BOLD responses in such areas would be inversely correlated with accuracy for the nov (i.e., learning) condition but not for the two nonlearning control conditions. In other words, we would expect to see activity decrease as accuracy improves. Again, we found such correlations in the right MFG, the SMA, and the left PCG (Fig. 5A). A weaker correlation between nov block accuracy and signal intensity was also seen in the striatum for some subjects, like the one shown in Figure 5A. There was considerable variability across subjects in the strength of these correlations, which led us to investigate whether a subject's BOLD-nov accuracy blockwise correlation predicted the ability to learn the novel rules. Indeed, there was a significant correlation between BOLD-nov accuracy Z-scores and overall accuracy in the nov conditions, an index of a subject's facility at learning the new rules, in the SMA, MFG, and PCG fROIs (Fig. 5B) (R2 = 0.47, p < 0.007; R2 = 0.59, p < 0.002; and R2 = 0.34, p < 0.03, respectively). Using a median split to divide the subjects according to their nov accuracy scores, we also found significant differences between the high and low learners in their BOLD-nov accuracy Z-scores in the MFG, SMA, and PCG (unpaired t tests; p < 0.008, p < 0.02, and p < 0.03, respectively) (Fig. 5C). The striatum fROI did not show this pattern (data not shown). BOLD-accuracy correlations only distinguished between good and poor learners in the nov learning condition, not in the fam and NR control conditions (Fig. 5D). These robust, condition-specific correlations between BOLD signal and behavioral performance strengthen the conclusion that these brain areas are actively recruited for the purpose of learning S-R associative rules. Moreover, these data are consistent with the hypothesis that these areas go “off-line” as successful learning is achieved, and not simply because of time-dependent effects unrelated to learning.
We also examined the correlations between these regions on a block-by-block basis and found strong blockwise correlations between many of the fROIs. For example, as shown for two subjects in Figure 6A (left panels), the activation sites in the right MFG and SMA were strongly covarying during performance of this task. All four of the nov>fam fROIs demonstrated significant positive correlations (maximum p = 0.032). To determine whether the strength of correlation between these fROIs had any predictive value regarding subject performance, we used a median split to divide the subjects into high and low performers, based on total accuracy scores. We found that the correlations between two fROI pairs significantly distinguished between the high- and low-performing groups: SMA/MFG and SMA/striatum (Fig. 6B). Strong correlations between the SMA and the MFG and the striatum predicted highly accurate performance of this task. The SMA/PCG correlation also showed a trend toward distinguishing these two groups, again with stronger correlations favoring more accurate performance, but it did not quite reach statistical significance (p = 0.11). Additional evidence that successful performance of this task depends strongly on coordination between the MFG and SMA is found in the regression of SMA/MFG correlation coefficients against total accuracy scores (Fig. 6C). Here, we found that the variations in the level of coactivation of these two fROIs across subjects could account for 63% of the variance in subject accuracy scores. Although not conclusive, these results suggest that the SMA is functionally engaged with the MFG and striatum when using or attempting to use S-R rules, either directly or via indirect connections or common inputs.
Map-wise random-effects analyses
To uncover activations within areas with signal/noise characteristics that prohibit consistent detection in single-subject analyses, we used a map-wise exploratory analysis of modulation of the BOLD response on the basis of S-R rule novelty. This approach both confirmed the involvement of the cortical regions identified in the individual subject ROI-based analyses detailed above and identified additional sites. Specifically, learning new S-R rules led to increased activity in the right MFG, the pre-SMA, the left PCG, and the right anterior striatum (a locus in the caudate) relative to executing known rules (Fig. 7). In addition, random-effects analysis uncovered additional consistent nov>fam activations in the left MFG, the right PCG, the left anterior insula, the right IFG (opercular portion), and the right IFS (Fig. 7, warm colors). These results are consistent with imaging and neurophysiological evidence that the vlPFC plays an important role in the formation of S-R associations. Meanwhile, using known S-R rules was confirmed to increase activity in the left SFG and FP relative to new rule learning. The map-wise analysis also uncovered two consistent fam>nov foci in the rostral anterior cingulate cortex (ACC), a more ventral site in the left insula, and the bilateral SFG. The peaks of the SFG activations fall at the border between the rostral premotor cortex (BA8B) and a portion of the dlPFC (BA9), which is more cytoarchitectonically similar to BA8 than the rest of the dlPFC (BA46 and BA9/46) (Petrides and Pandya, 1999). Assignment of this activation to the prefrontal or premotor cortex is thus ambiguous, however, based on (1) the consistent activations we observed in the dPM (BA6/8) of individual subjects, (2) the anatomical connectivity of BA8B, but not BA9, with inferior parietal areas, and (3) the previous implication of BA8B in visuospatial function, we tentatively interpret these peaks as rostralmost dPM. See Table 1 for peak coordinates and t values.
Our study investigated the neural correlates of learning abstract S-R rules and applying such previously learned rules. Importantly, we found distinct robust dlPFC activation sites during rule learning. These results help to reconcile discrepancies between data from neurophysiological studies (Hoshi et al., 1998; White and Wise, 1999; Asaad et al., 2000; Wallis et al., 2001) and clinical studies (Stuss et al., 2000), which clearly implicate the dlPFC in rule learning and utilization, and neuroimaging studies, which have not uncovered the robust dlPFC rule or rule-learning activity (Deiber et al., 1997; Toni and Passingham, 1999; Toni et al., 2001; Bunge et al., 2003; Eliassen et al., 2003). Our task design, which required higher-order organization of many lower-order S-R rules and which incorporated sustained periods of higher-order rule or set maintenance, may have enabled this finding. Our single subject analytical approach may have also allowed us to detect activations that were robust across subjects, but small and spatially variable, and less easily detected in map-wise group analyses.
Higher-order S-R rule learning
Across subjects, we consistently found significantly greater activity when subjects were learning a rule, relative to applying an identical type of fam rule, in the anterior striatum and three frontal regions: left vPM cortex, dorsomedial PM cortex (SMA/pre-SMA), and right dlPFC. We propose distinct, although likely interacting roles for each in the learning process. This hypothesis is strengthened twofold by our blockwise analyses of response magnitude and behavioral performance. Activity in these areas was modulated as a function of performance accuracy during learning blocks: as performance improved, activity decreased in these regions (Fig. 5). Moreover, the strength of these correlations in the dlPFC, SMA, and vPM predicted a subject's ability to learn novel sets; the best learners showed the strongest correlations (Fig. 5B). Dividing the subjects into “good” versus “poor” learners and comparing activity-accuracy correlation strengths suggests that engagement of these three regions during the initial problem-solving phase of rule learning and subsequent disengagement during the expertise phase represents successful learning of higher-order S-R rules.
An indication of interaction between these areas rather than just coincident activation comes from correlation analyses (Fig. 6). Within subjects, activity in the right MFG and right anterior striatum were highly correlated with activity in the SMA/pre-SMA. Also, the strength of these correlations was highly predictive of subject performance; the most accurate subjects showed the tightest correlations. The potential importance of the dlPFC-SMA/pre-SMA acting in concert is particularly compelling, because the strength of this correlation accounts for nearly two-thirds of the variability in subject accuracy. Although the time scale of these analyses is too crude to posit a neural circuit merely based on these data, recent tract-tracing studies in humans indicate that the SMA/pre-SMA is indeed anatomically connected to the dlPFC and vPM cortex, as well as the striatum (Lehéricy et al., 2004a,b).
What roles do these areas play, and what is a plausible functional circuit? The SMA has been implicated in internally guided response selection (Cunnington et al., 2002), maintaining contextually appropriate possible response sets (Rushworth et al., 2002) and representing visuomotor associations during sequence learning (Hikosaka et al., 1999; Sakai et al., 1999). Moreover, the medial premotor areas may serve as keystone sites in visuomotor learning, bridging perceptual learning and motor learning circuitry (Nakahara et al., 2001). In the learning condition of our task, subjects must initially guess at responses, but after accumulating information about contingencies, subjects are likely using a combination of implicit learning-driven, stimulus-cued responding and explicit hypothesis testing. The anterior striatum, implicated in reward-based learning and reported to demonstrate selective responses predicting immediate rewards (Tanaka et al., 2004), may monitor feedback during S-R rule learning. Single-unit recordings in the dPM and striatum of monkeys during conditional visuomotor learning indicate that these areas form a functional loop for acquiring and representing S-R rules (Hadj-Bouziane et al., 2003; Brasted and Wise, 2004). Interestingly, both groups have shown an increase in striatal firing as learning progresses, reminiscent of our imaging data (Fig. 4, bottom right). Although those studies investigated the lateral dPM and primarily the putamen, our results suggests that the medial dPM and caudate may be functioning in a similar manner via a parallel loop.
During S-R rule learning, the dlPFC may recall previous S-R outcomes, detect regularities among these, and use that information to determine an organizing rule to hold in mind. This is consistent with dlPFC-lesioned patients having difficulty maintaining set in the Wisconsin Card Sort Task (Stuss et al., 2000) and with PFC damage impairing flexible rule learning (Milner, 1963; Luria, 1969; Perrett, 1973; Petrides, 1985, 1997; Gaffan and Harrison, 1988, 1989; Vendrell et al., 1995; Dias et al., 1997; Murray et al., 2000). Support is also lent to this idea by monkey neurophysiology data demonstrating rule-selective activity in the dlPFC (Hoshi et al., 1998; White and Wise, 1999; Asaad et al., 2000; Wallis et al., 2001; Wallis and Miller, 2003). A previous fMRI study of S-R learning in humans weakly implicated the dlPFC, based on temporal decay of activity during the scan session (Toni et al., 2001). However, there was a confound of stimulus novelty in their learning condition, and stimuli in the learning and control conditions were not matched for perceptual complexity. Unlike previous rule-learning fMRI studies, our subjects learned to organize many individual S-R associations according to categorical or set-based rules. This may have increased working memory load, or the degree of manipulation (e.g., comparisons across several trials) in memory, both of which are expected to highly engage the dlPFC (Duncan and Owen, 2000; Levy and Goldman-Rakic, 2000; Postle et al., 2000; Rowe et al., 2000). Either possibility is consistent with our finding that activity in this area declines as subjects acquire the rule. We further speculate that the dlPFC may play a role in retrieving individual associations from the vlPFC, perhaps integrating information from primary association and medial premotor areas, which are not directly connected.
Previous neuroimaging studies of rule learning have consistently implicated the vlPFC (Deiber et al., 1997; Toni and Passingham, 1999; Toni et al., 2001; Eliassen et al., 2003). Moreover, vlPFC lesions severely impair S-R learning (Murray and Wise, 1997; Petrides, 1997; Parker and Gaffan, 1998; Bussey et al., 2002), and monkey neurophysiological data show association-selective neurons in the vlPFC (Asaad et al., 1998; White and Wise, 1999; Wallis et al., 2001). We did not find consistent vlPFC activations in the individual subject data; however, two vlPFC nov>fam sites did emerge in the map-wise group analysis, which is the type of analysis used in previous studies. Activation sites in the vlPFC may represent individual S-R outcome contingencies. Maintaining such representations in working memory may be crucial for learning in this task, but additional integration of these individual contingencies is required for effective learning to take place. Such function is likely the domain of the dlPFC and is probably a process that more strongly distinguishes between our nov and fam rule conditions. Moreover, our task bears a resemblance to category learning (for review, see Kéri, 2003; Ashby and Spiering, 2004). The type of category learning most studied with neuroimaging is based on the dot pattern prototype distortion task (Posner and Keele, 1968). Our task differs from these studies in a number of key ways. First, these studies have generally required category member/nonmember judgments (“A, not-A”), and have identified neural changes associated with the outcome of learning rather than the process of learning per se (Reber et al., 1998; Aizenstein et al., 2000). Vogels et al. (2002) used a design more similar to ours, requiring “A, B, or neither” judgments of stimuli; however, scanning occurred after judgment accuracy had asymptoted, again focusing on outcome. A study by Seger et al. (2000) used a different prototype distortion task and bears the closest resemblance to our task. In that study, subjects learned to distinguish two visual categories but again included prescan training and did not include a condition in which learned category judgments were made. However, our present findings support the interpretation of Seger et al. (2000) of a dlPFC activation as playing a key role in visual category learning.
We propose that in our task, the SMA/pre-SMA represents possible S-R associations and interacts with the dlPFC, where information held in working memory may bias selection of one of the cued S-R pairs in the SMA. Two fMRI studies support this idea. The first detected dlPFC activity reflecting context-dependent preparation to make one of a set of possible responses (D'Esposito et al., 2000). The second implicated the lateral PFC in selecting premotor representations associating stimuli and responses and exerting top-down control over the premotor cortex to bias the selection of a motor response (Koechlin et al., 2003). As successful learning proceeds, S-R associations are established and transmitted to downstream targets such as the striatum and lateral premotor cortex, where they can be subsequently accessed and used without extensive evaluation. If true, one would expect to see activity in the SMA and dlPFC decline as learning proceeds, which is consistent with our data.
Using a learned S-R rule to guide response selection
Familiar rule blocks preferentially recruited distinctly different areas, which included the FP and SFG. The finding of fam>nov activity in the SFG, a premotor region, fits nicely with recent monkey neurophysiology data indicting that dPM robustly represents well learned rules (Wallis and Miller, 2003). Map-wise analyses further implicated sites in the rostral anterior cingulate and left insula. The function of the FP is poorly understood, but it has been implicated in the processing of internal mental states (Christoff and Gabrieli, 2000). Within this framework, activation in our task would be attributed to the relatively lesser difficulty of the fam rule condition, leaving subjects with the “mental space” to think about things irrelevant to the task at hand. Because some of the other fam>nov areas are putatively engaged by monitoring the outcome of actions [ACC (Van Veen et al., 2001; Hadland et al., 2003)] or predicting reward [insula (Tanaka et al., 2004)], one could speculate that subjects were evaluating their performance in the task during the less demanding (fam) blocks. This is consistent with the subjects' postscan reports of being highly conscious of their performance, perhaps heightened by monetary performance bonuses. However, a more recent hypothesis regarding fronto-polar function suggested that it may be best understood as a place that coordinates multiple subgoals in the context of an overall goal (Ramnani and Owen, 2004) with the medial portion putatively engaged for partially anticipated, versus fully externally guided, tasks (Koechlin et al., 2000; Burgess et al., 2003). Our results may better fit this hypothesis, given that subjects held in mind the associative S-R rules for a given set while selecting the appropriate response for each trial. An explanation for less activity in the nov condition could be that the associative rules and set identification have not yet become fully established, and thus the neural representation is weaker or more disorganized.
This work was supported by National Institutes of Health Grants MH63901 and NS40813 (M.D.), the Wheeler Center for the Neurobiology of Addiction (C.A.B.), and the Henry H. Wheeler Jr Brain Imaging Center. We thank H. Alexander, S. Gibbs, B. Inglis, E. Schumacher, and E. Zuniga for valuable technical assistance.
Correspondence should be addressed to Dr. Charlotte A. Boettiger, Ernest Gallo Clinic and Research Center, University of California, San Francisco, 5858 Horton Street, Suite 200, Emeryville, CA 94608. E-mail:.
Copyright © 2005 Society for Neuroscience 0270-6474/05/252723-10$15.00/0