Evidence indicates the involvement of the rostral part of the dorsal premotor cortex (pre-PMd) in executive processes during working memory tasks. However, it remains unclear what the executive function of pre-PMd is in relation to that of the dorsolateral prefrontal cortex (DLPFC) and how these two areas interact. Using functional magnetic resonance imaging (fMRI), brain activity was examined during a delayed-encoding recognition task. Fifteen subjects had prelearned several four-code standard sequences and super sequences (SUPs) consisting of a train of two standard sequences to form “chunks” in long-term memory. During fMRI, subjects remembered eight-code encoding stimuli presented as an SUP or two unlinked standard sequences (2STs). A memory probe prompted the subjects to recognize codes across two chunks (ACROSS) or within a single chunk. A 2 × 2 factorial design was used to test two types of working memory manipulation: (1) a reductive operation selecting codes from chunks (“segmenting”) and (2) a synthetic operation converting unlinked codes into a sequence (“binding”). Response time data supported the behavioral effects of each operation. Event-related fMRI showed that the “segmenting operation” activated the DLPFC bilaterally, whereas the “binding operation” enhanced the left pre-PMd activity. Activity in the ventrolateral prefrontal cortex suggested its involvement in the retrieval of task-relevant information from long-term memory. Furthermore, effective connectivity analysis indicated that the left pre-PMd and ipsilateral DLPFC interacted specifically during the ACROSS recognition of 2STs, the condition that involved both operations. We propose specific neural substrates for working memory manipulation: the DLPFC for segmenting/attentional selection and the pre-PMd for binding/sequencing. The functional coupling between the DLPFC and pre-PMd appears to play a role in combining these distinct operations.
Accumulating evidence indicates that the brain structures previously regarded as “pure motor” areas in fact have cognitive functions (Ramnani, 2006). Along with the cerebellum and basal ganglia, the dorsal premotor cortex (PMd) may best exemplify this paradigm shift. For example, a series of experiments on mental-operation tasks have addressed the location of the human rostral PMd (PMdr/pre-PMd) and its function in nonmotor executive processes (Hanakawa et al., 2002). Consistently, many neuroimaging studies have demonstrated that working memory (WM) tasks activate pre-PMd (Picard and Strick, 2001). Although it is widely accepted that WM executive processes recruit “attentional-selection” function from the dorsolateral prefrontal cortex (DLPFC) (Miller, 1999), it remains unresolved what the specific operation of pre-PMd is. This question was partly answered in a recent neurophysiology experiment, in which monkeys memorized a sequence consisting of two stimuli, each pointing to a movement element (Ohbayashi et al., 2003). After a delay, they performed a sequential action in the original or reverse order based on another cue. Transient activation of pre-PMd neurons just before the memory–movement conversion suggested the role of pre-PMd in generating a new response sequence. Together with the nonmotor aspects of pre-PMd functionality, one of the executive roles of pre-PMd can be hypothesized as sequence generation in both cognitive and motor domains of behavior. In parallel with testing this idea, it is also important to clarify how pre-PMd interacts with DLPFC during cognitive manipulation.
The present functional magnetic resonance imaging (fMRI) study was aimed at clarifying the functions of the pre-PMd in relation to the functions of the DLPFC during WM manipulation. Most previous imaging studies failed to discern the functionality of these two regions because manipulation processes specific to each region were unknown. We solved this problem here by introducing experimental constraints on WM manipulation according to the hierarchical organization of memory “chunks” (see Fig. 1a). A chunk is an information structure in which elementary units are organized into a higher-order supergroup through learning (Miller, 1956; Ericcson et al., 1980). The hierarchical structure prevents the central executive from retrieving elements crossing a chunk boundary directly. To do so, subjects first located the chunks to which the elements belonged and then recited the entire chunks or further examined their contents (McLean and Gregg, 1967; Ericcson et al., 1980). These stepwise procedures should induce temporary reorganization of chunk representations within WM. We designed a “delayed-encoding recognition task” constrained by such chunk structures in long-term memory (LTM). The recognition process required two types of WM manipulation: (1) selecting task-relevant elements from chunks (“segmenting”) and (2) converting previously unrelated elements into a sequence (“binding”). It was hypothesized that the binding would involve pre-PMd through its sequence generation function (Ohbayashi et al., 2003) and the segmenting would activate DLPFC for attentional selection (Rowe et al., 2000). Brain activity was assessed to clarify the neuroanatomy of encoding and manipulation processes under the influence of memory chunks in LTM. Furthermore, a psychophysiological interaction (PPI) analysis was performed to examine effective connectivity between the pre-PMd and DLPFC during the segmenting plus binding operation.
Materials and Methods
Fifteen healthy volunteers (eight men and seven women; age range, 21–30 years) participated in the study. All of the volunteers were right handed as assessed by the Edinburgh Inventory (Oldfield, 1971). Written informed consent was obtained from all of the subjects. The study protocol was approved by the local ethics committee.
Experimental design and behavioral tasks.
The present study was designed so that chunk structures within LTM influenced cognitive manipulation of WM items. Therefore, before the imaging experiment, the formation of hierarchical chunks was experimentally controlled through learning (McLean and Gregg, 1967). All of the subjects underwent a two-staged training session for a period of 2 months. They were trained for 30 min everyday. At the first learning stage, the subjects were required to memorize six standard sequences for a 1 month period. The standard sequences were composed of two Arabic digits and two English alphabet letters arranged alternately (2h5f, 3y8r, 5d2i, 7c4a, 8j1p, and 9s4x), and the codes of the standard sequences were selected in a pseudorandom manner. One month later, their memorizations of all of the standard sequences were confirmed by free immediate recitation. At the second learning stage, the subjects remembered three super sequences (SUPs) for another 1 month period. These SUPs were composed of two standard sequences paired in a specific order (5d2i2h5f, 7c4a9s4x, and 8j1p3y8r). The subjects were told that all of the sequences would be equally used in the coming fMRI experiment. The subjects were allowed to participate in fMRI experiments only after confirmation of immediate and perfect recitation of all of the standard sequences and SUPs. Based on previous evidence (McLean and Gregg, 1967), the six standard sequences were hypothesized to form distinct chunks, each consisting of four codes. Strong association among these four codes was the first prerequisite of the experiment. At the second learning stage, the codes of the paired standard sequences would acquire much stronger linkage than those of nonpaired sequences. The second prerequisite was the formation of hierarchically higher-order chunks, each embracing two tightly linked standard sequences.
During the fMRI experiment, each trial initially required subjects to encode a sequence of eight digitally recorded auditory stimuli (Fig.1b). The stimuli were presented singly with 500 ms duration and 100 ms interstimulus intervals. These eight-code encoding stimuli corresponded to either an SUP (“structure condition”) or a train of two unpaired standard sequences (2STs; structure condition) (Fig. 1c). In the 2ST condition, the encoding stimuli were created by connecting two standard sequences pseudorandomly, with the SUPs being excluded. Note that both types of recoding stimuli can be basically regarded as a train of two standard sequences. However, given the different levels of association imposed by the learning, encoding stimuli should be recoded differently across the structure conditions. An eight-code encoding stimulus would be recoded as two unlinked four-code chunks in the 2ST condition. In contrast, a stimulus would be recoded as tightly linked two four-code chunks almost equivalent to a single eight-code chunk in the SUP condition. It was predicted that encoding stimuli in the 2ST condition would more vigorously recruit the neural mechanisms of recoding than those in the SUP condition, although the total number of codes to remember was the same.
After a 3 s delay period for maintenance, a recognition stimulus containing a memory probe was visually presented at the center of view for 2 s (Fig. 1b). The probe tested recognition of the first four codes (i.e., first to fourth codes), the middle four codes (i.e., third to sixth codes), or the last four codes (i.e., fifth to eighth codes) of the encoding stimuli. The memory probes were shown in white on a black uniform background. They were flanked with four wild-card letters (×) shown in green to make the entire recognition stimulus eight characters long (2 × 5° visual angle). The subjects were asked to judge as quickly as possible whether the memory probe matched the corresponding part of the encoding stimuli. The probe for the first or last four codes required recognition of a standard-sequence structure within the encoding stimuli (WITHIN; recognition condition), whereas the probe for the middle four codes required recognition of a sequence bridging across two standard sequences (ACROSS; recognition condition) (Fig. 1c). In half of the trials, a memory probe matched the encoded stimuli (“match trials”) and in the rest one did not (“mismatch trials”). The matched and mismatched trials were assigned pseudorandomly. The mismatch probes were created by connecting two two-code segments (e.g., 2 h and 8r) chosen semirandomly from two different standard sequences (Fig. 1b). In the mismatch trials, either half of the codes or none of the codes matched the encoded stimuli.
Two basic cognitive processes were defined according to the concept of manipulating hierarchically organized chunks in WM (Fig. 1a). The segmenting operation was defined as a reductive process selecting sequence parts from a larger chunked sequence or sequences. Contrarily, the binding operation was defined as a synthetic process converting previously unlinked sequences into a larger sequence. The recognition task was designed so that these two manipulation processes were separable on the basis of a 2 × 2 factorial design (Fig. 1c).
WITHIN recognition should simply require comparison of the memory probe with one of the sequences stored in the memory buffer because the recoding stimuli were basically composed of two standard sequences in both SUP and 2ST conditions. In contrast, ACROSS recognition in both structure conditions should require selection of relevant elements from larger sequences: the last two codes from the first standard sequence and also the first two codes from the second standard sequence. From another perspective, both types of recognition may involve attentional selection to a certain extent (Miller, 1999; Rowe et al., 2000; Rowe and Passingham, 2001). However, ACROSS recognition requires more demanding selection than WITHIN recognition because the central executive needs not only to identify the chunk labels but also to select chunk contents (McLean and Gregg, 1967; Ericcson et al., 1980). Therefore, the main effects of recognition (ACROSS minus WITHIN) were hypothesized to reflect cognitive costs for the segmenting operation.
In ACROSS recognition, there should be differences in the manipulation process after the segmenting between the two structure conditions. Specifically, in ACROSS recognition of the 2ST stimuli, the two two-code segments selected from two previously unlinked standard sequences should be newly bound to be synthesized as a sequence for comparison with the probe (Fig. 1c). There was less necessity of this process for the SUP stimuli than for the 2ST stimuli because the codes should already have formed a linked sequence to a considerable extent during second-stage learning. Hence, it was hypothesized that the 2ST-ACROSS recognition specifically involved the binding operation, which should be revealed as an interaction term in the 2 × 2 factorial design.
Imaging was performed on a 3T MRI scanner (Siemens Allegra, Erlangen, Germany). Subjects lay supine on the scanner bed, wore a headset, and had a button-response unit for the right hand. Visual stimuli were projected onto a screen from a liquid crystal display projector. The subjects were asked to push one of the two buttons with their right index finger in response to the recognition stimuli, depending on the judgment (match or mismatch). Stimulus delivery and response recording were controlled by Presentation software (Neurobehavioral Systems, Albany, CA) on a personal computer. The subjects practiced the actual behavioral paradigm for 5 min within the scanner before the imaging experiment.
Functional images were collected using 28 oblique slices covering the whole brain (slice thickness, 4 mm; interslice gap, 1 mm; in-plane matrix size, 64 × 64; field of view, 256 mm) with an echo planar imaging sequence [repetition time (TR), 2000 ms; echo time (TE), 40 ms; flip angle, 90°). High-resolution three-dimensional T1-weighted images were obtained with magnetization-prepared rapid gradient echo images (TR, 2500 ms; TE, 4.38 ms; inversion time, 1100 ms; 1.0 mm3 cubic voxels; 256 axial slices).
Each fMRI run (312 scanning volumes) was repeated three times. Each run included 24 trials with a 15 s intertrial interval (72 total trials). For the encoding stimuli, the two structure conditions (2ST or SUP) were assigned pseudorandomly. In half of the trials, one of the three SUPs was presented as an encoding stimulus for the SUP condition (with 12 trials for each SUP). In the rest of the trials, two of the six standard sequences were semirandomly chosen for the 2ST condition so that subjects were exposed to each standard sequence at the same frequency across conditions (12 times each).
Behavioral data analysis.
Response times were examined by repeated-measures ANOVA (RM-ANOVA) with Greenhouse–Geisser correction for nonsphericity. Four behavioral conditions were modeled in a 2 × 2 factorial design with two levels each of the structure type (SUP and 2ST) and of the recognition type (WITHIN and ACROSS). The a priori prediction was that the 2ST condition, but not the SUP condition, would involve the binding operation during ACROSS recognition. This effect should be reflected as a “structure × recognition” interaction. In addition, we tested whether there was a significant main effect (or simple main effect) of recognition levels, which presumably reflected cognitive costs of the segmenting operation (ACROSS > WITHIN).
Three confirmation analyses were performed to examine whether the response times were affected by factors other than the experimental categories (structure or recognition). The first test was on the effects of memory probe position (“position effect”). Given that the encoding stimuli are maintained in WM as a sequence, retrieval of the last part of the sequence should take longer than that of the first part. If this was the case, the recognition effect might have been confounded by the position effect. The WITHIN condition included the effects of two probe positions (FIRST and LAST), whereas the ACROSS condition included only the middle probe position (MIDDLE). To remove potentially confounding effects of position, we performed a multiple regression analysis in which recognition and position were regarded as independent explanatory variables of response times. The recognition effect was treated as a categorical variable, whereas the position effect was presumed to be parametric.
The second test concerned whether the cases of memory probe (match or mismatch) would effect response times (“case effect”). It was possible that subjects responded more quickly when the memory probe was a prelearned sequence (case, match), which they could have anticipated, than when it was not (case, mismatch). For examining this effect, all of the experimental conditions were collapsed, and a paired t test was performed with two conditions of case (match or mismatch) as a within-subject variable. The case effect was also tested with an RM-ANOVA with the two levels of case and the four levels of experimental categories (SUP-WITHIN, SUP-ACROSS, 2ST-WITHIN, 2ST-ACROSS) as within-subject variables.
The third test was on the effects of learning during fMRI (“stage effect”). Theoretically, it was possible that learning was specifically occurring during the ACROSS recognition, because the subjects had not been trained previously to recognize the segment spreading over two sequences. Therefore, we checked whether reaction times changed differentially between the conditions as the trials progressed. The trials were split into three stages: 1–25 trials (stage 1), 25–48 trials (stage 2), and 49–72 trials (stage 3). A two-way RM-ANOVA was performed with the three levels of stage (stage 1, 2, or 3) and the two levels of recognition (ACROSS or WITHIN) as within-subject variables.
Image data analysis.
All fMRI data were preprocessed and analyzed using SPM2 software (Wellcome Department of Cognitive Neurology, University College London, London, UK; http://www.fil.ion.ucl.ac.uk). The first seven scans of each run were not processed to allow for T1 equilibrium effects. For preprocessing, the remaining functional images were realigned with respect to the first functional image and were corrected for slice acquisition timing in reference to the middle slice in each scan. The resulting volumes were spatially normalized to fit to an echo planar imaging template in MNI (Montreal Neurological Institute) space. Finally, all normalized images were spatially smoothed with an 8 mm full-width at half-maximum (FWHM) Gaussian kernel.
First, fMRI data were analyzed individually. Two types of events were modeled: recoding and recognition. The maintenance period was not modeled because short interevent intervals and the lack of temporal jittering did not allow us to discriminate maintenance-related activity from other activities. The two recoding conditions (SUP-REC and 2ST-REC) were modeled as epochs representing sustained activity during the presentation of eight-code auditory stimuli (4.8 s). Four separate covariates were modeled as events to represent four recognition types (SUP-WITHIN, SUP-ACROSS, 2ST-WITHIN, and 2ST-ACROSS). All covariates were convolved with a canonical hemodynamic response function before entering the design matrix. The data were high-pass filtered with a cutoff frequency of 52 s, and an autoregression model was used to remove serial correlations. Statistical parametric maps of t statistics were calculated on the basis of a general linear model for the specific contrasts as described below.
A second-level random-effects model group analysis was subsequently performed. A contrast image representing estimated activity size was created from the first-level analysis of each contrast for each subject. A one-sample t test model was applied to the contrast images. Activities were considered significant if they passed a false discovery rate (FDR) threshold of p < 0.05 corrected for whole-brain voxels and also had a spatial extent of ≥20 voxels per cluster, unless otherwise mentioned. The FDR approach controls for the expected proportion of false positives among suprathreshold voxels. An FDR threshold is determined from the observed p value distribution and hence is adapted to the amount of signal within a given contrast (Genovese et al., 2002). The estimated final spatial resolution was 15.6 × 16.1 × 14.8 mm FWHM.
A supplementary volume-of-interest (VOI) analysis was performed using Marsbar (http://marsbar.sourceforge.net/) to evaluate time-dependent MRI signal changes in the activated regions. The signal changes were individually computed from the first-level analysis by setting up a 10 mm radius spherical VOI at the statistical peak of activity. For visual inspection, the MRI signals were plotted after averaging the data across subjects for each condition. These data were also used to evaluate the effects of response times, as a generic measure of task difficulty, on brain activity in the regions of particular interest (left DLPFC and left pre-PMd). Based on a general linear model, a full model was initially established by including four variables: recognition, structure, interaction (structure × recognition), and response time. Using a stepwise reduction procedure, we determined which factors were meaningfully associated with brain activities (α = 0.05; f-to-remove = 4).
For the recoding process, it was hypothesized that an eight-code-long encoding stimulus in 2ST-REC and SUP-REC would be recoded as two unrelated four-code chunks and two tightly linked four-code chunks, respectively. This means that the 2ST-REC condition would impose a greater load on the recoding-related areas than the SUP-REC condition. Therefore, recoding effects were tested by subtracting brain activity in the SUP-REC condition from that in the 2ST-REC condition (null hypothesis, 2ST-REC − SUP-REC = 0).
For recognition-related activity, summary contrast images were created from the first-level analysis on the basis of a 2 × 2 factorial design. One of our main foci of interest was the structure × recognition interaction. We sought activity revealing ACROSS recognition effects greater in 2ST than in SUP [i.e., (2ST-ACROSS − 2ST-WITHIN) − (SUP-ACROSS − SUP-WITHIN) = 0]. The resulting activation map was inclusively masked by the effect image of 2ST-ACROSS compared with the implicit baseline (uncorrected p < 0.05) to ensure detection of the regions exhibiting positive effects during 2ST-ACROSS recognition. This interaction analysis was first thresholded at p < 0.05 FDR corrected for the whole-brain voxels and then a small volume correction (SVC) was applied according to the preexisting hypothesis. Our hypothesis was that the binding operation would share a mechanism with conversion of WM items into a sequential motor action (Ohbayashi et al., 2003), considering nonmotor and motor functions of pre-PMd. To put it differently, the neuronal activity observed during memory-movement conversion possibly reflected a computation algorithm applicable to both motor and nonmotor sequencing. Overlapping of imagery- and execution-related activity in pre-PMd during sequential finger tapping (Hanakawa et al., 2003b) supports this hypothesis. Therefore, the binding operation was presumed to enhance activity in pre-PMd. A 10 mm radius spherical VOI was set up bilaterally in the lateral premotor cortex (premotor-VOI) by using the center coordinate of x = ±32 mm, y = 0 mm, and z = 56 mm obtained from a previous imaging study reporting PMd activity during an “N-back” task (Callicott et al., 1999). In the N-back task, the WM items should acquire a temporal relationship for serial updating, perhaps in the form of a sequential mental set (Braver et al., 1997). This process conceptually overlaps with the conversion of WM items into a sequence. The significance level of the interaction contrast was set at a height threshold of p = 0.05 (t(14) > 3.24) family-wise error (FWE)-corrected for multiple comparison limited within the VOI.
Another point of interest was greater brain activity during the ACROSS conditions than during the WITHIN conditions (main effect of recognition). Our assumption was that the segmenting operation would be involved during the ACROSS recognition, but not so much during the WITHIN recognition, regardless of the structure type [null hypothesis, (2ST-ACROSS + SUP-ACROSS) − (2ST-WITHIN + SUP-WITHIN) = 0]. For the segmenting-related regions, we expected the involvement of the DLPFC on the basis of previous studies on attentional selection (Rowe et al., 2000; Rowe and Passingham, 2001).
In the 2ST-ACROSS condition, the binding operation operates on the results of the segmenting operation. This led us to hypothesize that the 2ST-ACROSS condition might demand closer regional interactions across the relevant neural modules than the other conditions. In other words, although the regional effects of segmenting and binding were hypothesized to be reflected by DLPFC and pre-PMd activities, respectively, it was likely that serial operation of the two manipulation processes would require closer interaction of the two regions. To test this hypothesis, we performed a PPI analysis (Friston et al., 1997) and examined effective connectivity between the regions involved in the manipulation processes. The PPI refers to the interaction between the physiological activity of the brain and the psychological context, and tests whether the neural response in one brain region can be explained in terms of an interaction between input from a different region and experimental conditions. We were particularly interested in the effective connectivity between pre-PMd and DLPFC in the same hemisphere, because anatomical studies have clearly revealed reciprocal connections between them (Lu et al., 1994). To sample physiological covariates, a 10 mm radius spherical VOI was set up in each subject at the left pre-PMd activity peak where binding-related activity was identified (Table 1). The first principal component was computed from the left pre-PMd activity time series and was used for input functions. Condition-specific regressions were computed at every voxel to test the difference in regression slopes between the two ACROSS conditions. The resulting SPM demonstrated significant context-dependent dynamic changes in the contribution of left pre-PMd to other brain regions including the DLPFC. Based on the a priori hypothesis, inferences regarding significance were limited within a search volume of a 10 mm radius spherical VOI centered at the left DLPFC activity peak (Table 2) at a threshold of p = 0.05 (t(14) > 3.93). FWE corrected for multiple comparisons (SVC method).
Performance of the recognition task was very accurate in all of the four conditions (97.8, 95.7, 97.8, and 98.2% for SUP-WITHIN, SUP-ACROSS, 2ST-WITHIN, and 2ST-ACROSS, respectively). Accuracy did not differ significantly across the conditions (F(1,28) = 1.62, p = 0.21 for the recognition × structure interaction; F(1,28) = 0.76, p = 0.39 for the recognition main effect; F(1,28) = 1.18, p = 0.17 for the structure main effect). The latencies were 986 ± 48, 1105 ± 50, 996 ± 48, and 1265 ± 46 ms (means ± SEM) for SUP-WITHIN, SUP-ACROSS, 2ST-WITHIN, and 2ST-ACROSS, respectively (Fig. 2a). RM-ANOVA revealed a significant recognition × structure interaction (F(1,28) = 25.3; p < 0.001). Because the interaction term was significant, we proceeded to the analysis of the simple main effect. This analysis indicated significant differences in response times between the recognition conditions in both structure types (SUP-WITHIN vs SUP-ACROSS, F(1,28) = 18.5, p < 0.001; 2ST-WITHIN vs 2ST-ACROSS, F(1,28) = 111.2, p < 0.001). The simple main effect of structure type was only evident during ACROSS recognition (SUP-ACROSS vs 2ST-ACROSS, F(1,28) = 27.2, p < 0.001; SUP-WITHIN vs SUB-WITHIN, F(1,28) = 0.4, p = 0.54).
The response time data were further analyzed to examine the effects of the position of the memory probe (three position levels) (Fig. 2b). The latencies were 931 ± 44, 1105 ± 50, 1060 ± 54, 955 ± 46, 1265 ± 46, and 1037 ± 54 ms (mean ± SEM) for SUP-FIRST, SUP-MIDDLE, SUP-LAST, 2ST-FIRST, 2ST-MIDDLE, and 2ST-LAST, respectively. The finding that the WITHIN recognition of LAST took longer than that of FIRST suggests confounding effects of position on response times. A multiple regression analysis was performed to test the significance of recognition effects, taking the position effect into account. The results clearly revealed significant effects of recognition type on reaction time (p < 0.001) after the position effect (p = 0.06) was removed.
The response times were not significantly affected by the familiarity to, or expectancy for, the memory probe sequences (case effect). The latencies were 1115 ± 53 and 1095 ± 46 ms (mean ± SEM) for match and mismatch, respectively. A paired t test analysis indicated no significant differences in response times between the case conditions (t = 0.29; df = 28; p = 0.77). This result was confirmed by a different RM-ANOVA model testing the case effect (p > 0.05 in all four categories) (Fig. 2c). This result indicated that the mean difference or variance in response times was not explained by the familiarity to the memory probes. It meant that the expectancy effect for the probe stimuli on behavioral costs were minimal. Thus, the image data from the match and mismatch conditions were analyzed altogether.
The effect of learning during the imaging experiment (stage effect) was found to be minimal. Response times were 1042 ± 56, 997 ± 53, 968 ± 46, 1249 ± 41, 1171 ± 56, and 1144 ± 56 ms (mean ± SEM) for stage1-WITHIN, stage2-WITHIN, stage3-WITHIN, stage1-ACROSS, stage2-ACROSS, and stage3-ACROSS, respectively (Fig. 2d). There were neither significant stage × recognition interactions (F(2,42) = 0.29, p = 0.75) nor main effects of stage (F(2,42) = 0.92; p = 0.41). The main effects of recognition remained significant (F(1,28) = 92.5; p < 0.001).
In summary, these results showed that (1) ACROSS recognition was significantly influenced by structure (significant recognition × structure interaction), supporting the specific involvement of the binding operation during the 2ST-ACROSS condition; (2) ACROSS recognition was cognitively more demanding than WITHIN recognition (recognition effect), consistent with the involvement of the segmenting operation during ACROSS recognition; (3) the recognition effect remained significant after the confounding effects of position were removed; (4) the case effect was not evident, suggesting the expectancy for the probe sequence or the conflict to that expectancy was minimal; (5) learning effects during the fMRI experiment were minimal, supporting the robustness of previous chunk formation and justifying treatment of all stages as a homogeneous condition.
Recoding-related brain activity
The 2ST-REC condition revealed greater activation than the SUP-REC, which is consistent with the hypothesis that 2ST-REC would impose a higher load on the recoding system. Significant activity was found in the ventrolateral prefrontal cortex (VLPFC), frontal opercular region, medial frontal gyrus, and inferior parietal lobule (IPL) in the left hemisphere (Fig. 3, Table 3). The frontal opercular activity was situated primarily in the posterior part of the inferior frontal gyrus. The medial frontal activity was located above the cingulate sulcus and rostral to the vertical anterior-commissural plane, thereby corresponding to the presupplementary motor area (pre-SMA) (Picard and Strick, 1996). In pre-PMd and DLPFC, the recoding-related activities were not significantly different between the recoding conditions, even using a threshold of uncorrected p < 0.01. The reverse contrast (SUP-REC minus 2ST-REC) revealed no significant activation, even when the threshold was lowered to uncorrected p < 0.01.
We sought activity revealing the main effect of structure or recognition on the whole brain (FDR <0.05 corrected). There was no significant activity showing the main effects of structure in either direction, even at a liberal threshold of uncorrected p < 0.01. The effects of recognition (ACROSS > WITHIN) were found bilaterally in the DLPFC, VLPFC, and the superior parietal lobule (Fig. 4, Table 2). These neural substrates exhibiting the recognition effect, especially the DLPFC, were considered to represent the segmenting operation. The bilateral DLPFC regions were found along the inferior frontal sulci, whereas the VLPFC activities were mainly in the anterior part of the inferior frontal gyri on both sides. The segmenting-related left VLPFC was located slightly posterior and inferior to the recoding-related VLPFC activity (Fig. 3b, inset). The reverse effect of recognition (WITHIN > ACROSS) revealed no significant activation, even when the threshold was lowered to uncorrected p < 0.01.
Activity revealing the structure × recognition interaction was hypothesized to reflect the binding operation. No brain region exhibited the effects of interaction at the threshold corrected for the whole-brain voxels with the use of the SVC method based on the a priori regional hypothesis (i.e., pre-PMd for the binding operation); however, significant activity was identified within the premotor-VOI (p = 0.02 FWE and p = 0.01 FDR corrected within the search volume) (Fig. 5, Table 1). This activity was located at the junction of the superior frontal sulcus (SFS) and the superior precentral sulcus. The cluster was anterior to the vertical anterior–commissural plane (Picard and Strick, 2001), which was used as a landmark for separation between pre-PMd and the caudal part of PMd. Based on these anatomical features, this activated region was consistent with pre-PMd (Picard and Strick, 2001), PMdr (Hanakawa et al., 2002), or the “SFS area” (Rowe et al., 2000). Note, however, that the possibility that the activity could belong to the posterior prefrontal cortex cannot be excluded because the activity was situated around the anterior border of Brodmann's area 6. To be consistent with the previous works and the preexisting hypothesis, we will call this regional activation “pre-PMd activity” and discuss it mostly as such hereafter.
No other region exhibited the structure × recognition interaction even with a liberal threshold of uncorrected p < 0.01. It was noteworthy that the interaction was not detected in the left DLPFC even with the application of the SVC (using the coordinates of the segmenting-related DLPFC). Based on these findings and our a priori hypothesis, the left pre-PMd activity was interpreted to reflect the binding-type manipulation of WM information involved most prominently in the 2ST-ACROSS condition.
Activity in the left pre-PMd and left DLPFC was reassessed by incorporating response times, a surrogate marker of task difficulty, into the statistical model (VOI analysis). In the left pre-PMd, only the interaction was significantly associated with this activity (standard coefficient, 0.38; p = 0.008), and response times exhibited a tendency toward negative correlation with this activity (standard coefficient, −0.18; p = 0.28). In the left DLPFC, recognition was the only significant explanatory variable for the activity (standard coefficient, 0.38; p = 0.02), whereas the contribution of reaction time appeared to be negligible (standard coefficient, 0.03; p = 0.84). It thus seems unlikely that the condition-specific activities of the pre-PMd and DLPFC can be simply explained by nonspecific task difficulty effects.
To clarify the dynamics of neural activity underlying the two types of chunk manipulation, we investigated the effective connectivity between the binding-related left pre-PMd region and the segmenting-related left DLPFC. The PPI analysis (Fig. 6) revealed significantly more effective connectivity between the two regions during the 2ST-ACROSS condition than during the SUP-ACROSS condition (p = 0.01 FWE and p = 0.01 FDR corrected within the search volume). In the PPI analysis, no regions other than the left DLPFC were detected in the entire brain, even when the threshold was lowered to uncorrected p < 0.01. It was thus shown that functional interaction between the left pre-PMd and DLPFC was specifically increased during the 2ST-ACROSS condition, which required serial manipulation processing involving both binding and segmenting.
This study for the first time clarified the specific executive functions of pre-PMd and its interaction with the DLPFC in humans. The pre-PMd and DLPFC were found particularly relevant to the binding and segmenting types of WM manipulation, respectively. Segmenting is similar to the attentional selection function in previous WM tasks (Rowe et al., 2000; Rowe and Passingham, 2001), whereas binding likely shares mechanisms with conversion of memory items into a response sequence (Ohbayashi et al., 2003). Furthermore, the pre-PMd and DLPFC interacted closely, whereas both binding and segmenting were to be serially executed.
Previous WM studies clarify neither the specific role of the pre-PMd in WM manipulation nor the way how the pre-PMd communicates with the DLPFC. The present WM task constrained by LTM was able to highlight the two fundamental executive processes and their respective substrates. The behavioral data strongly supported the existence of condition-specific manipulation processes, segmenting operation during ACROSS recognition, and binding operation specific to 2ST-ACROSS recognition. On the other hand, the present fMRI procedure with fixed and short delay periods might not be ideal to distinguish between the encoding and recognition phases. However, neither the pre-PMd nor DLPFC showed condition-specific activity during the recoding phase. It was thus unlikely that the possible overflow from the recoding-related activity significantly affected the findings of recognition-related activity there. Likewise, the structure effect during the recognition phase was not evident anywhere in the brain. These findings suggested that the brain activity of interest, especially in the pre-PMd and DLPFC, was not seriously influenced by the possible overlap between the recoding-related and recognition-related activities.
The DLPFC activity was reported to increase with task difficulty and to plateau around WM capacity (Callicott et al., 1999). This suggested the possibility that the DLPFC might invoke participation of the left pre-PMd when task demands increased to the point at which the DLPFC alone could not meet the task requirements. By using response times as a conventional index of task difficulty, a multiple regression analysis was performed to reexamine categorical task effects on the pre-PMd and DLPFC activities. There were no significant effects of response times on the left DLPFC or left pre-PMd activity, making it less likely that neural activity overflowed from the DLPFC to the pre-PMd because of excessive general task demands.
Pre-PMd and binding operation
The 2ST-ACROSS condition selectively enhanced left pre-PMd activity. Classically, PMd was considered a higher-order motor control region (Hoshi and Tanji, 2004). A recent physiology study in nonhuman primates (Ohbayashi et al., 2003) has proposed that the function of the PMdr, which is arguably analogous to human pre-PMd, may not be purely motor (Picard and Strick, 2001). Human imaging studies have also supported cognitive domains of the pre-PMd function.
The precise anatomical definition of the pre-PMd is still open in humans, because the border between the pre-PMd and the prefrontal cortex has not been established yet. Therefore, the nomenclature of the pre-PMd should be regarded as tentative. Considering the considerable across-subject variability in the cytoarchitechtonic border, more elaborative technical development will be needed to delineate the pre-PMd and prefrontal cortex anterior to the pre-PMd in each individual's anatomical image. Functionally, it is also important to consider oculomotor factors because the pre-PMd and frontal eye field (FEF) are located nearby (Hanakawa et al., 2002; Koyama et al., 2004). From the location of information only, the possibility that the present pre-PMd activity overlapped with the FEF cannot be excluded. However, it seems unlikely that oculomotor factors produced the pre-PMd activity in the present setup. Subjects should have a straight view of the memory probe during ACROSS recognition, although they might need to look slightly leftward or rightward during WITHIN recognition. Hence, more eye movements would have occurred during WITHIN recognition than during ACROSS recognition. Because the pre-PMd activity was highest in the 2ST-ACROSS condition among others, it is more reasonable to attribute the binding-related activity to the pre-PMd than to the FEF.
The binding operation is presumably related to conversion of the memory items into a sequence. Note that sequencing processes are included in many cognitive tasks that previously revealed pre-PMd activity (Mellet et al., 1996; Braver et al., 1997; Owen et al., 1998; Callicott et al., 1999; Hanakawa et al., 2002; Tanaka et al., 2005). For instance, pre-PMd is remarkably active during mental-operation tasks in which verbal or spatial mental representations are sequentially updated in response to sensory cues (Hanakawa et al., 2002, 2003a; Tanaka et al., 2005). Also, pre-PMd activity has been observed during the imagery as well as execution of sequential finger tapping (Sadato et al., 1996; Hanakawa et al., 2003b). In N-back tasks, every time a new stimulus appears, the central executive updates WM representations by discarding the oldest item in WM and putting the remaining and new items together. This process can be regarded as conversion of WM items and a new stimulus into a sequential mental set. Overall, the sequence-converting function of the pre-PMd best explains the binding-related activity in the pre-PMd, although the notion that this function is applicable to the purely cognitive domain still waits for future validation. We propose that, perhaps along with visuospatial information processing (Courtney et al., 1998) and stimulus-response linkage (Grafton et al., 1998), one of the fundamental roles of the pre-PMd might be the conversion of memory items into a sequence in both motor and cognitive domains.
The DLPFC and the segmenting operation
The segmenting operation appears to involve focusing attention on the task-relevant information elements held in WM. Activity in the segmenting-related areas (the bilateral DLPFC, VLPFC, and superior parietal gyrus) was observed in previous studies in which subjects selected requisite items from memory for responses (Rowe et al., 2000; Rowe and Passingham, 2001). Response selection and the segmenting operation share the concept of attentional selection, putatively one of the core functions of the central executive (Rowe and Passingham, 2001). Attentional selection is required in both WITHIN and ACROSS recognition conditions, in that subjects must select a four-code sequence from an eight-code sequence maintained within memory buffer. However, the ACROSS recognition would cost higher load than the WITHIN recognition, because it should be difficult for the central executive to directly retrieve a segment crossing a chunk boundary (McLean and Gregg, 1967; Ericcson et al., 1980).
The DLPFC has been associated with high-level executive processes (Miller, 1999). The PPI analysis has provided new evidence that the left DLPFC is functionally coupled with the ipsilateral pre-PMd during WM manipulation in humans. This is consistent with the anatomical evidence from nonhuman primates that the PFC and pre-PMd are reciprocally interconnected in the same hemisphere (Barbas and Pandya, 1987; Luppino et al., 1993; Lu et al., 1994). Although neuroimaging studies revealed coactivation of the DLPFC and pre-PMd during various WM manipulation tasks (Braver et al., 1997; Owen et al., 1998; Callicott et al., 1999; Stern et al., 2000), it has remained unclear until now how these two regions functionally interact. It is possible that such functional coupling can actuate cognitive integration between different executive processes subserved by distinct regions.
The VLPFC and the role of LTM in WM manipulation
In the present study, manipulation processes should have induced reactivation of preexisting chunk representations in LTM (Ericsson and Kintsch, 1995; Baddeley, 2000). Such reactivation should have also been observed as recoding-related activity. Referencing to LTM may have induced the VLPFC activity, which was observed during both recoding and manipulation processes. VLPFC activity was reported during a knowledge-based chunking task, which required subjects to retrieve task-relevant information from LTM (Bor et al., 2004). Similar VLPFC activity was observed during stimulus-response linkage tasks (Petrides, 2002; Prince et al., 2005; Hanakawa et al., 2006). These findings all support the role of the VLPFC in retrieving task-relevant knowledge from LTM.
The left operculum IPL and pre-SMA activities during recoding imply the enrollment of the phonological loop in recoding auditory stimuli and/or in registering the recoded information into the memory buffer. Pre-SMA is also involved in updating appropriate responses from ambiguous action sets including motor chunks (Rushworth et al., 2004). These findings suggest that the recoding system translates discrete encoding stimuli into grouped information (i.e., chunks) through reactivation of LTM. In turn, the segmenting operation should evoke a similar process of LTM reactivation, which is to retrieve chunk contents from LTM. Consistently, these segmenting-related areas overlap with the expertise-related activity in a recent study in which subjects were extensively trained on visual category recognition before fMRI (Moore et al., 2006). During encoding/maintenance, novel stimuli of the trained category more strongly activated bilateral DLPFC and VLPFC than those of untrained categories. These regions may be involved in the application of expert memory skills, which should share a similar mechanism with the LTM-assisted memory strategy studied here.
In summary, the present study has demonstrated not only specialized functions of the pre-PMd and DLPFC but also their interaction for cognitive manipulation. Methodologically, multilevel chunk structures provided useful behavioral constraints to dissect manipulation processes. The manipulation and recoding processes activated the VLPFC, reflecting the retrieval of knowledge from LTM. The pre-PMd and DLPFC may contribute to executive function through sequence generation and attentional selection, respectively, and the functional coupling between them seems to play a pivotal part in integrating these executive processes.
This work was supported in part by Grants-in-Aid for Scientific Research from the Ministry of Education, Science, Sports, Culture, and Technology, Japan on Priority Areas (Area Number 454/Project Number 18047013) and on Fundamental Research C (Project Number 17500210) to T.H., by a Grant-in-Aid for Scientific Research on Priority Areas System study on higher-order brain functions (18020014) from the Japan Society for the Promotion of Science, and by a grant from New Energy and Industrial Technology Development, Japan (51101244-0) to H.F. There are no conflicts of interest of any kind.
- Correspondence should be addressed to Takashi Hanakawa, Department of Cortical Function Disorders, National Center of Neurology and Psychiatry, 4-1-1 Ogawahigashi, Kodaira, Tokyo 187-8502, Japan.