Abstract
Repeated experience with a visual stimulus can result in improved perception of the stimulus, i.e., perceptual learning. To understand the underlying neural mechanisms of this process, we used functional magnetic resonance imaging to track brain activations during the course of training on a contrast discrimination task. Based on their ability to improve on the task within a single scan session, subjects were separated into two groups: “learners” and “nonlearners.” As learning progressed, learners showed progressively reduced activation in both visual cortex, including Brodmann's areas 18 and 19 and the fusiform gyrus, and several cortical regions associated with the attentional network, namely, the intraparietal sulcus (IPS), frontal eye field (FEF), and supplementary eye field. Among learners, the decrease in brain activations in these regions was highly correlated with the magnitude of performance improvement. Unlike learners, nonlearners showed no changes in brain activations during training. Learners showed stronger activation than nonlearners during the initial period of training in all these brain regions, indicating that one could predict from the initial activation level who would learn and who would not. In addition, over the course of training, the functional connectivity between IPS and FEF in the right hemisphere with early visual areas was stronger for learners than nonlearners. We speculate that sharpened tuning of neuronal representations may cause reduced activation in visual cortex during perceptual learning and that attention may facilitate this process through an interaction of attention-related and visual cortical regions.
Introduction
Repeated experience with visual stimuli can result in perceptual learning, i.e., faster and improved detection and discrimination of those stimuli. Perceptual learning is an important ability because it enables one to respond more efficiently to visual information in the environment.
Determining the cortical areas of plasticity underlying perceptual learning is of great interest to many researchers. Previous psychophysical studies have often found that the effect of learning is specific to the trained retinotopic location (Shiu and Pashler, 1992; Fahle et al., 1995; Schoups et al., 1995), stimulus orientation (Ramachandran and Braddick, 1973; Fiorentini and Berardi, 1981; Ahissar and Hochstein, 1997), stimulus size (Ahissar and Hochstein 1996), direction of motion (Ball and Sekuler, 1987; Watanabe et al., 2001), and the eye used during training (Karni and Sagi, 1991). The specificity of these learning effects suggests that the underlying neural plasticity occurs in low-level areas of visual cortex (Karni, 1996; Gilbert et al., 2001; Fahle 2004; Ghose 2004). Human imaging and monkey electrophysiological studies have accumulated more direct evidence for the involvement of early visual areas in perceptual learning. For example, several human imaging studies have found changes in activation in the primary visual cortex after training compared with before training on contrast discrimination (Schiltz et al., 1999), texture discrimination (Schwartz et al., 2002), contrast detection (Furmanski et al., 2004), and shape identification (Sigman et al., 2005) tasks. Single-unit recording studies of Yang and Maunsell (2004) and Raiguel et al. (2006) examined the activity of neurons in visual cortical area V4 and found sharpened orientation tuning curves after training. Similarly, in motion-sensitive areas MT (middle temporal visual area) and MST (medial superior temporal visual area), Vaina et al. (1998) used functional magnetic resonance imaging (fMRI) to demonstrate an expansion of the activation zone (accompanied by the disappearance of activation in other extrastriate visual areas), and Zohary et al. (1994) used single-unit recordings to show an increase in neural sensitivity after human and monkey subjects, respectively, were trained on a motion discrimination task.
Building on this previous work, we sought to relate changes in neural activity directly to changes in behavioral improvement during perceptual learning. To do so, we trained subjects on a visual perceptual task, monitored their performance over the course of training, and used fMRI to track concurrent dynamic changes in brain activations, which are presumed to reflect neural activity. We modified the contrast discrimination task paradigm developed by Fiorentini and Berardi (1981), because perceptual learning can occur in a short training session of ∼1 h, and hence brain activation data can be obtained during a single scan session. Additionally, we obtained whole-brain images so that we could measure changes not only within but also outside the visual cortex and then through correlation analyses, examine whether changes observed within and outside visual cortex were functionally coupled. Finally, because subjects in our study could be classified as “learners” and “nonlearners,” we could probe differences in brain activations between the two groups that might explain differences in their ability to learn.
Materials and Methods
To study mechanisms of perceptual learning, we obtained fMRI data throughout the training session as well as behavioral performance data on a contrast discrimination task. In doing so, we were able to correlate activations during learning with the corresponding subjects' behavioral improvement rather than compare brain activations before and after training, as done previously with positron emission tomography (PET) and fMRI (Schiltz et al., 1999; Schwartz et al., 2002; Furmanski et al., 2004). We equated subjects for rates of learning instead of the amount of training, thereby enabling us to show tight coupling between improvement and brain activation changes.
Subjects.
Twenty-two normal volunteers participated in the experiment (11 males, 11 females; ages 22–49 years). All subjects provided informed consent before the experiment, and none had participated in any perceptual learning experiment before the current study. All procedures were approved by the National Institute of Mental Health Institutional Review Board. All subjects had normal or corrected-to-normal vision. Before the experiment, we collected behavioral pilot data on the task from five subjects (one male, four females; ages 24 ± 2 years) outside the scanner. All of these pilot subjects showed improvement within 200 trials (averaged percentage correct improved from 75 to 94%), similar to the previous results of Fiorentini and Berardi (1981).
Task and stimuli.
Perceptual learning of certain visual tasks can take a long time (>10,000 trials) to complete (Fine and Jacobs, 2002). To acquire fMRI data of perceptual learning within a single scan session, we modified a paradigm developed by Fiorentini and Berardi (1981), in which performance typically improves over the course of ∼200 trials. Our experimental session consisted of two parts: (1) contrast discrimination threshold testing conducted during anatomical data acquisition and (2) the perceptual learning experiment. For both, the stimuli consisted of sine waves of spatial frequency f and 3f with relative spatial phase 0° (Fig. 1a). We conducted contrast discrimination threshold testing to determine, for each subject, the contrasts of third harmonic sine waves that were subsequently used in the perceptual learning experiment itself. The environment during the threshold testing was set as close to the subsequent perceptual learning experiment as possible.
Examples of luminance profile and stimulus pattern used in the current study. a, A stimulus consisted of two sinusoidal gratings: sine wave of spatial frequency f and 3f. b, Two trial types for a task trial, same and different. During same trials, the two subsequently presented stimuli were identical. During different trials, the contrast of the third harmonic component of the stimulus was different between stimulus 1 and stimulus 2; on half of the trials, the contrast of the third harmonic component of the stimulus was higher in stimulus 1 than that in stimulus 2, and in half of the trials it was lower.
For the threshold testing, we used a multiple random staircase method (Cornsweet, 1962; Gracely et al., 1988). The procedure of threshold testing was as follows: Two complex grating stimuli, reference and test, consisted of sine waves of f and 3f. In the stimulus that was used as a reference, the contrast ratio of f and 3f were 1/0.33, which we confirmed were visible for all subjects. In the test stimulus, we began the staircase for the third harmonic at a contrast value 30% higher than the reference stimulus, so that both were visible initially. During threshold testing, the reference and test stimuli were presented successively, and the subject's task was to decide whether the two images were the same or different, namely, had different contrasts of the third harmonic sine wave. The contrast of the third harmonic in the test stimulus was reduced after a subject responded correctly three times that the two images were different and the contrast was increased after a subject failed to detect the difference, thus yielding a level of response of ∼70–75% correct. The step size in the staircase was 2% of the contrast of the first harmonic. After nine points of response reversal for each staircase, the threshold testing ended, and we averaged contrasts at the last seven reversal points from the staircases.
After threshold testing, we created two stimuli of complex grating patterns for the perceptual learning experiment using the contrast values measured in threshold testing: the contrast ratio of the first and third harmonic of one of the two stimuli was set at 1/0.33, and the third harmonic contrast of the other stimulus was set for each subject according to the subject's threshold value. During the perceptual learning experiment, there were two stimulus trial types: “same,” in which the two stimuli presented successively in a trial were identical, and “different,” in which the two stimuli presented successively in a trial differed in contrast of the third harmonic (Fig. 1b). The subject's task in the perceptual learning experiment was to detect a difference in contrast between the two successively presented complex grating patterns; subjects responded with a differential button press for same and different. Approximately half of the task trials were same and the other half were different. Figure 2a shows a schematic diagram of a task trial. Half of the subjects were tested with vertical gratings and the other half with horizontal gratings.
Time course of a trial. a, Task trial. Stimulus 1 and stimulus 2 were presented successively for 50 ms with an interstimulus interval of 600 ms. A response period followed the stimulus presentations and lasted for 1 s. During the response period, if the subject's response was correct, then the word “YES” was presented in green during a feedback period after the response period. If the subject's response was incorrect, then the word “NO” was presented in red. If the subject did not respond during the response period, the words “NO RESPONSE” were presented in blue. b, Fixation trial. A black fixation spot was presented in the middle of the visual field, and the color of the spot changed to white after 800 ms. Subjects were instructed to press a button as soon as they detected the change. No feedback was provided for the task. The color of the fixation spot changed back to black at 1100 ms from the beginning of a trial. The next task started right after one task concluded. ISI, Interstimulus interval. Diagonal arrows represent the flow of time during a task.
The stimuli were presented at the center of the screen and subtended 9° of visual angle; a small black fixation spot was presented at the center of the screen. Subjects were instructed to maintain fixation on the spot during the entire experiment. Feedback was given after each response to facilitate the learning process (Herzog and Fahle, 1997). Fixation trials were used as the baseline condition for the fMRI data analysis. During a fixation trial, the color of the fixation spot changed from black to white (Fig. 2b). On these trials, subjects were instructed to press a button as soon as they detected the color change of the fixation spot. The button press in the fixation trials was introduced as a control for subjects' motor responses. The order of same, different, and fixation trials was pseudorandomized.
Imaging methods.
We used a 3T GE Healthcare (Buckinghamshire, UK) scanner and GE Healthcare standard head coil to collect anatomical and fMRI data. Parameters for anatomical data acquisitions were as follows: magnetization-prepared rapid-acquisition gradient echo; echo time (TE), 2.9 ms; repetition time (TR), 7.5 ms; flip angle, 6°; 256 × 256 matrix; field of view (FOV), 240 × 240 mm; voxel size, 1 × 1 × 1.2, 124 slices. Voxel size for fMRI data were 3.75 × 3.75 × 5 mm and 24 slices in a volume (gradient-echo echo planar imaging sequence; TE, 40 ms; TR, 2300 ms; flip angle, 90°; 64 × 64 matrix; FOV, 240 × 240 mm). Functional brain imaging covered the whole cerebral cortex but not the cerebellum. The scanning room was darkened and subjects wore earplugs during the experiment to reduce distracting stimulation. The subject's head was stabilized with a vacuum-sealed bag, partially filled with Styrofoam beads. The scan session consisted of an initial seven runs plus another seven runs if the subject did not reach an 80% correct performance level within the initial seven runs. Each run had 84 trials consisting of ∼60% task trials and 40% fixation control trials. The trials were presented in a rapid event-related manner.
Behavioral data analysis.
The criterion of learning for the current study was an improvement in performance of at least 20% correct (e.g., from 75% correct to 95% correct) during the experiment. We chose this criterion to match the learning effects observed by Fiorentini and Berardi (1981). To track improvements during the course of training for each subject, we calculated a moving average of percentage correct responses in 100 trials, in which 10 new trials were added and the 10 oldest trials were dropped for the calculation of each new moving average. The percentage correct for the initial period of training was subtracted from that of the period in which a subject's behavior reached asymptote to determine whether the learning criterion was met. There were eight subjects who showed an improvement of 20% or more and hence met the learning criterion. These subjects were labeled as learners. Ten subjects did not show a significant improvement and were labeled as nonlearners. The four remaining subjects who showed some behavioral improvement, but less than the criterion level, were excluded from additional analysis.
To test the significance of learning for each subject, we conducted a simple linear regression on individual percentage correct response data and obtained a learning curve. For the learners, we used only the data from the period before the subject's behavior reached asymptote (Fine and Jacobs, 2002). From the regression, we obtained the slope of the learning curve (amount of percentage correct improvement per trial). We used the slope as the measure of improvement for each subject, termed “learning index” in the current study.
To examine whether the subjects showed faster response times as a result of training, we also obtained reaction time (RT) data (from response period onset until the subject's response) during the experiment. We ran a mixed-design two-way ANOVA with time periods (see methods below) as within-subjects variables and subject groups (learners and nonlearners) as between-subject factors to test whether there was a change in the RT over time and a difference in RT between learners and nonlearners.
Additional behavioral data were collected to provide supportive evidence for perceptual learning. In six subjects, we tested for specificity of the perceptual learning by rotating the orientation of the stimuli 90° from the learned orientation (e.g., from vertical to horizontal or vice versa) in a separate session, which was conducted ∼15 min after the original learning session. In two other subjects, we tested for perceptual learning retention of the original task 2 weeks after the first session in one subject and 4 weeks after in the other.
fMRI data analysis for learners.
Because the learners showed varying rates of learning during the scan session, rather than averaging the behavioral data into equal time periods across the group (see methods for the data analysis of the nonlearners below), we equated their performance into stages of improvement [0% improvement, ∼10% improvement, ∼20% improvement, and ∼20% improvement (plateau)] and then correlated these performance improvements with brain activations of corresponding periods. To define the four stages, first, we identified the point at which each improvement was achieved relative to the initial (0% improvement) period for each subject. For example, for an initial performance of 50% correct, 10% improvement is the point at which 60% correct performance was achieved. Then, for each stage, we created a window with the point as its center so that each window included the same number of correct task trials for the analyses. In this way, we could correlate brain activations with performance improvements across the group of learners.
We used the AFNI (for Analysis of Functional NeuroImages) software package (Cox, 1996) to analyze the fMRI data. All fMRI data were registered to corresponding anatomical data and smoothed using a Gaussian filter with a root-mean-square width of 4 mm. Signal intensity was divided by the mean signal value for the run and multiplied by 100 before analysis so that the data represented percentage signal change from the mean signal value. For each subject's data, we used a gamma function with a peak of one as a model for the hemodynamic response function. We used each subject's learning stages (0, ∼10, and ∼20% improvement, and plateau) as regressors (conditions) for regression analyses using the general linear model. We included only the correct task trials (correct reject in same trials and hit in different trials) for the learning conditions. Trend analyses were conducted on the group and individual data to identify the areas in which brain activations changed in a correlated manner with the four stages of behavioral improvement. The results of individuals were overlaid on each subject's flattened brain surface of the occipital lobe along with their retinotopic maps (see methods below). For the group data, we ran a one-way repeated-measures ANOVA (voxelwise), with learning stages (0, ∼10, and ∼20% improvement, and plateau) as fixed factors and subjects as random factors. Before analysis, the fMRI datasets were first transformed into Talairach space (Talairach and Tournoux, 1988), and a mask of the main task effect (task trials > fixation trials, p < 5 × 10−7) was then applied to the data. The p values of the results of the voxelwise multiple comparisons were corrected using a familywise correction (Forman et al., 1995) by setting the individual voxel significance level (p value) at 0.05 to obtain minimum cluster sizes for the corrected p value.
The trend analysis after ANOVA revealed the areas in which brain activations were correlated with the four stages of behavioral improvement. We set these areas as regions of interest (ROIs) and calculated mean percentage blood oxygenation level-dependent (BOLD) signal change in the ROIs across the subjects for each learning stage for additional ROI analyses. Each region was defined anatomically in reference to the atlas of Talairach and Tournoux (1988), and we included voxels with p values of <0.05 from the individual trend analysis to calculate each subject's mean percentage BOLD signal within each ROI. BOLD signals from both hemispheres were combined for each ROI except for the frontal eye field (FEF), which showed significant correlation with behavioral improvement only in the right hemisphere. We conducted simple linear regressions between the learning indices (slope of the learning curves) and change of brain activations in the ROIs to determine whether the magnitude of improvement among learners correlated with changes in brain activation within the ROIs. The amount of change (decrease) in brain activations was obtained for each subject by dividing the total amount of change in activation during learning (stages before plateau) for each ROI by the number of task trials of the corresponding stages.
Comparison between learners and nonlearners.
To examine the difference in the pattern of brain activation changes between the learners and nonlearners, we analyzed the nonlearners' brain activation during the course of training. We were unable to parse the nonlearners data into four learning stages because nonlearners did not have learning stages, so we divided their data into five equal time periods (T1–T5) and examined the BOLD signal as a function of time during the course of training. Then, to compare the nonlearners' with the learners' data directly, we reanalyzed the learners' data in the same manner as the nonlearners' data.
Accordingly, we conducted regression analysis on all the individual fMRI data with the five time periods (T1–T5) as regressors (conditions). As in the previous analyses, only data from correct trials were included. We ran a two-way mixed-design ANOVA (voxelwise) on the group data in Talairach space with time periods (T1–T5) and subject groups (learners and nonlearners) as fixed factors and subjects as random factors nested in the subject groups. After the ANOVA, we contrasted brain activations between time periods both within and between learner and nonlearner groups to (1) confirm that the two different analyses of the learners' data gave consistent results and (2) compare the pattern of brain activation changes over the time course of training between learners and nonlearners. To focus on the activations in the ROIs set in the previous learners' data analysis, we also conducted repeated-measures one-way ANOVAs on the ROI data of each group with time periods (T1–T5) as within-subjects factors and independent samples t tests between the learners' and nonlearners' averaged BOLD signal in corresponding time periods. For the ROI analyses, we included voxels that showed the main task effect (task trials > fixation trials, p < 5 × 10−7), and BOLD signals from both hemispheres were combined to obtain the averaged BOLD signal for each ROI.
We further conducted simple linear regressions between the learning indices (slope of the learning curves) and averaged BOLD signal in each ROI during T1 for all subjects, learners and nonlearners, and separately among learners to examine whether the magnitude of performance improvement correlated with that of brain activations during T1.
Finally, we conducted a functional connectivity analysis (for the detailed procedures, see http://afni.nimh.nih.gov/sscc/gangc/SimCorrAna.html) to compare learners' and nonlearners' covariance of activations between the ROIs. We used spherical ROIs (radius of 10.5 mm) centered around the local maxima of the results of the group trend analysis (Table 1) as seed ROIs. Correlation coefficients between the time course of BOLD signal in a seed ROI and in each voxel in the brain was calculated for each subject. Then, we converted the correlation coefficients to z scores for group comparisons. We conducted independent samples t tests (voxelwise) between the learner and nonlearner groups. We also conducted t tests using averaged z scores in Brodmann's area (BA) 17, BA18, BA19, fusiform gyrus (FG), intraparietal sulcus (IPS), FEF, and supplementary eye field (SEF) in left and right hemispheres separately to compare the strength of connections among these ROIs between learners and nonlearners.
Talairach coordinates of local maxima of areas in which brain activation negatively correlated with behavioral improvement
Retinotopic mapping data.
We obtained retinotopic mapping data for five subjects in the learners group to identify early visual areas V1–V4 for each subject. We were unable to collect retinotopic mapping data from the three remaining subjects in this group, because they had relocated at the time of this scanning. We used a 3T GE scanner with a 16-channel head coil, and imaging parameters for the fMRI data were as follows: TE, 40 ms; TR, 2000 ms; voxel size, 1.5 × 1.5 × 2.5 mm; SENSE rate 2 (Pruessmann et al., 1999; de Zwart et al., 2002); and 22 oblique slices covering the occipital lobe. Anatomical data were acquired for the same areas as the fMRI data. The conventional phase-encoded method was used using a rotating wedge (Engel et al., 1994, 1997; Sereno et al., 1995; Tootell et al., 1997). Geometric distortion in fMRI images was corrected using the B0 field map (Jezzard and Balaban, 1995). Activation patterns were superimposed onto the individual's flattened surface created using the FreeSurfer software package (Dale et al., 1999; Fischl et al., 1999, 2001; Segonne et al., 2004), and we defined boundaries of the early visual areas (Sereno et al., 1995; Tootell et al., 1996) using the SUMA (for Surface Mapping with AFNI) software package (Cox, 1996; Argall et al., 2006; Saad et al., 2006).
Results
Behavior
The criterion of learning for the current study was defined as at least 20% correct improvement (e.g., from 75% correct to 95% correct) during training. Although previous results (Fiorentini and Berardi, 1981) and our pilot data showed performance improvements of ∼20% correct within 200 trials for the current task paradigm, more than half of the subjects did not show the comparable amount of learning in the scanner. This was a serendipitous finding, however, because it enabled us to separate subjects into learners and nonlearners and compare their fMRI data. Among those subjects who did learn, an average of ∼300 task trials were needed to achieve 20% correct improvement. Therefore, perceptual learning during the fMRI experimental session was more difficult than outside the scanner. Additionally, in the scanner, the subjects started at an average level of performance of only ∼60% correct, again indicating that performing the task in the scanner was more difficult. The difference in the initial performance level between learners (56%) and nonlearners (67%) was not significant (p > 0.05).
Eight subjects among 22 met the learning criterion. Percentage correct improvements for these learners averaged 27%. Ten subjects, termed nonlearners, who did not meet the criterion showed an average of 5% correct improvement over the course of training. Figure 3 shows the average learning index (slope of the learning curve) for learners (mean ± SEM, 0.12 ± 0.019) and nonlearners (mean ± SEM, 0.01 ± 0.007). The mean learning index of learners was significantly larger than that of nonlearners (independent samples t test, t(16) = 5.9; p = 2.0 × 10−5). Four remaining subjects showed a significant (p < 0.05) albeit small improvement with training, averaging only 11%.
Averaged slope of the learning curves (black line) for learners and nonlearners. A slope of the learning curve for each individual subject was obtained by fitting a linear function to percentage correct data over trials. Shaded areas depict SEM. The learners' mean slope was significantly larger than the nonlearners'. *, Independent samples t test, t(16) = 5.9, p = 2.0 × 10−5.
Averaged RT during the experiment was 634 ms for learners and 683 ms for nonlearners. We used the datasets that were divided into five time periods (T1–T5) for the analysis of RT data. A five (time periods) × two (subject groups) mixed-design ANOVA revealed that the main effects for time periods and subject groups were not significant (F(1,14) = 2.158, p = 0.16; F(1,14) = 0.981, p = 0.34, respectively). Thus, there was no change in the RT from T1 through T5, and no overall differences in the RT of learners compared with nonlearners. The time periods × subject groups interaction was also not significant (F(1,14) = 0.460; p = 0.51), indicating that the RT did not change over time for either group. Thus, although accuracy improved significantly over time for learners, their RT did not. Nonlearners showed no change in either accuracy or RT during training.
Of the six subjects tested for perceptual learning specificity, in which the sine wave gratings were rotated by 90° from their original orientation, all showed a significant drop in performance (averaging 14.9% drop after the orientation change; t = 2.79; p = 0.04), which was similar to the drop of ∼13.5% reported by Fiorentini and Berardi (1981). Of the two subjects tested for perceptual learning retention 2–4 weeks after the first session, the two subjects showed performance increases of 12.8 and 10.3%, respectively.
fMRI (learners)
The datasets that were grouped based on each learner's learning stages (0, ∼10, and ∼20% improvement, and plateau) were used for the analyses of their fMRI data. The average cumulative numbers of task trials at the end of each stage were 100 trials (0%), 190 trials (∼10%), 320 trials (∼20%), and 440 trials (plateau).
A one-way repeated-measures ANOVA (four learning stages; voxelwise) on the learners' group data revealed that the main effect of learning stage was significant bilaterally in several visual areas, including BA18, BA19, and FG (p < 0.001, familywise corrected). Hence, there was a change of activation at least from one learning stage to another in these areas. The results of trend analysis after ANOVA showed that BOLD signal in these visual areas were negatively correlated with behavioral improvement (p < 0.001, familywise corrected). Blue lines in Figure 4a show changes in averaged activation in the visual areas over the course of behavioral improvement, and Figure 4, b, d, and e, shows clusters, plotted onto inflated views of the brain, in the visual areas that negatively correlated with learning.
Brain areas showing activations correlated with learning. The results of trend analysis on learners' group data. a, Averaged BOLD signal in ROIs are shown as a function of performance improvement: visual areas in blue and attention-related areas in green. b–e, The clustered areas that survived familywise correction for multiple comparisons. The results are superimposed onto a single subject's inflated brain surface that was aligned to the group fMRI data in Talairach space. The colors were coded to represent p values in each voxel. b, Ventral view showing BA18, BA19, and FG. c, Dorsal view showing IPS, FEF, SEF, and dorsolateral PFC (DLPFC). d, Posterior view showing BA18, BA19, and IPS. e, Medial view showing BA18 and SEF. L, Left; R, right.
To localize these activations within visual cortex more precisely, we obtained retinotopic maps of five learners and overlaid them on the results of the trend analysis on individual subject data. Figure 5 shows the results of two representative subjects. On the left is shown the result of a contrast between the discrimination task and fixation. Stronger activations for the task condition were seen in visual areas V1–V4 and beyond. On the right is shown smaller extents of those areas that were negatively correlated with behavioral improvement (p < 0.05, uncorrected), including areas as early as V1 but mainly V3–V4 and beyond. These results were similar to those in the other three subjects for whom we had retinotopic maps.
Activations in early visual areas that responded to task versus fixation (left) and were correlated with learning (right) for two individual subjects. The results were superimposed onto each subject's flattened surface of the occipital lobe along with their retinotopic maps. The white lines represents boundaries between visual areas. The colors were coded to represent p values in each voxel. The results show higher activation during task than fixation trials and decreased activation as behavior improved. Activation changes were seen as early as V1 but mainly in V3–V4 and beyond. L, Left; R, right.
The results of the ANOVA (voxelwise) group data analysis also revealed a main effect of learning stages in areas of the attentional network, specifically, IPS, FEF, and SEF. A subsequent trend analysis revealed that the activations in these attention-related areas were negatively correlated with behavioral improvement (p < 0.001, familywise corrected). All the areas negatively correlated with learning were bilateral, except FEF (the right hemisphere only). Green lines in Figure 4a show average BOLD signal in the attention-related areas over the course of behavioral improvement, and Figure 4c–e shows the highly significant clusters in the attention-related areas that negatively correlated with learning. The pattern of activation changes in time during learning was remarkably similar for these attention-related areas to what we observed in the visual areas (Fig. 4a). All areas that were significantly correlated with behavioral improvement are listed in Table 1. The correlation was negative in all the areas, and no area showed significant increases in BOLD signal during learning.
In the previous analysis, we found that the activations in visual areas BA18, BA19, and FG and in attentional-related areas IPS, FEF, and SEF were negatively correlated with the behavioral improvement, i.e., BOLD responses decreased as the learners improved. To examine whether the magnitudes of improvement and decrease in brain activations were correlated with each other, we conducted simple linear regressions on the learning indices (slope of learning curves) and amount of decrease in brain activation during learning in those visual and attention-related areas that showed negative correlation with learning. Figure 6 illustrates that the amount of decrease in brain activations in each ROI accounted for an average of 78% of the variance in the learning improvement. The results were highly significant (F values ranged from 10.658 to 64.375, and p values ranged from 0.02 to 0.0002). Thus, the bigger the decrease of brain activation in these areas, the larger the subject's improvement per unit time. There was no difference in the correlation coefficients between any combinations of the ROIs (Steiger, 1980), indicating that all the predictor variables (amount of decrease of BOLD response in the six ROIs) correlated equally with the criterion variable (learning index).
Correlation between the magnitudes of learning and decreases in brain activation in six ROIs. Learning index (slope of the learning curve) was used as a measure of learning. The results for the ROIs in visual cortex (a) and for the ROIs in attention-related areas (b). Each data point for an ROI represents one individual subject. The correlations were all significant, simple linear regression; *p < 0.05, **p < 0.01, ***p < 0.001.
fMRI (learners and nonlearners)
The datasets that were divided into five time periods (T1–T5) were used to compare learners' and nonlearners' fMRI data, because nonlearners did not have learning stages. Thus, to directly compare the two groups, we had to analyze the learners' data in the same way as the nonlearners' data. First, we compared the pattern of activation during T1–T5 for the two groups. A two-way mixed-design ANOVA (voxelwise) on the group data revealed that the main effects of time (F = 3.630) and subject group (F = 8.518) were significant (p < 0.01, uncorrected) in BA17, BA18, BA19, FG, IPS, FEF, and SEF. A time × subject group interaction was also significant in BA18, BA19, FG, IPS, FEF, and SEF (F = 3.630; p < 0.01, uncorrected). When we examined the visual and attention-related areas more closely by repeated-measures one-way ANOVAs (ROI analysis) on each group's data, we found that the main effect of time in all the six ROIs was significant for learners (p values ranged from 0.03 to 0.002) but not for nonlearners (p values ranged from 0.07 to 0.27). These results indicate that brain activations of learners changed in these areas over the time course of training, whereas brain activations of nonlearners did not. Figure 7, a and d, shows the activation changes of the two groups over time for each ROI. After the one-way ANOVA on the ROI data, contrasts between times in each ROI revealed that averaged brain activations of learners significantly decreased from T1 to T3 in BA19, FG, IPS, and SEF, and from T1 to T4 in BA18 and FEF. Voxelwise comparisons of activations between T1 and T4 confirmed that there were changes in activations in both visual and attention-related areas for learners (Fig. 7b, left). Clusters of significant decreases of activation were found in BA18, BA19, FG, and SEF (p values ranged from 0.04 to 0.001, familywise corrected); clusters of decreased activation were also found in IPS and FEF, but the level did not reach significance at a corrected p value of <0.05. These results for learners were consistent with those of the previous learners correlation analyses (Fig. 4), although the familywise corrected significance levels were less for the clusters of differential activation between T1 and T4 than for those of the correlation with learning. The same contrast between T1 and T4 for the nonlearners showed only one significant cluster (Fig. 7b, right), located in the right IPS (p < 0.01, familywise corrected), which was higher in T4 than T1, unlike the learners in which the activation was lower in T4 than T1 in IPS. Unlike learners, nonlearners did not show reduced brain activations in either early visual or attention-related areas. For the nonlearners, with the exception of the right IPS, no other brain region showed an increase or decrease in activation during training.
Comparisons of learners' and nonlearners' group data. a, d, Averaged brain activations in the ROIs were plotted over the time course of training. The results are shown for the ROIs in visual cortex, BA18, BA19, and FG, and attention-related areas, IPS, FEF, and SEF. The error bars represent SEM. b, c, The results of voxelwise comparisons are superimposed on a surface model of the N27 (Holmes et al., 1998) in Talairach space. The top row shows the ventral view, and the bottom row shows the dorsal view of the brain. B, Contrasts of BOLD response between T1 and T4. The results of learners are shown on the left and nonlearners on the right. c, Contrasts of BOLD response during T1 between learners and nonlearners. The colors were coded to represent p values in each voxel: blue, negative t value; red, positive t value. L, Left; R, right.
The direct comparisons (voxelwise) of brain activations during T1 between the two groups revealed that the learners' BOLD response was significantly higher than the nonlearners' in BA18, BA19, FG, IPS, FEF, and SEF (Fig. 7c). Independent samples t tests on the ROI data also showed that the learners' averaged brain activations were significantly higher than the nonlearners' during T1 in all the six ROIs (p values ranged from 0.04 to 9.0 × 10−7) (Fig. 7a,d). In addition, in IPS and FEF, the learners' averaged brain activation was significantly higher than the nonlearners' during T2–T5 (p values ranged from 0.03 to 0.002) (Fig. 7d).
Based on these results showing a difference in BOLD response between the groups during T1, we speculated that the initial time period of training was a critical period for subsequent learning. Accordingly, we further examined the relationship between the magnitudes of learning and brain activations in the ROIs during T1. Figure 8 shows the results of simple linear regressions revealing that the brain activation during T1 in BA18, BA19, FG, IPS, FEF, and SEF accounted for an average of 52% of the intersubject variability in the slope of the learning curves across all the subjects, a highly significant result (F values ranged from 8.805 to 34.010, and p values ranged from 0.009 to 3.0 × 10−5). The activations during T1 within the six ROIs demonstrated significant positive correlation with the behavioral improvement. These results indicate that one could predict whether or not a subject would learn based on the magnitude of brain activation in these areas during the initial period of training. When we separated the subjects into the groups of learners and nonlearners, we found that, for the learners, the stronger the brain activation in BA19, FG, IPS, and SEF, the larger the improvement; brain activation during T1 in these regions accounted for an average of 64% of the intersubject variability. For the nonlearners, there was no significant correlation between the brain activation during T1 in any of the ROIs. Thus, not only could one predict whether learning would occur, one could also predict how well one would learn based on the brain activations during T1 in BA19, FG, IPS, and SEF.
Correlation between brain activation during T1 and magnitude of learning. Learning index (slope of the learning curve) was used as a measure of learning. Each data point for an ROI represents one individual subject. Simple linear regression; *p < 0.05, **p < 0.01, ***p < 0.001.
All the results thus far have shown a remarkable similarity between changes in visual and attention-related areas during perceptual learning. To assess the coupling in activations among these areas, we first explored the functional connectivity among the regions and compared learners with nonlearners using an ROI analysis. Figure 9a shows the connections between ROIs that were significantly stronger for the learners than for the nonlearners (red lines) and vice versa (blue lines). Independent samples t tests on averaged z scores in each ROI revealed that learners had stronger connections than nonlearners (p values ranged from 0.01 to 0.04) between the right IPS and FEF (attention-related areas) with the left BA18 and BA19. Learners also showed stronger connections between the right and left BA18 and between the right BA19 and the left BA17. Nonlearners, in contrast, showed stronger connections among visual areas, between the left BA19 and bilateral BA18 and bilateral FG. We then examined the results of voxelwise analyses of independent samples t test. The results were qualitatively similar to those of the ROI analysis (Fig. 9b,c), except that we saw additional functional connectivity. In particular, connections between the right IPS with BA18 and BA19 in the right hemisphere were stronger for the learners than nonlearners (Fig. 9d). Thus, the learners, relative to nonlearners, showed greater functional connectivity between attention-related areas in right hemisphere and visual processing areas in the right and left hemisphere.
Comparisons of learners' and nonlearners' functional connectivity among visual and attention-related areas. a, A schematic diagram showing the results of t test (ROIs) between learners and nonlearners. Red and blue lines represented functionally stronger connections between areas for the learners and nonlearners, respectively. Visual areas are in rectangles, and attention-related areas are in ovals. b–d, Results of t test (voxelwise) between learners' and nonlearners' z scores of connectivity. The results are superimposed onto a single subject's brain volume. The colors were coded to represent p values in each voxel; red represents correlations for learners > nonlearners, and blue represents correlations for nonlearners > learners. b, An axial view (z = −5) showing stronger connections between the left BA19 (seed ROI) and bilateral FG for nonlearners. c, An axial view (z = 44) showing stronger connections between the left BA19 (seed ROI) and IPS/FEF in the right hemisphere for learners. d, A coronal view (y = −86) showing stronger connections between the right IPS (seed ROI) and bilateral BA18/BA19 for learners. L, Left; R, right.
Discussion
The current study examined the correlation between changes in neural activations measured with fMRI and changes in behavioral improvement as subjects trained on a contrast discrimination task. Changes in brain activation were tracked not only within but also outside of the visual cortex to examine their functional coupling. Differences in brain activations between learners and nonlearners were probed to explore possible mechanisms that might determine whether or not a subject would learn. We found the following: (1) among learners, brain activations both in visual areas BA18, BA19, and FG and in attentional-related areas IPS, FEF, and SEF decreased as learners improved in accuracy, and the decrease in BOLD response in those areas correlated with the magnitude of perceptual learning; (2) unlike learners, nonlearners did not show changes in brain activations during training; (3) the degree of activation in visual and attention-related areas during the early period of training predicted whether a subject would learn or not, and, among learners, how well one would learn; and (4) over the course of training, functional connectivity between attention-related areas (IPS and FEF of the right hemisphere) and visual areas (BA18 and BA19 of both hemispheres) were stronger for learners than nonlearners.
The results provide compelling evidence that the areas we identified play an important role in perceptual learning. Additionally, the specificity of the learning and its stable retention over time provide supportive evidence that the observed behavioral improvements were indeed perceptual learning. Finally, because we familiarized our subjects with the task during discrimination threshold testing, it is unlikely that the learning effects reflected procedural learning of a strategy to perform the task (Karni and Bertini, 1997; Hawkey et al., 2004).
Changes in visual cortex activation during perceptual learning
Within visual cortex, we found that activations decreased in BA18, BA19, and FG as subjects learned. The results, when superimposed on retinotopic maps of individual subjects, revealed that the reduced activations were seen as early as V1 but mainly in V3–V4 and beyond, all of which contain populations of neurons that are sensitive to stimulus contrast (Boynton et al., 1999; Reynolds et al., 2000; Gardner et al., 2005; Lu and Roe, 2007). Previous studies reported perceptual learning effects in early visual areas, including V1 and V4, when subjects were trained on fine discriminations of simple visual features, such as orientation and texture (Schiltz et al., 1999; Schoups et al., 2001; Schwartz et al., 2002; Furmanski et al., 2004; Yang and Maunsell, 2004; Sigman et al., 2005; Raiguel et al., 2006), and in motion-sensitive areas MT and MST, when subjects were trained on motion discrimination tasks (Zohary et al., 1994; Vaina et al., 1998). These results, together with ours, suggest that learning occurs in visual areas in which the task-relevant visual information is processed.
Although the current study and a PET study by Schiltz et al. (1999) found reduced activations in visual cortex with learning, other fMRI studies have reported the opposite effect, i.e., enhanced activations (Schwartz et al., 2002; Furmanski et al., 2004; Sigman et al., 2005; Turk-Browne et al., 2007). It remains a puzzle, however, as to which conditions lead to reduced or enhanced activations as a result of perceptual learning. It should be noted that brain imaging studies of motor skill learning have also reported both types of effect (Karni et al., 1995; Doyon et al., 2002; Wu et al., 2004), and no hypothesis has yet provided a sufficient explanation for this unresolved issue.
One possible explanation for reduced activations is that they reflect a sharpened tuning of neuronal populations representing the trained stimulus features. It has been suggested that a reduced response to repeated objects, i.e., repetition suppression, coupled with improved performance for the repeated stimuli may be attributable to sharpened tuning of neuronal representation that are most sensitive to the repeatedly presented stimulus feature (Desimone, 1996; Wiggs and Martin, 1998). During learning in the current study, neurons that are not tuned to given contrasts might have dropped out of the pool of responsive cells. It therefore may be that the reduced activation in our study reflects a sharpened tuning of neuronal representation, resulting in the learners' improvements in accuracy.
The results from several physiological recording studies in monkeys support our hypothesis of sharpened tuning of neuronal representations as a result of perceptual learning. Schoups et al. (2001) found that training of an orientation discrimination causes an increase in the slope of the tuning curve of V1 neurons. In addition, Yang and Maunsell (2004) found that the average bandwidth of the orientation tuning curve becomes narrower for the trained population of neurons in V4, especially for neurons whose preferred orientation was close to the trained orientation, a finding confirmed by Raiguel et al. (2006). Moreover, in the current study, subjects who showed larger behavioral improvement also showed a greater reduction of brain activations in visual cortex. This finding suggests that a reduction of activated neurons was associated with better performance. According to this argument, the remaining population of activated neurons had enhanced tuning curves for the trained stimulus contrasts to distinguish them more efficiently.
Changes in attention-related areas during perceptual learning
We found that BOLD signals in IPS, FEF, and SEF, cortical areas associated with modulating attentional signals (Corbetta et al., 1998; Kastner et al., 1999; Hopfinger et al., 2000; Giesbrecht et al., 2003; Woldorff et al., 2004), were also negatively correlated with behavior improvement. This result suggests an involvement of attention in perceptual learning in the current study, as has been demonstrated by several previous psychophysical studies. For example, Shiu and Pashler (1992) found that subjects do not improve at an orientation discrimination task if their attention is directed to another feature of the stimulus (e.g., brightness). Similarly, Ahissar and Hochstein (2002) found that subjects' learning to detect a target within a horizontally or vertically elongated array does not transfer to a feature of the array they do not attend, namely, the orientation of the array. Although Watanabe and his colleagues (Watanabe et al., 2001, 2002; Seitz and Watanabe, 2003) have shown that it is not necessary for attention to be directed to a visual feature for that feature to be learned, they admit that attention plays an important role in perceptual learning (Seitz and Watanabe, 2005).
Additionally, we found that the learners had initially higher brain activation in attention-related areas than the nonlearners. This result suggests that subsequent learning depended on enhanced attention to the task initially, and the reduction of activation in attention-related areas during learning may represent a reduced requirement for attention as the task became easier (Vaina et al., 1998). This may also have contributed to the reduced activations within visual cortex (Kastner et al., 1998). However, we cannot assess this possibility because we did not directly manipulate or monitor attention in our study.
Supportive evidence for the involvement of attention in perceptual learning in the current study was the finding that the functional connectivity between attention-related areas IPS and FEF in the right hemisphere with early visual areas BA18 and BA19 was stronger for the learners than for the nonlearners, whereas nonlearners showed stronger connectivity among visual areas than learners. The results may indicate that bottom-up information was insufficient for perceptual learning in the current study; that is, it required enhanced top-down attentional signals. Our data on functional connectivity may be understood in the context of a rich neuropsychological literature. Hemispatial neglect, i.e., inattention to the contralesional hemifield, is seen most often for patients with right parietal lesions (Mesulam, 1981, 1999). This has suggested that the right hemisphere attends to both the ipsilateral (right) and contralateral (left) visual hemifields, whereas the left hemisphere attends only to the contralateral (right) hemifield (Heilman and Van Den Abell, 1980; Corbetta et al., 1993; Nobre et al., 1997). The greater functional connectivity we found between the right IPS and early visual areas suggests that, for learners, there was enhanced integration of information processing between attention-related regions and early visual areas.
An early prediction of who will learn
In the current study, the subjects were classified into two groups, learners and nonlearners, based on their performance improvements during training. The nonlearners had been given essentially the same amount of training as the learners, and yet they did not show significant improvement in performance. One of the prominent differences in brain activations between the learners and nonlearners was that the learners relative to the nonlearners showed higher brain activations in both visual and attention-related areas during the initial period of training. Furthermore, our results showed high correlations between the individual subject's amount of improvement and their initial BOLD response in both visual and attention-related areas, indicating that one could predict the likelihood that a given subject would learn based on the degree of initial activations in these areas. We thus speculate that selective attention to the stimuli during the initial period of training was critical for the subsequent learning.
Footnotes
-
This work was supported by the Intramural Research Program of the National Institutes of Health–National Institute of Mental Health. We thank Dr. Jeff H. Duyn for providing essential support for our study; Dr. Patrick Bellgowan for valuable discussion; Olivia Wu, Dr. Kathleen Hansen, and James Cheh for comments on this manuscript; and Drs. Ziad Saad, Rick Reynolds, and Gang Chen for suggestions on fMRI data analysis.
- Correspondence should be addressed to Ikuko Mukai, Laboratory of Brain and Cognition, National Institute of Mental Health–National Institutes of Health, Building 10, Room 4C104, 10 Center Drive, Bethesda, MD 20892-1366. mukaii{at}mail.nih.gov