Abstract
Previous studies have shown that perceptual learning can substantially alter the response properties of neurons in the primary somatosensory and auditory cortices. Although psychophysical studies suggest that perceptual learning induces similar changes in primary visual cortex (V1), studies that have measured the response properties of individual neurons have failed to find effects of the size described for the other sensory systems. We have examined the effect of learning on neuronal response properties in a visual area that lies at a later stage of cortical processing, area V4. Adult macaque monkeys were trained extensively on orientation discrimination at a specific retinal location using a narrow range of orientations. During the course of training, the subjects achieved substantial improvement in orientation discrimination that was primarily restricted to the trained location. After training, neurons in V4 with receptive fields overlapping the trained location had stronger responses and narrower orientation tuning curves than neurons with receptive fields in the opposite, untrained hemifield. The changes were most prominent for neurons that preferred orientations close to the trained range of orientations. These results provide the first demonstration of perceptual learning modifying basic neuronal response properties at an intermediate level of visual cortex and give insights into the distribution of plasticity across adult visual cortex.
Introduction
Perceptual learning involves improvements in sensory abilities induced by extensive training (for review, see Goldstone, 1998). In the somatosensory and auditory systems, perceptual learning has been shown to be associated with changes in the response properties of cortical neurons that are consistent with the behavior improvement (Jenkins et al., 1990; Recanzone et al., 1992a,b, 1993). These neurophysiological results, together with those from psychophysical experiments showing that perceptual learning in the visual system can be highly specific for certain stimulus configurations (Sagi and Tanne 1994; Karni and Bertini 1997), lead to the expectation that visual perceptual learning will involve changes in the responses of neurons in early visual cortex (Gilbert et al., 2001).
Nevertheless, recent experiments that have examined the effects of perceptual learning on the response properties of neurons in the early stages of visual cortex have failed to find strong effects. Ghose et al. (2002) found that training monkeys extensively on an orientation discrimination task did not change the selectivity or responsiveness of neurons in V1 or V2 in ways that could account for perceptual learning. Using similar training, Schoups et al. (2001) similarly found only modest effects, restricted to a subtle change in the slopes of orientation tuning curves in V1. Neither study found reliable changes in the distribution of orientation preferences or the sharpness of orientation tuning. Crist et al. (2001) showed that extensive training on a bisection discrimination task could affect how much V1 neurons were influenced by stimuli outside their receptive fields, but again basic receptive field properties were unchanged.
In each of these studies, animals were well trained and showed substantial improvement in sensory abilities over the course of training. Although it is possible that these studies overlooked properties that were changed by the training, it is striking that none revealed effects that were close to the size of those reported for the somatosensory or auditory systems. One possible explanation for this discrepancy is that the plasticity of neuronal response properties associated with perceptual learning is mediated primarily in later stages of the macaque visual cortex.
Here, we describe experiments that examined plasticity in a later stage of visual cortex, area V4. V4 lies in the ventral pathway of visual processing, several stages removed from V1. Nevertheless, it is early enough in cortical processing that most of its neurons respond robustly to relatively simple stimuli, making it practical to use the same stimuli and tasks that have been used to examine plasticity in V1 and V2. These properties make it an ideal locus for examining the relative strength of the effects of perceptual learning in later stages of extrastriate cortex. We report here that perceptual learning can cause overt changes in the orientation tuning of V4 neurons. Some of these data have appeared in abstract form (Yang et al., 2001).
Materials and Methods
Behavioral tasks. We trained three rhesus monkeys (Macaca mulatta) to do an orientation match-to-sample task (Fig. 1) for juice rewards. All stimulus generation and behavioral monitoring were under computer control. The animal had to fixate a central fixation point throughout each trial. Five hundred milliseconds after the animal acquired fixation and pressed a lever, a sample stimulus appeared for 500 msec, followed by a 500-550 msec delay during which only the fixation point was on the screen. A second stimulus then appeared, and the animal had to use a lever to report whether the orientations of the sample and test stimuli were the same. On each trial, the animal had to release the lever within 600 msec of the appearance of the test stimulus or continue to depress it throughout that period, depending on whether the stimuli matched.
Orientation match-to-sample task. Monkeys held their gaze on a central fixation point while two stimuli were presented sequentially, separated by a brief delay. The stimuli were temporally counterphasing Gabors (σ = 0.5°) that were oriented close to 45°. They were centered at 1.5° azimuth and -2.6° elevation (3.0° eccentricity and -60° polar angle). Monkeys used a lever to report whether the orientations of two stimuli were the same. Monkey 1 was trained to release the lever if the sample and test stimuli were the same but to continue to hold when they differed. Monkeys 2 and 3 released the lever for nonmatching orientation and continued to hold for matches. To ensure that the monkeys did the task using only orientation, in each trial the two stimuli had different spatial frequencies (1 cycle/degree and 4 cycles/degree). The sample and test stimuli here differ by 3°, which was close to the threshold for monkeys 2 and 3.
The stimuli were temporally counterphasing Gabors (σ = 0.5°; 4 Hz sinusoidal contrast modulation; peak contrast, ∼100%). The stimuli appeared on a background of uniform gray that had the same mean luminance as the Gabors. After the earliest stage of training, the stimuli were always presented in the same retinal location (3° eccentric, 1.5° azimuth right, and -2.6° elevation), and the orientation of the Gabors was slightly offset (clockwise or counterclockwise) from 45°. The size of the orientation offset depended on the animal's performance (see below). Matching stimuli had orientation offset in the same direction, whereas nonmatching stimuli did not. The offset of the sample stimulus from 45° was selected randomly on each trial to be clockwise or counterclockwise. The spatial frequency of the Gabors was one of two values (1 cycle/degree and 4 cycles/degree), with the sample and the test stimuli in each trial always assuming different values to discourage the monkey from using cues other than orientation to perform the task, such as an luminance change at a certain screen area or a potential rotation illusion to detect an orientation change.
Each monkey underwent a brief initial training with vertical and horizontal orientations and identical spatial frequencies to learn the matching task. During this phase, the stimuli were centered on the display, and there was no fixation control. Once the basic task had been learned, we implanted a headpost and a scleral search coil, enforced fixation, and used eccentric stimuli for all subsequent training. Animals were required to hold their gaze within 0.75° of the fixation point. A typical training session lasted from 2 to 4 hr, during which the monkeys performed 1000-2000 correct trials. The orientation difference between nonmatching sample and test stimuli was reduced whenever the animal performed the task at over 80% correct for at least 200 trials. The orientation difference was adjusted manually during the training of monkey 1. For monkeys 2 and 3, the computer monitored performance and adjusted the difficulty automatically. In addition, we added distractor stimuli when training monkey 1. These distractors were temporally counterphased Gabors that appeared simultaneously with the sample and test stimuli. A total of 18 Gabors were presented on the screen, including the stimulus at the trained location (which was the only behaviorally relevant stimulus) and 17 others. The orientations and spatial frequencies of the distractors were varied from trial to trial. The 18 stimuli were arranged in three concentric rings at eccentricities of 1.5°, 3°, and 6°, each containing six stimuli, with the size of the Gabors scaled with distance from the fixation point (σ of 0.25°, 0.5°, and 1.0°).
We trained each monkey to do an additional simple match-to-sample task using obliquely oriented lines that were presented at fixation (length, 0.28°; eccentricity, 0.14°) with the same timing that was used for the primary task. As with the primary task, monkeys used a lever to report whether the orientation of sample lines and test lines were the same. The orientations of the sample and test lines were always either the same or orthogonal. During recording sessions, the central matching task was used to direct the animal's attention away from the peripheral stimuli that were used to measure orientation and spatial frequency tuning, to reduce the chance that these stimuli would affect the animal's training or that attention to orientations close to the trained orientation would distort response functions. The monkeys' performance at this central fixation task was over 90% correct.
Recording techniques. When the animal's performance had asymptoted (after 100,000-150,000 trials) (Fig. 2 A), recordings were made from V4 in both cerebral hemispheres. The hemisphere contralateral to the trained location provided neurons with receptive fields overlapping the trained location. The hemisphere ipsilateral to the trained location provided neurons with receptive fields that overlapped a location in the opposite hemifield that was mirror-symmetric to the trained location. We refer to this mirror-symmetric position as the untrained location. Neurons with receptive fields overlapping the untrained location provided a control sample against which the effect of training could be compared.
Behavioral performance. A, Improvement in orientation difference threshold. Each symbol represents data from a different subject. Each set of data was fit with an exponential, and the parameters for each exponential are given in the key. B, Spatial specificity of training. Performance thresholds were measured for monkeys 2 and 3 at different locations using a staircase procedure. Thresholds rose rapidly for even small stimulus offsets, demonstrating spatial specificity of the training. Each set of data was fit with a line that was constrained to intercept the y-axis at the animal's threshold at the trained location.
For each hemisphere, a recording chamber was implanted over V4, and a craniotomy was made to allow access to transdural Pt-Ir microelectrodes (impedance, ∼1 MΩ at 1 kHz). The electrodes were advanced by a hydraulic microdrive mounted over the chamber before daily recording sessions. Spikes from individual neurons were isolated using a window discriminator, and the time of their occurrence was recorded with a resolution of 1 msec. When searching neurons in V4, we presented Gabors with different sizes, spatial frequencies, and orientations at a rate of 25 per second at the trained (or untrained) location while animals were performing the central match-to-sample task. From our previous experience, the fast flashing stimuli were effective stimuli for generating responses from neurons with different orientation, spatial frequency, and size preferences. Data were recorded from all neurons that responded to these stimuli.
Once the spikes of a neuron were isolated, we measured its orientation and spatial frequency tuning. Responses to eight orientations (22.5° intervals) and five spatial frequencies (0.5, 1, 2, 4, and 8 cycle/degree) were recorded. Different stimuli were interleaved randomly, and at least eight repetitions of each stimulus were presented. The stimuli for these measurements were Gabors of the same size (σ = 0.5°) as the training stimulus, centered on either the trained or the untrained location. Monkeys performed the central match-to-sample task while these measurements were made. Approximately 100 neurons were recorded from each hemisphere in the course of ∼20 electrode penetrations. We recorded both trained and untrained samples in monkeys 1 and 2 and a trained sample in monkey 3.
In monkeys 1 and 3, we made additional recordings to determine whether neuronal responses differed when the animal performed the trained task. For these recordings, we interleaved trials of the central and peripheral match-to-sample tasks, with the peripheral match-to-sample stimuli present in both cases.
Specificity of behavioral training. After all recordings were completed, we tested the specificity of the training by measuring behavioral performance on the matching task using stimuli presented with different orientations and in different locations. The threshold for each condition was obtained using a staircase procedure that converged at 79% correct. For monkey 1, which was the subject that performed the task with distractor stimuli, the threshold was measured with and without distractors. This animal showed no difference between two conditions, therefore, the data were combined in Results.
Data analysis. For each neuron, responses to different orientations were fit with a wrapped Gaussian using a downhill simplex method (Press et al., 1988): 1
Four parameters were obtained from this fit: preferred orientation θpref, bandwidth σ, tuning amplitude a, and baseline firing rate b. The neuronal response variance was determined by a least squares fit of a proportionality constant to the spike counts and spike count variance from responses to each of the orientations. To assess the ability of individual neurons to detect small differences in orientation, we used a d' measure (Green and Swets, 1966). d' was based on the best-fitting functions for orientation tuning and variance. For each neuron, d' was the ratio of the slope of the orientation tuning curve at a given orientation, r′(θ), divided by the predicted variance of response to the stimulus at that orientation: 2 We also defined discriminability as the inverse of d', which corresponds to the smallest orientation changes that would permit 75% performance in a two-alternative forced choice task.
To compare neuronal performance to behavioral performance, we used a similar method to calculate the d' of the neuron for discriminating a 6° orientation difference ∼45°, which was comparable with the threshold of our worst-performing monkey. To estimate the neuronal performance of a population of n neurons, we used a bootstrap method. A sample was created by selecting n neurons randomly from the recorded population without replacement. We then calculate the d' of the sample population using the following equation: 3 We repeated the same procedure 1000 times and used the mean of the sample populations as an estimation of the population d'.
Results
Behavioral performance
Three monkeys received extended training on the orientation match-to-sample task. Their thresholds for detecting an orientation change improved rapidly initially and approached an asymptote after ∼6 months and many thousands of trials. Monkey 1 reached a threshold near 5° (Fig. 2A). The training of the other two animals differed in that a computer-controlled staircase method kept the orientation change near threshold throughout training. These other animals improved more quickly and achieved thresholds that were near 2°. We do not think using distractors when training monkey 1 was the cause of the performance difference. We previously trained another monkey without distractors and without a computer-controlled staircase, and its behavior and neurophysiology was comparable with monkey 1 (Ghose et al., 2002). Each animal achieved performance that was far superior to naive human subjects, who typically have thresholds between 15 and 20°.
We measured the effects of stimulus location on the monkeys' performance. We tested the behavioral threshold for monkey 1 for stimuli presented in a visual hemifield opposite to the trained location. The threshold there was 11.0°, about twice that in the trained location. The threshold for monkey 1 appeared to be uniform within the trained quadrant (Ghose et al., 2001). The behavioral testing of monkeys 2 and 3 was more extensive than that for monkey 1. Within the trained quadrant, their performance declined with distance from the trained location (Fig. 2B), with thresholds being about twice as high when the stimulus was offset by as little as 2°, an offset comparable with the size of typical V4 receptive fields at this eccentricity (Desimone and Ungerleider, 1986). Overall, we found the learning specific to the trained location in all monkeys. The lack of location specificity within the trained quadrant in monkey 1 may be attributable to the fact that it did not achieve the same performance level as the other two monkeys.
The training was also specific to orientation. The monkeys' thresholds increased for stimuli centered on orientations other than 45°. However, there is evidence that the thresholds measured at other orientations did not reflect the monkeys' true ability to discriminate orientation differences. Thus, we were unable to get accurate assessments of the effect of stimuli orientation (for more detailed results and discussion, see Ghose et al., 2001).
The orientation-change thresholds were not affected by small adjustments of the spatial frequencies of the stimuli (p ≫ 0.05; spatial frequencies from 1 to 6 cycles/degree, fixed or varying between sample and test). The threshold was also unchanged when we altered the temporal frequency (0-8 Hz) or spatial or temporal phases of the Gabor stimuli. These results confirm that the monkeys were performing the task based on orientation.
Effects of training on the orientation tuning of neurons
We recorded the responses of 524 individual V4 neurons that had receptive fields overlapping the trained location. These neurons comprise a group we refer to as the trained population. Four hundred thirty-eight neurons recorded in the other cerebral hemisphere that had receptive fields in a location mirror symmetric across the vertical meridian served for comparison (the untrained population). To ensure that the appropriate retinal location was sampled, responses from the trained population were recorded using Gabors that were the same size as the training stimuli and centered on the trained location, rather than adjusted to the receptive field location and size of the cell. The control populations were similarly tested using Gabors centered on the mirror-symmetric location.
The neurons that we recorded were well driven by these stimuli. Figure 3 shows the distributions of neuronal responses to their preferred Gabor orientation and spatial frequency. The median responses for the trained and control populations were 26.6 and 24.7 spikes/sec, which compares favorably with responses recorded from V4 using Gabors of optimal size and position (McAdams and Maunsell, 1999).
Strength of neuronal responses. The strongest response for each neuron was taken from the eight orientations and five spatial frequencies that were tested. The median responses are shown in dashed lines (26.6 spikes/sec for the trained population and 24.7 spikes/sec for the control population). Neurons in both populations were adequately driven by the stimuli centered on the trained and untrained locations.
Responses to eight different orientations were recorded from each neuron and fitted with a wrapped Gaussian function (Fig. 4). The fits yielded four parameters: preferred orientation, tuning bandwidth, tuning amplitude, and baseline response. Additionally, we estimated the variance of the response of each cell by finding the proportionality constant that best predicted the variance of responses to each orientation based on the mean response. Finally, we calculated the maximum d' of a neuron across all orientations based on the fitted orientation tuning curve and estimated response variance (see Materials and Methods).
Determination of orientation tuning. The average rates of firing of one neuron in response to different orientations are plotted (8-16 repetitions of each orientation). Error bars are ± 1 SEM. The solid curve is the best-fitting wrapped Gaussian function. Peristimulus time histograms of the response to four orientations are shown in the inset. Dark portions of the histograms represent the 500 msec when the stimulus was present.
The overall effect of training on orientation tuning curves in V4 is shown in Figure 5, which combines data from the neurons recorded from monkeys 1 and 2. Plots in the first four columns show distributions for each of the parameters from the orientation tuning functions, with separate distributions for the trained (top row) and control (bottom row) populations.
Orientation tuning of the trained and untrained populations. Each column contains distributions for one orientation tuning parameter for trained (black) and untrained (gray) populations. The dashed lines represent median values. p values are shown where there is a significant difference between the trained and the control populations. Training reduced the width and increased the height of orientation tuning curves. These effects combined to significantly improve discriminability.
Although the trained and control populations did not differ dramatically for any value, some of the differences were statistically significant (Table 1). There was a bias in the distribution of preferred orientations in the trained population (Raleigh test; p < 0.05). Unexpectedly, the number of neurons with preferred orientations near the trained orientation was significantly less than the average (V test; p < 0.05). For individual monkeys, only monkey 1 showed a significant bias. No bias existed in the control population.
Tuning parameters of V4 neurons
The average bandwidth of orientation tuning was 13% narrower (4.5°) for the trained population. Both animals showed this trend, but it reached statistical significance only for the combined data. Although we did not collect control data from monkey 3, the average bandwidth for the trained population in this subject was similar to that for monkeys 1 and 2 (31.0 ± 2.1° vs 32.5 ±1.2°).
The amplitude of orientation tuning curves in V4 (i.e., the difference between responses to the preferred and null orientations) also increased with training. The average amplitude increased by 14% (2.1 spikes/sec) after orientation discrimination training. This difference reached statistical significance for each animal individually. The baseline firing (i.e., response to the null orientation) did not show consistent changes. There was a significance decrease with training for monkey 1 but a slight increase for monkey 2, leading to no significant effect for the combined data.
To assess the ability of a neuron to discriminate different orientations, we calculated its d' and discriminability. Because these values are functions of orientation, we took the best performance (smallest discriminability and largest d') of each cell across all orientations. The final columns in Figure 5 and Table 1 show that training increased d' by 24% (0.009), corresponding to improving discriminability by 33% (4.2°). This effect was significant for each animal individually.
The values in Figure 5 and Table 1 are based on fitted orientation tuning functions. An alternative approach to assessing the effects of training on orientation tuning that does not depend on fitted functions is to construct population tuning curves. Figure 6 shows such curves for the trained and untrained populations. The curves were constructed by normalizing the responses of each neuron to its response to its preferred orientation, shifting the data left or right to bring the orientation to the center of the x-axis, and then averaging responses at each orientation across neurons. Then, we fitted Gaussian curves to each set of average responses (trained and untrained). Consistent with expectations from the distributions of parameters from fits to individual neuron responses, the average orientation tuning for the trained population was substantially narrower, and the amplitude of the tuning function was larger. The bandwidth of the Gaussian for the trained population was 24.3°, whereas that for the control population was 30.0°. The amplitude of the tuning curve for the trained population was 0.51, whereas that for the untrained population was 0.41.
Population orientation tuning curves aligned to the preferred orientation of each cell. The tuning curve of each cell was normalized to its maximum response, and its preferred orientation was assigned a value of 0°. Black points are the average for the trained population; gray points are the average for the untrained population. The average points for each population have been fit with a Gaussian function. The trained population has sharper tuning (σ = 24.3° vs σ = 30.0°). The error bars are the SEM, which are smaller than the symbols for some points.
Orientation tuning as a function of preferred orientation
Training did not significantly increase the number of neurons that preferred orientations close to the trained orientation. Nevertheless, we were interested in seeing whether its effects on orientation tuning bandwidth and amplitude were preferentially distributed among those neurons that responded best to the trained orientation.
The left column in Figure 7 shows the average orientation tuning bandwidth, amplitude, and maximum d' as a function of the preferred orientation of the neurons. Values from trained (black) and untrained (gray) populations are plotted separately. The vertical offsets between the two lines in each pair reflect the overall changes between the trained and control populations that were described above. Additionally, there is a tendency for the differences between the trained and control values to be larger for neurons that preferred orientations close to the trained orientation (solid lines), compared with cells that preferred orientations far away from the trained orientation (dashed line). We examined this further by separately pooling the data from neurons with preferred orientations in the two ranges (right column) and testing the effects of training and preferred orientation using a two-way ANOVA (Table 2).
Orientation tuning parameters as a function of the orientation preference of each cell. In the left column, values for tuning bandwidth, amplitude, and maximum d′ are plotted as functions of the orientation preference of the neurons from which they were obtained. Black lines indicate the trained population, and the gray lines indicate the untrained population. Solid lines indicate the half of the orientation range closest to the trained orientation (45°), which is at the center of the x-axis. Error bars are SEM and are plotted on only one side of each line. Bar plots on the right pool the values from the orientations closest to and furthest from the trained orientation separately for the trained and untrained populations. p values are shown when there is a significant difference between groups. See also Table 2. The effects of training on bandwidth and maximum d′ were most pronounced among neurons with orientation preferences close to the trained orientation.
Two-way ANOVA for the Gaussian fitting parameters
Within the trained population, cells preferring orientations near the trained orientation had significantly narrower tuning curves and higher maximum d' than neurons preferring other orientations. The change in tuning curve amplitude was not significantly different between the two groups. As expected, no significance differences related to orientation preference were seen in the untrained population. Thus, training showed some specificity not only for neurons with receptive fields in the trained location but also for those neurons that preferred orientations close to the trained orientation.
Spatial frequency tuning
From the earliest stages of training, the animals performed the match-to-sample task using two specific spatial frequencies (1.0 and 4.0 cycles/degree). Because spatial frequency was irrelevant for this task, it provides a check for the effects of extensive exposure to particular stimuli. Responses to a range of spatial frequencies were collected from each neuron, and these data were used to construct spatial frequency tuning functions (see Materials and Methods). The spatial frequency tuning for the trained and control populations are compared in Figure 8. Figure 8A shows the distribution of preferred spatial frequencies. The distribution of spatial frequency preference did not significantly differ between the two groups (paired t test; p > 0.9), nor was there any obvious increase in the number of neurons preferring the two spatial frequencies used in the task. Similarly, there were no significant differences between the average bandwidths or amplitudes of the spatial frequency tuning curves for the trained and controlled populations (Fig. 8B). Thus, the effects of training seem to be specific to the stimulus parameter that was relevant to the performance of the task.
Spatial frequency tuning. A, The distribution of preferred spatial frequency for trained and untrained neurons. B, Comparison of tuning bandwidths of neurons from the trained and untrained populations. C, Comparison of tuning amplitude of neurons from the trained and untrained populations. The error bars are 95% confidence intervals. Training on the orientation discrimination task did not produce detectable changes in spatial frequency tuning.
Effects of behavioral state
The analyses presented above were based on recordings done while the monkeys were not performing the orientation discrimination task (see Materials and Methods). Responses were collected this way so that we could freely alter stimulus parameters without affecting training, and to reduce the possibility that the animal might attend preferentially to stimuli oriented close to the trained orientation, leading to distortions of the tuning curves.
We wanted to know whether neurons would respond differently when monkeys were engaged in the orientation match-to-sample task. Because we could not obtain reliable tuning curves when the monkeys were performing the discrimination task, the effects of performing the task were examined using the task stimuli (i.e., orientations close to 45°). For two animals (monkeys 1 and 3), we interleaved blocks of trials in which the animals did the orientation discrimination task and with blocks of trials in which they worked on a simple match-to-sample task at a central fixation point while the same stimuli were presented at the trained location (see Materials and Methods).
Neurons responded more strongly when the monkeys performed the task using the receptive field stimuli. Each point in Figure 9A indicates the average response of one neuron to a given stimulus in the two behavioral conditions. The vertical axis represents responses while the animal was doing the task using the receptive field stimuli, and the horizontal axis represents responses while the animal was doing the central matching task. Responses were stronger when the animal performed the match-to-sample task (mean responses: monkey 1, 20.6 ± 2.0 vs 18.4 ± 1.9 spikes/sec, +12%; monkey 3, 23.7 ± 2.0 vs 17.8 ± 1.6 spikes/sec, +33%). Monkey 3 may have shown a larger effect of attention because it performed the task with a smaller orientation change (2 vs 6° for monkey 1). Consistent with previous results from V4 (McAdams and Maunsell, 1999), performing the task caused no significant change in the variance of neuronal responses, as measured by the Fano factor (mean ratio between tasks: monkey 1, 2.9 ± 0.2 vs 2.8 ± 0.2 spikes/sec; monkey 3, 1.7 ± 0.1 vs 1.6 ± 0.1 spikes/sec). A small increase in neuronal responses when animals were attending to the stimulus in the receptive field is consistent with many previous studies (Haenny et al., 1988; Motter, 1994a,b).
Effect of behavioral state on neuronal responses. The left column shows the data from monkey 1, and the right column shows data from monkey 3. The x value of each point corresponds to data collected while the animal was ignoring the stimulus in the receptive field; the y value corresponds to data collected while the animal was using the stimulus to do the match-to-sample task. The dashed lines are the diagonals, and the dotted lines are the fitted lines. A, Average rate of firing in response to stimuli under the two behavior modes. B, Fano factor. The Fano factor is the ratio of the variance in the number of spikes counts to the mean number of spike counts across stimulus presentations. It provides a measure of the reliability of the response a neuron. Attention to the stimuli increased the strength of responses slightly but did not change the Fano factor appreciably.
V4 neuronal population performance and comparison to V1/V2
Although the changes in orientation tuning properties of V4 neurons were significant, it was unclear whether such changes could account for the observed behavioral improvement. Based our sample populations, we calculated the performance of the V4 neurons for discriminating orientations close to 45° (Fig. 10) (see Materials and Methods). For V4 neurons, when we pooled neurons from the whole population, we did not observe a significant difference between the trained population and the untrained population. Forty-nine trained neurons or 50 control neurons are required to reach the 79% performance. However, when we pooled only neurons with preferred orientations close to 45°, the trained population showed a significantly better performance than the untrained population. To reach the same behavior performance, only 25 trained neurons were required whereas 34 control neurons were needed. This result is consistent with training effects that were largest among the neurons that were most relevant in the task (Fig. 7).
Population performance. The neuronal population d′ is plotted as a function of population size. The d′ is based on discriminating a 6° orientation difference centered on 45° (see Materials and Methods). Plots in the left column are based on all the neurons recorded in each population. Plots on the right are based on only the neurons with preferred orientations within 45° of the trained orientation. The solid lines are trained populations, and the dashed lines are the untrained populations. The horizontal dotted line is the d′ corresponding to 79% neuronal performance. The number of neurons required for each population to achieve this performance is indicated for each line.
Our laboratory has previously reported that this same orientation discrimination task did not produce detectable changes in either V1 or V2 (Ghose et al., 2002). One of the subjects in this study (monkey 1) was also a subject in the previous experiment. The top two rows in Figure 10 show a reanalysis of the V1 and V2 data from that study. Although training did not affect neurons in V1 and V2 appreciably, the performance of those neurons is superior to those in V4, even after training. This difference may arise from differences in the way the measurements were made in the two studies. For V1 and V2, responses were measured using a stimulus that had a size and position that was optimized for each neuron. For V4, we always used stimuli that were located at the center of the training location and that had the same size as the training stimuli. This allowed us to record neurons more rapidly and provided a good estimate of neuronal responses during the trained task. It is possible that testing V4 neurons with stimuli that were not optimized produced an offset between the values for V1/V2 and those for V4. In contrast, we do not believe that optimizing the stimuli for V1/V2 recording obscured effects of training in those areas. Most neurons in V1 and V2 had receptive fields centered within 0.5° of the center of the trained location, and there was no correlation between the distance of the receptive field center from the trained location and any measured response parameter (Ghose et al., 2002).
Discussion
Plasticity in V4
We have found that extended training on an orientation discrimination task changes the response properties of V4 neurons that have receptive fields overlapping the trained visual field location. Training makes responses stronger and narrows orientation tuning. Correspondingly, neurons can signal orientation differences more reliably.
Although the changes were modest, they distinguish V4 from areas V1 and V2, which have been investigated in several previous studies. Our laboratory previously reported (Ghose et al., 2002) that this same orientation discrimination task did not produce detectable changes in either V1 or V2 (Fig. 10). One of the subjects in the current study (monkey 1) was also a subject in the previous experiment. A similar investigation by Schoups et al. (2001) also found that training did not change the sharpness of orientation tuning in V1. They found an increase in the average absolute value of the slope of normalized orientation tuning functions at the trained orientation for neurons with a preferred orientation of 12-20° from the trained orientation. However, slope measurements failed to consider the response variability and, thus, did not fully capture the ability of the neuron to discriminate orientations. Our study (Ghose et al., 2002) did not find training to affect either orientation tuning slope or d' in either V1 or V2.
Another experiment examined neuronal responses after training monkeys extensively on a bisection task (Crist et al., 2001). That training similarly produced no detectable changes in orientation tuning, receptive field size, or cortical magnification in V1. However, this training led to a greater modulation of responses to a stimulus inside the receptive field by a second stimulus outside the receptive field. This effect was only seen when the animal performed the task and, therefore, attended to the stimuli. Collectively, these single-unit studies from macaque monkeys suggest that training affects the responses of neurons in V1 and V2 but that its effects on basic neuronal tuning properties are modest compared with those described here for V4.
The mechanism by which orientation tuning properties changed in V4 neurons is not clear. Although it is possible that such changes arise from the changes in intrinsic circuitry within V4, another possibility is that they depend on changes in connections between V4 neurons and ascending inputs from early visual areas. Dosher and Lu (1998) proposed a model in which perceptual learning mainly serves to alter the connections between neuronal outputs and a learned categorization structure. This model was shown to improve the performance without fine-tuning in individual neurons. It is consistent with improvement of neuronal performance existing in V4 and not in V1.
Relationship to psychophysical studies of perceptual learning
The existence of greater neuronal plasticity in a later cortical area may seem at odds with human psychophysical studies showing that perceptual learning is highly specific for stimulus parameters such as location and orientation. For example, Ball and Sekuler (1987) found that training of motion discrimination led to an improvement in performance that was confined to a region covering only a few degrees around the trained location. Failure of training to transfer to sites more than a few degrees from the trained location has been shown for texture (Karni and Sagi, 1991) and orientation discrimination (Schoups et al., 1995) tasks. This last study additionally showed that improvement at the trained location was specific to the trained orientation. It has been suggested that the spatial and orientation specificity of perceptual learning argue for neuronal plasticity in V1, which contains the cortical representation with the most orderly topography, the smallest receptive fields, and the sharpest orientation tuning (Gilbert et al., 2001).
Nevertheless, the psychophysical data are far from conclusive about involvement of early visual cortex. There is little reason to believe that neurons in later stages of visual cortex could not support highly localized learning. Although average receptive fields sizes are larger in later stages, there is a considerable range in every area. For example, receptive field sizes of V4 neurons overlap with those of V1 neurons (Desimone and Schein, 1987), and recent results suggest that even receptive fields in inferotemporal cortex (IT) can be small when animals are trained with precisely localized stimuli (DiCarlo and Maunsell, 2003). Similarly, orientation selectivity is not unique to V1. Although average orientation tuning is broader in later stages, broader tuning can, in some situations, provide better discrimination when signals from many neurons are combined (Pouget et al., 1999; Zhang and Sejnowski, 1999). Overall, there are no compelling reasons to exclude the possibility that changes in later stages of visual cortex contribute to, or dominate, perceptual learning.
Distribution of plasticity across visual cortex
The different levels of plasticity that have been seen between V4 and V1/V2 raise the question of the relative plasticity in different levels of cortical processing. One possibility is that V1 comprises a stable representation in adults and plasticity increases progressively in subsequent cortical levels. Consistent with this, several observations suggest that neurons in IT, at the highest level of the ventral stream, can be influenced substantially by visual experience in adult life. Kobatake et al. (1998) found that training on a visual recognition task with a particular set of stimuli led to a greater proportion of IT neurons that were responsive to those stimuli. Similarly, Sigala and Logothetis (2002) showed that training monkeys to categorize visual stimuli based on different features produces an enhanced neuronal representation of the diagnostic features relative to the nondiagnostic ones in the responses of IT neurons. Sakai and Miyashita (1994) found that IT neurons preferred the Fourier descriptor stimuli that were used in training. Other studies that have trained monkeys to distinguish novel visual stimuli have also typically found moderately large proportions of IT neurons that respond selectively to those stimuli (Sato et al., 1980; Logothetis et al., 1995; DiCarlo and Maunsell, 2000). It is likely that these selectivities developed over the course of training. Finally, Jagadeesh et al. (2001) showed that brief periods of training with novel stimuli could improve the ability of individual IT neurons to distinguish between those stimuli. Collectively, these results point to a high degree of plasticity in IT, consistent with the idea that plasticity increases in successive levels of processing in visual cortex.
The potential for high plasticity in IT raises the question of whether our orientation discrimination training would produce pronounced effects there. There are reasons to believe it may not. In particular, although IT is important for discriminating different patterns and forms, lesions of this region do not affect discrimination of a particular object presented in different orientations (Gross, 1978; Holmes and Gross, 1984; Vogels et al., 1997). Vogels and Orban (1994) examined the orientation tuning of IT neurons after orientation discrimination training and found it unchanged. Thus, although IT may be susceptible to experience, its plasticity may be specifically related to those stimulus parameters for which it has the greatest selectivity, such as shape and texture.
In addition to increasing plasticity at higher cortical levels, or as an alternative to it, the distribution of cortical plasticity may depend on which cortical areas contain neurons with response properties best matched to the task. For example, whereas exposure to complex forms might produce the greatest changes in IT, extensive practice on the discrimination of a simple visual attribute might produce the greatest changes in earlier levels, in which more neurons are highly selective to that attribute. It is unlikely that any detectable plasticity would occur in areas where neurons have no appreciable and consistent selectivity for the relevant stimulus dimension.
Finally, it is possible that neurons in all cortical areas have similar levels of plasticity. Neurons in V1 may be as susceptible to training, as are those in V4, but if V1 neurons are more strongly driven by visual scenes throughout the rest of the day, the specific effects of task training may be effectively overwritten in V1. Training with different stimuli and with controlled experience outside the training period should be able to distinguish between the possibilities for the distribution of plasticity across visual cortex.
Although our discussion has been limited within the areas of the ventral pathway, the effects of training on this orientation matching task might also be found in other visual areas. For example, the central intraparietal area in the dorsal pathway was found to contain neurons that responded selectively to surface orientation (Taira et al., 2000; Tsutsui et al., 2001).
Comparison between the visual system and other sensory systems
The changes observed in the visual system are modest compared with those described in other sensory systems. Major topographical reorganization has been described in somatosensory and auditory systems after perceptual learning but has not yet been seen in the visual system. For example, Recanzone et al. (1992b) reported that the cortical representation of a trained skin region was 1.5-3 times larger than corresponding regions on untrained fingers. Similarly, training monkeys to do a sound frequency discrimination task increased the representation of the trained frequency in A1 (Recanzone et al., 1993). Studies in the early stages of visual cortex did not find changes even close to that scale. In our study, because of the irregular visuotopy of V4, we were not able to measure the topographical changes, but changes in topography have not been seen in V1 or V2 (Ghose et al., 2002; Crist et al., 2001). In addition to changes in topography, pronounced changes in receptive field size, tuning bandwidth, and temporal response properties have been described in primary auditory and somatosensory cortex (Recanzone et al., 1992a,b, 1993; Kilgard and Merzenich, 1998). No comparable changes have been observed in V1 either.
There are several possible sources for this difference. The visual system might be less plastic than the other systems. Selective somatosensory deafferentation leads to extensive reorganization in both the thalamus and cortex (Garraghty and Kaas, 1991; Nicolelis et al., 1991; Pettit and Schwark, 1993; Faggin et al., 1997; Jones, 2000). In contrast, localized retinal lesions produce dramatic changes in visual topography and receptive field size in V1 (Heinen and Skavenski, 1991; Gilbert and Wiesel, 1992; Chino et al., 1995) but few changes in the LGN (Gilbert and Wiesel, 1992; Darian-Smith and Gilbert, 1995). Stimulus differences may also explain why pronounced effects are not seen in primary visual cortex. The auditory and somatosensory perceptual tasks did not depend on response properties that were unique to cortical neurons. For example, training on tactile or auditory frequency discrimination (Recanzone et al., 1992a,b, 1993) involves stimulus parameters that are represented selectively by sensory receptors. In comparison, orientation-tuned cells are first found at the level of the cortex. It is possible that the sensory stages that first embody a response property are relatively immutable to changes in selectivity for that property. Finally, methodological or species differences are possible. Most of the auditory and somatosensory studies were performed using multiunit recording in anesthetized animals, whereas the current data are from single units in behaving animals. Many of those studies were performed in species such as owl monkeys or rats, and this might contribute to differences. In particular, macaque visual cortex contains vastly more neurons than the other sensory systems in which pronounced reorganization has been described. It is possible that primary cortex in the other systems represents a substantially larger range of cortical processing in those other systems than it does in the macaque visual cortex. Thus, learning-related plasticity might be more widely distributed in the visual system across different areas. All these factors could contribute to the observed difference in plasticity between the visual system and other sensory systems. Resolving which of these factors are most important will be an important step for understanding the neuronal mechanisms that support perceptual learning.
Footnotes
This work was supported by National Institutes of Health Grant R01 EY05911. J.M. is an Investigator with the Howard Hughes Medical Institute. We thank William H. Bosking, Yuzo M. Chino, Daniel J. Felleman, and Geoffrey M. Ghose for helpful discussions and critical comments on previous versions of this manuscript. We also thank Dennis J. Murray and Tori Williford for excellent technical assistance during all phases of this project.
Correspondence should be addressed to Dr. Tianming Yang, Department of Physiology and Biophysics, University of Washington, Seattle, WA 98195. E-mail: tyang{at}shadlen.org.
Copyright © 2004 Society for Neuroscience 0270-6474/04/241617-10$15.00/0