Visual perceptual learning models, as constrained by orientation and location specificities, propose that learning either reflects changes in V1 neuronal tuning or reweighting specific V1 inputs in either the visual cortex or higher areas. Here we demonstrate that, with a training-plus-exposure procedure, in which observers are trained at one orientation and either simultaneously or subsequently passively exposed to a second transfer orientation, perceptual learning can completely transfer to the second orientation in tasks known to be orientation-specific. However, transfer fails if exposure precedes the training. These results challenge the existing specific perceptual learning models by suggesting a more general perceptual learning process. We propose a rule-based learning model to explain perceptual learning and its specificity and transfer. In this model, a decision unit in high-level brain areas learns the rules of reweighting the V1 inputs through training. However, these rules cannot be applied to a new orientation/location because the decision unit cannot functionally connect to the new V1 inputs that are unattended or even suppressed after training at a different orientation/location, which leads to specificity. Repeated orientation exposure or location training reactivates these inputs to establish the functional connections and enable the transfer of learning.
Visual perceptual learning is known to be orientation and retinal location specific (Karni and Sagi, 1991; Fahle, 1994; Ahissar and Hochstein, 1997), placing strong constraints on many perceptual learning models by restricting the site of learning to the retinotopic and orientation-selective visual cortex. Various models propose that perceptual learning may result from training-induced modifications of recurrent horizontal connections in V1 that lead to sharpened neuronal tuning (Adini et al., 2002; Teich and Qian, 2003; Zhaoping et al., 2003) or from improved readout of V1 signals through response reweighting within the visual cortex (Poggio et al., 1992; Dosher and Lu, 1998). Alternatively, Mollon and Danilova (1996) hypothesized that learning occurs at a central site, but what is learned is the receptor arrangement along a certain orientation or the local retinal image properties, which still predicts orientation and location specificity of learning. The concept of central learning gains support from new single-unit evidence that motion-direction learning correlates with neural activities at a nonsensory decision area, the lateral intraparietal area (LIP), not the motion selective middle temporal area (MT) (Law and Gold, 2008), which is modeled as a high-level decision unit refining its functional connections to sensory neurons responding to a specific motion direction through response reweighting, which again constrains the model learning to be motion-direction specific (Law and Gold, 2009).
However, orientation and location specificities may not be intrinsic properties of perceptual learning. Recently, we demonstrated that location specificity can be abolished by a feature and location double-training procedure (Xiao et al., 2008). For example, location-specific contrast discrimination learning can be rendered completely transferrable to a new location if the new location is trained with an irrelevant orientation-discrimination task. Although these results support the central learning hypothesis (Mollon and Danilova, 1996), the complete location transfer of learning indicates that learning does not simply reflect changes in functional connections between the decision unit and specific sensory neurons. Rather, some more general learning process must have been activated that can be applied to sensory inputs from a different neuronal population at a different retinal location.
However, a central and general learning process may also predict orientation nonspecificity in perceptual learning, which is in contradiction to the extant data. To resolve this contradiction, in this study we designed a training-plus-exposure (TPE) procedure to demonstrate complete transfer of learning across orientations. Based on the new evidence, we propose a rule-based learning model to explain perceptual learning and its specificity and transfer.
Materials and Methods
Observers and apparatus.
Fifty-seven observers with normal or corrected-to-normal vision participated in this study. All were new to psychophysical experiments and unaware of the purposes of the study. This research was approved by the Beijing Normal University Institutional Review Board, and informed consent was obtained from each observer.
The stimuli were generated by a PC-based WinVis program (Neurometrics Institute). Gabor stimuli were presented on a 21-inch Sony G520 color monitor (1024 × 768 pixels; 0.37 × 0.37 mm per pixel; 120 Hz frame rate; 50 cd/m2 mean luminance), and the bar array stimuli were presented on a 21-inch Dell P1130 color monitor (1024 × 768 pixel; 0.37 × 0.37 mm per pixel; 150 Hz frame rate; 41 cd/m2 mean luminance). Luminance of the monitors was linearized by an 8-bit look-up table. A chin-and-head rest helped stabilize the head of the observer. Experiments were run in a dimly lit room. Viewing was binocular.
The Gabor stimuli (Gaussian windowed sinusoidal gratings, with spatial frequency = 6 cycles per degree, SD = 0.17°, contrast = 0.47, and phase randomized for every presentation in the orientation discrimination task) (Fig. 1a), presented on a mean luminance background, were used in orientation and contrast discrimination tasks (Figs. 1, 2). The stimulus was viewed at a distance of 4 m through a circular opening (diameter = 17°) of a black piece of cardboard that covered the entire monitor screen. This control prevented observers from using external references like monitor edges to determine the orientations of the stimuli.
The bar arrays (Fig. 3a) used in the feature detection task were similar to those used by Ahissar and Hochstein (1997). Specifically, the 7 × 7 array of white bars (22.2 ×1.3 arcmin each) with an interbar distance of 42.5 ± 3.9 arcmin was presented on a black monitor screen and was viewed at a distance of 2 m. The target was an oddly oriented bar placed at either the second or the sixth bar location of the middle row of the array (Fig. 3a, red circles), which differed from other uniformly oriented background bars by 16°. The stimulus array was followed, at various stimulus onset asynchronies (SOAs), by a mask that was also a 7 × 7 array, with each element containing one pair of white bars oriented at the target and background orientations, and the other pair rotated by 90°.
Contrast- and orientation-discrimination thresholds were measured with a temporal two-alternative forced choice (2AFC) staircase procedure. In each trial, the reference and test (reference + Δcontrast or Δorientation) were separately presented in two stimulus intervals (92 ms each) in a random order separated by a 600 ms interstimulus interval. The observers judged in which stimulus interval the stimulus had a more clockwise orientation (orientation discrimination) or a higher contrast (contrast discrimination). The fixation cross was flashed for 200 ms and disappeared 200 ms before the onset of the first stimulus interval. The feature detection thresholds were measured with a single interval yes/no staircase procedure. Each trial started with a fixation, then the observer pressed the “ready” key and the stimulus was presented for 1 frame (6.7 ms), which was followed by a mask with a variable SOA. The observers judged whether the stimulus array contained an odd element (50% trials). Auditory feedback was given on incorrect responses in all tasks.
The staircases followed the three-down, one-up staircase rule, which resulted in a 79.4% convergence rate. The step size of the staircases was 0.05 log units. Each staircase consisted of four preliminary reversals and six experimental reversals. The geometric mean of the experimental reversals was taken as the threshold for each staircase run.
Orientation specificity and transfer in orientation learning
We first replicated orientation specificity in a foveal orientation discrimination task. Six observers in the initial training phase practiced orientation discrimination (i.e., “Which interval contains a more clockwise orientation stimulus in a two-interval trial?”) for a Gabor stimulus (Fig. 1a) at one orientation (36° or 126°, denoted as Δori_ori1, meaning orientation discrimination at orientation 1). After seven 2 h sessions of practice on different days, significant learning was evident [mean percent improvement (MPI) = 32.3 ± 3.2%, p < 0.001, one-tailed paired t test] (Fig. 1b, left), but orientation discrimination for the same Gabor was not significantly improved at an untrained orthogonal orientation (Δori_ori2, MPI = 6.9 ± 3.8%, p = 0.067).
Based on our previous work (Xiao et al., 2008; Zhang et al., 2010), we hypothesized that learning takes place at a high-level decision stage beyond the retinotopic and orientation-selective visual areas. We reasoned that the orientation specificity might result because the high-level decision unit, which has learned at the trained orientation, cannot functionally connect to V1 inputs representing the untrained transfer orientation to transfer learning. These new V1 inputs are unattended and likely suppressed by the decision unit after attention has been directed to the trained orientation (Treue, 2001; Vidnyánszky and Sohn, 2005; Gál et al., 2009). We further hypothesized that the unattended and suppressed V1 inputs could be reactivated by repeated exposure of the transfer orientation, so that the decision unit and new V1 inputs can be functionally connected to enable learning transfer.
To test this hypothesis, we designed a TPE procedure in which an observer is trained at one orientation and simultaneously or at later time passively exposed to the transfer orientation while performing an irrelevant task. Specifically, following a successive TPE procedure, several weeks later the same six observers started the second exposure phase in which they were exposed to orientations around the transfer orientation (ori2 ± 10°) in a contrast discrimination task (Δcon_ori2). The purpose of having the observers perform demanding near-threshold contrast discrimination around ori2 was to control attention (Ahissar and Hochstein, 1993), i.e., to divert attention away from the stimulus orientation. As our control condition (Fig. 1d) shows, contrast discrimination training alone had little impact on orientation discrimination performance, so that any potential improvement of orientation discrimination at ori2 after the exposure phase was not learned at ori2, but transferred from ori1. The ±10° orientation jitter was used to stimulate neurons responding to the reference orientation and to-be-discriminated orientations. After training contrast discrimination performance was significantly improved (Δcon_ori2; MPI = 25.3 ± 4.3%, p < 0.001) (Fig. 1b, right). Importantly, orientation discrimination for the untrained ori2 (Δori_ori2) was also improved (MPI = 21.8 ± 5.8%, p = 0.007). The overall MPI of Δori_ori2 after the TPE procedure was 28.0 ± 3.8% (p < 0.001), not significantly different from that of trained Δori_ori1 (p = 0.36), suggesting complete transfer of orientation learning from ori1 to ori2. The successive TPE results are summarized in the left section of Figure 1e.
We used a transfer index (TI) to compare the transfer of learning among different training conditions. TI = MPItransfer/MPItrained. TI = 1 would indicate complete transfer and TI = 0 would indicate zero transfer. For the above TPE procedure, TI was 0.19 after phase I training, which increased significantly to 0.90 after phase II orientation exposure (p = 0.013).
To confirm the above TPE results, six new observers completed a simultaneous TPE procedure in which they practiced Δori_ori1 and Δcon_ori2 simultaneously in alternating blocks of trials (staircases). During Δcon_ori2 training, they were exposed to orientations around the transfer orientation (ori2 ± 10°). This TPE procedure significantly improved performance at Δori_ori1 (MPI = 37.2 ± 8.8%, p = 0.004) (Fig. 1c,e) and Δcon_ori2 (MPI = 17.7 ± 6.4%, p = 0.020). Again, orientation discrimination at untrained ori2 (Δori_ori2) was also improved (MPI = 31.6 ± 6.3%, p = 0.002), which was not significantly different from the improvement at the trained Δori_ori1 (p = 0.18), again suggesting a complete transfer of orientation learning (TI = 1.16). These simultaneous TPE results are summarized in the middle section of Figure 1e.
A control experiment excluded the possibility that improved orientation discrimination at Δori_ori2 resulted from contrast training around ori2 alone. Contrast discrimination training around ori2 (Δcon_ori2; with ± 10° jittering of the Gabor orientation) significantly improved contrast performance in six new observers (MPI = 26.9 ± 2.1%, p < 0.001) (Fig. 1d,e), but this contrast learning had no significant impact on orientation discrimination at the same orientation (Δori_ori2; MPI = 7.7 ± 5.8%, p = 0.083).
Interestingly, a reversed TPE procedure (i.e., ori1 training after, rather than before, ori2 exposure) does not generate the same transfer of learning. The same observers in the control experiment continued to practice orientation discrimination at ori1 for five more sessions (Δori_ori1; MPI = 22.2 ± 5.1%, p = 0.004) (Fig. 1d,e). However, this time orientation learning did not transfer much to the untrained Δori_ori2 (MPI = 5.0 ± 2.6%, p = 0.053) (Fig. 1d,e). These reversed TPE results are summarized in the right section of Figure 1e. We will return to this interesting result in Discussion.
Remarkably, after this reversed TPE procedure, untrained contrast discrimination at ori1 was also improved (Δcon_ori1; MPI = 22.6 ± 3.6%, p = 0.001) (Fig. 1d, solid green circle with black outline) as much as learning at trained Δcon_ori2 (p = 0.13), assuming that the pretest contrast discrimination thresholds were similar at the two orientations. Here the reversed TPE procedure became the regular successive TPE procedure regarding contrast learning. This transfer seems to point to the generality of the TPE training effects. However, it is not clear whether the transfer of learning to Δcon_ori1 resulted from the successive TPE training first at Δcon_ori2 and then at Δori_ori1, because we do not know how much learning at trained Δcon_ori2 alone had transferred to Δcon_ori1. This issue of generality is addressed in Figure 2.
We also examined whether the transfer of learning enabled by the TPE procedure was specific to the transfer orientation. Five new observers were trained with the same simultaneous TPE procedure and changes in orientation discrimination thresholds were measured at 0°, 15°, 30°, 45°, and 60° away from the transfer orientation (36° or 126°) where contrast discrimination was trained. To reduce the potential impact of measuring thresholds at one orientation on the performance of neighboring orientations, the neighboring orientations were placed either clockwise or anticlockwise from ori2. As a comparison, another baseline group of five new observers were trained with orientation discrimination only and changes of orientation performance were measured at similar orientation deviations from the transfer orientation. We found that the transfer of learning following the TPE procedure was strongest at the exposed transfer orientation (Fig. 1f). Beyond a 30° deviation from the transfer orientation, there were no significant performance differences between the baseline and TPE groups.
Orientation specificity and transfer in contrast learning
To better test the generality of our results, we used the TPE procedure in a new contrast-discrimination learning task to demonstrate complete transfer of contrast learning across orientations (Fig. 2). The baseline group of six observers practiced contrast discrimination for a vertical or horizontal Gabor (i.e., “Which interval contained a higher contrast stimulus?” in a two-interval trial) in six to seven 2 h sessions on different days, which produced significant learning at the trained orientation (Δcon_ori1; MPI = 29.1 ± 4.1%, p < 0.001) (Fig. 2b,e). The contrast learning partially transferred to the untrained orthogonal orientation (Δcon_ori2; MPI = 11.6 ± 3.0%, p = 0.006), but the change of performance at Δcon_ori2 was less than that at trained Δcon_ori1 (p = 0.020).
A TPE group of six new observers practiced contrast discrimination at ori1 (Δcon_ori1), as well as orientation discrimination at ori2 (Δori_ori2, the exposure condition) in alternating staircases. Here the orientation discrimination task allowed orientation exposure at ori2, but the stimulus contrast was irrelevant and unattended as attention was diverted to the orientation judgment. Again, this measure was to ensure that any improved contrast discrimination at ori2 was not learned at ori2, but transferred from ori1. The Gabor contrast in the orientation discrimination task was jittered from trial to trial (from 30% to 67%, ±1 octave from the 0.47 reference contrast) to activate neuronal responses to the reference contrast and the nearby to-be-discriminated contrasts at ori2. After this TPE procedure, both contrast discrimination at ori1 (Δcon_ori1) and orientation discrimination at ori2 (Δori_ori2) were significantly improved (MPI = 26.5 ± 7.5% and 30.3 ± 4.9%, p = 0.008 and p = 0.001, respectively) (Fig. 2c,e). So was contrast discrimination at ori2 (Δcon_ori2) by an equivalent amount (MPI = 31.8 ± 4.9%, p < 0.001), suggesting that the TPE procedure enabled complete transfer of contrast learning across orientations.
A control experiment ruled out the possibility that orientation training (with contrast jittering) at the transfer orientation (Δori_ori2; MPI = 30.5 ± 4.2%, p = 0.001) (Fig. 2d,e) alone improved contrast discrimination at the same orientation (Δcon_ori2; MPI = 9.4 ± 8.0%, p = 0.15). A comparison of the transfer index indicated that the transfer of learning with the TPE condition (TI = 1.70) (Fig. 2c) was significantly more than the transfer with the baseline (TI = 0.49) (Fig. 2b) and control (TI = 0.25) (Fig. 2d) conditions (p = 0.010, one-way ANOVA).
Orientation specificity and transfer in feature detection learning
Next, we studied whether the TPE procedure could override orientation specificity in a feature-detection learning task originally used by Ahissar and Hochstein (1997, 2004). In this task, observers learn to detect an oddly oriented target bar from other uniformly oriented background bars (Fig. 3a, far left), all of which are flashed simultaneously and briefly, and are then followed by a mask (Fig. 3a, third from left) at various SOAs. The transfer of learning is tested when the target and the background orientations are swapped (Fig. 3a, second from left). Ahissar and Hochstein (1997) reported that perceptual learning transfers nearly completely to the swapped target-background orientations only if the target-background orientation difference is large (i.e., 30°, an easy task made difficult by the brief duration). If the orientation difference is small (e.g., 16°, a hard task), learning does not transfer much. This result provided the core evidence for Ahissar and Hochstein's influential reverse-hierarchy theory of perceptual learning (Ahissar and Hochstein, 1997, 2004), which asserts that easy-task learning can be accomplished by cognitive mechanisms at a high level of the information-processing hierarchy, but hard-task learning requires modification of the stimulus representation in early visual areas at the bottom of the hierarchy.
We first successfully replicated Ahissar and Hochstein's hard-task learning data in seven new observers by showing that the very substantial learning at the trained target-background orientations (MPI = 61.4 ± 3.9%, p < 0.001, after four sessions of training) did not transfer much to the swapped target-background orientations (MPI = 19.7 ± 9.5%, p = 0.041) (Fig. 3a, far right panel). As in Ahissar and Hochstein's experiment, there was no pretest at the swapped orientations. Therefore, the pretraining threshold at the trained orientations was used to calculate the MPI at the swapped orientations. We then had another five observers undergo the TPE procedure, in which the original feature-detection training alternated in blocked trials with repeated exposure to the swapped-background orientation at a fixed stimulus-mask SOA (106.7 ms, which was near the average pretraining threshold). In the exposure condition, the observers judged whether the stimuli were bars (uniformly oriented at the swapped-background orientation without the odd element presented on 80% trials) (Fig. 3b, left) or circles (20% trials) (Fig. 3b, middle) in each 60-trial block. This time, feature-detection learning (MPI = 43.9 ± 3.9%, p < 0.001) transferred completely to the swapped orientations (MPI = 44.7 ± 4.8%, p < 0.001) (Fig. 3b, right). The average TI was significantly higher with the TPE condition (TI = 1.04) than with the baseline condition (TI = 0.32) (p = 0.002, one-tailed parametric t test).
Up to 8 weeks after the initial training, five of the seven observers who performed baseline training in Figure 3a returned and were exposed to the swapped-background orientation (judging bars or circles) for four sessions before the odd-bar detection performance at the swapped orientations was remeasured (Fig. 3c). For these observers, the original baseline feature detection training led to an insignificant performance improvement (MPI = 9.6 ± 6.9%; p = 0.12) at the swapped orientations (Fig. 3c, left red triangle). Subsequent repeated exposure to the swapped-background orientation significantly improved performance at the swapped orientations by another 33.9 ± 5.3% (p = 0.002) by the final (10th) session (Fig. 3c, right red triangle). The overall MPI was 56.2 ± 3.4% (p < 0.001) with the trained orientations and 40.3 ± 7.0% (p = 0.004) with the swapped orientations, which did not differ significantly (p = 0.12).
Existing models of perceptual learning predicting specificity, not transfer
The demonstration of complete transfer of perceptual learning across orientations (Figs. 1⇑–3) and retinal locations (Xiao et al., 2008; Zhang et al., 2010) challenges the existing models of perceptual learning. First, the complete learning transfer contradicts V1-based models (Adini et al., 2002; Teich and Qian, 2003; Zhaoping et al., 2003) as well as reweighting models that place the decision unit in the visual cortex at a post-V1 stage (Poggio et al., 1992; Dosher and Lu, 1998). These models predict orientation and location specificities, but not transfer, because of the retinotopy and orientation selectivity of the visual cortex. Second, the complete learning transfer also contradicts all existing reweighting models, even if the decision unit is placed in nonretinotopic high brain areas (Mollon and Danilova, 1996; Law and Gold, 2009). All reweighting models assume training improved readout of visual inputs from a specific population of neurons, which also predicts orientation and location specificities, but not transfer. The transfer results also run counter to the recent claim that orientation specificity results from response reweighting within the same orientation channel (Jeter et al., 2009). Again, within-channel reweighting would not predict complete transfer of learning to a new orientation.
Third, the complete learning transfer, especially with the feature detection task (Fig. 3), also challenges the influential reverse-hierarchy theory (Ahissar and Hochstein, 1997, 2004). Reverse-hierarchy theory postulates that high-level easy-task learning directs the early visual cortex to modify the stimulus representations to achieve low-level and nontransferrable hard-task learning. However, the complete transfer of the same hard-task learning after the TPE procedure (Fig. 3) demonstrates that both easy- and hard-task learning can be accomplished by a single architecture (Dosher and Lu, 2007) in high-level brain areas.
Finally, the successive TPE training data (Figs. 1b, 3c), in which the training and exposure phases were often separated by several weeks, rule out the possibility that the transfer of learning could result from temporal associations between the trained orientation and the exposed transfer orientation. Similar associations have been used to explain task irrelevant perceptual learning under significantly different stimulus conditions (Seitz and Watanabe, 2005).
A rule-based learning model
Our results suggest that what is really learned in perceptual learning are the heuristics or rules for performing a visual task efficiently. To overcome the limitations of the previous models, we propose a rule-based learning model to explain perceptual learning and its specificity and transfer.
Existing reweighting models, as constrained by the learning specificities, focus on the weight retuning of specific V1 inputs. However, the transfer of learning suggests that the decision unit has deduced the rules of reweighting the inputs from learning a task at a specific orientation or retinal location. For example, assuming that the V1 inputs for a stimulus orientation are Gaussian distributed with the most relevant (strongest) inputs at the center of the distribution, training would enable the decision unit to form a set of algorithms for assigning a weight to each input on the basis of its relative distance from the mean. These algorithms or rules are independent of the absolute orientations or locations that the V1 inputs represent, so they are potentially applicable to other orientations and retinal locations.
Rule application: specificity and transfer
Our results demonstrate that the transfer of learning requires two things: learning the rules for the task and exposure of the new features to which the learned rules must be applied. The application of learned rules to a new orientation or retinal location is not automatic, or learning would always transfer. When training is at a specific orientation or retinal location, V1 inputs representing other untrained orientations or retinal locations are unattended and likely suppressed (Treue, 2001; Vidnyánszky and Sohn, 2005; Gál et al., 2009), so that the decision unit cannot functionally connect to these V1 inputs to apply the learned rules. Orientation exposure (Figs. 1⇑–3) or location training (Xiao et al., 2008) may establish such connections to enable learning transfer by reactivating the unattended or suppressed V1 inputs, either during training in separate blocks of trials or after training.
A key finding is that there is no transfer of learning if the order of the TPE procedure is reversed (Fig. 1d), indicating that the order is crucial. In order for transfer to occur, training on the orientation task had to precede exposure to the new orientation, suggesting that mere exposure of the transfer orientation does not lead to the transfer of learning. Rather, the rule learning may require substantial experience with a near-threshold and demanding task.
We note that there is much to be done to flesh out our model. For example, we need to know how analogous in terms of context and task parameters the second task needs to be to the first in order for transfer to occur. We are also currently investigating how much exposure is needed for reactivation. In one study, we found that ∼200 trials of pretest in the periphery enables complete transfer of foveal orientation learning to the periphery (Zhang et al., 2010), indicating that this process could be very fast for certain tasks. Without the pretest, there is no transfer (Schoups et al., 1995).
This research was supported by a Natural Science Foundation of China Grant 30725018 (C.Y.) and National Institutes of Health Grants R01-04776 (S.A.K.) and R01-01728 (D.M.L.). We thank Drs. Roger Li, Wu Li, and Li Zhaoping for their helpful comments.
- Correspondence should be addressed to Cong Yu, State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing 100875, China.