Spatial frequency tuning in the lateral geniculate nucleus of the thalamus (LGN) and primary visual cortex (V1) differ substantially. LGN responses are largely low-pass in spatial frequency, whereas the majority of V1 neurons have bandpass characteristics. To study this transformation in spatial selectivity, we measured the dynamics of spatial frequency tuning using a reverse correlation technique. We find that a large proportion of V1 cells show inseparable responses in spatial frequency and time. In several cases, tuning becomes more selective over the course of the response, and the preferred spatial frequency shifts from low to higher frequencies. Many responses also show suppression at low spatial frequencies, which correlates with the increases in response selectivity and the shifts of preferred spatial frequency. These results indicate that suppression plays an important role in the generation of bandpass selectivity in V1.
- primate vision
- striate cortex
- spatial frequency tuning
- response suppression
- dynamic tuning
- quality factor
Neurons in the lateral geniculate nucleus of the thalamus (LGN) have antagonistic center-surround receptive fields, which are poorly tuned for orientation and predominantly low-pass in spatial frequency (Rodieck and Stone, 1965; Derrington and Fuchs, 1979; So and Shapley, 1979; Kaplan and Shapley, 1982;Hicks et al., 1983; Irvin et al., 1993). In contrast, many primary visual cortex (V1) simple cell receptive fields are elongated and have between two and three subfields of alternating polarity (Hubel and Wiesel, 1959, 1962). This type of receptive field can be sharply tuned for both orientation and spatial frequency (De Valois et al., 1982; De Valois and Tootell, 1983).
Two major classes of models have emerged to explain the transformation of receptive fields and spatial tuning properties between the LGN and V1. Feed-forward models suggest that elongated receptive fields and sharp cortical tuning are a result of input from spatially collinear LGN receptive fields (Hubel and Wiesel, 1959, 1962). The summation of spatially aligned input in V1 could improve both orientation and spatial selectivity. A recent version of the feed-forward model that also accounts for contrast invariance (Sclar and Freeman, 1982; Skottun et al., 1986) incorporates feed-forward inhibition as well as feed-forward excitation (Troyer et al., 1998).
In contrast to the hierarchical organization of feed-forward models, feedback models suggest that cortical selectivity is produced primarily through intracortical mechanisms. These models suggest that cortical tuning is only loosely biased by LGN input and is refined through intracortical excitatory and inhibitory influences (Benevento et al., 1972; Worgotter and Koch, 1991;Ben-Yishai et al., 1995; Somers et al., 1995; Carandini and Ringach, 1997;Adorjan et al., 1999; Anderson et al., 2000; Pugh et al., 2000). The cortical feedback interactions determine, at least in part, the elongation of the subfields in simple cells as well as their effective number (Sabatini et al., 1997).
The feed-forward and feedback models make testable predictions about both the time course of cortical tuning and the characteristics of suppressive input. Feed-forward models postulate that excitation and inhibition have similar spatial frequency tuning. Thus, suppression produces a change in the magnitude of the response and not in its tuning shape. In other words, the resulting spatial frequency tuning function should be separable in spatial frequency and time. In contrast, feedback models suggest that cortical tuning could emerge over the time course of the response; tuning should be initially broad, reflecting the tuning of the LGN input, and become more selective as the effect of intracortical feedback increases. This could be the result of the contribution of cortical inhibition with a different tuning shape than the LGN component.
Here, we measure the dynamics of spatial frequency tuning in macaque V1 using a reverse correlation method. Our primary goal is to determine whether the dynamics of spatial frequency tuning exhibit spatiotemporal separability. In cases of inseparability, we are also interested in describing what the main forms of inseparability are and how we can account for these responses with feed-forward or feedback mechanisms. To study these issues we focus on three questions of interest. (1) Does the peak and/or shape of the tuning curve change over the course of the response? (2) Is there evidence of suppressive input in the spatial frequency tuning curve, and if so, what are the tuning characteristics of suppression? (3) If suppression is evident in the tuning curve, how does it affect the preferred spatial frequency and the selectivity of the tuning curve?
MATERIALS AND METHODS
We performed acute experiments on adult Old World monkeys (Macaca fascicularis), using methods described elsewhere (Ringach et al., 1997). All procedures were performed in compliance with National Institutes of Health and University of California Los Angeles/Animal Research Committee guidelines.
The dynamics of spatial frequency tuning in V1 were measured using the reverse correlation technique described by Ringach et al. (1997). Receptive fields were located at eccentricities of 1–6°. Visual stimulation was monocular through the dominant eye (the other eye was occluded). We recorded the responses of individual cells to a rapid sequence of luminance-modulated sinusoidal gratings at a fixed orientation (optimal for the cell) but with varying spatial frequencies and spatial phases (Fig. 1). The stimulus was presented at an effective rate of 50 Hz by presenting each grating twice on a monitor with a refresh rate of 100 Hz. The optimal orientation for each cell was determined using conventional steady-state orientation tuning curves run before the experiment. Test spatial frequencies were selected to completely span the response range of each cell, spanning between 3.33 and 6 octaves, centered at the preferred spatial frequency of the cell. Spatial frequencies were logarithmically spaced. The preferred spatial frequency of the cell was defined by the peak of the steady-state spatial frequency curve in response to drifting gratings. Each spatial frequency test was presented in eight equally spaced spatial phases spanning 360°. Blank images of uniform luminance were interleaved in the sequence to provide a measure of baseline response. The probability of presentation of a blank image was equal to that of any one spatial frequency independent of spatial phase.
The stimulus was presented in a circular window 1.5–3× the size of the classical receptive field of the cell. The size of the receptive field was defined as the saturation point or peak of an area summation curve, run at the preferred orientation, temporal frequency, and spatial frequency of the cell (Levitt and Lund, 1997;Sceniak et al., 1999). The Michaelson contrast of the stimulus was 99%. Each sequence was composed of 1500 images drawn randomly from the stimulus set of all test spatial frequencies and blanks. Each sequence lasted 30 sec. Thirty sequences were run consecutively for each cell, for a total experimental time of 15 min.
The response of the cell consisted of the arrival time of each action potential elicited during the visual stimulus. For a fixed time lag, τ, we calculate the probability that a spike was preceded by a particular grating with spatial frequency f at a particular time delay, τ, independent of spatial phase: Pr (f, τ). The baseline, B(τ), is calculated as the probability that a blank image with uniform luminance preceded a spike by τ msec. By dividing the magnitude of the response to a given stimulus grating by the magnitude of the response to a blank, we calculate the relative strength of the response:Equation 1
The log transformation makes the absolute value of R(f, τ) proportional to the d′ value between the response and the baseline assuming a Poisson firing rate (Green and Swets, 1974; Ringach et al., 2002).
We studied the dynamics of tuning by calculating R(f, τ) at values of τ ranging from 0 to 150 msec. R(f, τ) equals zero when the response of the cell to a test stimulus equals the response to a blank stimulus. Positive values indicate enhancement of the response of the cell, whereas negative values indicate suppression. For both short and long time lags (τ < 30 and τ > 130 msec), the response distribution should be flat and near zero, indicating no correlation between the stimulus and the response. At some intermediate values of τ, the response may show both positive and negative values, indicating enhancement or suppression of the response for different spatial frequencies.
Curve fitting. Steady-state spatial frequency tuning curves are sometimes fit with a Gaussian curve to reduce the effect of response noise on estimates of the peak and selectivity of the tuning curve (Hawken and Parker, 1987; DeAngelis et al., 1994). However, as will be described below, many of the cells in our data set showed changes in the peak and selectivity of the response, as well as response suppression for nonoptimal spatial frequencies. These features could not be well fit by a single curve in which only the amplitude of the curve is allowed to vary as a function of time. Instead, we modeled our data as the sum of excitatory and inhibitory components, each of which is separable in spatial frequency and time:Equation 2where:Equation 3
In this model, E(f) acts as an excitatory component, with fctr(E) and ςE as the center and width of the component respectively, and gE(τ) as the gain parameter. I(f) is the inhibitory component. fctr(I) and ςI represent the center and width of the inhibitory component, and gI(τ) its gain. The center and width of each component is invariant across time. The amplitudes of both components are the best fitting positive coefficients, using a least squares measurement of error. Unless specifically stated otherwise, estimates of tuning curve properties are based on the fit of the two-component model.
Time course of the response. For most cells, the response starts around τ ≈ 30 msec and returns to baseline shortly after τ ≈ 130 msec, although there was considerable variation in the time of the onset (τonset) and decay (τfinal) of the response. We use the variance of the response over time to identify τonset and τfinal. For short time delays, before the stimulus signal has reached the cortex, any deviation of the response from the zero baseline is attributable to noise in the measurement. The magnitude of the deviations from baseline can be measured as the variance of the signal (across all spatial frequencies) at each time delay:Equation 4
We use the first 20 msec of the response to provide an estimate of the “variance of the noise,” which is defined as the average response variance during the first 20 msec after the stimulus presentation, (τ ≤ 20). For time delays that produce a stimulus-driven response, the variance should increase significantly. τonset and τfinal were defined as the time lags τ at which the variance of the response, V(τ), crossed a threshold of 4 SD above the variance of the noise.
Figure 2A illustrates V(τ) for one example cell. The first 20 msec of the curve are shown to the left of the short vertical line. These values are averaged to provide an estimate of (τ ≤ 20). The SD of the noise is calculated as the square root of this value. The response criterion level is set at 4 SD above (τ ≤ 20) and is indicated by a dashed line intersecting the curve. τonset and τfinal are defined as the first and last level-crossings of the curve with the criterion level.
We define the maximum and minimum amplitudes of the response as: Equation 5respectively. Figure 2B shows an example of how the maximum and minimum amplitudes of the response vary over time. τmax and τmin indicate the time lag that produced maximum response enhancement and suppression, respectively. In addition, we define τdev and τdecay as the points at which Mx(τ) has achieved or decayed back to half of its maximum amplitude, R(fpk, τmax)/2, indicated by the dashed line. τdev and τdecay occur where the maximum amplitude curve intersects the dashed line.
Spatial frequency tuning characteristics. Figure2C illustrates a time slice of the spatial frequency tuning curve at τmax = 64 msec. fpkdenotes the preferred spatial frequency at this time delay. The dashed line indicates R(fpk, τ)/; the points at which the curve intersects this value are defined as the low and high spatial frequency cutoffs, flow and fhigh. Responses above zero indicate enhancement of the spike rate relative to the baseline; responses below zero indicate suppression. We define total response enhancement at each time lag τ as the total area of response enhancement [AE(τ)], indicated by the horizontal lines in Figure 2C. Total response suppression [AS(τ)] was defined similarly and is indicated by vertical lines.
We estimated the preferred spatial frequency of the cell at τmax and for the time-averaged tuning curve,(f):Equation 6The peak of the time-averaged tuning curve will be referred to as f̅p̅k̅ ̅ ̅.
As described below, we found that fpk changes with time in a number of cells. To quantify this change, we estimated fpk as a function of τ. In most cases, but not always, the change was monotonic and increasing with time. We compare fpk at τfinal with fpk at τonset to measure the overall change in the location of the preferred spatial frequency:Equation 7
This value estimates Δfpk in octaves of spatial frequency. We calculated Δfpk using the raw data, smoothed to reduce the effects of noise. Raw data were interpolated to 300 log-spaced data points and smoothed by convolving with a Gaussian filter with a ς of 0.4 log units. The peak of the tuning curve was defined by the peak of the smoothed curve.
Spatial frequency selectivity was measured by the “quality factor” of the tuning curve. The quality factor is given by:Equation 8where fpk(τ) is the preferred spatial frequency at time τ, and fhigh(τ) and flow(τ) represent the high and low cutoff spatial frequencies, respectively (Horowitz and Hill, 1989). Cells that have sharp spatial frequency tuning have a large Q-factor, whereas cells with very broad tuning have a Q-factor close to 0.
The use of the quality factor to define selectivity has an advantage over more traditional estimates such as response bandwidth. Traditional estimates of bandwidth that consider the log ratio between fhigh(τ) and flow(τ) are undefined when the response is low-pass. Because many of the cells in our sample are low-pass for spatial frequency for at least a portion of the duration of their response, bandwidth measures would limit our ability to assess selectivity for non-bandpass cells. The Q-factor does not suffer from this problem, but is similar to bandwidth measures in that it remains constant if the tuning curve is translated along a logarithmic frequency axis.
As described below, we found that many responses become more selective over time. We measure the change in selectivity as a function of time (ΔQ) as:Equation 9
Positive values of ΔQ indicate that the cell became more selective over time, whereas negative values indicate that the cell became less selective over time.
When measured with luminance-modulated sinusoidal drifting gratings, the spatial frequency tuning curves of the LGN have a characteristic low-pass shape in which the high spatial frequency limb of the tuning curve returns to baseline, but the low spatial frequency limb remains elevated. Tuning in V1 is generally bandpass, with both the high and low spatial frequency limbs decaying to baseline. To examine the steepness of the spatial frequency limbs in our data, we use two indices that allow us to separately examine the low and high spatial frequency flanks of the tuning curve. These indices are given by:Equation 10
An index close to 0 indicates that the spatial frequency limb has a shallow slope, whereas a very steep slope is indicated by an index close to 1.
Suppression. Cells were identified as having a significant suppressive response if the minimum response at τmin was below the suppression criterion point and remained below this level for a period of at least 10 msec. The suppression criterion point was defined as 4 SD of the noise below zero. For cells with significant suppression, the strength of suppression was measured as the ratio of the area of suppression over all response τ values, versus the total area beneath the curve, including enhancement and suppression:Equation 11
AE(τ) and AS(τ) are approximations of the excitatory and suppressive area under the curve, as illustrated in Figure2C. This measure indicates the relative strength of suppression for different values of τ.
The SupIndex measures the net suppression in the tuning curve. However, if inhibitory and excitatory inputs have similar, but not equal, tuning, the strength and contribution of the suppressive input to the net tuning curve may be masked by overlapping excitatory input. We measure the relative location of excitation and suppression on the basis of the centers of the two components of the model:Equation 12
As described below, for some responses the centers of the excitatory and inhibitory inputs are more than two octaves apart. For such responses, we would not expect the two inputs to overlap significantly. If the tuning of the components does not overlap, the inhibitory input could not cause a change in the selectivity of the net tuning curve. However, the overlap between the excitatory and inhibitory components depends on both the relative location and the bandwidth of the two inputs. Figure 2D illustrates the model components used to fit the response in Figure 2C. We measure the overlap as:Equation 13This measure represents the shaded region in Figure2D. Using this measure of the overlap between excitatory and inhibitory input, we can then measure how the interaction between suppressive and excitatory inputs changes the tuning of the response by looking at the change in overlap over time:Equation 14
Data selection. We recorded from a total of 208 cells from 10 animals. Multiple electrodes were used on some animals to allow us to gather data from several cells concurrently. Because we recorded from multiple cells simultaneously, in some experiments cells were not stimulated at their optimal orientation. This study includes only cells that were stimulated at orientations within 20° of their preferred orientation; 56 cells were excluded from the data set for this reason. From this subset of cells, we excluded cells for which the response was not considered significant (n = 43). A significant response requires that V(τ > 20) was more than 4 SD above the average variance of the noise, (τ ≤ 20). Finally, some responses did not return to baseline for the highest spatial frequencies measured. These cells (n = 15) were excluded from the analysis. A total of 94 cells passed these criteria and form the dataset analyzed in this study.
Figure 3 illustrates the dynamic responses of three cells that are representative of our data. In Figure3A–C, we plot R(f, τ) for a range of time delays between τonset and τfinal. Positive values indicate that the response of the cell was enhanced for a given spatial frequency relative to the response to a blank stimulus. Negative values indicate that a stimulus with a given spatial frequency suppressed the response of the cell below the level produced by a blank stimulus. The dashed line is the zero level, defined as the response to a blank. Figure 3D–F shows changes in spatial frequency selectivity over the response period of the neuron for the examples shown in A–C. Similarly, Figure 3G–I shows the location of the peak spatial frequency as a function of time for the examples in A–C.
Figure 3A depicts a response that is initially broadly bandpass (τ = 36–42 msec), which becomes more selective over the course of the response. Although the magnitude of the response is approximately equal at τ = 42 and τ = 66 msec, the Q-factor at τ = 66 msec is approximately double the Q-factor at the earlier time delay. The change in selectivity over time is illustrated in Figure 3D. This increase in selectivity is a result of two potentially related phenomena that are apparent in Figure3A: the low spatial frequency limb of the tuning curve becomes markedly steeper over time, and the response at the lowest spatial frequencies is suppressed, starting at τmax= 60 msec.
Figure 3B shows another common phenomenon that may be related to suppression at low spatial frequencies: a change in peak spatial frequency over time from low to high spatial frequencies. At τ = 42 msec, the response peaks at 1.8 cycles per degree (cpd). Later in the response, at τ = 74 msec, the response peaks at 4.7 cpd. The change in fpk can be seen more clearly in Figure 3H, which plots fpk as a function of τ. The total shift in fpk for this cell was 1.9 octaves. The shift was accompanied by a decrease in responsiveness at low spatial frequencies to baseline levels by τ = 66 msec, and below baseline levels as the response decays.
The increase in selectivity, change in peak spatial frequency, and suppression at low spatial frequencies frequently occur together and, as we show below, appear to be related. Both the responses in Figure 3, A and B, show all three phenomena, whereas the example illustrated in C shows virtually none of these response characteristics. There is no overall change in either selectivity or peak spatial frequency, and the apparent high spatial frequency suppression (τ = 120–130 msec) is not significantly different from noise activity.
Preferred spatial frequency
We measured the preferred spatial frequency of R(f, τmax) and of the time-averaged response,(f). There was no significant difference between these two measures (Wilcoxon sign rank test; p > 0.5). Figure 4A shows the distribution of f̅p̅k̅ ̅ ̅ for the time-averaged data. f̅p̅k̅ ̅ ̅ ranges from 0.13 to 9.72 cpd. The average f̅p̅k̅ ̅ ̅ was 3.7 cpd (SD = 2.1 cpd). The distribution is skewed toward low- to mid-range spatial frequencies, with a sharp dropoff above 4.5 cpd.
To determine whether fpk is invariant with time, we measured the magnitude of Δfpk for each cell (Eq. 7). Figure 4B shows how Δfpk is distributed in the population. Positive values indicate that fpk shifted from low spatial frequencies toward higher spatial frequencies, whereas negative values indicate a spatial frequency preference shift from higher to lower spatial frequencies. On average there is a positive change in fpk over time, averaging 0.62 octaves ± 0.69 (1 SD). The largest change in fpk was slightly over two octaves. Figure 3, A and B, both provide examples of responses with large changes in fpk over the course of the response; for both cells, plots of fpk as a function of τ are shown in Figure 3, G and H, respectively. For both responses, the peak spatial frequency increases at an approximately constant rate over the entire time course of the response enhancement. The change in the preferred spatial frequency over time is a clear form of inseparability in spatial frequency and time that we observed in many V1 neurons.
We used the Q-factor to estimate spatial frequency selectivity. As we did for fpk, we measured selectivity for both R(f, τmax) and for the time-averaged curve,(f). The Q-factor was significantly higher for the time-averaged curve than for τmax (Wilcoxon sign rank; p < 0.01). Figure5A shows a histogram of the Q-factor for (f). Selectivity ranged from very untuned (Q-factor = 0.36) to highly tuned (Q-factor = 2.12), with a mean of 1.24 ± 0.38 (1 SD).
The Q-factor in V1 cells may change over time. Figure 5B shows the Q-factor during development of the response compared with the Q-factor during decay. For the majority of cells, Q(τdecay) is higher than Q(τdev) (Wilcoxon sign rank test; p < 0.001), although the amplitude of the curve at the two time points is equal by definition.
We next asked whether the increase in selectivity is similarly distributed on both flanks of the tuning curve (Fig. 5C). The increase in selectivity should be accompanied by an increase in the steepness of at least one of the limbs of the tuning curve. We compare the change in steepness (Δsteepness) for the response on both sides of the peak spatial frequency to determine whether Δsteepness is evenly distributed. For the low spatial frequency flank, Δsteepness is measured as ML(τdecay) − ML(τdev); similarly, Δsteepness for the high spatial frequency flank is measured as MH(τdecay) − MH(τdev). Positive values of Δsteepness indicate an increase in steepness over time. Figure 5C shows the results of the comparison. The low spatial frequency side of the tuning curve shows a significantly larger increase in steepness than the high spatial frequency flank (Wilcoxon sign rank test; p < 0.001). Δsteepness(f < fctr) ranges from just below zero, or no change, to ∼0.7. In contrast, values of Δsteepness(f ≥ fctr) tend to be clustered near zero. These results indicate that the increase in selectivity is primarily dependent on changes in the response to low spatial frequencies.
The increase in the steepness of the low-frequency limb of the tuning curve often appeared to be accompanied by suppression at the lowest spatial frequencies. A majority of neurons (69%; 65 of 94) showed significant response suppression, usually at spatial frequencies lower than the optimal. The distribution of suppression strength in our data is shown in Figure 6A. The suppression index is a measure of how strongly suppression contributed to the overall response (Eq. 11). A value of 1 indicates a purely suppressive response, whereas a value of 0 indicates pure enhancement. A suppression index of 0.5 indicates that enhancement and suppression are equally balanced. For the majority of cells with significant suppression, suppression accounted for a little less than one-third of the total area under the curve (mean SupIndex = 0.27 ± 0.19).
For most cells, maximal suppression occurred at the lowest spatial frequencies measured, regardless of the spatial frequency of excitation. Figure 6B indicates the location of suppression, relative to the location of excitation (Eq. 12). For the majority of cells, excitation is centered at higher spatial frequencies than suppression. This pattern is consistent with a role of suppression in increasing the steepness of the low spatial frequency limb of the tuning curve. In addition, we observed from the model fits that the peak amplitude of the suppressive component was delayed relative to the peak of the excitatory component by ∼5 msec on average (data not shown).
Relative location and widths of enhancement and suppression
On the basis of changes in selectivity and fpk over time, as well as the location of suppression at frequencies other than fpk, we conclude that, as a whole, the dynamics of spatial frequency tuning are not separable in spatial-frequency and time. Instead, we found that the response could be well fit by a model based on two input components, each of which is separable in space and time. The components of the model act as excitatory and inhibitory inputs to the response, with separate temporal profiles. Figure 7gives an example of the model fit for the response shown in Figure3A. The response shows a change in both selectivity and peak spatial frequency as a function of time, which are both reasonably well fit by the model. The spatial and temporal profiles of the model components for this cell are illustrated in Figure 7, A and B, respectively. For both plots, the solid line indicates the excitatory component; the dashed line indicates the inhibitory component. The inhibitory component is centered at lower spatial frequencies but largely overlaps the excitatory component, and its time course is slightly delayed relative to the development of the excitatory component. Figure 7C shows the model fit to the data for τdev, τmax, and τdecay. The fit is able to capture the change from broad bandpass tuning at τdev to sharp tuning with low spatial frequency suppression at τdecay.
Figure 8 shows the population data for the best fitting curves. Figure 8A compares fctr(E) with fctr(I), in a log-log scatter plot. fctr(I) tends to be lower than fctr(E), by up to 3.5 octaves. The large degree of separation between the centers of the two components is a consequence of the tendency for fctr(I) to be located at the lowest spatial frequencies that we measured, whereas fctr(E) tends to be located at spatial frequencies very similar to the peak of the response. The degree of separation between the centers of the components might be surprising at first, because the components can only interact if they overlap.
Figure 8B compares ςI and ςE. A large proportion of cells are located near the unit line, indicating that they tend to covary. In other words, cells with broad excitatory components also tend to have broad suppressive components. Cells with the largest separation between fctr(E) and fctr(I) also tend to have large ς values, suggesting that the increase in the width of the components compensates for the separation between their centers (data not shown). The result is that there is a large degree of overlap between the components, regardless of the separation between fctr(E) and fctr(I).
Correlation between suppression, Δfpkand ΔQ
Many cells show both a net effect of suppression and a change in selectivity and fpk. We asked whether suppression plays a role in generating these characteristics. In principle, a time-delayed suppressive component at low spatial frequencies could increase both selectivity and cause a shift in fpk toward higher frequencies over time. If the suppressive and excitatory input overlap, but have different tuning, the result will be an increase in the slope of the tuning curve. The location of suppression relative to excitation determines whether the increase in selectivity will be symmetrical or biased toward one flank of the tuning curve. If suppression is located at the same spatial frequency as excitation, but is broader in tuning, both flanks of the tuning curve would be equally suppressed, resulting in a symmetrical tuning function. However, if suppression is located primarily at spatial frequencies either higher or lower than excitation, the increase in selectivity would be biased toward the corresponding flank of the tuning curve. In addition, the overlap between suppressive and excitatory input might push the peak of the resulting tuning curve away from the location of suppression over time.
Suppression and selectivity
We examined the possible role of suppression in increasing selectivity by looking at the correlation between spatial frequency selectivity and suppression (Fig.9A). In general, cells with a high suppression index (SupIndex > 0.23), which indicates the relative contribution of suppression to the total response (Eq. 11), tend to be more selective than cells with a lower suppression index (Wilcoxon sign rank test; p < 0.001). This relationship suggests that at least one of the roles of suppressive input may be to increase the selectivity of the tuning of the cell.
We investigated the relationships between the inhibitory and excitatory inputs in our model to examine the influence of suppression on selectivity. If the delayed development of suppression is responsible for the increase in selectivity shown by many of the cells in our sample, we should find a correlation between changes in selectivity over time, ΔQ (Eq. 9), and the amount of overlap between the components of the model, Δoverlap (Eq. 14).
Figure 9, B and C, compares ΔQ and Δoverlap, on the basis of whether the suppression was centered at lower or higher spatial frequencies than excitation. When fctr(E) is higher than fctr(I), there is a significant positive relationship between ΔQ and Δoverlap (Fig.9B) (r2 = 0.6; p < 0.001). The opposite is true when fctr(E) is lower than fctr(I) (r2 = 0.4; p < 0.05). These correlations suggest that the relationship between suppression and excitation contributes to changes in selectivity over time.
Suppression and spatial frequency shift
In addition to changes in selectivity, changes in the amount of overlap between the excitatory and suppressive components could also produce changes in the peak of dynamic tuning. An increase in Δoverlap would tend to shift the peak of the tuning curve away from fctr(E), toward the nonsuppressed flank of the tuning curve. We tested this hypothesis by comparing Δoverlap with Δfpk (Fig.10A,B); for larger values of Δoverlap, we expect to find larger shifts in fpk.
Similar to the results for selectivity, there is a positive correlation when fctr(E) is greater than fctr(I) and a negative correlation otherwise (r2 = 0.3; p < 0.001 and r2 = 0.3; p < 0.1 respectively). These results suggest that suppression may also be involved in producing the shift of fpk over time.
In this study, we measured how spatial frequency tuning evolves as a function of time from stimulus presentation. We found that, for a majority of cells, the tuning curve is not separable in spatial-frequency and time. The three most salient patterns in the data were (1) shifts in the preferred spatial frequency toward higher spatial frequencies over the duration of their response, (2) increases in selectivity over time, and (3) suppression at low spatial frequencies.
The role of suppression in generating cortical spatial frequency tuning
Of particular interest in this study was whether suppression plays a role in spatial frequency tuning. We found suppression primarily at low spatial frequencies, slightly lagged in time relative to the development of excitation. For individual cells, the relative amount of suppression correlates with the selectivity of the response, suggesting that the generation of sharp cortical spatial frequency tuning may be directly dependent on suppression of nonpreferred stimuli. This hypothesis is supported by the dynamics of the relationship between suppression and selectivity revealed by the two-component model. The model suggests that the relative location, timing, and the amount of overlap between excitatory and inhibitory inputs are all important in understanding the relationship between suppression and selectivity. When lagged suppression is located at lower spatial frequencies than excitation, it inhibits the response to low spatial frequencies, sharpening the low spatial frequency limb of the tuning curve. As the magnitude of suppression increases relative to the magnitude of excitation, the increasing overlap between the inhibitory and excitatory components pushes the peak of the spatial frequency tuning curve toward higher frequencies. Such a mechanism might be partly responsible for transforming low-pass tuning input from the LGN into the more typical bandpass shape observed in V1. This process is seen in the dynamic responses of many cells, which are initially low-pass but become bandpass over the time course of their responses.
The relationship between suppression and selectivity in our data is consistent to some extent with the study of Bauman and Bonds (1991) in cat area 17. These investigators suggested that suppression occurs on the sharper limb of the spatial frequency tuning curve. For both “low-pass” and “high-pass” cells, suppression was most often revealed on the complementary limb of the tuning curve, i.e., on the high spatial frequency limb of a low-pass tuning curve. For bandpass cells, suppression was often located on both sides of the tuning curve. On the basis of their results, Bauman and Bonds (1991) suggested that suppression was involved in sharpening the slope of spatial frequency tuning.
For reasons that are not completely clear, however, another study of spatial frequency suppression has reached different conclusions (Ramoa et al., 1986). In this study, when spontaneous firing rates were elevated using pharmacological agents, there was no evidence for suppression in the spatial frequency tuning curve, although orientation tuning showed robust suppression.
Comparison with the dynamics of orientation tuning
There are several similarities between the dynamics of spatial frequency tuning and the dynamics of orientation tuning, suggesting that cortical selectivity for both orientation and spatial frequency relies on similar mechanisms. First, the dynamics of cortical tuning revealed suppression in both the orientation and spatial frequency domains. For both orientation and spatial frequency, the suppression is correlated with higher tuning selectivity (Ringach et al., 1997, 2002). In addition, dynamic orientation tuning responses are well described by a two-component model, in which an oriented excitatory component and a delayed, iso-oriented suppressive component can accurately capture the salient aspects of the data (Pugh et al., 2000). For orientation tuning, suppression tends to be broader than excitation, producing symmetrical flank suppression or global inhibition. A few cells also show changes in orientation preferences over time, which could be the result of suppression centered slightly off the excitatory peak. Such changes in the orientation peak could be analogous to the shift in peak spatial frequency found in this experiment. The similarities between the dynamics of orientation and spatial frequency tuning in V1 suggest that a single model in Fourier space can probably account for the transformation of thalamic input into sharp cortical selectivity.
Models of cortical tuning and the role of suppression
There are two major classes of models that attempt to explain the emergence of cortical selectivity. In their most simple forms, feed-forward models suggest that cortical tuning is generated from the organized convergence of thalamic input (Hubel and Wiesel, 1959, 1962;Troyer et al., 1998; Ferster and Miller, 2000). In contrast, feedback models hypothesize that the excitatory thalamic input provides only a weak tuning bias, which is then refined through intracortical excitatory and inhibitory interactions (Benevento et al., 1972; Worgotter and Koch, 1991; Ben-Yishai et al., 1995;Somers et al., 1995; Carandini and Ringach, 1997; Adorjan et al., 1999; Anderson et al., 2000; Pugh et al., 2000).
There is evidence that aligned LGN input does play at least a partial role in generating cortical selectivity (Bullier et al., 1982; Ferster, 1987; Chapman et al., 1991; Reid and Alonso, 1995); the extent of this role is at the center of the feed-forward/feedback debate. As feed-forward models have become more sophisticated, an additional question has arisen regarding the role of suppressive input. Current feed-forward models draw on evidence that V1 receptive fields receive push–pull suppression (Heggelund, 1981; Palmer and Davis, 1981; Ferster, 1988). Push–pull suppression is required by these models to explain contrast invariance, while maintaining a primary dependence on feed-forward influences to produce cortical selectivity (Troyer et al., 1998). Feedback models, on the other hand, focus on evidence of a more broadly tuned suppression in the orientation domain (Benevento et al., 1972; Nelson, 1991; Sato et al., 1996; Ringach et al., 1997) that could improve cortical selectivity by suppressing nonoptimal stimuli (Ben-Yishai et al., 1995; Somers et al., 1995). Such suppression could also produce contrast invariance (Wielaard et al., 2001). Both classes of models were developed to account for the properties of orientation selectivity. Here, we consider how well these classes of models account for dynamic spatial frequency tuning in V1.
Our results suggest that response suppression has a strong influence on the spatial frequency tuning characteristics of V1 neurons. The influence of response suppression is seen most clearly in spatial frequency tuning selectivity, which is sharpened through inhibitory influences. The effects of suppression on tuning selectivity indicate that, although response suppression may be responsible for contrast invariance and contrast gain control, it is also involved in producing sharp tuning. We think a single inhibitory circuit could potentially explain, in a parsimonious way, gain control phenomena, contrast invariance, and the dynamics of both orientation and spatial frequency tuning selectivity.
The effect of suppression on cortical selectivity is most consistent with feedback models, which predict that LGN input provides a tuning bias, but that this bias is sharpened by intracortical excitation and suppression (Ben-Yishai et al., 1995; Somers et al., 1995; Adorjan et al., 1999). Thus, feedback models predict that the dynamic response should become more selective over time as the intracortical feedback develops. These predictions are supported by the increase in selectivity over time observed in our data. In addition, the broad tuning of the excitatory input components fit by our model (Fig. 8A) is similar to estimates of thalamic spatial frequency tuning (Derrington and Fuchs, 1979; Kaplan and Shapley, 1982), suggesting that the excitatory components of the model could be thought of as biased LGN input.
Current feedback models have not made specific predictions about the tuning of intracortical spatial frequency suppression. In the orientation domain, broad iso-oriented inhibition suppresses both flanks of the symmetrical orientation tuning curve (Ringach et al., 1997). The flank suppression increases the selectivity of orientation tuning while maintaining a symmetrical orientation tuning curve. In the spatial frequency domain, the LGN inputs to V1 are low-pass rather than symmetrically bandpass; thus inhibitory inputs at low spatial frequencies, such as those described by our model, would suppress the response at the lowest spatial frequencies, producing the bandpass curves characteristic of visual cortex.
The dynamics of spatial frequency tuning are not consistent with current instantiations of the push–pull model (Troyer et al., 1998), because this model does not predict the suppressive component at low-spatial frequencies that is seen in our data. However, we emphasize that our results do not reject the entire class of feed-forward models. Inhibitory influences could be caused by feed-forward input from the LGN that is inverted by cortical interneurons. Thus, it might be possible to modify the push–pull model to account for some or all of the effects seen in our data.
To summarize, we think that a goal for future theoretical work is to explain the correlation between suppression and selectivity observed in the data. It might be possible to identify a single inhibitory circuit that can adequately explain bandwidth, gain control, and contrast invariance simultaneously. Further experiments can also shed new light on the relationship between these properties. If inhibition is responsible for most of these phenomena, one could ask whether correlations are observed on a cell-by-cell case.
This research was supported by National Institutes of Health Grant EY-12816 and National Science Foundation Grant IBN-9720305 (D.L.R.).
Correspondence should be addressed to Christine Bredfeldt, Department of Psychology, 405 Hilgard Avenue, Franz Hall, University of California, Los Angeles, Los Angeles, CA 90095. E-mail:.