The organization of cat primary visual cortex has been well mapped using simple stimuli such as sinusoidal gratings, revealing superimposed maps of orientation and spatial frequency preferences. However, it is not yet understood how complex images are represented across these maps. In this study, we ask whether a linear filter model can explain how cortical spatial frequency domains are activated by complex images. The model assumes that the response to a stimulus at any point on the cortical surface can be predicted by its individual orientation, spatial frequency, and temporal frequency tuning curves. To test this model, we imaged the pattern of activity within cat area 17 in response to stimuli composed of multiple spatial frequencies. Consistent with the predictions of the model, the stimuli activated low and high spatial frequency domains differently: at low stimulus drift speeds, both domains were strongly activated, but activity fell off in high spatial frequency domains as drift speed increased. To determine whether the filter model quantitatively predicted the activity patterns, we measured the spatiotemporal tuning properties of the functional domains in vivo and calculated expected response amplitudes from the model. The model accurately predicted cortical response patterns for two types of complex stimuli drifting at a variety of speeds. These results suggest that the distributed activity of primary visual cortex can be predicted from cortical maps like those of orientation and SF preference generated using simple, sinusoidal stimuli, and that dynamic visual acuity is degraded at or before the level of area 17.
The surface of primary visual cortex is tiled with small domains sensitive to specific image features. In cat area 17, functional imaging has revealed a quasi-periodic pattern of domains that are selective for stimulus orientation, spatial frequency (SF), and ocular dominance (Hubel and Wiesel, 1962; Shatz and Stryker, 1978; Tolhurst and Thompson, 1981; Tootell et al., 1981; Lowel and Singer, 1987; Bonhoeffer and Grinvald, 1991; Hubener et al., 1997; Shoham et al., 1997; Issa et al., 2000). Implicit in such mapping studies is the assumption that complex visual scenes are represented linearly within cortical domains based on the orientations and spatial frequencies found in the images.
A recent study, however, has called into question the idea that separable cortical maps, like those of orientation or SF preference, can predict population responses to complex stimuli. Basole et al. (2003) imaged activity in cortical orientation domains in response to drifting complex images (textures composed of bars) and found that the orientations in a stimulus were not sufficient to determine how domains are activated. Instead, response patterns changed with the direction or speed at which a stimulus moved. Based on these findings, they suggested that cortical responses cannot be characterized by the intersection of separate maps of tuning properties and instead suggested that a single map of spatiotemporal energy is needed to understand population responses to complex stimuli.
An alternative hypothesis is that response patterns within primary visual cortex (V1) can be predicted using a small set of maps that define spatiotemporal tuning curves for each location on the cortical surface. These tuning properties are defined by linear filter theory and explain many of the response properties of individual V1 neurons (Adelson and Bergen, 1985; van Santen and Sperling, 1985; Watson and Ahumada, 1985). According to this theory, neurons respond only to stimuli that fall within their orientation, SF, and temporal frequency (TF) pass bands. The maps of orientation and SF preference would therefore be major determinants of the activity patterns but would not be sufficient to predict activity patterns. Two groups have suggested that four additional maps, those of TF preference and orientation, SF, and TF bandwidths are also needed to predict cortical responses (Baker and Issa, 2005; Mante and Carandini, 2005). Linear filter models that include cortical maps of all of these parameters (Baker and Issa, 2005; Mante and Carandini, 2005) predict the velocity-dependent changes in orientation domain activity observed by Basole et al. (2003).
Here we test whether the linear filter model also predicts activity patterns within SF domains. We used optical imaging to measure activity in SF domains in response to complex stimuli and compared the measurements with predictions of the model. The model was constrained by measuring all of its parameters in vivo. The linear filter model performed well, predicting the different activity patterns observed in response to two types of complex stimuli moving at a variety of speeds. These findings imply that linear filtering accounts for much of the population response of V1 and that maps of cortical tuning properties generated using simple stimuli can predict cortical responses to complex images.
Materials and Methods
Cortical responses were recorded in cat areas 17 and 18. The University of Chicago Institutional Animal Care and Use Committee approved all procedures.
Female cats aged 14–16 weeks were used in acute experiments. Anesthesia was induced (20–30 mg/kg, i.v. loading) and maintained (2–3 mg/kg, i.v. as needed) with thiopental. Animals received Baytril IM (2.5–5 mg/kg, s.c.) as prophylaxis against infection, dexamethasone (1.0–2.0 mg/kg, i.v. or s.c.) to reduce cerebral edema, and atropine (0.04 mg/kg, s.c.) to decrease tracheal secretions. Ophthalmic atropine (1%) and phenylephrine (10%) were instilled in the eyes to dilate the pupils and retract the nictitating membrane, respectively.
The animal was placed in a stereotaxic apparatus in which core temperature was maintained at 37.5 ± 0.5°C by a thermostatically controlled heating pad. The animal received lactated Ringer's solution with 2.5% dextrose (LRS) through a venous cannula (2–10 ml · kg−1 · h−1). Positive pressure ventilation (1:2 O2/N2O) was adjusted to maintain end-tidal CO2 between 3.8 and 5.0% with peak inspiratory pressure at 10–20 cm H2O. Electrocardiogram and EEG were monitored throughout the experiment. The scalp was incised, and an opening was made in the skull using a dental drill. Paralysis was then induced with pancuronium–bromide added to the LRS (0.1 mg/kg, i.v. induction; 0.04–0.1 ml · kg−1 · h−1 maintenance). The dura was incised and reflected, and the brain was stabilized with 3% agarose in sterile saline. The opening was covered with a glass coverslip. A combination of contacts and trial lenses were used to focus the eyes at 40 cm. At the end of the acute procedure, the animal was killed with an overdose of pentobarbital (60 mg/kg).
Extracellular multiunit recordings were made with 0.5–5 MΩ tungsten electrodes (Frederick Haer Company, Bowdoinham, ME). Potentials were amplified and filtered at 5 kHz (model 1800; A-M Systems, Carlsborg, WA) and then digitally sampled at 10 kHz and stored for off-line analysis (Micro 1400; Cambridge Electronic Design, Cambridge, UK). Visual stimuli were the same as those used in imaging, except for the timing: each stimulus condition was initially shown without movement for 1 s and then drifted for 2 s, with drift direction reversing after 1 s. All conditions were presented pseudorandomly interleaved with two equal-luminance gray stimuli, and conditions were repeated four to eight times. Multiunit responses to a stimulus were calculated by averaging the spike rate during the 2 s drift period and subtracting the average spike rate from 2 s presentations of the gray stimuli.
For comparison with imaging data, multiunit sites were classified as preferring either “low” or “high” SF based on the ratio of their responses to the low and high SF pure sine waves; sites with a ratio >1.0 were classified into low SF units, and those with a ratio <1.0 were defined as high SF units.
Stimulus presentation and intrinsic signal imaging.
The visual stimuli used to map response properties consisted of sine wave gratings at 80% contrast drifting across the central 60° of visual space (mean luminance, 27.5 cd/m2). The portions of areas 17 and 18 that were imaged correspond to the central 5–10° of visual space (Tusa et al., 1978, 1979). All stimulus sets included four randomly interleaved mean-luminance gray stimuli; the responses averaged over these four stimuli were used to produce a “blank” response image. Stimuli were generated by computer and displayed on a gamma corrected 21 inch cathode ray tube (CRT) monitor using the Psychophysics Toolbox extensions for Matlab (MathWorks, Natick, MA) (Brainard, 1997; Pelli, 1997). The contrast linearity of the stimulus system was confirmed by imaging the CRT using a 1M30 CCD camera (supplemental Fig. 1, available at www.jneurosci.org as supplemental material).
Stimulus gratings first appeared as stationary images for 6 s (the interstimulus interval) and then drifted across the screen for 6 s. The drift direction of the gratings was reversed every 2 s. Primary visual cortex was illuminated with 610 nm light and, during the last 5.5 s of stimulus drift, a CCD camera (Dalsa 1M30) mounted on a macroscope (as described by Bonhoeffer and Grinvald, 1996) acquired images at 30 frames/s (a total of 165 frames were averaged for each stimulus presentation). The camera was focused 0.6 mm below the surface of cortex. Each stimulus condition was shown 16 times, presented in pseudorandom order to minimize adaptation effects (Dragoi et al., 2000). Responses to each condition were averaged over all stimulus trials to produce a single-condition response image. After acquisition, each single-condition response image was normalized to the blank response image and spatially bandpass filtered with uniform kernels (72 × 72 μm kernel for the low-pass filter and 1.68 × 1.68 mm kernel for the high-pass filter).
Response amplitudes were measured in restricted regions (templates) of the imaged field. Templates were generated using a combination of manual and automated methods to exclude areas that were either nonresponsive or contaminated by vascular artifacts. Portions of the map were included if they (1) were in area 17 (the method for distinguishing areas 17 and 18 is described below), (2) showed a well organized orientation map from all datasets used in analysis (when responses in a region degrade between experiments, the structure of the orientation map also degrades, so we required that responses in a region be strong enough to generate an orientation map in all the datasets), and (3) were not dominated by blood vessels (contaminating blood vessel patterns were excluded if they showed large reflectance changes in response to visual stimuli; such vessels were identified using a map of the SD of responses to four oriented stimuli, in which blood vessels appear bright because their modulation is often stronger than that of neural tissue). Portions of the imaged field that were active in some datasets were therefore sometimes excluded (see Fig. 1).
Orientation maps were generated using a standard vector averaging procedure (Blasdel and Salama, 1986). SF preference maps were constructed from responses to gratings of 0.3 and 0.9 cycles (c)/° presented at four orientations. The stimulus SF that best activated a given pixel was defined as the preferred SF of that pixel; this is the same method used to construct SF maps by Issa et al. (2000). Maps generated using this method have been verified previously using targeted extracellular electrophysiology (Issa et al., 2000).
We used sinusoidal stimuli to measure SF and TF tuning curves in low and high SF domains. TF tuning curves were measured in response to sinusoidal gratings at four orientations, two SFs (0.3 or 0.9 c/°), and with six TFs (0.5, 1, 2, 4, 8, and 16 c/s). TF tuning curves were measured using the stimuli that were closest to the peak orientation of each pixel and SF. SF tuning curves were measured from responses to six SFs (0.1, 0.25, 0.5, 0.75, 1.0, and 1.5 c/°) measured at the preferred orientation of each pixel (selected from four orientations: 0, 45, 90, and 135°). We measured contrast tuning curves by presenting sine wave gratings (0.5 c/° at 2 c/s) at four orientations (0, 45, 90, and 135°) and six contrasts (5, 10, 20, 40, 60, and 80%). Each image was blank normalized, and then the response to each SF, TF, or contrast was determined separately within low and high SF domains. These measurements were used to fit functions (Eqs. 2–4) from which all parameters needed by the linear model were estimated; tuning curves were normalized to a maximum value of 1 for display.
Areas 17 and 18 were distinguished in optical images based on the methods of Hubener et al. (1997) and Issa et al. (2000). Using the SF map, the boundary between areas 17 and 18 was determined based on two criteria: first, there is a transition from an area preferring high SFs to an area preferring low SFs; second, the transition occurred along a boundary running from the caudolateral portion of the lateral gyrus to the rostromedial portion of the lateral gyrus. Our estimate of the area 17–18 boundary is shown for three experiments in supplemental Figure 2 (available at www.jneurosci.org as supplemental material). The two regions conformed to previously described features of areas 17 and 18 (Movshon et al., 1978); measured optically, the average TF preference of area 18 (2.72 ± 0.19 c/s, mean ± SD; n = 2) was higher than that of area 17 (2.07 ± 0.34 c/s; n = 3), and the SF preference in area 18 (0.18 ± 0.05 c/°; n = 3) was lower than that of area 17 (0.47 ± 0.05 c/°; n = 3).
Measuring responses to complex stimuli.
Two types of complex stimuli were presented: paired sine gratings and square wave gratings. Paired sine gratings were constructed by adding two sine wave gratings (each at 30% contrast and of the same orientation), one at a low SF (0.3 c/°) and the other at a high SF (0.9 c/°). These SFs were selected because they activate well separated SF domains in cat area 17. The relative phase of the sine wave gratings was fixed to zero, such that the pair moved as a coherent whole. Both sinusoidal components therefore moved with the same speed, but each component grating had a different TF because they had different SFs. Example stimuli can be seen in supplemental Figure 3 (available at www.jneurosci.org as supplemental material) (an example of drifting stimuli can be seen in the supplemental movie, also available at www.jneurosci.org as supplemental material). Paired sine gratings and square wave gratings were presented at speeds from 0.6 to 20°/s. These stimuli were randomly interleaved with four blanks and one low SF and one high SF sine wave grating drifting at 2 c/s (SF ratio maps were generated from responses to these pure sine wave gratings to classify each pixel into a low or high SF domain). All stimuli were presented at four orientations (0, 45, 90, and 135°).
For each experiment, images were first blank normalized, and then responses to a given stimulus were averaged over all pixels classified as belonging to a particular domain (either low SF or high SF). Imaged responses to the complex stimuli were normalized by the maximum response to individual sine wave gratings (either 0.3 or 0.9 c/° gratings drifting at 2 c/s). This normalization allowed us to group response profiles from different pixels, regardless of the baseline reflectance of a pixel or maximum modulation. For comparison with the predictions of the model, responses were normalized by the maximum response in the dataset; responses in only one SF domain and to one stimulus were therefore maximal (with a value of 1), with all the others below 1. Unless otherwise noted, all values are reported as mean ± SEM, in which the number of samples (N) was the number of cats in which the experiment was run. Pearson's correlation coefficients were calculated between the measurements made in each SF domain and the predictions for the same SF domain.
The linear filter model.
We implemented the linear filter model as specified by Baker and Issa (2005), modifying it to include the nonlinear contrast–response function of cat area 17 neurons. Stimuli were defined by their spatial power spectrum (for example spectra, see supplemental Fig. 3, available at www.jneurosci.org as supplemental material). The response in a cortical domain was calculated using Equation 1 (schematized in supplemental Fig. 4, available at www.jneurosci.org as supplemental material): in which A is the amplitude of the stimulus at spatial frequency p, determined from the power spectrum. Because components in a stimulus had the same orientation and responses were measured at the preferred orientation of a pixel, the orientation parameter is left out of Equation 1. The function N is the contrast–response function, defined by the following Naka–Rushton equation: in which c is the contrast of the stimulus, G is the maximum value, n determines the maximum slope of the function, and c50 is the contrast that generates the half-maximum response.
S(p) and T(v,p) in Equation 1 are the spatial frequency and temporal frequency tuning curves, respectively. They are defined by log-Gaussian relationships as follows: in which p0 is the preferred SF, σS is the SD of the SF Gaussian (this parameter is reported as “SF bandwidth”), τo is the preferred TF, and σT is the SD of the TF Gaussian (“TF bandwidth”). Because the drift direction of an image component is always perpendicular to its orientation, the TF of the component is the product of drift speed and its SF; so, in Equation 4, TF is written in terms of speed (ν) and SF (p). Each of the tuning curves has a maximum value of 1; for example, if the stimulus SF p is the same as p0, the output of the function S is 1, otherwise it is <1.
Assessing linearity of the imaging system.
Intrinsic signal imaging is based on hemodynamic changes induced by neural activity and is therefore an indirect measure of cortical activity. To determine whether imaged responses are linearly related to neural responses over the range of contrasts used, we measured contrast–response curves in three experiments (sine wave gratings at each of four orientations had contrasts of 5, 10, 20, 40, 60, and 80%). Consistent with previous studies (Carandini and Sengpiel, 2004), the contrast–response curve measured by imaging shows a compressive nonlinearity with 50% saturation at a contrast of 28.5% and a maximum slope of 1.63 [average of low and high SF domains reported in Table 1; compare with a c50 of 32.5% and a maximum slope of 1.64 in the study of Carandini and Sengpiel (2004)]. This compressive nonlinearity is similar in form to that observed in electrophysiologically measured responses of cortical neurons (Albrecht and Hamilton, 1982; Sclar et al., 1990), suggesting that imaged responses are monotonically related to average neuronal responses. A previous comparison of optical and neuronal contrast–response curves in cat area 18 also suggested that optical responses are monotonically related to neuronal activity, but that the optical response is more sensitive to low contrasts than is neuronal spiking [optical c50 and slope, 8%, 1.42; neuronal c50 and slope, 16.8%, 2.31 (Zhan et al., 2005)]. Although the differences between the optical and neuronal contrast–response curves were small, they suggest that there is a nonlinear transform between neuronal firing and optical responses. It is important to note, however, that the optically measured contrast–response functions measured in area 17, which appear linear up until 60% contrast, differ substantially from those in area 18, which saturated at 20% contrast (Zhan et al., 2005). Optical responses in macaque V1 and secondary visual cortex (V2) show a similar pattern, in which responses in V2 saturate at low contrasts but responses in V1 are linearly proportional to contrast over a substantially larger contrast range (Lu and Roe, 2007). Therefore, most of the stimulus contrasts used to test the filter model were in the linear range of the optical contrast–response function and near the top of the linear range of electrophysiologically measured contrast–response functions (Albrecht and Hamilton, 1982), so the difference between neuronal activity and imaged response profiles is likely to be small.
In a separate experiment, we also measured responses to electrical stimulation (supplemental Fig. 5, available at www.jneurosci.org as supplemental material). The maximum response to electrical stimulation was linearly proportional to the stimulating current even up to very high currents (current ranged from 20 to 200 μA). This indirectly suggests that the compressive nonlinearity observed in the imaged contrast–response curve is attributable to the nonlinearity in the neuronal responses rather than to a nonlinearity in optical imaging. Although we cannot rule out all nonlinearities in intrinsic signal imaging, we show that intrinsic optical signals are monotonically related to both neuronal responses and local potentials generated by electrical stimulation.
To determine whether complex images are represented linearly within the spatial frequency domains of cat area 17, we imaged activity patterns in response to both simple and complex stimuli. Simple sinusoidal stimuli were used to determine the spatiotemporal tuning properties of different SF domains within area 17. We then measured the activity in those same domains in response to stimuli composed of multiple SFs and compared the measured responses with predictions based on the tuning properties.
Spatial frequency and orientation domains were identified using intrinsic signal imaging in area 17 of four cats. Preference maps were generated in response to sine wave gratings (Fig. 1) (example stimuli shown in supplemental Fig. 3, available at www.jneurosci.org as supplemental material). The structure of the orientation map was consistent with many previous reports in having repeating stereotyped pinwheel structures (Blasdel and Salama, 1986; Bonhoeffer and Grinvald, 1991). The SF map was also consistent with previous reports in showing separate domains sensitive to different ranges of SF (Hubener et al., 1997; Shoham et al., 1997; Everson et al., 1998; Issa et al., 2000). In subsequent experiments, we measured responses to a variety of stimuli within these identified domains.
Optical measurements of spatial and temporal tuning parameters in cat area 17
The linear filter model has several parameters, and, if they are allowed to vary, the model could be used to fit a range of arbitrary datasets. Instead, we constrained all the parameters of the model by measuring their values in vivo. The linear filter model as defined by Baker and Issa (2005) contained three filters, one each for orientation, SF, and TF. We made two modifications to this model. First, we incorporated a contrast–response function to account for the well described compressive nonlinearity that occurs at high stimulus contrasts (Albrecht and Hamilton, 1982; Sclar et al., 1990). Second, we disregarded the orientation filter because all components in each stimulus had the same orientation, and responses were measured in the orientation domain that matched the stimulus orientation. In this modified model, we therefore needed to measure seven parameters for each SF domain: SF peak and bandwidth, TF peak and bandwidth, contrast gain, contrast semi-saturation value, and contrast slope (for details, see Materials and Methods).
SF and TF tuning curves were measured in response to drifting sine wave gratings and are shown in Figure 2 and summarized in Table 1. The average SF preference of low SF domains was 0.35 c/°, and the SF preference of high SF domains was 0.62 c/° (n = 3). In addition, low SF domains had a slightly higher TF preference than did high SF domains (Fig. 2B), consistent with a weak inverse relationship between SF and TF preferences observed electrophysiologically (Baker, 1990; DeAngelis et al., 1993). SF and TF bandwidths were within electrophysiologically observed ranges in both low and high SF domains.
We also measured the contrast–response curves in these areas. Although the contrast–response relationship does not determine which domains are activated by a stimulus, it does affect the amplitude of response and thus the quantitative predictions of the model. Figure 2C shows contrast–response curves measured in low and high SF domains. The contrast–response curve was parameterized using the Naka–Rushton function (Eq. 2); parameter values are shown in Table 1. There was no significant difference in the contrast–response functions measured in the two regions, consistent with the findings of Carandini and Sengpiel (2004). We therefore used the average contrast–response relationship for calculating responses in both low and high SF domains.
Complex images are represented across spatial frequency domains
We next imaged the activity of SF domains in response to stimuli composed of multiple superimposed sine wave gratings. We used two types of complex stimuli, paired sine gratings (the sum of a low SF grating and a high SF grating) and square wave gratings (the sum of an infinite series of sine wave gratings). As Figures 3 and 4 show, a complex stimulus activates both low and high SF domains. By comparison, pure sine wave gratings strongly activate only low or high SF domains but not both (see SF tuning curves in Fig. 2A). This is the pattern of activity expected if different SF components in an image are represented in different SF domains in the cortex.
The pattern of activity in the SF domains changed with the drift speed of the stimulus (seen qualitatively in Fig. 3A,B). To quantify these changes, we measured responses to complex stimuli drifting at six speeds. Response amplitudes were first averaged across pixels that preferred the same SF and then normalized by the maximum response measured in either SF domain. Figure 4 shows how responses to paired sine gratings in SF domains change as a function of the drift speed (average of n = 4 experiments). The activity of high SF domains fell sharply when the image drift speed exceeded 3°/s, whereas responses in low SF domains remained strong even up to a drift speed of 20°/s. The fall off in high SF domains at high speeds is generally consistent with the predictions of a linear filtering model (Baker and Issa, 2005).
These imaging results are supported by extracellular recordings from the same experiments (supplemental Fig. 6, available at www.jneurosci.org as supplemental material). Similar but more pronounced speed-dependent shifts were observed in multiunit responses to paired sine gratings. Electrode penetrations were not targeted to any particular location, so units were grouped together for analysis based on a post hoc determination of their SF bias (for locations of electrode penetrations, see supplemental Fig. 2, available at www.jneurosci.org as supplemental material). Supplemental Figure 6 (available at www.jneurosci.org as supplemental material) shows the average responses from 14 recording sites that preferred low SFs (average SF preference, 0.35 ± 0.09 c/°, mean ± SD) and 17 sites that preferred high SFs (average, 0.89 ± 0.19 c/°, mean ± SD). As was observed with optical imaging, responses at high SF recording sites fell off at speeds >3°/s, whereas responses at low SF sites were maintained. The similarity in response profiles observed with imaging and microelectrode recordings support the conclusion that speed-dependent changes are different in low and high SF domains.
Comparison of model predictions to measured speed-dependent shifts in activity
If the linear filter model is a good description of cortical responses, it should predict the pattern of activity to both simple and complex images drifting at any speed. We therefore compared the speed-dependent shifts in activity measured in area 17 with the predictions of the model.
The model acts on a stimulus in four steps (schematized in supplemental Fig. 4, available at www.jneurosci.org as supplemental material). First, the spatial power spectrum of the stimulus, confirmed by imaging the stimulus monitor, is passed through the contrast–response function. Second, the stimulus is filtered by the TF tuning curve for the SF domain: the TF of a component is determined by its SF and the speed of the stimulus. Third, the stimulus is filtered by the SF tuning curve of each SF channel. Finally, the area under the spatiotemporally filtered power spectrum is integrated to provide the scalar prediction of response.
We initially consider how well the model predicts responses to a sine wave grating. Responses to a 0.3 c/° sine wave grating are plotted in Figure 5 as a function of drift speed. The responses were well predicted by the linear filter model: responses were stronger in low SF domains than in high domains, and responses peaked at slightly higher drift speeds in low SF domains than in the high domains. Overall, the model accounted for >50% of the variance in responses in the two domains (pure sine gratings, r = 0.77 for low SF domains, r = 0.71 for high SF domains). The reasonable fit to responses in low SF domains is not surprising, because this dataset was used to generate the TF tuning parameters shown in Table 1. However, the model also predicted responses in the high SF domain, which is an independent dataset because it was not used to estimate parameter values.
The predictions of the model were also compared with activity patterns in response to two similar complex stimuli. The paired sine and square wave stimuli we used have different power spectra, but both have power at a fundamental frequency and at the third harmonic (supplemental Fig. 3, available at www.jneurosci.org as supplemental material). It is not surprising, therefore, that the measured activity patterns were similar for the two stimuli. In both cases, the peak activity was nearly equal in low and high SF domains (compare Figs. 4 and 6), with responses in low SF domains being stronger at high drift speeds. The predictions of the model capture these response trends, accounting for 49–77% of the variance in the two SF domains (Fig. 4C: paired sine gratings, r = 0.72 for low SF domains, r = 0.70 for high SF domains; Fig. 6C: square wave gratings, r = 0.88 for low SF domains, r = 0.87 for high SF domains). It is important to note, however, that the similarity in the fits of the model to these two stimuli is not trivial: whereas both stimuli have power at the first and third harmonics, the phase relationship of the two components differs between stimuli. In the paired-sine stimulus, the peaks of the components sum, but in the square wave grating, the peaks subtract (Graham and Nachmias, 1971). Thus, despite the fact that most of the energy is at the same two frequencies, the stimuli have very different appearances.
Although the spectral differences between paired sine and square wave gratings are subtle, the model still correctly predicts how responses to these stimuli differ: the strongest response to paired sine gratings is found in high SF domains, whereas the strongest response to square wave gratings is found in low SF domains. These results suggest that the linear filter model predicts the detailed patterns of cortical activity in response to a variety of complex stimuli.
To determine whether both spatiotemporal components of the model (SF and TF tuning) are necessary to predict responses, we compared the data with predictions of simpler alternative models. In the first alternative, we assumed that orientation and SF vary across the cortical surface but disregarded the TF tuning curve (“No-TF” model). If this model were correct, responses would not vary with drift speed. We found that the full linear filter model always did a better job of predicting responses than did a model that lacks TF tuning (Fig. 7).
In the second alternative, we assumed that there was no variation in SF preference across the cortical surface (“No-SF” model). In this model, all the cortical domains were assumed to have the same SF preference, so the maximum response to a stimulus would be the same everywhere (within the appropriate orientation domain) (Fig. 8). The small variations in predicted response profiles are attributable to differences in TF tuning for the two domains (Fig. 8, compare black and gray dashed lines). By varying the value of the SF preference parameter, we could minimize the error of the fit to the three datasets (sine wave, paired sine, and square wave gratings) (Fig. 9). For the No-SF map model, the error is minimized at an SF preference value of 0.7 c/°, but, even at this optimal value, the total error for the simplified model was still larger than that of the full linear filter model (χ2 error of 0.59 for the No-SF model and 0.53 for the full linear filter model). The full model, with no free parameters, therefore provided a better description of both simple and complex responses than did a simpler model in which SF preference was a free parameter.
A full description of the organization of visual cortex would ideally allow us to predict the pattern of cortical activity generated by a visual stimulus and then predict perception based on the pattern of activity. Progress toward this goal started with the description of cortical maps of individual response selectivities, such as orientation preference, direction preference, and SF preference (Blasdel and Salama, 1986; Bonhoeffer and Grinvald, 1991; Weliky et al., 1996; Hubener et al., 1997; Shoham et al., 1997; Everson et al., 1998; Issa et al., 2000). Equally important, however, was the recognition that the pattern of cortical activity cannot be predicted from individual selectivity maps (Basole et al., 2003). Instead, theoretical studies suggest that a combination of maps, including maps of orientation and SF and TF tuning properties, should predict the distributed responses of V1 to rigidly drifting images (Baker and Issa, 2005; Mante and Carandini, 2005). This combination of maps was sufficient to predict the qualitative transitions observed within ferret orientation domains with changes in the drift speed, aspect ratio, and drift angle of texture stimuli (Basole et al., 2003; Baker and Issa, 2005; Mante and Carandini, 2005). In addition, the model predicted that response patterns within SF domains should depend on stimulus complexity and drift speed (Baker and Issa, 2005). The results reported here confirm these predictions and suggest that linear filtering is a valid method for describing the distributed response patterns within primary visual cortex.
If responses in SF domains are determined by linear filtering, the pattern of activity should depend on the speed at which a stimulus moves. In the model, sensitivity to the speed of motion derives from the TF tuning of area 17 neurons (Baker and Issa, 2005). Image components with very high TFs should not activate neurons well, whereas components with lower TFs should. Because the TF of an image component is proportional to both its SF and the speed of the image motion, low SF components should drive cortical responses better than high SF components when the stimulus moves quickly. The pattern of activity in SF domains should therefore shift between SF domains as image speed changes.
The linear filter model
The degree to which linear filtering accounted for activity patterns across area 17 was surprising. A variety of nonlinearities have been described in both neuronal processing and perception that potentially could have affected response patterns, but instead they appear not to be major determinants of the imaged population response in V1. For example, both adaptation and masking psychophysical studies have suggested that high SF image components can suppress activity in low SF channels (De Valois, 1977; Tolhurst and Barfield, 1978; Switkes and De Valois, 1983). Single-unit recordings also show evidence of asymmetric suppression: neuronal responses in area 17 are more often suppressed by masking gratings of high SF than of low SF (De Valois and Tootell, 1983). However, the linearity in population responses that we observed suggests that the psychophysical suppression is accomplished either at higher cortical stages or within a small subset of V1 cells whose activity is swamped out in the optical signal by the linear responses of the majority of neurons. The only nonlinearity included in our model was the compressive nonlinearity associated with the contrast–response curve, but, because most of the stimuli were presented at relatively low contrasts, even this nonlinearity had only a modest effect on predicted responses.
The spatial frequency map
Although several previous imaging studies provide evidence for an SF map (Hubener et al., 1997; Shoham et al., 1997; Everson et al., 1998; Issa et al., 2000), consensus has not been reached as to whether it is a genuine organizational feature of area 17. An analysis by Sirovich and Uglesich (2004) suggested that SF domains might be artifacts of intrinsic signal imaging and that SF preference does not vary with position along the cortical surface. However, a subsequent study using flavoprotein autofluorescence imaging, a metabolic but nonhemodynamic mapping signal, produced SF maps with statistically similar structure to those of intrinsic signal imaging, suggesting that the variations in SF preference measured optically are likely genuine (Husson et al., 2006).
Our findings also support a topological organization of SF preference. Activity patterns generated in response to sinusoidal gratings were different from patterns generated by complex stimuli. If there were no differences in spatiotemporal tuning within orientation domains (that is, if there were no SF map), response amplitudes in an orientation domain would vary with the SFs and TFs in a stimulus, but the pattern of activity across the orientation domain would be the same whether there was one component or 20 components with different SFs. This seems to be the case in ferret V1, in which there is little variation in SF preference over the area of a hypercolumn (Yu et al., 2002; Schwartz, 2003; Basole et al., 2004), and patterns of activity in response to drifting textures were indistinguishable from patterns generated by gratings drifting at different angles (Basole et al., 2003). This was not the case in cat area 17, however, in which the pattern of activity generated by a sine wave grating was distinct from patterns generated by square wave or paired sine stimuli. More importantly, the different response patterns were well predicted by a linear model in which all tuning parameters were measured optically. Together, there is now a wide array of evidence, both direct and indirect, that SF preference varies across cat area 17.
Visual acuity is affected by the speed of a moving image (Brown, 1972; Westheimer and McKee, 1975; Levi, 1996; Chung and Bedell, 2003): as the speed of an image increases, less detail is perceived. Dynamic visual acuity, which formally describes the loss of acuity with increasing image speed, plays an important role in daily activities from sports to driving, for example, a driver's dynamic acuity is one of the best predictors of his or her driving safety (Burg, 1964). The neuroanatomical locus that limits dynamic acuity, however, is not known. Although temporal integration in the retina can smear the perception of moving images (Barlow, 1958; Ross and Hogben, 1974), neurons in the retina and lateral geniculate nucleus can still follow much faster temporal changes in a stimulus than most cells in area 17 (Ogawa et al., 1966; Wollman and Palmer, 1995). As a result, cortical response properties likely underlie the speed-dependent changes in acuity (Brown, 1972; Westheimer and McKee, 1975; Levi, 1996; Chung and Bedell, 2003), but psychophysical studies cannot determine whether it is primary visual cortex or higher visual areas that limit dynamic acuity.
The linear filter model suggests that the properties of area 17 underlie the speed-dependent loss of visual acuity (Baker and Issa, 2005). For rigidly drifting images (images in which all components move at the same speed), the TF of a component is proportional to its SF, so the TF of a high SF component is always greater than that of a lower SF component. Therefore, as the speed of the image increases, the TF of the high SF component should be the first to exceed the TF pass band of the cortical neurons. This prediction is confirmed by the imaged responses to both paired sine and square wave stimuli: responses in high SF domains fall off at lower speeds than do responses in low SF domains. Similarly, the data and model show that a wide range of spatial frequencies should be perceived at slow drift speeds, consistent with human psychophysical findings that even hyperacuity is not degraded by image drift speeds <2°/s (Westheimer and McKee, 1978). The consistency between activity patterns in SF domains and the psychophysical phenomenon suggests that dynamic acuity is most likely determined by the properties of the primary visual cortex.
This work was supported by Department of Homeland Security Fellowship DE-AC05-00OR22750 (A.R.), National Institutes of Health Grant T32GM07281 (A.K.M.), and grants from the Brain Research Foundation and Mallinckrodt Foundation (N.P.I.). We thank Dr. Kimberly Grossman for her help during experimentation. Members of the Issa laboratory provided comments on this manuscript.
- Correspondence should be addressed to Naoum P. Issa, 947 East 58th Street, MC0926, Department of Neurobiology, University of Chicago, Chicago, IL 60637.