Genetic methods available in mice are likely to be powerful tools in dissecting cortical circuits. However, the visual cortex, in which sensory coding has been most thoroughly studied in other species, has essentially been neglected in mice perhaps because of their poor spatial acuity and the lack of columnar organization such as orientation maps. We have now applied quantitative methods to characterize visual receptive fields in mouse primary visual cortex V1 by making extracellular recordings with silicon electrode arrays in anesthetized mice. We used current source density analysis to determine laminar location and spike waveforms to discriminate putative excitatory and inhibitory units. We find that, although the spatial scale of mouse receptive fields is up to one or two orders of magnitude larger, neurons show selectivity for stimulus parameters such as orientation and spatial frequency that is near to that found in other species. Furthermore, typical response properties such as linear versus nonlinear spatial summation (i.e., simple and complex cells) and contrast-invariant tuning are also present in mouse V1 and correlate with laminar position and cell type. Interestingly, we find that putative inhibitory neurons generally have less selective, and nonlinear, responses. This quantitative description of receptive field properties should facilitate the use of mouse visual cortex as a system to address longstanding questions of visual neuroscience and cortical processing.
Over the past nearly half century since visual responses were first described in the mammalian visual cortex (Hubel and Wiesel, 1962), there has been intensive research into the neural circuit and developmental mechanisms that give rise to selective receptive field (RF) properties. However, although the description of visual encoding has become increasingly more quantitative (Ringach, 2004; Carandini et al., 2005), as has the anatomy of neuronal subtypes (Gilbert, 1983; Douglas and Martin, 2004), it has been difficult to link these functional and anatomical findings into a local circuit model of cortical processing. Even more questions remain about how such a circuit might be assembled during development, despite advances in understanding the molecular mechanisms involved (Waites et al., 2005; Rash and Grove, 2006; Polleux et al., 2007) and the consequences of altered sensory input (Hensch, 2005; Hofer et al., 2006).
The recent proliferation of genetic technology in mice may provide tools to answer many of these questions (Callaway, 2005). Targeted gene disruption and transgene expression can result in much more specific manipulations than have been possible via pharmacology or sensory alteration and can allow perturbation of cellular signaling (Huang et al., 1999; Karpova et al., 2005), synaptic plasticity (Zeng et al., 2001; Sawtell et al., 2003), and firing patterns (Tan et al., 2006; Zhang et al., 2007a), even at the single-cell level (Brecht et al., 2004). Furthermore, fluorescent protein labeling provides precise anatomical techniques to yield information about cell type (Feng et al., 2000; Tamamaki et al., 2003) or even synaptic connectivity (Wickersham et al., 2007) in the intact brain, bridging the gap between circuit structure and function. Indeed, studies have already begun to take advantage of genetic methods to study visual system development (Fagiolini et al., 2003; Cang et al., 2005), plasticity (Fagiolini et al., 2004; Syken et al., 2006), and function (Sohya et al., 2007).
Studies of cortical visual processing have typically used carnivores or primates, which are considered to have a more refined visual system, including a much larger cortical region for visual processing, higher acuity, extensive visual behaviors, and orientation, ocular dominance, and spatial frequency columns (Issa et al., 2000; Ohki and Reid, 2007; Van Hooser, 2007). Understanding visual processing in such a simple system as the mouse cortex, which lacks both fine-scale spatial acuity and maps such as orientation columns, should provide insight into the minimal mechanisms necessary for receptive field development and function. However, although recent studies in rat and squirrel have described receptive field properties in the absence of orientation maps (Fagiolini et al., 1994; Girman et al., 1999; Heimel et al., 2005; Ohki et al., 2005), much less is known about visual encoding and processing in mice. Several early electrophysiology studies (Dräger, 1975; Mangini and Pearlman, 1980; Metin et al., 1988) (for review, see Hubener, 2003) indicated that, although mouse cortical neurons can be classified into categories similar to those described in other species, their overall level of stimulus selectivity may be significantly less than that of other species. However, these studies are difficult to evaluate because they were performed before many of the recent quantitative techniques for receptive field characterization were developed. A recent study, using two-photon imaging of calcium signals, measured differences in orientation selectivity in inhibitory and excitatory neurons of mouse primary visual cortex V1 (Sohya et al., 2007) but did not perform a thorough characterization of other visual responses and selectivity and was restricted to imaging the superficial layers.
We therefore undertook a quantitative survey of receptive field properties in V1 of the anesthetized mouse. In addition to determining the types of stimuli and range of stimulus parameters that are appropriate for probing vision in mice, we sought to confirm that the basic properties of visual processing that have been studied in other species are present in mice. These include orientation and spatial frequency tuning, the presence of both simple and complex response types, and higher-order properties such as contrast-invariant tuning. Furthermore, to relate these functional properties to computation in the cortical circuit, we analyzed responses by their laminar location within cortex and by putative identity as inhibitory versus excitatory neurons based on spike waveform.
Materials and Methods
In vivo physiology.
Recordings were made from adult C57BL/6 mice, 2–6 months of age. The animals were maintained in the animal facility at University of California, San Francisco (UCSF) and used in accordance with protocols approved by the UCSF Institutional Animal Care and Use Committee. For surgery, mice were sedated with an intraperitoneal injection of chlorprothixene (5 mg/kg) and then anesthetized with urethane (0.5–1.0 g/kg, i.p., at 10% w/v in saline). Administration of chlorprothixene several minutes before urethane greatly reduced the dosage of urethane necessary to induce surgical anesthesia. Additionally, atropine (0.3 mg/kg) and dexamethasone (2 mg/kg) were administered subcutaneously to reduce secretions and edema, respectively. The animal was maintained at 37.5°C by a feedback-controlled heating pad. A tracheotomy was performed, and a small glass capillary tube was inserted to maintain a free airway. After retracting the scalp, we performed a small craniotomy, ∼1 mm in diameter, and nicked a slit in the dura to allow insertion of the multisite electrode. The exposed cortical surface was covered with 2.5% agarose in extracellular saline (in mm: 125 NaCl, 5 KCl, 10 glucose, 10 HEPES, and 2 CaCl2, pH 7.4) to prevent drying and provide mechanical support. The electrode was lowered into the brain through the agarose to an appropriate depth and was allowed to settle for 30–45 min before the beginning of recording. The electrode was placed without regard for the presence of visually responsive units, and all units stably isolated over the recording period were included. For superficial recordings, the electrode was often inserted at an angle, to increase the distance between the insertion and recording sites. The eyes were covered with ophthalmic lubricant ointment until recording, at which time the eyes were rinsed with saline and a thin layer of silicone oil (30,000 centistokes) was applied to prevent drying while allowing clear optical transmission. We did not induce cycloplegia, and the resting pupil diameter was ∼1 mm. We recorded only at sites with receptive fields located at least 20° lateral to the visual meridian, to avoid confounding effects attributable to the binocular zone of vision.
At the end of recording, the animal was killed by overdose of barbiturates. For histology, the animal was intracardially perfused with 4% paraformaldehyde in PBS, and the brain was sectioned coronally at 100 μm with a vibratome (Lancer). Sections were incubated for several hours in 0.1 μg/ml 4′,6-diamidino-2-phenylindole (DAPI; Sigma-Aldrich) to stain nuclei and imaged on an epifluorescence microscope.
Recordings were made with silicon microprobes from NeuroNexus Technologies. Two configurations were used: a linear probe with 16 sites spaced at 50 μm intervals (model a1x16-3mm50-177), which could be used to span across multiple layers of cortex; and a tetrode configuration, with four tetrode clusters, each consisting of four sites separated by 25 μm on a side (model a2x2-tet-3mm-150-121), which was used primarily to provide better isolation of units in layers 2/3 and 4. Approximately two-thirds of all units were recorded with the tetrode configuration. The shanks of the probes were 15 μm thick and 3 mm long, with a maximum width at the top of the shank of 94 μm (tetrode) or 123 μm (linear.) For experiments followed by histology to reconstruct penetrations, the electrode was coated with a small amount of the lipophilic vital dye DiI (Invitrogen). Signals were acquired using a System 3 workstation (Tucker-Davis Technologies) and analyzed with custom software in Matlab (MathWorks). For local field potential (LFP) recording, the extracellular signal was filtered from 1 to 300 Hz and sampled at 1.5 kHz. Current source density (CSD) was computed from the average LFP by taking the discrete second derivative across the electrode sites, at two-site spacing to reduce noise, and interpolated to produce a smooth CSD map.
For single-unit recording, the extracellular signal was filtered from 0.7 to 7 kHz and sampled at 25 kHz. Spiking events were detected on-line by voltage threshold crossing, and a 1 ms waveform sample was acquired around the time of threshold crossing. To improve isolation of single units, recordings from groups of four neighboring sites were linked, so that a waveform was acquired on all four sites in response to a threshold crossing on any of the four. This procedure was used for both the tetrode and linear configuration electrodes; in the latter case, sites 1–4, 5–8, 9–12, and 13–16 (in order along the shank) were grouped together. In both cases, this “virtual tetrode” acquisition had two primary benefits: improved discriminability when a waveform appeared on more than one site, and common-mode noise rejection of signals shared on all four sites. Whereas the larger amplitude spikes of layer 5 and layer 6 units were sometimes recorded simultaneously on adjacent sites at the 50 μm spacing of the linear electrode, layer 2/3 neurons often appeared predominantly on one site, even at the 25 μm spacing of the tetrode configuration. In both cases, many units had signals on nondominant sites that were below the voltage trigger threshold; however, the simultaneous acquisition allowed this low-amplitude information to be integrated to improve discriminability.
The individual waveform samples were aligned by their most negative time point. To identify single units, the spike waveforms from the four sites together were parameterized by 6–10 independent components using the FastICA package for Matlab (http://www.cis.hut.fi/projects/ica/fastica/) and clustered by a mixture-of-Gaussians model using KlustaKwik (Harris et al., 2000). Using independent components, rather than principal components, allowed us to take advantage of common-mode noise rejection from waveforms simultaneously acquired across multiple sites in the virtual tetrode configuration. Whereas independent components analysis (ICA) has been used previously to filter the continuous data across channels (Snellings et al., 2006), we instead performed ICA on the individual waveforms, to integrate waveform parameterization and noise rejection. Quality of separation was determined based on the Mahalanobis distance and L-ratio (Schmitzer-Torbert et al., 2005) and the presence of a clear refractory period.
Units were then classified as narrow or broad spiking based on properties of their average waveforms, at the electrode site with largest amplitude. Three parameters were used for discrimination (Fig. 1 F,G): the height of the positive peak relative to the initial negative trough, the time from the minimum of the initial trough to maximum of the following peak, and the slope of the waveform 0.5 ms after the initial trough. This third measure provided a proxy for the total duration of the slower positive peak, because our waveform sampling was not of sufficient duration to measure the entire return to baseline. Two linearly separable clusters were found, corresponding to narrow-spiking (putative inhibitory) and broad-spiking (putative excitatory) neurons. These clusters were separated identically by both k-means and linkage clustering. Unit classification was stable and did not change when only two of the three measurements were used. The width of the initial trough, as used previously (Bruno and Simons, 2002), did not give good separation, which may be attributable to filtering by the electrodes or acquisition system because this is the shortest timescale in the waveform, or to its sensitivity to the alignment of individual waveforms in computing the average waveform. Because of the small numbers of narrow-spiking units in each layer, in cases in which no significant difference was seen across the individual layers, we pool together data for narrow-spiking units from all layers into one category for presentation.
A challenge involved in our approach of unbiased recording from a number of neurons simultaneously is that stimuli cannot be tailored to individual neurons. For example, it is much faster to measure spatial frequency tuning of a single unit by finding the optimal orientation and then varying the spatial frequency only at this orientation than by presenting all orientations and all spatial frequencies, as in this study. This also limited our ability to vary multiple parameters in a single stimulus set: for example, to measure contrast-invariant tuning, we chose one spatial frequency and then varied orientation and contrast, because the curse of dimensionality makes it impractical to sample finely across all three parameters. Thus, the response was limited to those neurons tuned to the spatial frequency we chose. This will become even more of a limitation in new techniques, such as two-photon calcium imaging, which can sample large populations but have poorer temporal resolution. Stimuli similar to the contrast-modulated noise movies described below, which can provide rapid measures of responsiveness across a widely tuned population, may help to address these challenges.
Stimuli were generated in Matlab using the Psychophysics Toolbox extensions (Brainard, 1997; Pelli, 1997) and displayed with gamma correction on a monitor (Nanao Flexscan, 30 × 40 cm, 60 Hz refresh rate, 32 cd/m2 mean luminance) placed 25 cm from the mouse, subtending ∼60 × 75° of visual space. Episodic stimuli were repeated five to seven times, with stimulus conditions randomly interleaved, and a gray blank condition (mean luminance) was included in all stimulus sets to estimate the spontaneous firing rate.
Episodic stimuli included drifting sinusoidal gratings [1.5 s duration, temporal frequency of 2 Hz, 12 directions, spatial frequency of 0.01, 0.02, 0.04, 0.08, 0.16, 0.32, and 0 cycles/° (cpd), i.e., full-field flicker], full-length drifting bars (width of 5°, velocity of 30°/s, 16 directions), drifting short bars (4 × 8°, 30°/s, four directions), contrast-reversing (counter phase) sinusoidal gratings (0.04 cpd, sinusoidally reversing at temporal frequencies 1, 2, 4, and 8 Hz, eight orientations, 2 s duration), and contrast-reversing square checkerboard (0.04 cpd, square-wave reversing at 0.5 Hz). Episodic stimuli were shown at 100% contrast with the background at mean luminance, except in a subset of experiments on contrast-invariant tuning, in which drifting sinusoidal gratings were presented as described above but at fixed spatial frequency of 0.04 cpd and contrast 6.25, 12.5, 25, 50, and 100%.
Gaussian noise movies were created by first generating a random spatiotemporal frequency spectrum in the Fourier domain with defined spectral characteristics. To drive as many simultaneously recorded units as possible, we used a spatial frequency spectrum that dropped off as A(f) ∼ 1/(f + fc ), with fc = 0.05 cpd, and a sharp cutoff at 0.12 cpd, to approximately match the stimulus energy to the distribution of spatial frequency preferences. The temporal frequency spectrum was flat with a sharp low-pass cutoff at 4 Hz. This three-dimensional (ω x , ω y , ω t ) spectrum was then inverted to generate a spatiotemporal movie. This stimulus is related to the subspace reverse correlation method (Ringach et al., 1997), in that both explicitly restrict the region of frequency space that is sampled. To provide contrast modulation, this movie was multiplied by a sinusoidally varying contrast. Movies were generated at 60 × 60 pixels and then smoothly interpolated to 480 × 480 pixels by the video card to appear at 60 × 60° on the monitor and played at 30 frames per second. Each movie was 5 min long and was repeated two to three times, for 10–15 min total presentation. A brief clip of the contrast-modulate movie is available as supplemental data (available at www.jneurosci.org as supplemental material).
The average spontaneous rate for each unit was calculated by averaging the rate over all blank condition presentations. For drifting gratings, responses at each orientation and spatial frequency were calculated by averaging the spike rate during the 1.5 s presentation and subtracting the spontaneous rate. The preferred orientation was determined by averaging the response across all spatial frequencies and calculating half the complex phase of the value The orientation tuning curve was constructed for the spatial frequency that gave peak response at this orientation. Given this fixed preferred orientation θpref, the tuning curve was fitted as the sum of two Gaussians centered on θpref and θpref + π, of different amplitudes A 1 and A 2 but equal width σ, with a constant baseline B. From this fit, we calculated two metrics: an orientation selectivity index (OSI) representing the ratio of the tuned versus untuned component of the response, and the width of the tuned component. OSI was calculated as the depth of modulation from the preferred orientation to its orthogonal orientation θortho = θpref + π/2, as (R pref − R ortho)/(R pref + R ortho). Tuning width was the half-width at half-maximum of the fit above the baseline, R ortho. In addition, direction selectivity was calculated from the fitted function as (R pref − R opposite)/(R pref + R opposite).
We used these measures of selectivity, rather than the circular variance, because the circular variance, which is a single global measure of the tuning curve, combines aspects of both depth of modulation and tuning width into one value and thus does not give as intuitive a description of the tuning curve (but see Ringach et al., 2002 for a thorough exposition). Furthermore, because it has not yet gained common usage, there is less previous data for comparison. However, measuring orientation selectivity as (1 − circular variance), i.e., the absolute value of S as defined above, gave similar results (supplemental Fig. S2, available at www.jneurosci.org as supplemental material) and should be useful in the future, particularly in cases in which a single metric for orientation selectivity is desirable.
The spatial frequency tuning curve was determined from drifting gratings from the response at orientation θpref described above and was fit to a difference of two Gaussians (Hawken and Parker, 1987). Bandwidth was calculated from the fit as the ratio of the spatial frequencies that gave half-maximal response. Units were classified as having low-pass spatial frequency tuning if the response to 0 cpd, the full-field flicker, was at least 50% of the peak response.
Linearity of response was calculated from drifting gratings, at the orientation and spatial frequency that gave peak response. First, we binned the 1.5 s presentation into a spike histogram at 100 ms intervals and subtracted the spontaneous rate. We then applied the discrete Fourier transform and computed F 1/F 0, the ratio of the first harmonic (response at the drift frequency) to the 0th harmonic (mean response).
Responses to drifting full-field bars were analyzed by computing peristimulus time histograms with bin size 100 ms. The spontaneous rate was subtracted, and the peak firing rate for each orientation was used to generate an orientation tuning curve, which was fit to the sum of two Gaussians and analyzed as described above.
Receptive field size was calculated from 4 × 8° light bars. Bars were swept across the visual field at eight locations along the axis perpendicular to the direction of motion, e.g., for horizontally moving bars, each presentation swept a bar across at a different vertical position. The responses from the eight sweeps were binned at 100 ms and used to construct firing rate as a function of bar position. This was fitted with a two-dimensional Gaussian, with independent widths σ x and σ y , and RF radius was calculated by averaging the half-width at half-maximum of the two axes of the Gaussian fit (equivalent to the semi-major and semi-minor axes of the ellipse generated by the half-maximum contour). This process was repeated for four different directions and averaged across all conditions that gave a sufficient response. The bar length, 8°, was chosen to elicit strong responses from as many units as possible. However, it puts a lower limit on our measurement of RF size, ∼4°, so we may have overestimated the size of the smallest receptive fields.
Temporal frequency was calculated from the response to contrast-reversing sinusoidal gratings at a fixed spatial frequency of 0.04 cpd. The average firing rate over each 2 s presentation was calculated, and the spontaneous rate was subtracted. For the orientation that gave the largest total response, a temporal frequency tuning curve was fit using a difference of two Gaussians.
To analyze the response to contrast-modulated white noise movies, we binned the number of spikes in response to each frame of the movie. A responsiveness metric was calculated by taking the discrete Fourier transform at the modulation frequency, normalized by the average firing rate. Additionally, the spike-triggered average (STA) of contrast-modulated movie responses was computed by the mean of the frames preceding each spike. Because we used a 1/f power spectrum for the stimulus set, the raw STA is broadened by the correlations in the stimulus set. However, because the stimulus is Gaussian and therefore only contains second-order correlations, we were able to correct the STA exactly by normalizing its Fourier transform by the power spectrum of the stimulus set (Theunissen et al., 2001; Sharpee et al., 2004). In general, the strongest STAs preceded spikes by a lag of two movie frames (66 ms). Preferred orientation and spatial frequency were calculated by finding the peak of the spatial Fourier transform of the STA at 66 ms lag. STA receptive fields were also fit to Gabor functions (a sinusoid with a Gaussian envelope), described by where x′ and y′ are rotated, and translated coordinates are defined by x′ = cosθ(x − x c) + sinθ(y − yc ) and y′ = −cosθ(x − x c) + cosθ(y − yc ). Fits were performed in Matlab, based on routines provided by Michael Lewicki and colleagues.
Contrast–response curves were generated from the response to drifting gratings with spontaneous rate subtracted, for the preferred orientation determined as above. Because stimuli were presented at a preselected spatial frequency (0.04 cpd) rather than the preferred spatial frequency of each unit, the peak firing rate is not indicative of the maximal responsiveness of the unit, and the curves were thus normalized to maximum response of 1. Normalized curves were fit to the Naka–Rushton equation (Naka and Rushton, 1966): R(C) = g/(1 + (C/C 50) b ), where C is contrast, g is the gain, C 50 is the midsaturation contrast, and b is a fitting exponent that describes the shape of the curve.
Statistical significance was determined by Mann–Whitney U test, except when otherwise stated. In the figures, *p < 0.05, **p < 0.01, and ***p < 0.001. For figures representing the median of data, error bars show standard error of the median as calculated by a bootstrap. In other cases, error bars represent SEM.
Laminar and cell-type identity
We recorded 235 isolated single units from 27 adult mice. Individual recording sessions consisted of simultaneous recordings of ∼4–12 isolated single units across the 16 sites of the multielectrode array. Of these units, 87% (204 of 235) were responsive to at least one episodic visual stimulus. Nonresponsive units were not included in analysis of tuning properties. We generally performed only one electrode insertion per animal to avoid damage to cortex from multiple penetrations and to maintain a stable anesthetic state without needing to redose.
To verify the laminar location of each recording site of the multielectrode array, we measured in addition to electrode depth the LFP response to square-wave contrast reversal of a checkerboard. As shown in Figure 1 B, the averaged responses show a large deflection starting ∼40 ms after the contrast reversal, which varies in both amplitude and waveform across the recording sites. CSD analysis provides a method to transform the set of LFP recordings into the locations of current sources and sinks, with current sinks generally corresponding to sites of synaptic conductances (Mitzdorf, 1985; Swadlow et al., 2002). This transformation revealed a laminar distribution of activation in response to checkerboard reversal (Fig. 1 C), with a current sink beginning in layer 4, in which sensory input first arrives, spreading up to layer 2/3, and finally a weak sustained sink in layer 5. Retrospective histology (supplemental Fig. S1, available at www.jneurosci.org as supplemental material) confirmed the correspondence between CSD and laminar identity. Furthermore, layer 4 corresponded to the maximum negativity of initial dip in the LFP (Fig. 1 B,C) (supplemental Fig. S1, available at www.jneurosci.org as supplemental material), which allowed us to deduce the laminar location even for recordings that did not span the entire depth of cortex and therefore could not generate a full CSD map of the layers. As shown below, this assignment of layers correlates with different properties of the visual response. Furthermore, during recording, it was generally possible to recognize these laminar positions qualitatively by the nature of ongoing activity: layer 2/3 has very sparse background activity; in layer 4, the spontaneous rate increases and more units are present on each recording site; layer 5 has dramatically higher spontaneous rates and much larger spike amplitudes; and, in layer 6, spontaneous rates and spike amplitudes decrease and evoked responses are more sparse.
Interestingly, we often saw a fast oscillation, at ∼50–55 Hz, superimposed on the LFP and CSD (Fig. 1 B, arrows). This oscillation was stimulus evoked, with coherent activity commencing with the onset of the evoked LFP. Furthermore, its frequency spectrum was distinct from typical 60 Hz noise and did not change as we varied the refresh rate of the display. We thus presume that this corresponds to gamma oscillations that have been described in many species (Buzsáki and Draguhn, 2004), including mice (Nase et al., 2003). However, we did not investigate this oscillation further.
As an additional means of correlating visual responses with cortical circuit elements, we classified units based on their spike waveform. Figure 1 D shows the average waveforms from all the units recorded, except for four units that had unusual waveforms and were excluded from classification. The waveforms can be seen to fall into two general classes, narrow spiking (blue) and broad spiking (green), with averages across the two classes shown in Figure 1 E. Previous results have suggested that narrow-spiking cells correspond to inhibitory, predominantly fast-spiking, interneurons (McCormick et al., 1985; Connors and Kriegstein, 1986; Barthó et al., 2004), and recent studies have used this extracellular signature as a means of distinguishing excitatory and inhibitory neurons (Swadlow, 2003; Andermann et al., 2004; Lee et al., 2007; Mitchell et al., 2007; Atencio and Schreiner, 2008). A variety of waveform parameters have been used to separate these two classes, including trough-to-peak time (Barthó et al., 2004; Mitchell et al., 2007), the ratio of trough-to-peak amplitude (Andermann et al., 2004; Hasenstaub et al., 2005), and the duration of the longer, positive peak (Bruno and Simons, 2002; Atencio and Schreiner, 2008). Because our acquisition did not record the entire duration of the positive peak, we used the slope 0.5 ms after the initial trough (i.e., how quickly the voltage is returning to baseline) as a proxy for this parameter. These three parameters provided good separability and distinguished the same two groups of waveforms, as shown in Figure 1, F and G. Nineteen percent of the total population of recorded cells (45 units) fell into the narrow-spiking category, and there was not a significantly greater fraction of narrow-spiking units in any particular layer (p > 0.1, χ2 test). This proportion matches the fraction of GAD-positive neurons previously identified histochemically (Tamamaki et al., 2003). For simplicity, we refer to these two groups as “putative inhibitory” and “putative excitatory,” but this should not imply that there is a perfect correspondence between the spike waveform and cell type assignment (see Discussion).
Selective and nonselective responses
A characteristic of visual cortex responses is the transformation from primarily center-surround, nonoriented receptive fields observed in retina and lateral geniculate nucleus of thalamus (LGN), to responses in V1 that are optimal for bars and edges of a particular orientation. To measure the degree of orientation and other response selectivity in mouse visual cortex, we presented full-length 5° wide bars of varying orientations drifting at 30°/s and drifting sinusoidal gratings moving at 2 Hz of varying orientation and spatial frequency. We found a range of response types, including many units that were highly selective for stimulus orientation, other units that responded to a wide range of stimuli, and a small fraction of units that did not significantly change their firing in response to visual stimuli. Examples of a highly selective and a poorly selective response are shown in Figures 2 and 3, respectively.
The unit shown in Figure 2 was a broad-spiking, putative excitatory neuron located in layer 4. Figure 2 A shows spike rasters for repeated presentations of a bar drifting in 16 different directions. It gives a strong response to bars over a narrow range of orientations and has a short duration of peak response, corresponding to <10° of visual space, as the bar crosses its receptive field. The tuning curve shown in Figure 2 B, measuring peak response for each orientation, demonstrates that the orientation tuning width is less than ±30° and that it has ∼2:1 preference for one direction of motion. Figure 2 C shows spike rasters from the same unit in response to full-field sinusoidal drifting gratings, which demonstrate a similar orientation tuning and selectivity (Fig. 2 D). Furthermore, it responds to gratings only over a particular range of spatial frequencies (Fig. 2 E), with maximal response ∼0.08 cpd, corresponding to grating bars spaced 12° apart. Finally, the unit responds to the drifting grating with a periodically modulated response. This is characteristic of a linear, or simple cell, response, such that the unit responds maximally when an oriented stimulus is in phase, i.e., aligned, with ON and OFF subregions of its receptive field, and responds minimally or reduces its response when the stimulus is out of phase with the receptive field. The amplitude of this modulation relative to the average firing rate, the F 1/F 0 ratio, has become a standard quantitative metric for the qualitatively defined simple and complex cell distinction (Skottun et al., 1991).
Figure 3 shows a much less selective response, from a narrow-spiking putative inhibitory unit also in layer 4. This unit responds to bars of all orientation (Fig. 3 A), with only a slightly elevated response to certain orientations (Fig. 3 B). It also responds over a much larger region of visual space, up to 40° across. The response to drifting gratings (Fig. 3 C) is similarly untuned for orientation (Fig. 3 D), and it extends over a wider range of spatial frequencies (Fig. 3 E), even giving a small response to 0 cpd, i.e., a full-field flicker. This unit also shows a relatively constant rate of firing as the grating drifts rather than the periodically modulated response seen in Figure 2 C. This would thus be classified as a nonlinear, or phase-invariant, response, because its response does not depend on the precise alignment of the grating within the receptive field.
Response properties across the population
The responses of these two units are representative of the two ends of the selectivity spectrum. We used two measures to describe the selectivity of the orientation tuning curves in response to drifting gratings. First, the OSI measures the relative response at the preferred orientation versus the orthogonal orientation, in other words, the tuned versus untuned response. Figure 4 A shows the OSI across the entire population of units that responded to drifting gratings (n = 182). The OSI was defined as (R pref − R ortho)/(R pref + R ortho), where R pref is the peak response in the preferred orientation, and R ortho is the response in the orthogonal direction. By this criterion, perfect orientation selectivity would give OSI of 1, an equal response to all directions would have OSI = 0, and 3:1 selectivity corresponds to OSI = 0.5. A large fraction of units thus show orientation selectivity, with 74% (135 of 182) having OSI > 0.5, and many showing nearly complete selectivity, such as the unit in Figure 2. Figure 4 B shows the preferred angle for units that were orientation selective and responded strongly to both gratings and bars (n = 87), demonstrating that this tuning was consistent between stimuli (r 2 = 0.85), and showed little systematic preference for certain orientations. For orientation-selective units (OSI > 0.5), we measured the width of the tuned response, as the half-width at half-maximum above the baseline. Figure 4 C shows that most of the oriented units show relatively narrow tuning, similar to that seen in the unit of Figure 2, with median tuning width ±28.0°. It should be noted that some studies have measured tuning width as half-width at half-maximum including the untuned response rather than above the untuned baseline. Because this predominantly affects the poorly selective units, including the untuned baseline only shifted the median width of units with OSI > 0.5 to 28.9°.
The degree of orientation selectivity varied dramatically across layers and between putative excitatory and inhibitory units, as shown in Figure 4 D, a scatter plot of orientation selectivity for all units that responded to drifting gratings, and Figure 4 E, which shows the mean orientation selectivity for each of these groups. In layers 2/3, 4, and 6, there is a sharp distinction between cell types, because almost all putative excitatory units were highly orientation selective, whereas most putative inhibitory units were untuned. Although there was not a great difference in the orientation selectivity index (tuned vs untuned response) between excitatory units across these layers, we did observe a slight sharpening in the width of orientation tuning in upper layer 2/3, as shown in Figure 4 F. In layer 5, putative excitatory units showed a range of orientation selectivity, but, overall, the selectivity was much less (p < 0.001) and very few showed the high orientation selectivity that was typical of the more superficial layers. Furthermore, even among orientation-selective units, the width of tuning in layer 5 was much broader (p < 0.001), demonstrating that both the width of the tuned response and the magnitude of the untuned response are increased. A second measure of orientation selectivity, based on the circular variance of response across orientations, gave similar results for both individual units and population comparisons (supplemental Fig. S2, available at www.jneurosci.org as supplemental material).
Although many cells showed preferential responses for a particular orientation, it was less common for them to be selective for the direction of motion at that orientation. Figure 5 A shows the response to drifting bars for one such direction-selective unit, which responded strongly to bars moving at 202° but negligibly to bars moving in the opposite direction, 22.5° (which corresponds to the same bar orientation). Figure 5 B shows the prevalence of direction selectivity in response to drifting gratings across the population, demonstrating that, although a small fraction show quite strong selectivity, most (77%, 140 of 182) are substantially less selective, with indices <0.5, which corresponds to a 3:1 bias for preferred versus opposite direction. As Figure 5 C shows, almost all direction-selective units were putative excitatory units in layers 2/3 and 4.
As shown in Figures 2 and 3, units in mouse V1 responded preferentially to particular spatial frequencies of sinusoidal grating. Figure 6 A presents a histogram of the spatial frequency that elicited the maximum response at the optimal orientation, which generally ranged from 0.02 to 0.08 cpd (median of 0.036 cpd), although several units responded optimally at spatial frequencies up to 0.32 cpd, the highest tested. A small fraction responded best to 0 cpd, full-field contrast reversal. The preferred spatial frequency did not show any systematic variation across layers (Fig. 6 B), except for layer 6, which responded to significantly lower spatial frequencies, as did putative inhibitory units. The preferred spatial frequency did not show any significant variation with receptive field eccentricity, which ranged from 20 to 90° in azimuth (r 2 = 0.009; p < 0.37), consistent with the lack of a fovea/area centralis, the relatively constant photoreceptor and retinal ganglion cell spacing across the mouse retina (Jeon et al., 1998), and the uniform magnification of the cortical representation (Kalatsky and Stryker, 2003). Most neurons showed bandpass spatial frequency tuning (Fig. 6 C). The median bandwidth of spatial frequency tuning for bandpass units was 2.46 octaves, although some neurons responded to only one of the spatial frequencies presented, and a small fraction showed low-pass tuning (Fig. 6 C). The width of spatial frequency tuning was greater for both putative inhibitory and layer 5 putative excitatory units (Fig. 6 D).
A common classification of visual responses is into simple cells, whose response to stimuli can be predicted by the linear sum of responses to spots at individual locations, and complex cells, which demonstrate nonlinear spatial summation. Correspondingly, simple cells are phase dependent, in that they generally respond to an edge of particular orientation at a specific location, whereas complex cells are phase invariant, in that they respond to the edge at any location within the receptive field. This distinction may be measured using the modulation of response to drifting gratings (Skottun et al., 1991). A high F 1/F 0 ratio indicates that the firing of the unit is modulated at the temporal frequency of the grating, whereas a low F 1/F 0 indicates that the unit fires relatively constantly throughout the presentation of the grating. The units in Figures 2 and 3 had F 1/F 0 of 1.43 and 0.36, respectively. Figure 7 A shows the bimodal distribution of the F 1/F 0 ratio for all units that responded to gratings (n = 182), emphasizing the distinction of linear and nonlinear units by this metric. Most putative inhibitory units showed nonlinear responses, as did layer 5 broad-spiking units (Fig. 7 B,C). In layers 2/3, 4, and 6, the majority of putative excitatory units were linear, although a sizeable fraction were nonlinear.
The size of the receptive fields was measured by sweeping short bars (4° wide × 8° long) of different orientations across the visual field. Figure 8 A show the radius of the receptive field (half-width at half-maximum) for 108 units that gave sufficiently strong response to this stimulus. The radius of the receptive field was smallest among layer 2/3 and 4 broad-spiking units and was generally ∼5–7° (Fig. 8 B). Putative inhibitory and layer 6 units tended to be somewhat larger, and layer 5 units were nearly twice as large (p < 0.001), with some receptive fields up to 15° radius by this measure.
The spontaneous firing rate also varied dramatically across layers and cell type (Fig. 8 C,D). Putative excitatory units in the superficial layers had extremely low spontaneous rates, often firing less than once every 10 s, whereas layer 4 broad-spiking units were almost twice as active (p < 0.05). Putative inhibitory units had a wide range of spontaneous rates but tended to fire at rates almost an order of magnitude higher. Nearly all units in layer 5 had high spontaneous rates: the dramatic increase in firing rate generally provided a clear indication of the layer 5 boundary. It is interesting to note that cell types with high spontaneous rate (layer 5 and putative inhibitory) have the lowest stimulus selectivity.
The magnitude of the evoked firing rate did not vary as dramatically across layers as the spontaneous rate. Figure 8 E shows the average firing rate over 1.5 s in response to the optimal grating stimulus (with spontaneous rate subtracted), for all units (n = 204) that responded to at least one of the episodic stimuli, even if they were not considered responsive to the gratings. Figure 8 F demonstrates that the evoked rate is relatively constant across layers, although greater in putative inhibitory neurons. The median evoked firing rate across the population was 6.7 spikes/s, which is lower than studies in V1 of other species (Girman et al., 1999; Heimel et al., 2005) but higher or consistent with that seen in other regions of rodent cortex (Brecht et al., 2003; DeWeese et al., 2003; Sato et al., 2007). It is possible that anesthesia level contributed to these lower rates or that our method of isolating units without regard to responsiveness may allow us to detect units with lower firing rates. It should also be noted that these firing rates are averages over 1.5 s. The peak instantaneous firing rate can be significantly higher (supplemental Fig. S3, available at www.jneurosci.org as supplemental material), particularly for simple cells, which modulate their firing rate periodically in response to gratings. Additional details of response magnitude, spontaneous rate, and variability are presented in supplemental Figure S4 (available at www.jneurosci.org as supplemental material).
We measured temporal frequency response by presenting contrast-reversing sinusoidal gratings at a fixed spatial frequency (0.04 cpd) and varying temporal frequency (Fig. 8 G). Most units responded optimally at ∼2 Hz (median of 1.68 Hz) although we saw a significant increase in peak temporal frequency in layer 4. This is shown in Figure 8 H, which presents the average temporal frequency tuning curves for layers 2/3 and 4, demonstrating an increased response by layer 4 units to stimuli at 4 and 8 Hz (p < 0.05).
The relative proportion of different functional response categories across the layers and cell types is illustrated in Figure 9, using OSI = 0.5 and F 1/F 0 = 1 as thresholds for orientation selectivity and linearity. A binary classification, which depends on an arbitrary threshold, is difficult when the parameter is not bimodally distributed. Even when a parameter does show a bimodal distribution, such as the F 1/F 0 ratio, this may not represent a true distinction in either taxonomy or mechanism (Mechler and Ringach, 2002). Our categorization is thus meant only to provide a summary overview of response types for comparison with other studies.
Across the population, 13% of units were not responsive to any of the stimuli we used, and 9% were left unclassified because they failed to give sufficient response to drifting gratings to estimate orientation selectivity and linearity. In layers 2/3 and 4, broad-spiking putative excitatory units were nearly always simple and orientation selective, with a fraction of nonlinear oriented units and a small number of simple nonoriented units in layer 4. Layer 5, in contrast, showed mostly nonlinear responses, with a large fraction of classic complex oriented units and a smaller number of nonlinear nonoriented units. Layer 6 was similar to layers 2/3 and 4, although with a greater number of nonlinear oriented units and the highest proportion of nonresponsive units. Finally, three-quarters of putative inhibitory units were nonlinear, of which the vast majority were nonoriented.
Responses to noise movies
To measure responses to more complete, yet still well parameterized, stimuli, we generated movies of stochastic noise with defined spatial and temporal frequency spectra (Fig. 10 A) (supplemental movie, available at www.jneurosci.org as supplemental material). We used these movies to probe the overall visual responsiveness of units by periodically modulating the contrast so that each movie transitioned sinusoidally from a gray background to full contrast movie and back to gray again, with a 10 s period. This generally resulted in a periodic modulation of firing, as demonstrated in Figure 10 B. Modulating the contrast also served to maintain high firing rates throughout the presentation, because units often habituated and firing rates decreased during long movies without varying contrast.
By measuring the ratio of the response amplitude at the frequency of the movie modulation to the average firing rate throughout the movie (Fig. 10 C), we obtained a measure of overall visual drive. This value is 0 if the firing rate is constant during the movie (no response) and 1 for a perfect sinusoidal modulation with no baseline firing. It should be noted that, because this is measuring the net firing in response to a rich stimulus, it thereby combines both peak responsiveness and broadness of tuning in a single metric. Figure 10 D demonstrates that most units were responsive to this visual stimulus, including many of the units (55%, 17 of 31) for which we could not elicit a response and measure receptive field properties using the episodic stimuli described previously. The phase of the Fourier component at the contrast-modulation frequency describes the time of optimal response, which was generally slightly before the contrast maximum at 180°. Interestingly, a few units showed the opposite response and actually decreased their firing rate in response to the movies, as indicated by a phase near 0°, opposite to the rest of the population. Figure 10 E shows the responsiveness by layer and cell type, demonstrating that layer 5 and 6 units tended to modulate their firing rate less in response to the white noise movies (p > 0.001), whereas upper layer 2/3 was the most responsive (p < 0.01). Modulating the contrast in this manner and plotting the averaged response over all cycles also permitted a form of contrast–response curve, as well as a rough measure of contrast adaptation over periods <10 s (supplemental Fig. S5, available at www.jneurosci.org as supplemental material).
Noise movies have been used extensively to measure linear spatiotemporal receptive field structure directly by calculating the spike-triggered average (Jones and Palmer, 1987a; Chichilnisky, 2001). For each spike that a neuron fires, we collected the frame that preceded the spike by a time τ. The average of all these frames is known as the STA and, for a Gaussian white noise stimulus, represents the linear kernel of the spatiotemporal response of a neuron. In this case, we corrected the STA to account for the non-white Gaussian spectrum of the stimuli, via normalization by the power spectrum of the stimulus set (Theunissen et al., 2001; Sharpee et al., 2004).
For units that responded linearly to drifting gratings, we were generally able to recover a STA receptive field from responses to the contrast-modulated noise movies. Approximately two-thirds (64%, 63 of 98) of simple cells produced a receptive field by STA for a 10–15 min movie presentation. In general, the primary limitation in generating an STA receptive field for linear units was the number of spikes observed; for simple cells that generated at least 400 spikes, >90% produced an STA with sufficient signal-to-noise, suggesting that longer movie presentations could reveal an STA for more units. Note that, because the stimulus set was spatially frequency limited, the noise in the STA receptive fields is also frequency limited, leading to the beaded or rippled appearance of the residual noise in the estimated RFs. Monte Carlo simulations of Gabor receptive fields with our movie stimuli show similar rippled appearance at low numbers of spikes, which average out for longer movie presentations or higher spike count. For nonlinear units, with a low F 1/F 0 ratio, the STA generally showed no structure, corresponding to the lack of a linear kernel. Because of the relatively small number of spikes collected during these movie presentations, we did not attempt to use spike-triggered covariance or other methods to extract nonlinear receptive fields.
Figure 11 A–C shows examples of STA receptive fields, including the unit (Fig. 11 B1) whose response to gratings and bars was illustrated in Figure 2. Many RFs had both an ON and OFF subregion (55%), or often three subregions (27%), whereas 18% of units only had one subregion (Fig. 11 A–C, respectively). The orientation of the STA receptive fields generally agreed quite well with the preferred orientation determined from drifting gratings (Fig. 11 D) (r 2 = 0.91). The correspondence between the spatial frequency of the STA and the peak response to gratings was also significant (Fig. 11 E) (r 2 = 0.63) but only for units that had a peak spatial frequency > cpd as measured by drifting gratings (blue circles). It is not surprising that the units with the lowest spatial frequencies (< cpd; gray circles) gave STAs with inaccurate spatial frequencies, because the total size of the movie was 60° (corresponding to cpd), preventing us from matching lower spatial frequencies than this. Interestingly, the more subunits found in the STA, the narrower the orientation tuning width as determined from gratings (Fig. 11 F), consistent with linear models of orientation selectivity. Similarly, spatial frequency bandwidth narrowed with increasing number of subunits (Fig. 11 G).
To further analyze the structure of STA receptive fields, we fit the measured RFs to Gabor functions, which are sinusoids multiplied by a Gaussian envelope. Figure 11 H shows the fraction of signal variance that was accounted for by the fit (n = 63), demonstrating that, for many units, the Gabor provided a good description of the data. Because our measured RFs had relatively large noise (up to 20% of the STA variance) as a result of the small number of spikes, we expect that longer movie presentations, or further optimization of movie parameters based on the tuning properties determined in this study, would produce STAs with lower noise and better fits.
The shape of a Gabor can be characterized by two numbers, nx = σ xf and ny = σ yf, which correspond to width along the sinusoid and perpendicular to the sinusoid, in terms of number of cycles of the sinusoid (Ringach, 2002). The ratio of nx to ny can be thought of as the aspect ratio, whereas the value of nx is related to the number of subunits that can fit in the Gaussian. Figure 11 I shows the values of nx and ny for units that were best fit by the Gabor (fit error <0 0.33; n = 50) compared with similar data from macaque (Ringach, 2002). The mouse STA receptive fields overlap with most of the macaque data, indicating that they have similar shapes. However, the mouse does not have units with very large nx , consistent with the fact that we did not observe STAs with more than three strong subunits. A histogram of the phase of the sinusoid (Fig. 11 J), mapped onto the range 0–90° to remove symmetries (Field and Tolhurst, 1986), shows an over-representation of odd-symmetric receptive fields. Estimates of the spatial phase distribution show a wide range in other studies, including uniform, peaked at 90°, or bimodal at 0 and 90° (Field and Tolhurst, 1986; Jones and Palmer, 1987b; DeAngelis et al., 1993; Ringach, 2002).
An important aspect of cortical visual responses is that orientation selectivity does not vary with increasing stimulus contrast, a phenomenon known as contrast-invariant tuning (Ferster and Miller, 2000; Priebe and Ferster, 2008). This constancy constrains models of cortical processing, because simple feedforward models such as the original Hubel and Wiesel proposal (Hubel and Wiesel, 1962) generally predict an increase in tuning width as the visual drive (e.g., contrast) is increased (Fig. 12 A, right). However, it is generally observed that, as contrast increases, the visual response increases but units do not lose selectivity, resulting in a multiplicative scaling of the tuning curve (Fig. 12 A, left). Numerous factors have been identified that could explain this property, including recurrent connectivity, inhibition, membrane potential fluctuations attributable to background activity, and the spike threshold, among others (Sompolinsky and Shapley, 1997; Ferster and Miller, 2000; Shapley et al., 2003; Finn et al., 2007; Priebe and Ferster, 2008).
We sought to determine whether this constancy is present in the mouse as well, to both assess the similarity in cortical processing to other species and ascertain whether genetic manipulations in mice could provide causal tests of proposed mechanisms of invariant tuning. For a subset of recordings (n = 6 animals), we presented drifting gratings of varying contrast and orientation, at fixed spatial frequency (0.04 cpd). A typical response from an oriented unit shown is shown in Figure 12 B, demonstrating contrast-invariant tuning, as the tuning curve only changes in amplitude, not shape, as contrast increases. Aligning and averaging the tuning curves of all orientation-selective units that responded to this spatial frequency (n = 22 units) resulted in the mean tuning curve observed in Figure 12 C, which shows no increase in width with increased contrast. Furthermore, a comparison of all individual orientation tuning curves at two different contrasts shows no systematic change in either orientation selectivity (Fig. 12 D) or tuning width (Fig. 12 E).
In addition to testing contrast-invariant tuning, this stimulus set allowed us to measure the response as a function of contrast for those units that were responsive to this spatial frequency. Figure 12 F shows the value of contrast that resulted in half-maximal response, which showed a wide range, with a median value of 19.8%. The slope of response over contrast at the half-maximal response is shown in Figure 12 G; for normalized contrast–response curves, a slope of 1 indicates a relatively linear increase in response with contrast, whereas a greater slope indicates a sharp transition from low to high response at the midsaturation contrast. Although many units had slope near 1, again there was a wide range, with a median value of 2.4.
Despite the poor visual acuity (Prusky et al., 2000) and relatively small region of cortex devoted to visual processing, we find that most neurons in mouse V1 respond to at least one type of visual stimulus. We have determined the relevant range of stimulus parameters, such as spatial and temporal frequency, and orientation selectivity, which are effective in driving the strongest responses. This broad characterization should benefit future studies by guiding the design of appropriate visual stimuli. For instance, knowledge of orientation and spatial frequency tuning width can help specify the resolution at which these parameters should be sampled to properly characterize all cells. These results should also facilitate the design of optimal stimuli for other functional recording techniques that lack the single-spike temporal resolution of electrophysiology, such as intrinsic signal (Kalatsky and Stryker, 2003) or calcium imaging (Stosiek et al., 2003). Importantly, our finding that responses vary greatly across the layers indicates that there can be widely different results for simple measurements, such as the degree of orientation selectivity or receptive field size, depending on the distribution of recording sites in a given study.
Several technical differences from previous experiments may contribute to the responsiveness and selectivity we observed. In terms of surgery, performing a small craniotomy and inserting the electrode at an angle to the cortical surface, which may reduce damage to superficial recording sites, aided in isolating units and evoking strong responses from the superficial layers, in which the most selective cells were found. In contrast to traditional single-electrode recording, which involves advancing an electrode until a responding unit is found, we allowed our multisite electrode to settle at one position and recorded any units that could be detected over 1 h or more of recording time. This enabled us to isolate units with low spontaneous rates and sparse responses. Indeed, some units were isolated that produced only a few hundred spikes over 1 h of recording. Traditional recording techniques that search for responses are most likely to detect units with high firing rates or large extracellular action potentials, which in mice tend to be the less selective layer 5 and putative inhibitory units. Furthermore, because isolated units were generally stable over this period, we were able to present a large number of stimuli, sampling a broad parameter space, to evoke responses from units with either very selective or poor responses.
We verified laminar identity using the LFP from geometrically defined recording sites established by the silicon probes, confirmed in several cases by retrospective histology. Although the layers in mouse cortex are not as clearly defined as in other species, with less distinct cytoarchitectural boundaries and a more extensive thalamic projection (Frost and Caviness, 1980; Antonini et al., 1999), we found clear differences in response types between these laminas.
Furthermore, spike waveforms were used to segregate units into narrow-spiking and broad-spiking categories, which have been used frequently as a rough classification of inhibitory versus excitatory neurons (Rao et al., 1999; Bruno and Simons, 2002; Andermann et al., 2004; Mitchell et al., 2007). There is still some uncertainty as to the exact cell types represented by this division, in particular whether the narrow-spiking class includes just the parvalbumin-positive fast-spiking units and whether some small fraction of narrow-spiking units may be excitatory (Swadlow, 2003). However, the profound difference in functional responses we find between these groups suggests that, although their exact categorization may be uncertain, narrow-spiking neurons are primarily a distinct class from broad-spiking neurons. Our designation of narrow-spiking units as putative inhibitory neurons is further supported by the recent finding, using two-photon calcium imaging in mouse, that inhibitory GABAergic neurons are much less orientation selective than non-GABAergic neurons (Sohya et al., 2007). It will be interesting to see whether such a dichotomy in orientation selectivity exists in other species that have a significant number of nonoriented units. Two recent studies in cat, in which almost all units are orientation selective, did show decreased orientation selectivity among presumed inhibitory neurons in layer 4 (Cardin et al., 2007; Nowak et al., 2008).
Comparison with previous studies
Our results generally agree with previous studies in mice, particularly the finding of greater orientation selectivity in superficial layers (Mangini and Pearlman, 1980; Metin et al., 1988), and the large number of poorly orientation-selective units with large receptive fields in layer 5, which have been suggested previously to be corticotectal neurons (Mangini and Pearlman, 1980). Our findings are also in agreement with recent studies in other rodents such as rat (Girman et al., 1999; Ohki et al., 2005) and squirrel (Van Hooser et al., 2005), confirming that high orientation selectivity can exist in the absence of large-scale organization such as an orientation map. We did not observe many of the simple, nonoriented responses in layer 4 that have been described in previous mouse studies (Mangini and Pearlman, 1980; Metin et al., 1988). It is likely that previous recordings with sharp tungsten electrodes more readily isolated afferents of LGN neurons, which generally have linear nonoriented responses. It is also possible that sinusoidal gratings elicit greater selectivity than small spots or flashed stimuli, used in some previous studies.
Our results are also consistent with previous finding in the mouse dorsal lateral geniculate nucleus (dLGN) of thalamus, which provides direct input to V1. Grubb and Thompson (2003) found a median peak spatial frequency response of 0.027 cpd, similar to our finding of 0.035 cpd. Our measurement of peak temporal frequency of 1.7 Hz, obtained with contrast-reversing rather than drifting gratings, shows a decrease from 3.8 Hz measured in dLGN, similar to the decrease in temporal frequency response from thalamus to cortex found in other species (Hawken et al., 1996). We do find slightly higher temporal frequency tuning in layer 4, consistent with its receiving direct input from thalamus.
It is also interesting to compare these findings in mouse with those of “higher” species whose visual systems have been more thoroughly studied (for an extensive comparison of tuning properties across species, see Van Hooser, 2007). The most readily apparent difference is in spatial scale, with median spatial frequency in mouse of 0.04 cpd compared with 0.9 cpd in cat and up to 4 cpd in primates. The total percentage of oriented units (74% of responsive units) is slightly lower, with many other species having 80% or more, although >95% of units in cat are reported to be orientation selective (Van Hooser, 2007). Within the population of oriented units, the width of tuning was close to that of other model species, with median orientation tuning half-width of 28–29° for mouse compared with 19–25° in cat and 24° in macaque (Van Hooser, 2007). The spatial frequency bandwidth of 2.5 octaves in mouse is somewhat larger than the value of 1.5 octaves in cat and macaque but close to owl monkey, 2.1 octaves, and squirrel, 2.3 octaves (Van Hooser, 2007). Thus, even at the low-resolution limit of the mouse visual system, neurons can still be nearly as selective for orientation and spatial frequency as in species with more developed visual systems. As pointed out previously (Van Hooser, 2007), these tuning widths seem to be fairly invariant across species and may represent common constraints on cortical processing.
One striking difference in mice is the relative paucity of complex cells in layer 2/3, in which we found that 75% of responsive units were simple, whereas in other species the majority of units outside of layer 4 are complex (Van Hooser, 2007). It is interesting to speculate that this may be related to the absence of orientation columns in mouse. Most models of complex responses involve pooling inputs of similar orientation, which can result from nearest-neighbor connectivity in the presence of orientation columns. However, without columnar anatomical organization, greater constraints on synaptic specificity would be needed to preserve orientation selectivity while pooling inputs. It must be noted, however, that the squirrel has a large proportion of complex cells but lacks an orientation map (Van Hooser et al., 2005).
The relative lack of orientation and phase selectivity in putative inhibitory neurons has interesting implications for cortical processing. They would be well situated to provide signals such as divisive normalization (Carandini et al., 1997), balanced inhibition and excitation (Chance et al., 2002), untuned inhibition for contrast-invariant tuning (Lauritzen and Miller, 2003), and for coupling net activity to metabolism or blood flow (Buzsáki et al., 2007). Conversely, many models of orientation selectivity require orientation-selective inhibition (Ferster and Miller, 2000), and the net inhibitory input to cat cortical cells has in fact been shown to be orientation tuned (Anderson et al., 2000). Only a small fraction of putative inhibitory units we recorded have sufficient orientation selectivity to provide such a signal. It is possible that this fraction of oriented narrow-spiking units, or perhaps inhibitory neurons among the broad-spiking units, serves this distinct role; alternatively tuned inhibition may not be necessary for contrast-invariant orientation selectivity (Finn et al., 2007; Priebe and Ferster, 2008).
The presence of strong selectivity and differential responses by layer and cell type opens the door for additional study of cortical development and function, taking advantage of recent technical advances using mouse genetics to study neural circuitry. For example, viral-based labels for synaptic connectivity (Wickersham et al., 2007) could be used to investigate the relative tuning properties of connected neurons and, in combination with cell-type specific markers, could elucidate the microcircuit that gives rise to selective responses. The use of quantitative visual analysis with manipulations of genes involved in growth and synaptic plasticity may elucidate the developmental processes that result in specific receptive field properties, such as organized ON and OFF subregions resulting in orientation selectivity and simple versus complex cell types. Finally, the ability to make precise manipulations of neural activity in defined cell populations (Boyden et al., 2005; Tan et al., 2006; Zhang et al., 2007b) should allow direct causal tests of models of cortical processing.
This work was supported by National Institutes of Health Grant EY02874 (M.P.S.) and a Helen Hay Whitney Foundation fellowship (C.M.N.). We thank Drs. Jonathan Horton, Steve Lisberger, and Linda Wilbrecht and members of the Stryker laboratory for comments on this manuscript and helpful discussions. We also thank Drs. Michael Lewicki and Eizaburo Doi for providing routines to fit Gabor functions and Dr. Dario Ringach for data on Gabor fits and circular variance from macaque.
- Correspondence should be addressed to Michael P. Stryker, Department of Physiology, 513 Parnassus Avenue, Room HSE-802, University of California, San Francisco, San Francisco, CA 94143-0444.
- Andermann et al., 2004.↵
- Anderson et al., 2000.↵
- Antonini et al., 1999.↵
- Atencio and Schreiner, 2008.↵
- Barthó et al., 2004.↵
- Boyden et al., 2005.↵
- Brainard, 1997.↵
- Brecht et al., 2003.↵
- Brecht et al., 2004.↵
- Bruno and Simons, 2002.↵
- Buzsáki and Draguhn, 2004.↵
- Buzsáki et al., 2007.↵
- Callaway, 2005.↵
- Cang et al., 2005.↵
- Carandini et al., 1997.↵
- Carandini et al., 2005.↵
- Cardin et al., 2007.↵
- Chance et al., 2002.↵
- Chichilnisky, 2001.↵
- Connors and Kriegstein, 1986.↵
- DeAngelis et al., 1993.↵
- DeWeese et al., 2003.↵
- Douglas and Martin, 2004.↵
- Dräger, 1975.↵
- Fagiolini et al., 1994.↵
- Fagiolini et al., 2003.↵
- Fagiolini et al., 2004.↵
- Feng et al., 2000.↵
- Ferster and Miller, 2000.↵
- Field and Tolhurst, 1986.↵
- Finn et al., 2007.↵
- Frost and Caviness, 1980.↵
- Gilbert, 1983.↵
- Girman et al., 1999.↵
- Grubb and Thompson, 2003.↵
- Harris et al., 2000.↵
- Hasenstaub et al., 2005.↵
- Hawken and Parker, 1987.↵
- Hawken et al., 1996.↵
- Heimel et al., 2005.↵
- Hensch, 2005.↵
- Hofer et al., 2006.↵
- Huang et al., 1999.↵
- Hubel and Wiesel, 1962.↵
- Hübener, 2003.↵
- Issa et al., 2000.↵
- Jeon et al., 1998.↵
- Jones and Palmer, 1987a.↵
- Jones and Palmer, 1987b.↵
- Kalatsky and Stryker, 2003.↵
- Karpova et al., 2005.↵
- Lauritzen and Miller, 2003.↵
- Lee et al., 2007.↵
- Mangini and Pearlman, 1980.↵
- McCormick et al., 1985.↵
- Mechler and Ringach, 2002.↵
- Métin et al., 1988.↵
- Mitchell et al., 2007.↵
- Mitzdorf, 1985.↵
- Naka and Rushton, 1966.↵
- Nase et al., 2003.↵
- Nowak et al., 2008.↵
- Ohki and Reid, 2007.↵
- Ohki et al., 2005.↵
- Pelli, 1997.↵
- Polleux et al., 2007.↵
- Priebe and Ferster, 2008.↵
- Prusky et al., 2000.↵
- Rao et al., 1999.↵
- Rash and Grove, 2006.↵
- Ringach, 2002.↵
- Ringach, 2004.↵
- Ringach et al., 1997.↵
- Ringach et al., 2002.↵
- Sato et al., 2007.↵
- Sawtell et al., 2003.↵
- Schmitzer-Torbert et al., 2005.↵
- Shapley et al., 2003.↵
- Sharpee et al., 2004.↵
- Skottun et al., 1991.↵
- Snellings et al., 2006.↵
- Sohya et al., 2007.↵
- Sompolinsky and Shapley, 1997.↵
- Stosiek et al., 2003.↵
- Swadlow, 2003.↵
- Swadlow et al., 2002.↵
- Syken et al., 2006.↵
- Tamamaki et al., 2003.↵
- Tan et al., 2006.↵
- Theunissen et al., 2001.↵
- Van Hooser, 2007.↵
- Van Hooser et al., 2005.↵
- Waites et al., 2005.↵
- Wickersham et al., 2007.↵
- Zeng et al., 2001.↵
- Zhang et al., 2007a.↵
- Zhang et al., 2007b.↵